<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>netCUBED Blog &#187; phonetic</title>
	<atom:link href="http://blog.netcubed.de/tag/phonetic/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.netcubed.de</link>
	<description>Just another web developer's weblog</description>
	<lastBuildDate>Mon, 29 Jun 2009 20:58:35 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Phonetic search with DBIC</title>
		<link>http://blog.netcubed.de/2009/05/phonetic-search-with-dbic/</link>
		<comments>http://blog.netcubed.de/2009/05/phonetic-search-with-dbic/#comments</comments>
		<pubDate>Fri, 29 May 2009 14:49:42 +0000</pubDate>
		<dc:creator>Moritz Onken</dc:creator>
				<category><![CDATA[Perl]]></category>
		<category><![CDATA[dbic]]></category>
		<category><![CDATA[phonetic]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://blog.netcubed.de/?p=97</guid>
		<description><![CDATA[I recently ran across the probem that there are many people with almost identical last names and it is hard to guess every possible spelling until you finally find the person you were looking for.
In german, there are a lot of possible spellings for the name &#8220;Maier&#8221; which all sound the same: Meyer, Meier, Mayer, [...]]]></description>
			<content:encoded><![CDATA[<p>I recently ran across the probem that there are many people with almost identical last names and it is hard to guess every possible spelling until you finally find the person you were looking for.</p>
<p>In german, there are a lot of possible spellings for the name &#8220;Maier&#8221; which all sound the same: Meyer, Meier, Mayer, Mayr, Meyr etc.</p>
<p>A phonetic algorithm reduces a given word to a digest which is the same for all names, which sound similar. After asking for a proper name on #dbix-class I came up with the DBIx-Class-PhoneticSearch module.</p>
<p>For now it is only avaiable from <a href="http://github.com/monken/DBIx-Class-PhoneticSearch/tree/master">github</a>. But sometime soon it will be on the CPAN :-)</p>
<p>The usage is pretty easy:</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;">    <span style="color: #000066;">package</span> MySchema<span style="color: #339933;">::</span><span style="color: #006600;">User</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">use</span> base <span style="color: #ff0000;">'DBIx::Class'</span><span style="color: #339933;">;</span>
&nbsp;
    __PACKAGE__<span style="color: #339933;">-&gt;</span><span style="color: #006600;">load_components</span><span style="color: #009900;">&#40;</span><span style="color: #000066;">qw</span><span style="color: #009900;">&#40;</span>PhoneticSearch Core<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    __PACKAGE__<span style="color: #339933;">-&gt;</span><span style="color: #006600;">table</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'user'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    __PACKAGE__<span style="color: #339933;">-&gt;</span><span style="color: #006600;">add_columns</span><span style="color: #009900;">&#40;</span>
      id       <span style="color: #339933;">=&gt;</span> <span style="color: #009900;">&#123;</span> data_type <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'integer'</span><span style="color: #339933;">,</span> auto_increment <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      surname  <span style="color: #339933;">=&gt;</span> <span style="color: #009900;">&#123;</span> data_type <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'character varying'</span><span style="color: #339933;">,</span> 
                    phonetic_search <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1</span> <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      forename <span style="color: #339933;">=&gt;</span> <span style="color: #009900;">&#123;</span> data_type <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'character varying'</span><span style="color: #339933;">,</span> 
                    phonetic_search <span style="color: #339933;">=&gt;</span> <span style="color: #009900;">&#123;</span> algorithm <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'Koeln'</span><span style="color: #339933;">,</span> 
                                         no_indices <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">1</span> <span style="color: #009900;">&#125;</span> <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
&nbsp;
    <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    __PACKAGE__<span style="color: #339933;">-&gt;</span><span style="color: #006600;">set_primary_key</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'id'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    __PACKAGE__<span style="color: #339933;">-&gt;</span><span style="color: #006600;">resultset_class</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'DBIx::Class::ResultSet::PhoneticSearch'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>This defines a result class with a forename and surname column. Both are phonetic-enabled. forename uses the Koeln algorithm, which has been optimized for german names and words. Make sure you <code>deploy()</code> that schema again or add the two columns to your schema.</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">ALTER</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`user`</span> <span style="color: #993333; font-weight: bold;">ADD</span> <span style="color: #993333; font-weight: bold;">COLUMN</span> <span style="color: #ff0000;">`surname_phonetic_phonix`</span> CHARACTER VARYING;
<span style="color: #993333; font-weight: bold;">ALTER</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`user`</span> <span style="color: #993333; font-weight: bold;">ADD</span> <span style="color: #993333; font-weight: bold;">COLUMN</span> <span style="color: #ff0000;">`forename_phonetic_koeln`</span> CHARACTER VARYING;</pre></div></div>

<p>Now you can search for any user by a similar sounding name:</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;">  <span style="color: #0000ff;">$rs</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$schema</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">resultset</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'User'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #0000ff;">$rs</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">create</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span> forename <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'John'</span><span style="color: #339933;">,</span> surname <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'Night'</span> <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #0000ff;">$rs</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">search_phonetic</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span> forename <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'Jon'</span> <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">first</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">forename</span><span style="color: #339933;">;</span>  <span style="color: #666666; font-style: italic;"># John</span>
  <span style="color: #0000ff;">$rs</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">search_phonetic</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span> surname <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'Knight'</span> <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">first</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">surname</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;"># Night</span></pre></div></div>

<p>The default algorithm is <a href="http://search.cpan.org/perldoc?Text::Phonetic::Phonix">Phonix</a> which is IMHO far superior to the popular Soundex algorithm. E. g. the last example (Knight -> Night) does not work with Soundex.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.netcubed.de/2009/05/phonetic-search-with-dbic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
