I recently ran across the probem that there are many people with almost identical last names and it is hard to guess every possible spelling until you finally find the person you were looking for.

In german, there are a lot of possible spellings for the name “Maier” which all sound the same: Meyer, Meier, Mayer, Mayr, Meyr etc.

A phonetic algorithm reduces a given word to a digest which is the same for all names, which sound similar. After asking for a proper name on #dbix-class I came up with the DBIx-Class-PhoneticSearch module.

For now it is only avaiable from github. But sometime soon it will be on the CPAN :-)

The usage is pretty easy:

    package MySchema::User;
 
    use base 'DBIx::Class';
 
    __PACKAGE__->load_components(qw(PhoneticSearch Core));
 
    __PACKAGE__->table('user');
 
    __PACKAGE__->add_columns(
      id       => { data_type => 'integer', auto_increment => 1, },
      surname  => { data_type => 'character varying', 
                    phonetic_search => 1 },
      forename => { data_type => 'character varying', 
                    phonetic_search => { algorithm => 'Koeln', 
                                         no_indices => 1 } },
 
    );
 
    __PACKAGE__->set_primary_key('id');
 
    __PACKAGE__->resultset_class('DBIx::Class::ResultSet::PhoneticSearch');

This defines a result class with a forename and surname column. Both are phonetic-enabled. forename uses the Koeln algorithm, which has been optimized for german names and words. Make sure you deploy() that schema again or add the two columns to your schema.

ALTER TABLE `user` ADD COLUMN `surname_phonetic_phonix` CHARACTER VARYING;
ALTER TABLE `user` ADD COLUMN `forename_phonetic_koeln` CHARACTER VARYING;

Now you can search for any user by a similar sounding name:

  $rs = $schema->resultset('User');
  $rs->create({ forename => 'John', surname => 'Night' });
 
  $rs->search_phonetic({ forename => 'Jon' })->first->forename;  # John
  $rs->search_phonetic({ surname => 'Knight' })->first->surname; # Night

The default algorithm is Phonix which is IMHO far superior to the popular Soundex algorithm. E. g. the last example (Knight -> Night) does not work with Soundex.