Tuesday, March 19, 2013

Techniques for approximate string matching

Since we are planning to add fuzzy matching of string (user names) to our "Who's Calling?" Android application, here is a list of tools and algorithms that can be useful for this task.

  1. Levenshtein distance we used this years ago in the first Casus prototype. Levenshtein is tricky name, I rememebered it as Levenstein, but as its Levenshtein distance to original name is just one, it was not too difficult to find the real one ;-) 
  2. Approximate String matching 
  3. Somehow related, and very interesting article on writing your own spell checker 

This list is no way complete, perhaps there will be a pointer to a follow-up article once there is some progress on the implementation of our text matching solution.

Please feel free to add more ideas or pointers in the comments area. Thank you.

No comments:

Post a Comment