Monday, July 12, 2010

Detecting similar strings (username like password)

Here are a couple ruby libraries that may help you out. I found this a while back on a ruby site
when I was looking to compare strings for duplicates when cleansing data.

The first is distance between strings. (I prefer this one)

The second is the traditional soundex Remove vowels take the first letter run a calc on the rest.
Its not very reliable but may help.

The third is like soundex but larger strings.

http://raa.ruby-lang.org/project/levenshtein/
http://raa.ruby-lang.org/project/soundex/
http://raa.ruby-lang.org/project/metaphone/

Here is the link to the original post: