SOUNDEX is an algorithm for encoding names as a sequence of the first letter and three digits, so that similar-sounding names produce the same code although the spelling may differ considerably. If you try to find a BAILEY in the telephone directory, you are advised to try also BAILLIE, BAYLEY, BAYLY, AND BAILYE. All these versions would produce the same code, namely B400.

The algorithm has been around for a long time, since the famous 1890 U.S.. Census during which Dr. Hollerith introduced punched cards (he later formed a company which developed into IBM), but it keeps cropping up in magazines in various versions, it could be useful if you ha ve a large address book with similar-sounding names, but it can also help with look-up of frequently misspelt words. In the dictionary part of the program, the code for 'scimitairey' will match 'cemetery', 'scimitar', and 'symmetry', so exact spelling is not essential. The algorithm, as coded here, requires two passes. The first pass eliminates all non-letter characters and saves the group code of cach letter, provided it is not a duplicate of the preceding one. The second pass sets the first character of the code to cither the first letter or the group of the first letter, depending on the choice of the mode, then eliminates the nonsignificant copies.

Simpler forms of the algorithm use only one pass, but do not eliminate the repetition of the first letter ('llama' would be encoded as L450 rather than L500), and do not recognise letters belonging to the same group but separated by one or more non-significant letters ('decision' and thicken' would both be encoded as 3250 in the dictionary mode, rather than 3225 and 3250 respectively).


★ PUBLISHER: The Amstrad User (Australia)
★ YEAR: 1986


