
We’ll use three basic techniques to cope with all this: the ‘edit difference’, the hardcoded ‘common misspelling’ lookup, and the sound-alike. mistyping – computer as copmuter or comptuer.misspellings – definitely often rendered as definitly, definately or defiantly.multiple valid spellings – encyclopaedia, encyclopedia.national word differences – pants and trousers.national spelling differences – yoghourt, yogurt and yoghurt.dialect differences – crayfish, crawfish, écrevisse, crawdads, yabbies.In short, your “fuzzy search” algorithm ought to be able to cope with a lot of creative ways to search for the same thing, for example: They will use argot or even textspeak (‘I H8 U 4 UR HCC’). They will also use dialect words, or specialist words, for the widget, and even then they may get the spelling wrong. When users search the site, even if they are using a single language, they will misspell or use national spellings. It is the inversion table, containing the list of words, that we’ll focus on in this article. This second table usually records at least the location of the text and the sequence number of the word in the text, where it occurs. This word list will be referenced, by a foreign key constraint, from a large narrow table that records where the various entities that are associated with the word are stored, so that relevant matches, maybe a link, a phrase, a description or an image, can be returned to the user. At this stage, we’ll stick to a single language site, but if your site is multi-language, then the structure of the related tables is rather different.

If, for example you are selling widgets, the inversion table would contain a list of widgets, and the widget spares, repairs, advice, instructions and so on.


When you create your application, you will need to have an ‘inversion table’ that lists all the words that are legitimately ‘searchable’. Although they are well-tried in the industry, It is rare to hear of them being used in SQL. In this article, I’ll be explaining some of the strategies that you can adopt to make the searches that are made by users more satisfactory for them. For example, if it needs to find ‘sausages’, you won’t expect to receive a search on ‘sossyjez’, however, When people search your website or application, they have grown to expect it. When an application searches for specific text data, you can generally rely on it to get the search term right.
