(Page 2 of 2)
There are a few ways computers can tackle the name-search problem. A simple database search will look for a string of characters, called a key-based search, and try to find a match. But if the string is not exact, the process is futile.
A more sophisticated search, using what is called fuzzy logic, can look for similar strings and return the closest matches. But that can return many inaccurate results. Researchers at Language Analysis Systems have found, for example, that looking for names with similar consonant sounds, a longstanding method of name analysis, could find these matches for the name Criton: Courtmanche, Corradino and Cortinez.
Language Analysis Systems has created NameClassifier, a program that attempts to identify the cultural origins of a name, and NameHunter, which searches for names with the same linguistic parameters, like whether affixes like "y" and "de la" are used. NameHunter uses a pair-wise strategy, a technique that accounts for dissimilarities and random errors by basing its search on a series of comparisons, looking for shared properties.
"Maria De la Cruz Vasco de Gamma' has a lot more different things going on than a short Chinese name," said Dr. Hermansen. "If you can discover what culture are you looking at, you can apply the best computational techniques."
Intelligent Search Technology's system, NameSearch, also ranks results. But rather than identify the culture of a language, NameSearch employs a key-based strategy that uses multiple keys for each name — taking affixes into account, for example, or various phonetic spellings and nicknames. The search is then performed using a series of comparisons that determine how similar a name is to the keys.
Another company, Basis Technology, in Cambridge, Mass., has developed a language analyzer, based on phonics, for translating Latinized names back into their original alphabets so they can be searched against lists in the original languages.
Joel Ross, vice president of military and intelligence services for Basis, noted that "Qaddafi," for example, can be spelled at least 60 ways in the Latin alphabet. To match a name on a watch list, the Basis system takes a Latinized name and compares it to the company's own transcription scheme, so that it will match with the one Arabic version of the name.
The future of name-searching, according to the companies working on it, is not in watch lists, but in sifting through huge quantities of digital documents, like those that might be found on terrorists' computers or intercepted online.
"Going through massive amounts of text and identifying when there's a name, city, corporation," is the next challenge, said Richard Wagner, the president of Intelligent Search Technology. "That's where this industry is headed."