Vendor simplifies mining of data

Search Software America provides high-level data search and matching

Software that helps child welfare agencies locate delinquent parents and credit bureaus identify tardy payments can easily be adapted to help law enforcement track down potential terrorists, according to industry experts.

For example, Search Software America (SSA), which is expanding its presence in the federal market, is one of many data mining companies offering to help organizations search databases for information on individuals. The company currently provides search-matching capabilities for accurate customer lookup, fraud identification and screening of individuals.

The Greenwich, Conn.-based company recently rolled out a new version of its data mining software, Identity Systems (IDS) 2.1, which does not require databases to be reformatted into a specific model or "scrubbed." The software has been deployed worldwide in the private sector as well as by federal and state government agencies in the United States.

Data mining has been gaining prominence since the Sept. 11, 2001, terrorist attacks, after disclosures that federal agencies failed to share information about two terrorists who had been on a law enforcement watch list. Months later, the former Immigration and Naturalization Service disclosed that it had issued visas to two of the hijackers.

Many systems integrators are implementing the software at state agencies in systems ranging from taxation to child welfare.

Meanwhile, the Internal Revenue Service and two federal intelligence agencies are using SSA's software because it is able to find matches "despite extreme error and variation," said Michael Dunkerley, the company's vice president of global marketing. The FBI also uses a version of the software for its National Crime Information System.

The IRS is using the software for its Name Search Project, which is designed to match the name of a person or business with a tax identification number, according to Lee Smallwood, database administrator for the project.

"It is pretty good at what it does," Smallwood said. "The strength of it is that the vendor maintains all the translation tables and all the things that happen behind the scenes. You don't have to maintain the patterns of common spelling — the vendor does it."

IDS' core component is a search server application used for intelligent indexing and searching of an organization's identity data to match names, addresses and other identification. The search server runs on IBM Corp. OS/390 mainframe, Microsoft Corp. Windows NT/2000 and Unix platforms.

Without programming or changes to the organization's database design, IDS can be configured for use on operational systems or to build searchable data repositories.

The software searches for data stored in IBM DB2, Microsoft SQL Server and Oracle Corp. databases. Using algorithms, the software has the ability to relate one database table to another to discover hidden relationships. The process does not require the data to be cleaned or formatted and supports searches in 55 languages.

While there are many data-matching approaches around, Dunkerley said SSA has developed database keys that "maximize our chance of finding all of the relevant candidates in the search."

SSA's software is able to search for a name whether it is "William," "Bill," "Billy" or "Will," or "Peter," "Pete," "Pietro" or "Pierre."

Federal law enforcement agencies are looking at SSA's software, and countries worldwide, including Canada, are tapping the software for a wide range of purposes, including immigration matching and finding financial data, SSA officials said.

"More and more it is being recognized that there is information that could be available to agencies across many databases," said Simon Haigh, president of Lazy Software Inc., a British data mining company expanding in the United States.

"One of the problems we face is being able to aggregate that information [because], in general, all they have done is build extra silos of information," he said.

Since establishing a U.S. base last year, Haigh said, his company has been welcomed by agencies such as the Federal Emergency Management Agency and the FBI, which "basically indicated that we need to get firmly into partnerships with integrators that support them."

"We're seeing that certainly the largest interest is coming from Virginia and the Washington [D.C.] area," Haigh said.

***

Search and match

Search Software America's Identity Systems (IDS) server allows organizations to search databases for information on individuals.

IDS can be used on stand-alone systems or integrated with multiple databases/systems into a centralized search index.

IDS provides:

* Intelligent indexing and searching of identity data for real-time or batch inquiries.

* Matching of data including account information, addresses, names, divisions or organizations.

NEXT STORY: Agency makes data warehouse click