Tapping a data mother lode
CIA spin-off sees wealth of information in e-mail, Web pages and other sources
About 85 percent of the information in the databases of agencies or private corporations is so-called unstructured data — e-mail messages, word-processing documents and other files that do not fit neatly into organized rows and columns — but rarely do organizations tap into it.
Looking to reverse that trend, In-Q-Tel, the venture capital arm of the CIA, has invested more than $1 million in Stratify Inc., a specialist in managing such data, and signed on as a customer.
Stratify's tool automatically organizes millions of documents, e-mail messages and Web pages into an easy-to-navigate hierarchy that can be integrated with structured data, allowing analysts to draw new insights from immense bodies of information.
It's the kind of analysis that usually happens only with the quantitative data — dollars spent, number bought — stored in traditional databases. "The goal here is to help the company develop products that government agencies can use off the shelf, because it's cheaper, quicker to implement and [it costs less] to maintain," said Gilman Louie, chief executive officer and president of In-Q-Tel. In-Q-Tel contacted Stratify in March and in less than six months had agreed to invest in the firm and buy its software for internal use, said Nimish Mehta, Stratify's president and CEO.
There's a compelling argument for getting a better grasp on that unstructured data, Mehta said.
"Key corporate insight resides in that [unstructured] information," Mehta said. Management is relying only on structured data — which is about 15 percent of the total data — but "when running an agency, they ought to be using all the resources and data available.... This tool allows the same kind of structured access to unstructured information as, for example, an Oracle [Corp.] database gives for structured data."
Ramon Barquin, president and CEO of Barquin and Associates Inc., an information technology consulting firm specializing in knowledge management, said agencies have recognized the need to dig into their unstructured data for some time and that the events of Sept. 11 have only reinforced that need.
Barquin said Stratify's software could aid an agency that is sorting through myriad pieces of unstructured data "just fishing for things that have a bearing on that problem, and [that needs to] manage and find content that is going to be helpful in that problem solving."
In-Q-Tel officials agree.
"In a crisis like [the Sept. 11 attacks], people are easily buried in too much information, but every piece of information is important," Louie said. If you normally receive 30 e-mail messages a day and now receive 300 or 3,000, "you need a tool to organize the information that allows you to ingest and digest what's there and more easily categorize it."
The software can handle the entire Microsoft Corp. Office suite and Adobe Systems Inc.'s PDF, as well as Web-based HTML data and text. At the urging of In-Q-Tel, Stratify is now finishing work on support for Microsoft Exchange and Lotus Development Corp. Notes and adding support for western European and Middle Eastern languages, both of which should make its products more attractive to government buyers, Mehta said.
"We told Stratify, as part of their commercial strategy, that they can't be English only," Louie said. "And you better support Microsoft Exchange and Lotus Notes...to compete in the knowledge management world."
Stratify's Mehta said the company has had a number of discussions with federal agencies, and "by summer of next year, we expect to announce several significant government relationships."
NEXT STORY: Forman promotes e-gov resource sharing