FirstGov completes its search

Government Web portal will use search engine that clusters results

The federal government will soon lure citizens to its Web pages with a revamped official government Web portal that looks and feels a lot like popular search sites.

FirstGov users will get search results clustered into groups of related hits through new technologies that the General Services Administration is buying.

Vivisimo, which operates the Clusty.com search engine, and Microsoft MSN Search will power the government portal under a $1.8 million contract announced late last month. Vivisimo and Microsoft will offer FirstGov users more organized and comprehensive results from media outlets, image libraries and government Web pages, officials said.

The move indicates a shift in government acquisition practices toward mirroring industry trends rather than playing catch-up. FirstGov had been using search supplier Fast Search and Transfer since 2002. When government officials decided they wanted FirstGov to adopt more advanced, user-friendly search capabilities, they awarded a new contract in less than three months.

"Since we first awarded that [Fast Search] contract in 2002, search has changed a lot," said M.J. Pizzella, associate administrator of GSA's Office of Citizen Services and Communications, which oversees FirstGov. "We try to stay on the cutting edge as best we can, considering the restraints on government with budgets."

The contract is part of an $18 million, five-year blanket purchase agreement that FirstGov set up to acquire future search capabilities. The BPA enables officials to award a new contract every year to a host of vendors, including Vivisimo, Fast Search and Gigablast. Well-known search companies Google and Yahoo did not bid, government officials said.

"Next year, when the renewal date comes up, we could have the option to look at any other company that is under the BPA," Pizzella said. "This technology is moving so fast that if you lock yourself in, you can't move as fast as technology."

Pizzella said the government "is often woefully behind technology. Hopefully, this will allow the government to move in real time with the pace of technology."

The new search engine will be three times as large and half as costly as FirstGov's current search capabilities. Now, FirstGov indexes about 8 million federal government Web pages. Vivisimo and MSN will comb through about 24 million pages from federal, state, local, tribal and territorial government Internet sources. For the first time, FirstGov will retrieve images and news articles from the mainstream press. In the same search, the engine will also scour government images and official agency news releases.

FirstGov will query multiple databases and Web pages, then list results grouped by subject. This feature is possible because of two next-generation search enhancements, metasearch and clustering. Metasearch digs through multiple databases, search engines and the Web to answer queries. Clustering clumps search results based on textual and linguistic similarities. Users see subject-matter hyperlinks, which they can click to get subsets of results. Clustering lets users see results that would otherwise be buried at the end of ranked lists.

For instance, a search for "nursing jobs" may return one cluster of job listings on USAJobs, the federal government's job database, and a separate group of links to health programs culled from other government sites.

"We're going to build the best government search portal in the world," said Raul Valdes-Perez, co-founder of Vivisimo and an adjunct associate professor of computer science at Carnegie Mellon University.

Clusty was funded by a government grant from the National Science Foundation. The company has since worked with defense organizations, the Energy Information Administration and the Social Security Administration.

"We think it's a great payback to the taxpayer," he said.

Government computer programmers say GSA made the right decision in choosing a known private-sector provider, adding that Vivisimo's metasearch and clustering will revolutionize FirstGov's service.

Tamas Doszkocs, a computer scientist at the National Library of Medicine, has been working on metasearch and clustering engine ToxSeek for almost a decade. ToxSeek scours toxicology and environmental health databases at government agencies, then groups results under categories such as "National Library of Medicine," "MedlinePlus," and "exhibits and archives."

"If I were in their shoes, I would do the same thing," Doszkocs said. "It's always safer and more reliable to go with a proven commercial solution. I'm glad that the public will get a quantum leap in the quality of information."

But he said more sophisticated search engines exist, including ToxSeek, Google and Yahoo. Government employees were not allowed to submit proposals.

"If the others didn't compete, then this is the best solution," Doszkocs said.

Before the award was announced, he and his colleagues had talked with FirstGov officials and demonstrated a prototype of ToxSeek that could enhance whichever commercial service they selected. The Homeland Security Digital Library, a program run by the Homeland Security Department and the Naval Postgraduate School, deployed a version of ToxSeek more than six months ago.

Before Vivisimo was chosen, Doszkocs said, "Our hope is to get together once the dust settles and somehow achieve a way to utilize a metasearch and clustering technology on top of the new FirstGov system."

He said FirstGov's promised clustering capability will show other government agencies an alternative to plain vanilla search functions.

Google, noticeably absent from the FirstGov project, said the company's desire to be a subcontractor did not align with the task order's condition that the search provider must act as the prime contractor.

"Because the requirements for FirstGov search were highly customized, Google proposed to work as a subcontractor alongside a partner who could better deliver on some of the specialized requirements," said Dave Girouard, general manager of Google Enterprise. "But this structure was not acceptable to FirstGov."

Valdes-Perez refused to name Google but said Vivisimo spoke to all Web search companies, as a matter of corporate policy, before teaming with Microsoft.

"They were very flexible and accommodating to the needs of FirstGov, blending their Microsoft search results with other search results, [as part of metasearch] and clustering," he said.

Valdes-Perez added that recent reports show people notice little difference in quality between Yahoo, Google and Microsoft. "If anything, the MSN search will get better because it's more recently launched."

Postulating on Google's exclusion from the FirstGov agreement, Doszkocs said, "This is small potatoes for them."

Finding what you want

FirstGov will soon offer users smoother navigation of the Web. Here are some of its new features.

  • Search results will be clustered into groups of related hits.
  • The current index of about 8 million federal government Web pages will more than triple when more federal, state, local, tribal and territorial government sites are added.
  • Images and news articles from the media will be available for the first time.

— Aliya Sternstein