Document system to support Obama's transparency initiative, GPO says

The agency launched the new network in February to make federal digital records available, even if document formats like Microsoft Word change.

A digital publishing system the Government Printing Office is developing can support President Obama's goal to make government more transparent, according to a recent letter from the head of the agency.

The Federal Digital System, which GPO launched on Feb. 4, publishes and stores the increasing number of digital documents agencies produce. The public can log on to the system's Web site to search federal records. Later this year, the agency hopes to add more functionality, including the ability for agencies to upload digital documents and reports directly, and a searchable version of the mammoth federal code of regulations.

"That's a big document. It's quite a task," said Mike Wash, the chief information officer at GPO. "Having that available online would be very useful."

In a letter to Obama released in mid-March, Public Printer Robert Tapella said the Federal Digital System could support the president's mandate to increase government transparency. "We now have a flexible and extensible digital system that can put into action your call for transparency and open government," he wrote.

GPO is the central repository for government documents and publishes a multitude of them online, including congressional publications, new regulations and a daily compilation of presidential papers.

"The letter was a reinforcement of our mission," Wash said. "We believe we've done a very nice job of engineering an information management system that if the White House would like us to, we can provide automated ways of doing things that are currently done manually. The data is all there."

The agency began work on the Federal Digital System in 2003 so it could publish and house a permanent collection of government documents. It established a FDSys program office in 2004 and spent the next two years analyzing the document storage market, examining issues such as how to guarantee digital records remain permanent and how to ensure that publications remain unchanged.

GPO began developing the system in 2007. The program's budget is $29 million, and to date, the agency has spent $20 million. It expects to spend another $5 million to $10 million upgrading the system.

Wash said GPO had to solve numerous issues to meet its goals because of the enormous amounts of data it has to process and store. For example, saving records in a particular format is problematic because software programs such as Microsoft Word or Adobe Acrobat, which are used to create documents, are updated over time, making it difficult if not impossible to recall documents stored in older formats. But federal records must be accessible for decades or even centuries.

FDSys automatically re-formats documents using an open format with XML tags, making them searchable. The system saves a copy of the XML document for its archives and duplicates it, using the second copy to create a publicly accessible record.

"Fifty years from now if someone wants to see Obama's inauguration speech, we'll have a new rendering on FDSys, compatible with whatever the current access tool is," Wash said.

XML is increasingly the format being used for government databases and regulatory filings, but the real strength of the technology is that it makes it easier to update files to any format developed because it is open source.

The system also saves four copies of documents other than the version the public can access. In addition, GPO duplicates the entire collection at another location.

The agency sends copies of the most important federal documents to 53 regional depository libraries, which house massive collections of government information. Barring a catastrophic, nationwide attack, Wash believes the system ensures that federal documents will be available for future generations.

NEXT STORY: The NGEN Feeding Frenzy Begins