Preserving e-history for all

Efforts to capture Web snapshots expose e-records dilemmas

In just two weeks, all federal agencies are supposed to have turned in "snapshots" of their Web sites as they appeared in the final hours of the Clinton administration.

The project is intended to preserve a picture of electronic government at the peak of its evolution under the first electronic administration.

Thus, the National Archives and Records Administration expects an avalanche of electronic data by March 20—21 terabytes in all, enough to fill about 14 million floppy disks, or roughly twice the amount of data contained in the books of the Library of Congress.

In the process of creating a digital time capsule, the snapshot project is also spotlighting a problem that has been quietly nagging at archivists: How should agencies handle electronic records that exist only on the Internet?

The answer is elusive. According to an Archives memo, the Web site snapshots may be the first effort some agencies have made to preserve Internet records.

"During the past eight years, agency use of the Internet has grown tremendously," according to the memo. Yet, "many agencies have not yet scheduled their Web site records."

"Schedule" is a term archivists use for the process of determining which documents are important enough to be official records and how long various records must be kept.

It is neither practical nor affordable for federal agencies to routinely preserve snapshots of their Web sites, records experts say.

"We hope this does not become a regular thing. We hope we do not do it every four years or every eight years," said Michael Miller, director of modern records programs at the Archives. However, preserving documents on an ongoing basis is essential. "We hope to develop guidelines for managing Web sites and keeping Web records."

Miller said he hopes to have "Web guidance that will be useful for the long-term management of records on federal Web sites" by the end of March.

"Good luck," said J. Timothy Sprehe, a records management expert and president of Sprehe Information Management Associates. He and other records experts have found that use of the Internet is increasing too fast for records management policies to keep up.

For example, a number of federal agencies now routinely accept e-mailed public comments on rule-makings and post them on their Web sites. Such comments qualify as official records, Sprehe said, but in many instances may never be committed to paper—thus never becoming part of the agencies' official records.

Some federal agencies have begun conducting "town meetings" and other interactive events using the Internet. During these sessions, agency officials typically respond to questions from the public. The responses constitute official records, yet in some instances they have not been recorded or preserved.

Beyond the question of what is a record lies the question of how to preserve it. The Archives developed methods for storing the content of electronic documents independent of their formats. Now the agency must develop methods and standards for digital images, digital photographs, videos and e-mail messages and attachments.

"Almost no agency in the federal government is paying serious attention to e-mail records management," Sprehe said.A historical treasure trove

When departing Clintonites ordered all federal agencies to take Web site "snapshots," their concern was preserving the administration's electronic legacy. But for the National Archives and Records Administration, the project promises to create a treasure trove of useful information for future researchers.

"We save these things for one reason and find that people find tons of ways to use them," said Michael Miller, director of modern records programs at NARA. He cites the example of Nazi accounting records captured during the collapse of Germany in 1945. Largely unused for half a century, they have proven invaluable in recent years for tracing gold and treasure looted by the Nazis.

There is sure to be similar utility in the Jan. 20 snapshots of government Web sites, Miller said. Some agencies may find them useful in settling legal disputes, and researchers can use them to trace the early development of e-government.

NARA still must figure out how to store and, ultimately, how to search 21 terabytes of digital information. "We're not scaled to do that now," Miller said. But, "we felt we would be kicking ourselves if we did not" jump at the chance to preserve such a large chunk of digital history.

NEXT STORY: BLM gains ground on programs