Lose the file cabinets

New PDF/A standard preps electronic documents for long-term archiving.

About three years ago, when Enron and other corporate giants were crumbling, they left behind a mountain of records. Around that same time, the Administrative Office of the U.S. Courts began collaborating with Adobe to create a file format that would retain data from those and other records, maintain the data's formatting and visual presentation and adopt to future generations of technology.

After seeking input from vendors, academia, federal agencies and others for help, Adobe and the U.S. Courts founded the Association for Information and Image Management's standards working group. The International Organization for Standardization (ISO) has approved the working group's results: the PDF Archival (PDF/A) standard. The standard will be available from ISO after publication in about a month at www.iso.org, according to a National Information Standards Organization newsletter.

PDF/A specifies a subset of the PDF file format that is more suitable for long-term preservation than the traditional PDF. For example, the standard will forbid links to outside documents and require fonts to be embedded in documents, making the documents completely independent of outside resources, said Diana Helander, Adobe's business development manager for worldwide standards. The documents must include metadata so that archived documents can be fully searchable, auditable and traceable.

"There's no encryption allowed, so that if Adobe doesn't exist in 200 years, someone can still open the document," Helander said. "We expect to see a lot more government agencies adopting PDF/A."

Government officials and experts say PDF/A could help cut costs. Future access to electronic documents depends on maintaining the ability to read them. The PDF/A standard would allow PDF documents to be retained longer and reduce the costs associated with long-term preservation.

"Standardization really gets down to consistency and cost-containment," said Stephen Levenson, a special assistant in the Office of Information Technology at U.S. Courts and chairman of the working group. "It's really nothing different than saying paper has to be 8 1/2 by 11" inches.

Planning for the future now will help cut costs, he added.

At a June 20 meeting in Hamburg, Germany, the working group addressed specifications for the next version of the standard, including security issues such as digital signatures.

Experts say archivists should incorporate PDF/A into the National Archives and Records Administration's Electronic Records Archives program. The $500 million ERA program is a federal effort to save government records in any format and make them available on future hardware and software.

In August NARA officials will award the project to a team led by either Harris or Lockheed Martin.

Charles Dollar, a senior consultant for Cohasset Associates in Chicago, said PDF/A seems like a reasonable choice for any electronic archive program given the standard's flexibility.

"There's a great deal of work that needs to be done upstream," he said. "Use of PDF/A would help mitigate some of these problems."

Because of the cost of creating, saving and storing records, an easily transferable solution such as PDF/A should generate savings for U.S. Courts, the U.S. Patent and Trademark Office and other agencies, Dollar said.

Electronic records management system vendors say they welcome the standard but wish access to PDF/A would be free.

"Adobe's PDF has gained such popularity that it has almost become the de- facto standard for modern systems", said Straughan Schofield, general manager for product development and support at Tower Software. "PDF/A becoming an official international standard is a natural progression and recognition of current reality. Having said that, there remain issues of intellectual property ownership. Naturally, as an independent software vendor, Tower Software acknowledges Adobe's rights, but suggests that Adobe has a lot to gain by letting go a little."

PDF/A requirements

To ensure that archived electronic documents will be useful decades in the future, a new standard requires documents' data to be self-contained and independent of other documents.

PDF/A outlaws the use of:

  • Encryption.
  • Embedded files.
  • External links.
  • JavaScript.

    PDF/A mandates the use of:

  • Embedded fonts.
  • Metadata.