Navy takes pass on hot e-publishing technology
Saddled with an outdated electronic document system that wouldn't survive the Year 2000 rollover, the Navy's Trident II missile assembly contractor had to make a choice.
Saddled with an outdated electronic document system that wouldn't survive
the Year 2000 rollover, the Navy's Trident II missile assembly contractor
had to make a choice.
Lockheed Martin Space Systems had to select an electronic publishing
format for nearly 30,000 pages of documentation for building the missiles
and an extra 225,000 pages' worth of archives about missiles already produced.
Despite the buzz surrounding extensible markup language (XML), the Navy
Strategic Systems Program Office directed Lockheed to convert the documents
to standard generalized markup language (SGML).
The popularity of XML, a subset of SGML tailored for the World Wide
Web, is driven by the growing interest in e-commerce (see sidebar). But
it was SGML's ability to produce many types of documents from the same source
file that won over the Navy, said Moulton Schwab, project engineer at the
Navy Strategic Systems Program Office in Washington, D.C.
"We want to use it as an interactive document as well as be able to
print it," Schwab said. "We were looking for a common format to produce
whatever product somebody might want."
Both XML and SGML, which was a document-coding specification adopted
as an International Organization of Standardization standard in the mid-1980s,
create formats by placing electronic tags in a document that identify the
document's structure and the data itself. The tags allow data to be displayed
in different formats across media, including on the Internet, from CDs,
in printouts and onscreen.
With an SGML-formatted file, it is easy to later convert the document
to Adobe Systems Inc.'s portable document format (PDF) to Hypertext Markup
Language (HTML) to XML, he said.
"Since we didn't know how much we wanted of each or where the industry's
going, we wanted a known standard," Schwab said.
Other government agencies with large document libraries also are looking
at how to store and display the information. There are numerous options,
each with different strengths and weaknesses, but the most popular are those
such as SGML and XML that are based on public standards.
Eventually, the whole Trident assembly process could become paperless,
and the conversion from the original Xerox Integrated Composition System
(XICS)-encoded ASCII text to SGML was a big step in that process, Schwab
said. The electronic documents are used at the Strategic Weapons Facility
Atlantic in King's Bay, Ga., the Eastern Range at Cape Canaveral Air Station,
Fla., and aboard submarines.
Because the original documents were not all created in a uniform way
and Lockheed had little experience with SGML, the firm outsourced the job
to the Data Conversion Laboratory, which completed the conversion last March.
New York City- based DCL also provided Lockheed with a software tool
to convert additional documents into SGML on its own, according to Mark
Ferrara, technical lead of the Advanced Systems Group for Lockheed Martin
Space Systems.
"For anybody doing serious publishing of paper, there's no substitute
for SGML," he said.
Lockheed had about a week to provide DCL with the Trident source data — a collection of several gigabytes of text files, Ferrara said. Lockheed
provided DCL with an XICS-to-SGML standard conversion map with instructions
for handling exceptions. DCL created a test run of 1,000 pages. Once that
was successful, it fired up the production conversion system, plowing through
1,500 to 2,000 pages per week.
DCL automated much of the process by using software tools that can infer
what the tags in the original document should be in SGML, said Mark Gross,
president of DCL. But when the system encountered problems, having the instructions
from Lockheed on what to do helped minimize delays.
The military prefers SGML because it puts information in a standard
form that lends itself to creating interactive electronic technical manuals,
Gross said.
"They want to be able to get information to fix a tank off the laptop
or they want the computer to be smart enough to follow the steps the technician
is following to know what the next page will be," Gross said. In the case
of the Trident program, the Navy hopes to link barcodes on missile parts
with appropriate sections in the text, Schwab said.
DCL also is working for the Library of Congress and the Agriculture
Department, Gross said.
The National Library of Medicine has converted 250 book-length documents
to SGML during the past six years. The documents can be as large as 1,000
pages each, said Maureen Prettyman, computer specialist and project leader
at NLM. The books were converted from a variety of document formats.
"In order to guide the user through this morass of information, we have
to provide some structure," Prettyman said. When Prettyman started the conversion,
SGML was an emerging standard. The library chose SGML because it is platform-independent
and supports object identification and maintenance in a hierarchical structure
of text, she said.
Because the National Library of Medicine did not take any shortcuts
or use SGML advanced features, the data can easily be converted to XML for
use on the Web. A lot of government agencies are switching to XML, Prettyman
said. But because mass conversions are expensive, those who already use
SGML tend to stay with it, she said.
Documents such as the Congressional Record, which need exact document
replication, are better-suited for PDF, Prettyman said. But for searchable
text, PDF isn't the best option, she said.
NEXT STORY: What to consider when considering a Roth IRA