Navy takes pass on hot e-publishing technology

Saddled with an outdated electronic document system that wouldn't survive the Year 2000 rollover, the Navy's Trident II missile assembly contractor had to make a choice.

Saddled with an outdated electronic document system that wouldn't survive

the Year 2000 rollover, the Navy's Trident II missile assembly contractor

had to make a choice.

Lockheed Martin Space Systems had to select an electronic publishing

format for nearly 30,000 pages of documentation for building the missiles

and an extra 225,000 pages' worth of archives about missiles already produced.

Despite the buzz surrounding extensible markup language (XML), the Navy

Strategic Systems Program Office directed Lockheed to convert the documents

to standard generalized markup language (SGML).

The popularity of XML, a subset of SGML tailored for the World Wide

Web, is driven by the growing interest in e-commerce (see sidebar). But

it was SGML's ability to produce many types of documents from the same source

file that won over the Navy, said Moulton Schwab, project engineer at the

Navy Strategic Systems Program Office in Washington, D.C.

"We want to use it as an interactive document as well as be able to

print it," Schwab said. "We were looking for a common format to produce

whatever product somebody might want."

Both XML and SGML, which was a document-coding specification adopted

as an International Organization of Standardization standard in the mid-1980s,

create formats by placing electronic tags in a document that identify the

document's structure and the data itself. The tags allow data to be displayed

in different formats across media, including on the Internet, from CDs,

in printouts and onscreen.

With an SGML-formatted file, it is easy to later convert the document

to Adobe Systems Inc.'s portable document format (PDF) to Hypertext Markup

Language (HTML) to XML, he said.

"Since we didn't know how much we wanted of each or where the industry's

going, we wanted a known standard," Schwab said.

Other government agencies with large document libraries also are looking

at how to store and display the information. There are numerous options,

each with different strengths and weaknesses, but the most popular are those

such as SGML and XML that are based on public standards.

Eventually, the whole Trident assembly process could become paperless,

and the conversion from the original Xerox Integrated Composition System

(XICS)-encoded ASCII text to SGML was a big step in that process, Schwab

said. The electronic documents are used at the Strategic Weapons Facility

Atlantic in King's Bay, Ga., the Eastern Range at Cape Canaveral Air Station,

Fla., and aboard submarines.

Because the original documents were not all created in a uniform way

and Lockheed had little experience with SGML, the firm outsourced the job

to the Data Conversion Laboratory, which completed the conversion last March.

New York City- based DCL also provided Lockheed with a software tool

to convert additional documents into SGML on its own, according to Mark

Ferrara, technical lead of the Advanced Systems Group for Lockheed Martin

Space Systems.

"For anybody doing serious publishing of paper, there's no substitute

for SGML," he said.

Lockheed had about a week to provide DCL with the Trident source data — a collection of several gigabytes of text files, Ferrara said. Lockheed

provided DCL with an XICS-to-SGML standard conversion map with instructions

for handling exceptions. DCL created a test run of 1,000 pages. Once that

was successful, it fired up the production conversion system, plowing through

1,500 to 2,000 pages per week.

DCL automated much of the process by using software tools that can infer

what the tags in the original document should be in SGML, said Mark Gross,

president of DCL. But when the system encountered problems, having the instructions

from Lockheed on what to do helped minimize delays.

The military prefers SGML because it puts information in a standard

form that lends itself to creating interactive electronic technical manuals,

Gross said.

"They want to be able to get information to fix a tank off the laptop

or they want the computer to be smart enough to follow the steps the technician

is following to know what the next page will be," Gross said. In the case

of the Trident program, the Navy hopes to link barcodes on missile parts

with appropriate sections in the text, Schwab said.

DCL also is working for the Library of Congress and the Agriculture

Department, Gross said.

The National Library of Medicine has converted 250 book-length documents

to SGML during the past six years. The documents can be as large as 1,000

pages each, said Maureen Prettyman, computer specialist and project leader

at NLM. The books were converted from a variety of document formats.

"In order to guide the user through this morass of information, we have

to provide some structure," Prettyman said. When Prettyman started the conversion,

SGML was an emerging standard. The library chose SGML because it is platform-independent

and supports object identification and maintenance in a hierarchical structure

of text, she said.

Because the National Library of Medicine did not take any shortcuts

or use SGML advanced features, the data can easily be converted to XML for

use on the Web. A lot of government agencies are switching to XML, Prettyman

said. But because mass conversions are expensive, those who already use

SGML tend to stay with it, she said.

Documents such as the Congressional Record, which need exact document

replication, are better-suited for PDF, Prettyman said. But for searchable

text, PDF isn't the best option, she said.