Framework could aid global information exchange

UDEF is attracing the interest of the National Cancer Institute and IT groups.

An open-standards group has created a framework that could facilitate the global exchange of information among organizations. The naming system could benefit a wide range of disciplines, from disaster response to medical research.

The Open Group’s Universal Data Element Framework (UDEF) has the potential to hasten information exchange by indexing the world’s datasets — from e-commerce services to government registries and medical research databases — in one universally shared semantic repository.

And evidence shows that UDEF works. In October 2005, Open Group officials demonstrated the framework for members of the information technology community.

The demonstration applied UDEF to a disaster response situation. In the scenario, an imaginary emergency response team wanted information about the availability of 9-volt batteries in a retailer’s inventory database. An address repository managed by the U.S. Postal Service allowed the workers to determine the response team’s location in relation to manufacturers and retailers. Access to Office Depot’s database allowed the hypothetical workers to quickly check the batteries’ inventory status. Finally, MapQuest let users plot driving routes to stores that had batteries available.

Ron Schuldt, chairman of the Open Group UDEF Forum, said the framework’s coding provides a semantic link among disparate datasets. “If the UDEF is adopted on a global scale, enterprises will be able to reduce the costs of building and maintaining interfaces between enterprise applications,” said Schuldt, who is also a senior staff systems architect at Lockheed Martin Enterprise Information Systems.

UDEF provides a rigorous rules-based naming system. It involves mapping a data descriptor to a structured identifier that resembles the 123.123.123.123 format used in IP addresses. Schuldt simplified the idea by likening it to any basic classification system. “You go into a library to find a book, and whether you know it or not, the Dewey Decimal System is behind the scenes,” he said.

The UDEF framework expands on traditional e-commerce by providing a means to link all components — bank accounts, inventories and other automated systems — to one central semantic hub called the Global UDEF Registry.

The disaster response demonstration provided only a glimpse of UDEF’s possibilities. The emergency workers had access to the phone numbers of local stores in the test, but a real-world deployment could allow users to place items on hold or ship them. Adding radio frequency identification technology (RFID) could also help track supplies.

The catalyst for the UDEF movement has been the increasing push for semantic interoperability. “The timing, I think, is right [for] the whole notion of getting one’s arms around the semantics,” Schuldt said. “You hear at the [World Wide Web Consortium] level the words ‘Semantic Web.’ You go to many different conferences on interoperability, and they all conclude that the piece that’s missing is dealing with the semantics. If you went to conferences five years ago, you wouldn’t hear that.”

He said the success of UDEF implementation hinges on the government, manufacturers and other industries providing controlled access to their real-time inventory data in a government-managed Web portal.

“A critical next step is to generate momentum and support for the Open Group’s effort to launch the Global UDEF Registry. They need participants,” Schuldt said.

A number of groups have expressed interest in UDEF’s potential for resolving some of the challenges to global interoperability, including the IPv6 community, the National Institutes of Health’s National Cancer Institute (NCI), the Electronics Industry Data Exchange Group and the Aerospace Industry Association. Most groups are awaiting the launch of the Global UDEF Registry before committing to the framework, which Schuldt said he expects to happen in the second quarter of 2006.

NCI has already agreed to pursue a limited test using its data infrastructure. The idea is that researchers can use the same technology that locates supplies to discover new research worldwide.

NCI is a prime candidate for demonstrating the power of a global registry because it already has an open-source data registry based on the same international data standard used by UDEF — the International Organization for Standardization/International Electrotechnical Commission 11179 IT standard for metadata registries. It describes the grammar, or structure, of UDEF names. Any data registry based on the ISO/IEC 11179 standard can be mapped to UDEF names.

The main reason the framework can bridge such diverse communities is because it is standards-based.

“Right now, everybody is looking for a way to achieve semantic interoperability, not only within their enterprise, but across industries,” said Denise Warzel, an associate director at NCI’s Center for Bioinformatics. “UDEF codes present a way it could actually work. As far as we are concerned, there really is no other option for semantic interoperability, other than one-to-one negotiation between parties wanting to collaborate.”

UDEF standardizes the coding and meaning of language. “UDEF Global Registry becomes a centralized place for us to access and register our information using a common semantic naming convention, like a synonym,” Warzel said.

Communities that hesitate to participate in the Global UDEF Registry for privacy or security reasons can adopt security modules and other techniques to protect sensitive or proprietary data, as NCI has done. It created its data registry, the Cancer Data Standards Repository, in 1999, and it is accessible online. Researchers from a network of cancer centers in the United States, the United Kingdom and Iceland are formatting their data descriptors using the ISO/IEC 11179 standard when they collect the results of scientific research studies.

For instance, the standard allows a researcher who is testing how a drug interacts with a particular cancer to share findings with an oncologist using another computer application that connects to a different system in a different country. About a dozen NCI-designated cancer centers have contributed to the data registry as part of a new initiative, and more than 50 centers can access the data.

“Normally, scientific research is not shared until it is published,” Warzel said. “And when it is, rarely is it actually interoperable with other data repositories. This is a whole new paradigm for conducting cancer research.”

She suggested to Schuldt that NCI could lead a test program to show the potential power of the Global UDEF Registry.

“We’re confident it’s going to work for cancer, and if we can help influence the broader community, we would love to do it,” Warzel said.

Government officials have planned a conference to promote UDEF and other naming technologies April 27.

Schuldt’s test planted the seed in several sectors, including the CIO Council’s SICoP, the IPv6 sphere and the RFID industry, said Brand Niemann, co-chairman of the CIO Council’s Semantic Interoperability Community of Practice.

Last year, the Open Group approached Niemann with the idea of conducting a UDEF test to enhance interoperability within the data reference model. He was so impressed by the outcome that his community of practice gave special recognition to the group at Lockheed Martin’s annual conference in February.

Niemann said UDEF is also particularly well-suited for the Defense Department and Centers for Disease Control and Prevention because officials must act fast under uncertain conditions. One academic researcher has already examined how data fusion can support military operations by conducting a defense supply logistics demonstration.

NCI wants to share selected research

Although the National Cancer Institute does not need the Universal Data Element Framework (UDEF) to accomplish its mission, the framework offers NCI the potential to share selected cancer research information with a global community. Therefore, the institute is considering a pilot project to demonstrate that possibility.

Mapping the whole NCI data registry, the Cancer Data Standards Repository, is not necessary to demonstrate the effectiveness of a Global UDEF registry. In addition, converting the NCI repository would require a new computer program or months of manual labor.

NCI plans to carve out a subset of its registry and then have someone manually map NCI’s coding to UDEF’s coding. While NCI and UDEF use the same standard for data identifiers, each group has its own vocabulary for describing concepts.

For instance, if an NCI user wanted to contribute information that uses the word “patient,” the user would look up the word “patient” under the UDEF category “person” and use the code for that concept.

If done manually, it might take a person 10 to 15 minutes to find the UDEF code for “patient height.” Eventually, computers will be able to automatically look up equivalent codes if organizations adopt UDEF and integrate the coding into their systems.

Denise Warzel, an associate director at the NCI Center for Bioinformatics, anticipates a meaningful pilot project could include 10 to 50 data items, such as drug name, lab and pathology report.

—Aliya Sternstein

NEXT STORY: Chertoff's 'no e-mail' pledge