How green is your data?

The information side of IT — the capture, management and storage of data — provides another candidate for energy-saving initiatives.

Green information technology is often treated as a straight hardware play. Powering down PCs at night and consolidating servers top most organizations’ energy conservation checklists.

But the information side of information technology — the capture, management and storage of data — provides another candidate for greening. The growth in data is not likely to slow anytime soon, and all that information has to be stored. However, disk and tape systems contribute to data centers’ power and cooling requirements. So consolidating storage or at least slowing the proliferation of new devices can help an IT shop boost its environmental profile.

Greening your data

Here are some basic steps for reducing the environmental footprint of electronically stored data.

  • Identify redundant and poor-quality data.
  • Assess the cost of managing that data, factoring in expenses such as the additional storage hardware required.
  • Develop a strategy for eliminating redundant and poor-quality data.
  • Create a single, authoritative source of data, often called a master or gold copy, that multiple departments can use, rather than each one creating its own.
  • Deliver and exchange data via reusable data services, as service-oriented architectures do.

“The typical interest in green and sustainability is around the data center and energy efficiency, but organizations across the spectrum are beginning to realize there are other dimensions,” said Harsh Sharma, chairman of the Object Management Group’s Sustainability Special Interest Group. “How green is your data? That’s another important dimension.”

But with a few exceptions, government entities, like everyone else, appear to be focused on hardware when it comes to green initiatives. A CDW Government report on energy-efficient IT listed the top three energy-saving measures among federal agencies interviewed for the study: migrate to LCD monitors, buy Energy Star-compliant gear and pursue server virtualization.

One obstacle on the path to green data is the lack of a documented return on investment. Reports abound on the server side — many from vendors interested in selling newer, energy-efficient products — but scant information exists about data-related energy costs and the potential for reducing them, which makes it harder for enterprises to justify projects that address green data.

Moreover, systematic approaches to green computing of any kind are uncommon. Therefore, agencies are on their own as they try to develop more comprehensive and strategic sustainability plans that include data management among other initiatives, such as green buildings, regulatory compliance and procurement practices.

The Object Management Group’s sustainability group is leading an effort to develop such a holistic approach with input from several federal agencies and other organizations. The group’s Sustainability Assessment Model would give organizations a way to define a standard set of metrics for assessing, measuring and reporting on different dimensions of green initiatives, including data management, Sharma said.

In the meantime, some organizations are reporting good results when they include data in their broader energy conservation efforts.

A green adjunct

Oregon took on green data management as an adjunct to the state’s data center consolidation effort. About four years ago, Oregon consolidated 11 centers into one. Server virtualization — migrating multiple servers to operate as virtual machines on fewer physical devices — helped the state reduce energy costs and its carbon footprint. A new storage strategy also contributed to the savings.

“When we first consolidated in 2006, we spent a lot of time re-architecting the data storage and putting in tiered storage,” said Dugan Petty, Oregon’s chief information officer.

The data center chose a mix of disk storage and virtual tape libraries. State agencies and departments that use the data center’s services can select the storage approach that is most cost-effective and uses the least amount of energy for their particular needs.

The data center established a rate structure that provides incentives for adopting that approach. Rates start low for storage that has a smaller energy footprint, such as on-site tape, and increase for higher tiers of storage that offer better performance but consume more energy.

In the future, Petty said the data center will consider a strategy to identify and eliminate duplicate copies of data. Deduplication, usually done using an automated software tool, eliminates redundant data, thereby shrinking storage and backup requirements. The reduced storage demand in turn eases cooling and power requirements.

Rockwell Bonecutter, global green IT lead at Accenture, said deduplication can shrink the physical space required for datasets and files by 30 to 90 percent.

Outside the data center, Oregon officials have already sought to reduce the redundancy of geographic information systems data.

The state’s Geospatial Enterprise Office used to make copies of GIS imagery it acquired and distribute them on CDs to state agencies and counties. Agencies would then load the data on their local servers to boost performance for users. Now, the office operates an imagery portal through which agencies can download data as needed.

The need for a more centralized approach became evident as the size of the state’s GIS holdings grew. In the mid-1990s, its GIS dataset was 325G, but it grew to 4.1T in 2005 and is expected to reach about 7T this year, said Ed Arabas, Oregon’s senior operations and policy analyst. That growth stems from the arrival of color imagery and half-meter resolution in 2005. The still-larger 2009 dataset will include infrared imagery.

Chilling data

The California Institute of Technology’s Spitzer Science Center offers another approach to green data and storage. The center handles science tasks for NASA’s Spitzer Space Telescope, and its data center houses an archive of the data the telescope collects, among other things.

A couple of years ago, Eugean Hacopians, a senior systems engineer who runs Spitzer’s data center, began talking to Nexsan Technologies about the storage vendor’s massive array of idle disks technology. MAID reduces the power consumption of storage because the disk drives spin at full speed only when they are in use.

Hacopians said the MAID discussions inspired him to think about the broader implications of energy consumption. So the data center began taking additional steps to reduce energy use. A plastic wall was installed to shrink the area of the computer room that needed cooling — Hacopians said he was inspired by a freezer curtain at a supermarket: plastic panels were used to isolate cold and hot aisles. Overall, the project cost $25,000, but it cut the center’s air conditioning requirements in half.

“Why waste so much energy running the air conditioning full blast and making the computer room like an icebox?” he asked.

Hacopians said green was not the project’s initial goal but a side effect of making air conditioning and power usage more manageable.

Nevertheless, some agencies remain focused on servers in their green efforts. The Los Alamos National Laboratory began a server virtualization push in 2006. The lab has eliminated 105 physical servers and now runs more than 250 virtual servers on 13 physical hosts. Virtualization has saved the lab 873,000 kilowatt hours per year and about $639,000 in energy costs.

Before virtualization, the lab was “running into the proverbial brick wall in terms of cooling and power in the data center,” said Anil Karmel, solutions architect at Los Alamos’ Network and Infrastructure Engineering division. He said the lab will investigate the data side of the equation in the future, but he believes there are still savings to be had in virtual servers.

“I honestly haven’t seen any data regarding energy efficiency as it relates to storage,” Karmel said. “All the metrics that we have seen have been for traditional server virtualization.”

Sharma said that although the lack of metrics has kept some organizations from pursuing green data initiatives, that attitude will likely change as they advance beyond the easy wins of server and computer efficiency and develop more comprehensive approaches to sustainable energy use.