Data deduplication is Green



There's been a lot of attention on Green IT (sometimes known as Green Computing).

Green IT is about sustainable and responsible IT practices; one that uses resources to power the data center in a responsible manner so that the resources used in this generation are not compromising the ability of the next generation to do the same.

IT is inherently wasteful. The 2 biggest consumers of resources are power and cooling. Have you ever thought of how much electricity is used to power a data center and how much of that power is converted to energy to drives the equipments in the data center?

One study by Lawrence Livermore National Laboratory, for every 1.0 watt consumed by the system, 0.7 watt is used to cool the system. That means 1.7 watt is required to power and cool a system; with 70% "wasted" to cool the system. That study did not include the loss of energy due to power distribution as electricity is shared with the equipments in the data center, which could lose as much as 20%.

The study was done in 2004 and given the voracious demand for information today, the demand for more electricity has never been stronger. In another study by the U.S. Environmental Protection Agency (EPA) in 2006, one of the tables (shown below), 50% of the electricity goes to site infrastructure, including the cooling.



That is a lot of wastage and it's not Green.

As we go through the various features of any storage systems, one of the most compelling is data deduplication. I have written about data deduplication in previous blogs and yes, it is definitely a must have for all storage systems. But the hypes and the fuzz have left many confused.

Where should I run dedupe? What types of data should I dedupe?

I can see that a virtual "boxing ring" right now. On the blue corner, primary storage dedupe and on the red, backup dedupe ...

As I have been through the NAS vs SAN wars some years back, data deduplication should not be that way either. There is no competition; there is no "mine is better than yours" thingy. All these fuzz and hypes are made by dedupe vendors in the market, peddling their wares.

Customers must not be sucked into these things, because ultimately the customer must always fall back to what he/she wants to achieve in the overall IT architecture and design. Evaluate well and thoroughly. Run the spreadsheets if you must because I believe, ultimately, it is about the bottom line.

Always look at operational efficiency as one of the key criteria when evaluating data deduplication features.

If Green is on the customer's agenda, then yes, Data Deduplication can help. It reduces the storage capacity needed to store the original data significantly (when it works). Reducing inactive or semi-active data and even aligning the deduped data according to storage tiering, the amount of power and cooling required to run and process the active data set is reduced as well.

That's a biggie because when the power and cooling are reduced, the operational efficiency goes up as well as productivity and the responsiveness of the storage system. It lowers costs and hence better operational profits.

And that is also the primary reason, why Real Data Matrix's i-Infrastructure Optimization includes Ocarina Networks ECOsystem. It is probably one of the most efficient primary storage data deduplication and optimization solutions today.

News Archive