To dedupe or not to dedupe? Is that a question?



There are too much hype of deduplication software these days. Customers are confused because every storage or data management vendor in town is touting how great the technology is. Yes, deduplication is good. In fact, it's a GREAT feature!

But do customers and vendors need to be overly engrossed with deduplication that they missed the true nature of what they want to achieve with deduplication in their storage environment. Come on, ask yourself the simple question, "What do I want from deduplication?" I have come across many that says that dedupe is great; it is mind blowing; blah, blah, blah. But when I ask what is it that you want from dedupe, many will blabber something that seems intelligent hoping it will save their egos.

Really, what you do want to achieve with dedupe? Is it ROI? Is it the overall strategy to save cost? Is it to improve operational efficiency?

Like all new technologies (remember how hot iSCSI or thin provisioning was?), it should not be treated like a panacea to all IT woes. Find out more. Test the solution thoroughly before making the decision to invest into deduplication technology.

In fact, there are many implementations of deduplication. 2 prominent implementations are deduplication for backup and primary storage dedupe. They achieve different objectives and therefore, customers must understand the different objectives. If they turn on NetApp's deduplication(formerly known as A-SIS), this is meant for primary storage dedupe. It slows the consumption of storage capacity growth but at the expense of additional space for post-processing. The same goes for EMC Celerra's File Level Deduplication.

However, when considering EMC Data Domain, it is likely that the objective is for backup efficiency so that the success and recovery rate of the backup and restore improved. Backup and Restore is for service continuity where data can be recover to ensure business and operations continue to function.

Therefore, consider carefully. Here are 5 main areas to consider.

1. Local dedupe or global dedupe
2. Hashing differential or delta differential
3. Reverse referencing or forward referencing
4. Post processing or Inline processing
5. Source dedupe or Target dedupe

Don't be fooled by that large dedupe ratio. Some vendors are cautious and mentioned 20:1 ratio. Some even tout 500:1. The number is a just a number. Ultimately, here's Real Data Matrix's advice to you ...

1. Understand the ROI to achieve
2. Work with a set of requirements, with what can be achieved and with a set timeline
3. Evaluate properly and test thoroughly until satisfied

At Real Data Matrix, we work closely with our vendors to bring the best to our customer. We understand technologies such as NetApp Deduplication, EMC Avamar and Data Domain, SilverPeak for WAN level deduplication and even content-aware deduplication in Ocarina Networks.

In short, don't be caught by the fancy talk. Understand and test well.

News Archive