How effective is data deduplication?

Data deduplication's effectiveness depends on the redundancy in the data.  It can reduce backup data by up to 95%.

Data deduplicaiton is very effective on virtual machine images. Deduplication is an efficient approach to reduce storage demands in environments with large numbers of VM disk images. As we have shown, deduplication of VM disk images can save 80% or more of the space required to store the operating system and application environment; it is particularly effective when disk images correspond to different versions of a single operating system “lineage,” such as Ubuntu or Fedora.


What are the best use cases for data deduplication?

The best use cases for deduplication are backup, archive, disaster recovery, and DB dumps.


What is Software-Defined Storage (SDS)?

Software-defined storage (SDS) is an approach to data storage in which the programming that controls storage-related tasks is decoupled from the physical storage hardware.

Software-defined storage puts the emphasis on storage services such as deduplication or replication, instead of storage hardware. Without the constraints of a physical system, a storage resource can be used more efficiently and its administration can be simplified through automated policy-based management. For example, a storage administrator can use service levels when deciding how to provision storage and not even have to think about hardware attributes. Storage can, in effect, becomes a shared pool that runs on commodity hardware.

Software-defined storage is part of a larger industry trend that includes software-defined networking (SDN) and software-defined data centers (SDDC). As is the case with SDN, software-defined storage enables flexible management at a much more granular level through programming. [1]


What is the real value of Software-Defined Storage?

At present the majority of businesses use mixed, heterogeneous multi-vendor storage environments that were built for an older generation of applications. If they are to use the applications they want to use without relying on public cloud storage then businesses need a simple, low-cost, scale-out alternative – and one that does not require building a new purpose-built array for each new application (unconscionable both in terms of cost and work hours).

This is where software-defined storage really proves its value, by delivering scale-out storage based on whatever infrastructure the customer wants – including hugely cost-effective commoditized hardware. It allows business to build storage capabilities at the same economies and hyper-scale that previously was only available to the likes of Amazon and Google.

When it comes down to it, this is the reason why software-defined storage is proving so popular: it enables enterprises to build a modern, hyper-scale storage infrastructure that leverages commodity platforms. Moreover, with support for industry standard object storage APIs, it accelerates development of modern applications, allows businesses to build the applications they want and then run them where they want (i.e. in a private cloud if required) – and at a low cost, and with a simple management interface.[2]