At the heart of the modern business is data. While its storage has always played an integral role in business operations, in the past it was seldom considered critical to success. This is because data was not yet recognised as the highly valuable asset, powerful enough to determine a company’s fortunes, that it is today.
However, we are currently in the middle of a digital revolution which is redefining the way in which businesses and individuals conduct themselves and interact with each other. Big data sets, now considered treasure-chests of information and insight, can give companies a competitive edge.
By 2020, the volume of machine-generated data will be 15 times greater than that created by people. But common IT best practices for implementing, operating, and managing terascale and petascale storage have already become unsustainable when capacities move into the hundreds of PBs.
When the underlying media no longer keep pace with data storage growth you end up with a “blivet,” an impossibility, or 10 pounds of stuff in a five-pound bag. The subsequent fallout creates storage problems in terms of scalability, data resilience, a constant need for data center refreshes and, as a result, an excessive total cost of ownership. So how has this impacted technological advancements in the data center?
Given the momentous changes in the way in which businesses operate and the unstoppable creation of extremely large volumes of data, it is little surprise that traditional file systems, storage-area network (SAN), network-attached storage (NAS), and unified-storage systems are inadequate to cope with the data storage needs of enterprises. Data center technology needs to keep pace with the new reality in which data growth is exponential. The answer? Object storage. But before we go too far, let’s take a look at what the data storage landscape looked like up until just a few years ago to understand how and why object storage has so quickly become so hot.
There are two variations of traditional storage. First, there is block storage, which manages data as blocks within sectors and tracks. Secondly, there is file storage, which manages files organised into hierarchical file systems. Block storage is used by SANs, where a disk array is connected via a SCSI, iSCSI or Fibre Channel networks to servers, while file storage provides standard network file sharing protocols to exchange file content between systems.
These legacy technologies had a handful of negative aspects but were, on the whole, capable of servicing businesses’ data storage requirements. This was until massive scale-up was required to cope with exponential data growth: according to IDC between 2015 and 2019 storage hardware used for Big Data deployments will see aCAGR of 29.9 percent, reaching 73.41EB in 2019. As a whole, storage hardware, software, and services for Big Data will grow at a CAGR of 24.6 percent.
Enter object storage
Let’s take a look at what object storage is and how it has helped budding digital businesses transform into global online giants. While many struggled with exascale storage, enterprises such as Amazon, Google, Facebook and Yahoo! found innovative technologies to deal with the challenges related to massive storage growth such as data resilience, data durability, infrastructure, management, power, cooling and total cost of ownership to name a few – and they have flourished
The overbearing hurdle though was scalability and these early adopters of object storage were quick to recognise that the advantages it offers have, in theory, no limits: to date, stored data has been scaled to beyond 10 exabytes. Difficult to ignore for this reason among others, object storage quickly worked its way into the mainstream at the enterprise level. So how exactly does it help to overcome exascale challenges?
Object storage is based on a shared-nothing architecture: a distributed-nodal computing architecture where none of the nodes share system resources including CPU, memory, or physical storage media. There is no single point of failure in shared-nothing architectures, nor system contention. Scalability is near linear in both capacity and performance. Hardware includes commodity low-cost servers, HDDs, SSDs, NICs, etc.
Object storage is a variation of software-defined storage (SDS) in that it converts low-cost server hardware into a highly-scalable, resilient storage system. In fact, some object storage software that utilises erasure coding allows even lower-cost desktop HDDs with a much lower mean time between failures and bit-error rate to be utilised with no loss of performance or scalability.
Today there are four different object storage options to choose from: two are commercial and two are open source. In general, all four are share-nothing architectures. They’re designed to be:
- Completely distributed with no single point of failure
- Scalable to exascale levels
- Automatically self-healing
- For the most part self-managing with little admin intervention
- Run on commodity server hardware with embedded storage media.
Object storage is designed to be massively scalable and is therefore fundamentally different from traditional block or file storage systems: it organises information into containers of flexible sizes, referred to as objects. Each object includes the data itself as well as its associated metadata and has a globally unique identifier instead of a file name and a file path. These unique identifiers are arranged in a flat address space, which removes the complexity and scalability challenges of a hierarchical file system based on complex file paths.
In such a data hungry environment, in which every aspect of our lives is data driven and all data past, present, and future is a potential gold mine of business and other insight, traditional limited scale storage models are doomed. Instead, object storage is the best-suited technology to provide hyperscale and performance to enable IT professionals to analyse and store the vast volumes of unstructured data prevalent today.
Emilio Roman is vice president of sales for EMEA at Scality.