Flash technology will be superseded within a decade. There are many competing storage formats in development, each with their own characteristics, and Flash does not have a long runway, believes Barry Whyte of the SAN virtualisation architecture and development systems group at IBM.
IBM is developing “Racetrack” storage, where data is stored magnetically on silicon nanowires with retrieval governed by “massless motion”. In short, no moving parts and vastly reduced power needs, meaning the amount of energy needed to store, retrieve and back up your data will fall inexorably.
Racetrack is for the future. But ask about power and cooling efficiency of the storage boxes and you are quickly sidetracked down the traditional routes of data management with some spin-down capabilities.
EXPANDING NEEDS
There are efficiencies to be had in migrating your data from Flash to Tape via HDD as efficiently as possible (within the needs of the application – and therefore the business). But dynamic provisioning is not ubiquitous.
Whyte says: “Policy-based placement that can migrate data automatically, abstracting to spare capacity – thus allowing you to power down disks…That is the next step.”
There already exists the capability to manage storage separately from the server in a form “where the server does not know or care where the data is stored”, he says.
This level of virtualisation is available to midrange storage systems to raise utilisation and management across hardware from different vendors and is delivering efficiencies. But in the world of storage, nothing stands still. And often, that is part of the problem. All those spinning disks are drawing power, and despite spin-down technologies, the policy-based allocation – which is significantly cutting storage power draw – has yet to arrive.
If you develop a policy-based approach and then find the software that will do it for you, one day you will be able to dictate which media are holding your data. You will then be able to dynamically move your data to maximise disk utilisation and power down those disks that are not being used. A variation of this was once called Information Lifecycle Management – it was touted heavily but faded away. It might make a comeback under the guise of power management.
MORE TIERS
Dr Mike McCaig of Bull Information Systems encourages the implementation of a tiered storage design – keeping data on the right media for its required use, as an operational energy saving action. He says larger disk drives with slower operating speeds use less energy than small high-speed (high IO-rate) drives.
High-capacity, low-power SATA drives produce a carbon footprint that can be less than 10 per cent of high-speed enterprise class drives. Tapes and virtualised tape libraries are also among the most energy efficient means of providing storage for backup.
A review of the organisation’s data retention policy and information lifecycle management approach may reveal where data center online capacity could be replaced with offline storage. Similarly, reviewing backup policies and utilising deduplication technology will reduce the total work performed in backup systems and reduce the volume of data being stored at any time. (see box below).

Dr. Mike McCriag Of Bull Information Systems
Estimates of power savings from the use of thin provisioning range widely (from 10 per cent to more than 75 per cent) and like any other energy efficiency initiative are determined by a number of operating factors. Assuming an energy consumption of 100W/ TB, a data center operation running 10TB of storage and achieving a 40 per cent saving from thin provisioning would save 0.4KW/h, says McCaig.
With the improvement in storage products, energy performance and capacity each year, simply eliminating old technology is an obvious step towards reducing power consumption. Today’s latest drives deliver more than 20TB/kW – that is more than doubling what was available five years ago.
Great inroads have been made in making caches bigger (for performance), deduplicating data and thin provisioning (to save capacity, see box page 43) and virtualisation (for management). The latest generation of disk subsystems have features such as policy based spin-down and adaptive cooling designed in to them in a bid to become more energy efficient.
For example, EMC built in disk spin-down capabilities to its mid-range Clariion CX4, which automatically places inactive drives in sleep mode and activates drives on-demand. This policy-based feature was developed from its virtual tape systems for applications that have regular spans of time with little or no activity.
Though this functionality is not policy based the system has management options.
EMC Clariion product marketing manager Rodan Zadeh said: “Spin down is mostly used in archiving and back up. You can identify which RAID groups you want to be power aware. With the CX4 we did extensive analysis on active power management and on adaptive cooling. We actively monitor the air flow to the array and based on the temperature and the air flow the array cooling fans will step up or run at lower rpm to cool the system.”
As for publishing KW per GB or TB Zadeh says that mid tier customers are aware of the balance needed between power efficiency and performance and that EMC does not promote the Clariion purely on one or the other.
And all the major array manufacturers are pushing hybrid disk arrays. This marriage of HDD and Flash storage is being heavily promoted.
| THE BASICS OF DEDUPLICATION PRODUCTS |
|
AN OVERVIEW As the amount of data that organisations need to store grows, the industry is becoming increasingly concerned about the expanding costs of storage equipment and the associated costs of maintaining and managing that equipment. One way to reduce our storage footprint is by the implementation of deduplication. The Storage Networking Industry Association (whose members include EMC, HP, NetApp, Cisco, IBM and many other large vendors), de nes deduplication as “replacement of multiple copies of data – at variable levels of granularity – with references to a shared copy in order to save storage space and/or bandwidth”. Many major storage vendors have put deduplication solutions on the market that di- er in design, methodology and performance. Gideon Senderov reviewed the various types of deduplication at a recent industry event in San Francisco. He is a product management and technical marketing director at NEC Corporation of America.
DESIGN APPROACHES There are several approaches to deduplication-solution design and di- erent vendors take di- erent paths. One approach is using a component. It can be a card installed on a server that performs deduplication for blocks served into it. Another approach is gateway – a standalone device that performs deduplication but does not have any storage components. A dedupe solution can also be an appliance, which is simply a gateway with disks attached. It has both a deduplication engine and storage capacity. Another approach is a storage system capable of performing deduplication for particular data sets, le systems or volumes. There is also the grid-storage approach, which entails putting several controllers together to create a joint “deduplication pool”.
TARGET BASED VERSUS SOURCE BASED DEDUPE All the aforementioned designs (except the component approach) are target-based, where the deduplication process takes place at the target device. Another design approach – software – typically performs source-based deduplication, where local agents dedupe within the source before sending data to a repository. Neither of the two types (target or source-based) saves more space over the other.
SCOPE One major consideration in picking the right dedupe solution is the scope of deduplication a product provides. Several repository/controller con gurations are available. There can be multiple repositories per controller: multiple servers are coming into a storage system where each of the volumes or le systems has its own repository. Some solutions provide one repository per controller. This is similar to the aforementioned appliance-type design approach: “Whatever application servers come to that particular controller or that appliance, they would all get deduplicated into a single repository,” Senderov explains. Finally, multiple controllers can share a single repository, where various servers point to various controllers within the grid (the grid approach mentioned above) with a single dedupe repository on the back end.
FIXED VERSUS VARIABLE Another factor to consider is whether a solution offers deduplication segments of xed or variable granularity. Most vendors provide variable-size deduplication. Providing the engine with the ability to automatically adjust the size of a chunk of data it cuts increases the chances of identifying similar chunks. Solutions can also provide inline or post-process deduplication. With inline deduplication, data is deduplicated before being written to disk. With post-process deduplication it is after. Whether deduplication is performed inline or post-process affects replication, capacity and scalability, among other impacts. |
SOLUTION IN VIRTUALISATION
As an example of energy saving it was power company E.ON that consolidated onto a Hitachi Data Systems platform to drive down energy costs. It opted for virtualisation to the management of its enterprise infrastructure. It also planned to free up more capacity. “These improvements mean we can offer more flexible, responsive support that, in turn, means better customer service,” E.On said. “Critically, we have also been able to create an economically superior, energy efficient storage environment, directly benefiting our bottom line.”
In storage there is never a shortage of developments.
In the future, we are told, we will all have dynamic infrastructure-based data centers where storage is virtual and physical location and media is not a concern (although we will be paying by kW/h per TB).
But we are not in the future yet.
Related Feature:
10 Essential Facts about Solid State DisksRelated News:
Adaptec accelerates I/O in distributed data center set upsRelated Feature:
Correcting the false perceptions of medium voltage data centers
Keywords: storage, power, flash, virtualisation, tape, hdd, sata, ata, data center |