Progress To Solid State Storage in Data Centers

Archived Content

The following content is from an older version of this website, and may not display correctly.

Solid-state storage, which today mostly means flash memory, has been absolutely transformational in certain enterprise applications. It can dramatically accelerate I/O-heavy transactional processes, and as a result has proven popular in areas such as financial services. It has even turned around virtual desktop projects that were failing because their disk storage could not cope with hundreds or thousands of users all trying to boot their remote desktops at 9am.

Those tend to be tactical installations to solve a particular problem, though. “Even in large companies, flash is mainly used for accelerating specific applications and point solutions, rather than a grand end-to-end data center adoption,” admits Jay Prassl, VP of marketing at SolidFire, a developer of solid-state arrays (SSAs). He says that the real change will come when solid-state storage is there for every data center application that can make use of it. So why isn’t this happening – or has it already begun to happen, and we just didn’t notice?

Part of the problem has been that, although their vendors might have described them as enterprise-grade, SSAs until recently lacked many of the data management tools that data center users expect – features such as de-duplication, compression, thin provisioning, snapshots, replication, and support for multi-tenanted applications and role-based user access.

That has changed very rapidly however, as SSA vendors add these capabilities and as data management software companies optimise their software for flash.

Certainly, a recent report from market research company Gartner noted “a keen interest to harness [SSAs] for multiple workloads, given the maturing data services.” Gartner’s analysts added that while users needing high performance and low latency storage for tasks such as online transaction processing (OLTP), analytics and virtual desktop infrastructure (VDI) had been able to tolerate the feature shortcomings, those with multiple application workloads were only now finding SSAs acceptable.

DISK OR FLASH?
The next blocker that is being addressed is price: although flash and other non-volatile solid-state memory technologies are still more expensive than spinning disk on a per-gigabyte basis, that is changing. It is not just that the production cost of flash is declining—the cost of hard disk capacity is falling too as developers invent better magnetic coatings and ever finer read-write technologies—it is also that SSA developers have realised that flash has properties that allow them to change that equation.

In particular, flash’s speed allows them to layer on data reduction techniques such as de-duplication and compression, explains Vaughn Stewart, the chief technical evangelist at Pure Storage. These are technologies which incur significant overhead, and in the disk world are therefore usually restricted to tier-two archival storage, or are applied offline, once the data is stored and at rest.

“The emergence of SSAs was around making flash scalable and highly-available, for example the likes of Texas Memory Systems [now part of IBM] and Violin Memory, but it was also price-premium storage systems aligned to transaction processing and so on,” Stewart says. “What we’ve seen since is vendors understanding that they can invert that price premium by including data reduction technologies to make flash affordable. The result is that applications perform better and end users are happier.”

He adds, “Different datasets respond differently to different data reduction technologies – de-dupe is great for files and pictures but not databases for instance, while compression works well on databases but not on images or operating system binaries. That means the more technologies a platform has and the more granular they are, the more it can provide.

“The question then becomes, if you can get to price equality, would you rather have disk or flash? I think that in two years time we will see all-flash in the tier two capacity storage market too – the gains there are it’s a tenth the power and a tenth of the footprint, so that’s more storage per rack or floor tile.”

Of course, it is not quite as simple as that, because the benefits of data reduction vary considerably depending on the data. In some rare cases, for example where you are backing-up virtual machines which are largely identical, savings as high as 50:1 have been reported, but 2:1 or 3:1 is more likely. This is going to make SSAs a rather more nuanced decision for data center managers, because the usable capacity and therefore the return on investment now depends more on your data profile than on the supplier’s specification.

THE HYBRID ROUTE
And in any case, SSAs are not the only way for the data center manager to leverage the superior performance of flash. “Three forms of flash have gone in [to enterprises],” says Suresh Vasudevan, the CEO of Nimble Storage, another SSA specialist. “The first is flash in servers, for absolute performance. That’s fine as long as the data also exists elsewhere, in case of a server crash.

“Second is the flash-only array for those applications that have such performance needs that a 10 or 15-times price premium is worth paying. And third is the hybrid, they’re going after mainstream applications. In fact, every storage array now uses flash somewhere, and all storage vendors now include flash at some level.”

That hybrid route could well become the biggest one. It means combining flash with other technologies, typically spinning disk, whether it is performance-optimised tier one or capacity-optimised tier two disk. All the major array suppliers can include a tier of flash now, while many also leverage server-side flash within their subsystem controllers.

As an example, 70 percent of NetApp’s FAS filer shipments now include some proportion of flash, says NetApp solution marketing manager Laurence James. “We have shipped over 100PB of flash so far, a mixture of solid-state disk and server-side cache,” he adds. James notes that established suppliers such as NetApp were well placed to sell flash into the enterprise because they already had the necessary data management tools in place.

The specialist SSA developers are sceptical of course, pointing out that treating flash as if it were spinning disk will not take best advantage of it. James acknowledges that SSAs may well outperform a traditional array that has been upgraded with flash, such as a NetApp FAS. He argues though that as we move away from point solutions into the wider data center, the latter’s mature data management capabilities—and the fact that it consolidates all your storage needs, instead of adding extra devices to manage—will come to the fore.

“The requirements for data management do not change,” he says. “You have to look at what performance the customer actually needs – do you really need a million IOPS? FAS may not have the top-end performance, but it may be a better fit for the customer over time. Plus, we don’t want to create new islands for our customers.”

The other big problem, according to anyone with experience of selling flash into the enterprise, is its limited lifespan. Quite simply, each flash cell gradually wears out as it is erased and rewritten, and becomes less reliable. This is because erasing flash is relatively hard and slow – it requires a higher voltage, which damages it ever so slightly, and it must be erased a block at a time, which means moving any live data that is still there.

THE POWER OF MEMORY
On the one hand, endurance is continually improving as the technology improves, but on the other it is getting worse as chipmakers move to finer silicon processes and denser cells to pack more bits into the same space. The original SLC (single level cell) flash stores one bit per cell and is good for at least 100,000 rewrites, while cheaper MLC (multi level cell) flash store two bits per cell but can only stand perhaps 5000 or 10,000 rewrites. EMLC (enterprise MLC) comes in between those two on both cost and endurance, and then there is the new TLC (triple level cell) technology that stores three bits per cell but guarantees just 1000 rewrites.

Is all this really a problem though? That endurance number is only when problems might start, and even then a lot can be done with software error correction. In addition, designers typically allocate a proportion of the available flash as spare blocks to replace ones that go bad, and tripling the capacity per chip gives them plenty of scope to do that. Indeed, many of today’s low cost SSAs are based on the fact that memory controllers are so powerful that it is cheaper to error-correct than use higher grade flash. The downside is that more error correction means more work for the controller, which adds latency.

The controller will also do wear-levelling, intelligently spreading the writes and erasures across the memory to avoid hotspots that might cause unrecoverable failures. This means that the risk of encountering endurance problems will also depend on your usage. With the right management, it could take years or even decades to exceed the wear limit across an entire solid-state drive. Plus, if your usage is so intense that you cannot trust flash, there are probably better solid-state options to consider such as battery-backed DRAM.

For many people working with flash, the real issue is the file system. Some have taken an open source route and adopted derivatives of ZFS, the Sun-developed file system and logical volume manager, while others have opted to develop their own. For example, Vasudevan says Nimble’s funders believed flash could only be really disruptive if it had a file system that was optimized for its strengths and weaknesses, such as the phenomenon of write amplification which means that while flash is stronger than spinning disk at random reads, it can be much poorer at random writes.

“So we set up the company to build a flash-optimised file system, then use commodity hardware to package that as an appliance,” Vasudevan explains. He adds that new non-volatile memory technologies are on the way too, such as phase-change memory and magneto-resistive MRAM, so any new file system must also be open to those in the future.

Does that mean the hybrid arrays are bound to win out? SolidFire’s Prassl concedes that they will be better in some cases, but argues that it is more to do with the size of the organisation and its IT capabilities. He says that even when SSAs are islands in a larger disk-based sea, their performance outweighs the additional management overhead.

“I really believe those tiered storage systems that are strong in the mid-market, such as Nimble, that’s where all-flash will struggle, because the small and mid-size market won’t settle for two types of storage,” he says. “But in the large environment, tiered systems don’t make sense – it is more efficient and less of a management challenge to have two separate [capacity-optimised and performance-optimised] storage pools.”

And those islands tend to grow, argues Pure’s Stewart. “The majority of engagements may be tactical – the customer has an application performance problem and identifies flash as a way to fix it, for example,” he says. “But from there they very quickly understand that flash offers day-to-day benefits too, such as easier management, so it quickly becomes strategic.”

Progress To Solid State Storage in Data Centers

Archived Content

The make vs. buy decision for data center infrastructure management software – A clear choice

2023 Data Center Market Trends: Hong Kong Asia's Connectivity Hub

Emerging Energy Storage Technologies

Success story: Kao Data and Cadence