The list of providers offering Apache Hadoop on top of public cloud infrastructure services has grown rapidly over the past few months. The most recent addition was the announcement by Cloudera of partnerships with a handful of IaaS providers to offer its Hadoop suite to their customers.
Partners the Palo Alto, California-based software company announced in late October included Verizon, Savvis (a CenturyLink subsidiary), IBM's SoftLayer and T-Systems (a Deutsche Telekom company). Verizon made its own announcement of the offering this week.
Michael Olson, Cloudera CEO, said what makes Cloudera's Hadoop suite stand out among competitors – whose list includes heavyweights like Amazon's Elastic Map Reduce, Microsoft's HDInsight and MapR's Hadoop distribution on top of Google's Compute Engine, among others – is its set of features crucial for enterprises.
Classic Hadoop distribution vendors, Olson said, generally offer infrastructure companies like Yahoo! and Facebook rolled out and ran in 2008. “It's got scale-out storage, it's batch MapReduce.” There may be some components that are a couple of years old, such as HBase – the open source non-relational database built as part of the Hadoop project.
Cloudera's pitch is built around features like enterprise security, access control, logging, compliance and policy-based governance, which competitors do not offer, Olson said. It also offers real-time analysis tools.
Cloudera Enterprise is a software package that includes CDH, the company's own Hadoop distribution, and tools for managing server clusters and data they store and process. All features will be available as a service from Verizon once the company rolls it out.
The provider announced an entirely new infrastructure to support its public cloud offerings just in October and expects to launch the cloud services in public beta before the end of the year.
Like with Cloudera's other provider partners, customers will generally interface with Verizon to buy and provision Hadoop software. The providers also decide how they price the services and bill their clients, Olson said.
“We provide back-line support services and we'll deal directly with customers when cases demand that,” he said.
Cloudera's enterprise angle for its cloud offering is evident from the list of providers it has chosen to partner with. All of them are strong players in the data center infrastructure services market for enterprise clients.
Hadoop, as a technology, lends itself well to going to market by going to the partners' existing customers, since its main appeal is ability to work with data where it is stored, as opposed to moving it for processing elsewhere. Verizon already hosts large chunks of IT infrastructure for many customers, and it will make a lot of sense for those clients to go with the same provider's cloud Hadoop services than to go to a different provider or stand up their own Hadoop clusters, Olson said.