Archived Content

The following content is from an older version of this website, and may not display correctly.

Datacenter operating system (DCOS) maker Mesosphere and open source Big Data firm MapR have launched a project to that allows data centers to improve their multi-tasking.

The Myriad project is a resource management framework that lets people run and manage Hadoop Big Data jobs run alongside other applications and services in enterprise and cloud datacenters. It uses the Apache YARN model, which is the next version (2.0 of the Mapreduce model which underlies Hadoop.

A team effort between developers at MapR, Mesosphere and eBay, Myriad is an open source system which aims to consolidate big data with other workloads in the datacenter, creating a single pool of resources that can be handled more efficiently. It is now available for download from GitHub.

Untangled YARN

yarn hadoop open source
– Thinkstock saner g

The developers said they plan to submit Myriad as an Apache Incubator project with the Apache Software Foundation in the first quarter of 2015.

The new resource management framework is a solution to problems that have dogged Apache Hadoop developers, who have been forced to run big data jobs on dedicated clusters to stop them affecting other processes already running. Putting big data in a separate pool isolates the big data job from other applications and services in production, but the resulting under employment of servers is a waste of both CPU and power resources.

Myriad uses Apache YARN and Apache Mesos to allow big data workloads to run alongside streaming applications (like Storm), build systems, continuous integration tools (like Jenkins), HPC jobs (like MPI), Docker containers, as well as custom scripts and applications.

“Big data developers no longer have to choose between YARN and Mesos for managing clusters,” said Mesosphere CEO Florian Leibert.

Big data developers can now get the best of YARN’s power for Hadoop-driven workloads and benefit from Mesos’ ability to run any other kind of workload, he said.

Apache Mesos is a distributed systems kernel that abstracts computing resources like CPU, memory and storage so developers can to program against the datacenter like a single pool of resources. It forms the core of Mesosphere’s Datacenter Operating System (DCOS).