Cookie policy: This site uses cookies (small files stored on your computer) to simplify and improve your experience of this website. Cookies are small text files stored on the device you are using to access this website. For more information on how we use and manage cookies please take a look at our privacy and cookie policies. Some parts of the site may not work properly if you choose not to accept cookies.

sections

Planning for scale with enterprise NoSQL

  • Print
  • Share
  • Comment
  • Save

All applications have inherent limits in what they can support. As organizations grow, so do their expectations on the apps they use. Development and operations teams need to plan for large scale from day one — not react to it when it finally happens.

When you hit the limitations of your system, you’ll need to determine how best to scale up. This can be done by changing the cluster hardware configuration, patching or replacing the database software, re-designing schemas, migrating data to other services, or a host of other tasks that often exacerbate issues in the short term or prolong downtime. The process can be costly and error-prone as organizations learn what pitfalls their database management systems (DBMS) vendor hasn’t told them about or hasn’t yet discovered.

The limitations of the traditional, relational approach to the DBMS are becoming more exposed as enterprise cloud services become more ingrained in daily business. NoSQL databases offer a solution to large scale but can be complex to implement. So what do you need to know?

Design tradeoffs
Traditional relational databases are designed for consistency. There is a raft of NoSQL databases that target different use cases, and often emphasise availability of the data. The differences lie in either using expensive hardware that is powerful enough to provide database consistency at large scale (‘vertical scaling’), or using new database software that supports availability across groups of commodity servers (‘horizontal scaling’).

Popular relational databases like MySQL scale well vertically but are complex to scale horizontally. As demand grows for enterprise cloud applications, there are good reasons for using NoSQL databases that are designed to scale horizontally:

 

  1. Size of data — if your data set doesn't fit on one machine, you need two
  2. Concurrency — if a server can handle 10,000 req/sec and you get 20,000, you need two machines

 

When a database needs more than one machine to handle demand, you hit CAP theorem. Posed by Eric Brewer, CAP theorem states that a distributed database system can prioritize two of three properties: consistency, high availability, and partition tolerance.

While horizontally scaled databases can effectively and easily scale with cloud applications, multi-machine system introduces the problem of partitioning events; nodes will fail, networks will get cut off. Partitioning leads us to a choice between high availability partitions (AP) and consistency partitions (CP). With CP, all nodes must agree before data is written. This means that availability is hampered during a network partition as no one is able to write. For AP, consistency is sacrificed to allow the storage layer to remain available and usable to the application. The drawback is that different clients may see different data during periods when nodes haven’t all received the latest data.

Plan for success
Worrying about scaling issues when they hit is no longer appropriate, especially in a world where demand can no longer be controlled (eg. the ‘app store effect’). Users hit servers immediately, with no limits on when and how people are using apps. In fact, the only way to assert control in the app store era is to lock down an application. You either make users wait until servers can handle more load or pull the application entirely. Such draconian measures hardly qualify as planning for success.

Large, heavily used systems bring with them a high probability that a portion of the system will fail. A database engineered around this assumption that prioritizes availability and eventual consistency is better suited to keeping your application online.

ATMs are a great example. Inconsistent banking data is why it’s still possible to overdraft money without realising it. It is unrealistic to present a consistent view of your account balance throughout the entire banking system if every node in the network needs to halt and record this figure before continuing operations. It’s better to make the system highly available.

Enterprise collaboration software poses a new problem in the age of mobile data. When mobile devices lack network access and go offline, there are essentially two disconnected systems on which users are updating data. Allowing them to have mutable data on their phones or tablets while the network is offline would be an important feature. Syncing these updates for thousands of users when these devices come back online is a huge problem.

Deal with NoSQL complexity now
The view that we take with Cloudant’s database-as-a-service, along with CouchDB and other NoSQL databases, is that it’s better to expose developers to the complexities of large scale early in the design process. Address scaling issues head on, so that you don’t have to solve them at 3am or right before that important demo.

Building these systems is challenging, which is why there are plenty of vendors who offer consulting and support for managing distributed NoSQL databases, and others who will host and manage these systems for you. Whether you decide to have help or build it yourself, when you’re launching a new cloud application for your business, keep application availability at large scale in mind. The enterprise version of the app store effect is coming.

Related images

  • Cloudant's Simon Metson

Have your say

Please view our terms and conditions before submitting your comment.

required
required
required
required
required
  • Print
  • Share
  • Comment
  • Save

Webinars

  • Powering Big Data with Big Solar

    Tue, 12 Jul 2016 18:00:00

    The data center industry is experiencing explosive growth. The expansion of online users and increased transactions will result in the online population to reach 50% of the world’s projected population, moving from 2.3 billion in 2012 to an expected 3.6 billion people by 2017. This growth is requiring data centers to address the carbon impact of their business and the increasing need for data centers to integrate more renewable resources into their projects. Join First Solar to learn: -Why major C&I companies are looking to utility-scale solar as a viable addition to their energy sourcing portfolios. -How cost-effective utility-scale solar options can support datacenters in securing renewable supply. -Case study of how a major data center player implemented solar into their portfolio

  • Smart Choices for your Digital Infrastructure

    Tue, 28 Jun 2016 10:00:00

    Your data centre is a key part of successfully transforming and building your digital business. The challenge today is to create a highly reliable, flexible, scalable and cost-effective digital infrastructure. Your cabling system is an important element in the creation of that infrastructure. Attend and learn how to: - Piece together different elements of standards, technical specifications and physical properties in order to choose the right networking equipment - Reduce the time and labour spent maintaining, repairing or installing cabling by adopting improved design and management practices.

  • White Space 39: Attacks on power and cooling

    Tue, 17 May 2016 08:25:00

    This week on White Space, we talk about the security of Industrial Control Systems – the systems that control your CRAC or PDUs. If these devices are connected to a network, attackers can reach them, and shut down a facility. Special guests Ed Ansett and George Rockett.

  • White Space 38: Leaving Las Vegas

    Tue, 10 May 2016 13:25:00

    This week we talk about: Tax Break for a data center Efficiency standards News form the Las Legad event - EMC World The Dell/EMC merger. And much more...

  • Designing Flexibility into your Data Center Power Infrastructure

    Wed, 4 May 2016 18:00:00

    As power density is rapidly increasing in today’s data center, provisioning the right amount of power to the rack without under sizing or over provisioning the power chain has become a real design challenge. Managing the current and future power needs of the data center requires Cap-Ex to deploy a flexible power infrastructure: safely handling peak power demands, balancing critical loads and easily scaling to meet growing power needs. In this webinar you will learn: > How to create Long term power flexibility and improved availability for your operation > How to increase energy efficiency and improve SLAs through a comprehensive set of best practices.

More link