Home
auf Deutsch           
Sign In / Register Advanced Search 
You are here:

Design & Build

The latest news and information on where and how data centers are being built

Best defence – a four-site data centre configuration
Part 4 of IBM Global Technology Services’ paper on data centre topology design

In the last instalment of this five-part series on designing data centre topologies, its authors (engineers at IBM Global Technology Services) examined two- and three-site topologies. Part four explores four-site topologies and compares them with the two topology types explored last week.

Four-site topologies
Today, large organisations often have multiple data centres. This proliferation may be due to a highly decentralised IT delivery model (where decisions are left to individual business units), or as a result of mergers and acquisitions. At some point, these organisations are likely to undertake a data centre consolidation program as part of their enterprise IT transformation strategy.

A data centre consolidation program not only reduces the number of data centres, but also provides an opportunity to improve enterprise resilience based on designing a new data centre configuration. The four-site topology is currently being used by selected financial institutions. European Central Bank, for example, has four data centres: an active-active pair in Italy and another in Germany.

The bank switches from one pair to the other on a regular basis. As previously described, an out-of-region data centre in a three-site active/active/standby configuration becomes a single point of failure, following a simultaneous failure of two in-region data centres. Even within a region, mission-critical business systems require a minimum of two data centres to provide the geographic dispersion needed to achieve continuous availability targets. The four-site topology is simply a two-in-region pair with sufficient geographical separation to prevent a simultaneous failure of both (see Figure 6).

In a two-in-region pair configuration, production workload is placed in the primary pair sites A and B, and non-production workload such as application development or disaster-recovery testing is placed in the secondary pair sites C and D. Unidirectional asynchronous disk replication is used to replicate the production data from the primary to the secondary pair. Capacity of the secondary pair is sized to support the entire workload, when – following a failure of the primary pair – production workload is switched to the secondary pair.

Recovery at the secondary pair involves restarting the mission-critical business system and reconnecting end users. Because database updates are sent asynchronously to the out-of-region pair, there is a risk of some data loss. Switching to the out-of-region pair is more complex and usually takes longer than switching between in-region sites. For organisations with multiple mission-critical business systems that use the two-in-region pair configuration, recoverability is planned by distributing the mission-critical business system across sites.

A temporary loss of a single site usually requires the recovery of approximately 25 percent of the mission-critical business systems. A single site loss leaves approximately 50 percent of mission-critical business systems vulnerable to a second site failure until the first site is restored.

Comparison of data centre topologies
A comparison of data centre topologies where the number of sites varies between two and four is shown in Table 1. The table covers three failure scenarios and, in each case, characterises the data centre configuration by two measures: RPO and RTO. The three scenarios are single-site failure, regional disaster and cascading failure. RPO entries reflect whether the disk replication is synchronous or asynchronous. In the first case the loss of data is minimal. Loss in the second case corresponds to the length of the failover interval in minutes, during which time transactions are lost. In the single-site failure, a significant event renders the site inoperable for at least six months.

The event may be a natural hazard such as a fire, a tornado or a flood. All data centre configurations protect against this failure. In a regional disaster, a significant event renders two in-region sites inoperable for at least six months. The event may be a natural hazard such as a hurricane. Two-site configurations using synchronous data replication are assumed to be too close together to protect against this failure. Because the two-site active/active shared-nothing configuration uses asynchronous data replication, the sites may be far enough apart to withstand this type of failure. Each of the remaining configurations uses asynchronous replication to an out-of-region site, which protects them against this failure.

In a cascading site failure, a significant even, following either a single-site failure or a regional disaster, renders one site inoperable for at least six months. No two- or three-site configuration is able to withstand this failure. Although the three-site all-active configuration may have a surviving site, its capacity can support only 50 percent of the production workload. The two-in-region pairs, however, have one remaining site able to support 100 percent of the production workload. Not surprisingly, the two-in-region pairs data-centre configuration (with the largest number of sites) is able to withstand the broadest scope of events.

Not used in our comparison is the risk of a pandemic – a widespread event likely to occur in multiple waves over a six-month time frame. Up to 40 percent of the IT staff may be absent from work at any time. This event differs from the other risks considered in that its impact is on people, rather than IT equipment. Configurations operated by IT staff physically residing in multiple regions may offer better protection than those with the IT staff residing in a single region.

In the final part of the series (coming Wednesday), IBM engineers present field data collected from real-life applications of two-, three- and four-site data centre topologies in real-life situations.

Authors: Richard Cocchiara, Distinguished Engineer and the Chief Technology Officer for Business Continuity and Resiliency Services at IBM
Dr. Hugh Davis, Lead Architect in IBM’s Global Business Resilience Consulting Practice
Doug Kinnaird, Executive IT Architect in IT Strategy and Architecture Practice at IBM

Keywords: IBM, data centre, data centre topology, four-site topology

Comment Box
 
You must sign in to post
 
Username 
Password 
No Blogger account? Sign up here.
CAPTCHA Validation
Retype the code from the picture
CAPTCHA Code Image
Speak the code Change the code
 
Articles:
  • The TOP stories of 2009: September
  • CENTRAL EUROPE: Central Europe's largest data center is a Hungary competitor
  • The TOP stories of 2009: February
  • 2009 in review and projecting 2010: What Industry Leaders Think
  • 2009 in review and projecting 2010: What Industry Leaders Think
  • Strong growth in the Texas colocation data center market
  • 2009 in review and projecting 2010: What Industry Leaders Think
  • BUILD UPDATE: From Strength to Strength
  • Engineering for data center energy
  • Capacity overspill remains preferred use of containerized data centers
News:
  • New North Carolina data center is IBM design at its best
  • HP unveils 20 foot data center container
  • UK public sector data centers to be consolidated from hundreds to 12
  • PEER1 spends $1m per thousand sq ft
  • Sabey gets LEED Gold for VMware data center shell and core
  • Waste heat from Paris data center to warm greenhouse for plants of the future
  • Digital Realty Trust to sell $500 million in securities
  • Level 3 Dallas facility loses Terremark to Digital Realty Trust
  • APC stacks Infrastruxture into IBM containers
  • Second Digital Realty Trust data center gets LEED Platinum
Download Library:
  • Five Steps to a Successful Data Center Relocation
  • TIA-942 Data Center Standards
  • A Review of Data Centre Tier Classifications
  • Rethinking Static Datacenters
  • Fresh Air Cooling in Data Centres: Overview of approach showing potential savings in operational costs

The Design & Build Knowledge Bank contains the latest articles, news and features on how, where and when data centers are being built.
Keywords: Capacity management, construction, raised floor, Tier classification I II III IV, mega data centers, sustainable design, containers, modular, site selection, location, power, mission critical facilities.

© DatacenterDynamics 2010