Home
auf Deutsch           
Sign In / Register Advanced Search 
You are here:

Design & Build

The latest news and information on where and how data centers are being built

Real-life applications of several different data centre topologies
Part 5 of IBM Global Technology Services' paper on data centre topology design

In prior installments of the series, IBM engineers presented a method for selecting a data centre topology and explored in detail three different topology choices.

In this final installment we present some preliminary results on the method for selecting a data centre topology described in prior instalments of the series. In the three cases described below, we highlight the way our clients have been able to meet their business-resilience goals. We selected these examples to illustrate the use of two-, three-, and four-site topologies.

Field data

Data centre topologies considered in this paper can be evaluated only in the context of the risks being mitigated. Whereas some companies may balk at using a two-site topology, such configuration may provide an acceptable solution to others. In working with our clients, we recommend and design data centre topologies, considering both advantages and deficiencies associated with various solutions.

Ultimately, it is the client who must determine which of these solutions will address its availability requirements.

When the mission-critical Web-based banking application of a large Nordic bank had an outage, there was considerable pressure to determine what had caused the outage and what should be done to ensure that it would never happen again. Using methodology presented in this paper, critical business processes were identified and the risks associated with these processes were evaluated.

After a careful analysis and prioritisation of potential risks, it was concluded that the two-site configuration in use could meet business resilience requirements, provided that several enhancements were made to improve failover capabilities.

Although this two-in-region data centre configuration might not have been suitable in other circumstances, because of the unique nature of Nordic geology, its political and social stability and robustness of its civil infrastructure, a configuration with two data centres, only 5 km apart, proved to be adequate.

However, to ensure rapid recovery, it was deemed necessary to eliminate single points of failures in all areas, including organisation, technology, processes and facilities. Remediation plans included creation of a separate business system, capable of supporting critical business applications in the event of a failure in production or the backup systems and linking these systems using the IBM GDPS application availability solution. This solution is, in essence, a way of enhancing availability of a mission-critical business system by introducing a third system into a two-in-region configuration.

Following September 11, 2001, attacks, a large U.S. bank undertook a programme to reassess the risks to its data centre facilities and the centralised nature of its data centre topology. First, the risks associated with IT operations of the client were identified and prioritised. When financial impact of the risks associated with close proximity of primary and backup data centres was assessed, the client fully understood the risk to the business and requested a data centre architecture redesign, using a distributed topology.

Resulting design was based on a three-site topology that included a data centre, a data bunker in close proximity (for rapid data replication) and a third data centre several hundred miles away (with capabilities for backup from the data bunker when necessary). The solution also included organisational dispersion across the two regions. Technology, processes, organisation and facilities changes implemented enabled the client to achieve desired risk mitigation, while improving the overall systems availability and business resilience.

A study by a large European bank examined the regulatory requirements for extremely low RTO and similarly stringent RPO (close to no loss of data) and also the increasing threat of regional disasters, both man-made and natural. An in-depth risk analysis was followed by an in-depth examination of current technologies that support rapid recovery and minimal data loss (including in-flight transactions) in case of failure.

Given increasing instability of the power grid (several major widespread outages in the past few years as the region moved to a power-trading model similar to the North American one), changing weather patterns (including widespread severe weather) and an increase in likelihood of terrorist attacks, it was determined that business risks had to be addressed at the regional level.

A four-site data centre topology (two-in-region pair) was selected and, as a result, the entire IT infrastructure had to be upgraded and reliable data replication strategies had to be put into place (a four-site topology was also used for the official Web site IBM built for the 1998 Olympic Winter Games in Nagano, Japan).

 On the operational front, the bank adopted a strategy in which the primary production role is reassigned quarterly from the active/active pair in one region to the pair in the other region. This is, in effect, a way of exercising disaster-recovery capabilities of the system. Technology, processes and facilities changes carried out have enabled the client to achieve desired risk mitigation, while meeting strict regulatory requirements and improving overall business resilience.

So what does all this mean?

As more business is conducted online, more business systems are classified as mission-critical. Typically, 58 percent of business systems of an organisation are classified as mission-critical and these require additional investments in data centre facilities, IT technology, business processes and IT staff.

Many IT organisations have legacy data centres that have reached their capacity limit. These organisations need a data centre strategy that addresses their capacity problems and provides support for future mission-critical business systems.

When expanding the number of data centre sites is not possible, the solution for some organisations may lie in taking advantage of their existing data centres. Mergers and acquisitions also present opportunities not only for IT consolidation but also for protecting mission-critical business systems against an expanded set of threats.

Data centre site selection and infrastructure architecture are important considerations when designing a mission-critical business system because all components contribute to the achievable uptime. Data centres are susceptible to threats such as electrical power loss, exposure to natural hazards, such as hurricanes, or man-made hazards, such as physical security breaches. The best data centre configuration places mission-critical business systems in the lowest possible threat profile.

 Two geographically separated active data centres have been used to achieve continuous availability targets. Indeed, the most common data centre topology used today to support mission-critical business systems is based on two sites, within 50 km of each other. However, their proximity leaves the mission-critical business system exposed to an in-region disaster, causing a simultaneous site failure. Therefore, a minimum of two active in-region data centres with an alternate site out of region should be used. This topology still leaves an enterprise with reduced availability during a regional event.

When considering a wider range of data centre availability solutions, the addition of two more sites (a second active/active pair) made a marked difference in overall availability. The two-in-region-pair configuration, which is simply two in-region data centre pairs with sufficient geographical separation to prevent a simultaneous failure of both pairs, was able to withstand the broadest range of threats.

Related news: Cisco beefs up data center back-up, recovery solutions
Related feature:
State of Washington IT department leaves the outsourcing discussion behind

Related video: Interview: Steve Sams, VP of Global Sites and Facilities for IBM Global Technology Services

Keywords: IBM, data centre topology, disaster recovery

Comment Box
 
You must sign in to post
 
Username 
Password 
No Blogger account? Sign up here.
CAPTCHA Validation
Retype the code from the picture
CAPTCHA Code Image
Speak the code Change the code
 

The Design & Build Knowledge Bank contains the latest articles, news and features on how, where and when data centers are being built.
Keywords: Capacity management, construction, raised floor, Tier classification I II III IV, mega data centers, sustainable design, containers, modular, site selection, location, power, mission critical facilities.

© DatacenterDynamics 2010