In the heart of today’s tech landscape lies the data center, a hub of interconnected systems that facilitate the seamless flow of information. Central to this is the network architecture, continuously evolving to improve data processing and transmission. The broadcast “Evolving your network infrastructure for reliability” takes you inside the data center, revealing the technologies and methodologies transforming the digital world.

1. Inside the data center

At the core of every data center is its network architecture, managing the vast amount of generated and processed data. Over the years, network designs have evolved to meet the needs of hyperscalers and the growing demand for greater efficiency and scalability.

One of the most popular configurations today is the spine and leaf design, with spine switches interconnecting with leaf switches to form a network fabric. This modular design allows for easy expansion by adding more servers and bandwidth, while also ensuring redundancy – a critical feature for maintaining uninterrupted operations.

The need for speed

As the demand for faster data processing and AI applications grows, the need for speed and efficiency in data centers has never been more critical. AI pods, requiring significant additional network infrastructure, have driven significant changes in rack design.

The shift from traditional top-of-rack switches to middle-of-row architectures enhances switch port utilization and power reduction.

However, managing such high-density terminations into the rack presents new challenges.

Innovations in the connector space, including connectors terminating multiple fibers, are addressing issues such as link loss and polarity mismatches.

Pushing performance

To meet the needs for high-speed AI pods, data center operators are embracing new form factors for pluggable optics designed to handle higher data rates using complex modulation schemes that could lead to potential failures.

Nicholas Cole, data center solutions manager at Exfo (EXFO), describes the testing process:

“Our solution addresses this issue with a product that rigorously tests these transceivers. We run comprehensive scripts to evaluate various parameters, including receive power, transmit power, bit error rate, signal skew, physical power, and temperature. By validating these transceivers within the data center, we ensure reliability and performance, mitigating the risks associated with high data rates and complex signal processing.”

AI-driven applications also increase the need for network ports, as servers like the Nvidia DGX H100 require multiple connections for computing and storage fabrics. This results in denser networks with more cables, posing challenges in cable identification and fault resolution. Efficient management of port connectivity and quick fault detection have become essential.

The growing complexity of networks generates more performance data, making data analytics invaluable. Analyzing test data enhances troubleshooting, drives efficiencies, and provides intelligence on network performance. This trend is expected to be significant in 2024 as customers seek to leverage data for improved network management.

2. Across the campus

Data centers require extensive communication with the outside world, leading to increasing traffic between buildings. Larger fiber cables, with capacities of up to 6,912 fibers, are being deployed to meet current and future demands. Romain Tursi, solutions manager for centralized fiber testing & monitoring at Exfo, explains:

“This is what we are doing at Exfo, ensuring that during construction, real-time feedback is available on-site, with all joints and connectivity points analyzed during installation. This paradigm shift is addressed by our ‘test as you build’ approach.”

This new approach allows for efficient handling of the increased complexity of modern fiber installations.

“Test as we build” methodology in practice

The “test as we build” approach integrates traditional optical time-domain reflectometer (OTDR) testing into rack mounts and optical switching, allowing a single OTDR to test hundreds or thousands of fibers. This automates the process, enabling remote testing during splicing and providing immediate feedback to ensure accuracy at each step, ultimately industrializing and automating the entire process. Tursi explains that the OTDR testing process was done manually at the end of construction, generating a final report:

“There is a remote OTDR system that controls and manages the optical switches. This setup allows for bulk testing preparations throughout the workday. By day’s end, tests can be initiated for all completed work, providing immediate feedback the following day on any issues encountered.”

Cole goes on to explain how Exfo’s MaxTester complements a centralized test approach by providing a portable test solution so technicians can troubleshoot issues on links across the campus, or those that require additional scrutiny:

“This portable OTDR allows you to take a picture of a link between two points of connectivity to try and isolate the failure, so you can measure where the problem is.

GettyImages-1067942126
– Getty Images

For example, if there’s an issue with this patch cord that’s connecting two racks, you can see this on the instrument highlighted in red.

The instrument will show you what a bad patch cord looks like and what a good one looks like, where you have low loss and low reflections. So you need low reflection to make these high data rates work properly.”

This visual and diagnostic capability aids technicians in swiftly identifying and rectifying issues to maintain optimal network performance, ensuring minimal loss and reflections for efficient high-speed data transmission.

3. Between Metro and Edge sites

Cole introduces the next trend of connecting data centers using coherent optics technology, which utilizes polarization to transmit data more efficiently. This approach helps operators move toward an open vendor ecosystem, reducing infrastructure costs and power consumption.

However, it introduces complexities such as managing signal-to-noise ratios and ensuring compatibility across different components and wavelengths. Cole emphasizes the importance of rigorous testing during activation to ensure link stability and troubleshooting, marking a significant advancement in interconnecting data centers.

Many unmanned edge sites

Off the back of this, Tursi discusses the rise of 5G technology, which necessitates more computing power closer to end-users, driving the establishment of smaller, unmanned Edge data centers.

Remote supervision and control via robust network architectures are essential for managing these sites, including deploying remote test units across the network for continuous monitoring. The focus is on ensuring reliability and readiness for future needs, even in standby fibers, through automated monitoring and alarm systems.

This approach aims to maintain operational integrity without continuous on-site human supervision, crucial for both active and standby network routes.

Proactive network monitoring and maintenance, enabled by real-time visibility into network issues, are crucial for optimizing network reliability. Analyzing network data trends helps detect gradual degradation and predict potential failures, facilitating proactive maintenance scheduling.

GettyImages-1452233668
– Getty Images

Embracing the future of data center networks

The broadcast explores the key priorities in scaling dense networks through automation, using remote monitoring and analytics for proactive maintenance, and improving the efficiency of troubleshooting and restoration processes. These capabilities are crucial for managing the increasing complexity of data center and campus networks. Tune in to uncover how cutting-edge solutions are paving the way for a more connected world.

Learn more about EXFO’s data center testing solutions at exfo.com.