Exponential growth in data creation, storage and processing is driving global demand for hyperscale data centers. 

The infrastructure of these data centers, most of which is designed and managed by Cloud Service Providers (CSPs) such as Amazon, Microsoft, and Google, relies on robust physical connectivity to ensure that deployed storage and compute resources are fully utilized. 

A case can be made for using an automated solution to test and troubleshoot the data link layer of these critical physical connections. But first the links and link types should be examined.

The three most common physical links found within hyperscale data centers are: Direct Attach Copper (DACs) cables, Active Optical Cables (AOCs) and fiber optic cable assemblies connected into transceivers. All three of these connectivity elements can be viewed as highways over which data travels. 

Not only is it important that the underlying electrical and optical connectivity of these components function properly, it's also critical that the data link layer operates within the established network/manufacturer specifications. 

These connectivity elements vary primarily by data rate and physical link length (maximum reach). DACs are used to connect devices separated by short distances and are most often leveraged to connect two elements within the same rack.

AOCs have extended reach that enables the connection of two devices within the same row or even to a neighboring row. Fiber optic cable assemblies (to include long-haul outside plant cables spliced into data center entrance panels) can connect two devices at great distance from one another (typically up to 100km without amplification depending on the transceivers to which they are connected).

No matter what the data rate or distance of span, all of these links must be monitored and tested to ensure that both physical and data link layer functionality are within specification.

Direct Attach Copper (DAC) Cables

Direct Attach Copper (DAC) cables are an alternative where the cable itself is made of copper instead of an optical fiber. A DAC may be passive to provide a direct electrical connection or active when signal processing circuitry is integrated in the DAC built-in connectors.

Just as with an AOC, a DAC will be terminated by a Small Form Factor Pluggable (SFP) or a Quad Small Form Factor Pluggable (QSFP) depending on the line rate. As a comparison, AOC cables support longer transmission distances and are more lightweight than DAC cables. However, they cost more, and optical fibers can be more easily damaged than copper cables. Both AOC and DAC cables are also available as breakouts.

Active Optical Cables (AOC)

Active Optical Cables (AOC) are used in point-to-point interconnect applications in data centers, most often within the same row.

When comparing AOC cables to fiber optic cable assemblies with transceivers (also referred to as pluggable optics, e.g., SFPs and QSFPs) AOCs provide a simplicity of installation without the need to consider interconnection loss and eliminating the need to clean and inspect fiber end faces before making a connection.

However, AOC cables cannot be used in End of Row (EOR) or Middle of Row (MOR) configurations that use patch panels. For high-speed links at 40GE, 100GE and 400GE, this typically means using multiple lanes of data over ribbon cables. At 10GE, 25GE or 50GE, a single lane or fiber per direction is sufficient.

A key attribute is that AOC employs the same cages as pluggable optics and performs electrical-to-optical conversions at each cable end. In practice, this means QSFP terminations for 40GE and 100GE (QSFP-DD for 400GE) and SFP terminations for 10GE and 25GE.

The AOC is therefore active and includes transceivers, control chips and modules, in addition to the fiber optics cable. AOC cables are of fixed length, starting at just a few meters and extending to 100 meters or more.

Technically, an AOC does not have to comply with any of the numerous Ethernet interface types, although many advertise a certain type in the coded information.

Because DAC and AOC do not provide test access to the actual fiber or copper cabling, traditional media test and certification tools cannot be used to certify or troubleshoot the cable. Instead, a test tool that can accept dual SFP/QSFP transceivers and generate and analyze traffic must be used.

Testing DAC and AOC is a critical step to ensure any issues with network performance are not due to the DAC/AOC cable or its installation. Consider that it is costlier to troubleshoot a faulty cable once installed rather than testing it upfront.

For one thing, it is necessary to trace and locate the far end. DAC/AOC cable failure causes include simple manufacturing defects with wrong or reversed polarities or mislabeling or damage during shipment.

For AOC, they may get excessively bent causing high loss or the fibers may get crushed. In the case of DAC, there can be EMI degradation resulting in excessive bit errors. Add the increased number of cables to be tested in the hyperscale data center and it’s easy to understand the need to automate the testing process.

Edge deployments & disaggregation: Balancing time/cost efficiencies of testing at install

Hyperscale data center construction and commissioning in the era of optical network disaggregation means that contractors are now also responsible for testing and certification of performance, operability, stress and reliability of every fiber link within these multi-component white box networks.

Combine this with the exponential growth of hyperscalers and their need to be ever closer to the end user and the result is more Edge deployments (network virtualization). This is forcing hyperscalers to increase speed, security and efficiency at the same time as they minimize latency.

The coinciding need to turn up Edge deployments quickly can add fuel to the decision not to test all cables prior to installation, opting instead to wait and address any connectivity issues during troubleshooting.

Similarly, the need to minimize downtime during troubleshooting often leads to the decision to cut or disconnect a cable and lay a new cable rather than troubleshoot or remove the existing cable.

It is common for untested cables pulled out of cabinets in haste to be returned to the manufacturer only to have the manufacturer say there is nothing wrong with the cable, or to collect large volumes of “non-working” cables with no ability to diagnose them.

Not only is this costly – cables can vary from tens of dollars to thousands depending on the line rate – leaving dead cables creates unwanted bulk in the cabinet. Not only is this bulk cumbersome, but it can also lead to mislabeling, confusion and the increased probability of unplugging a live cable.

Because old cables cannot be used in an upgrade due to their rate-specificity, leaving cut and dead cables in a cabinet creates unsustainable volume and weight that can compromise the function of the rack and the structure.

The value of BIT Error Rate testing

Accurately illustrating the time-cost benefit of testing and verifying every cable at install is difficult due to the varying costs of the cables among other factors.

However, it’s not difficult to theorize the consequences of how not testing enough cables at install could make future troubleshooting efforts as well as upgrades to the network more time consuming and costly.

The simplest and most cost-effective way to test cables is to run a test pattern where the results can be compared to a Bit Error Rate (BER) threshold. DAC and AOC cables including breakouts usually have a BER rating on their datasheets, especially when they are meant to be used with devices implementing the RS-FEC algorithm.

This BER rating depends on the type of cable, line rate and type of Ethernet interface. In the case of a cable meant for use with RS-FEC encoded traffic, which is typical at 400GE, 100GE, 50GE and 25GE, there may even be both a pre-FEC rating (before error correction) and a post-FEC rating (after error correction).

In such a case, it is recommended to perform a cable test using a pre-FEC BER threshold close to the cable BER rating and ensure the measured BER is smaller than the threshold for a successful test.

For 40GE and 10GE cables where RS-FEC is not used, the expected BER threshold needs to be quite a bit smaller as there is no error correction on those circuits. In such cases, if there is no BER rating for the DAC or AOC, the recommended threshold BER is 10^-12.

Test times of one minute per cable are more than sufficient to obtain meaningful BER results with line rates of 10Gbps or higher. Best practice procedures for cable tests will result in the generation of test reports including a cable identifier, such as the serial number, which can be read from a DAC or AOC cable.

Testing DAC or AOC cables against their target BER thresholds is a meaningful method to ensure more cables are functional when connected.

VIAVI Solutions offers the T-BERD®/MTS 5800 Automated Cable Test Suite, an all-in-one automated testing and troubleshooting solution for hyperscalers. For more information, visit the dedicated product page.