If we were in couples counseling, I think we’d have to admit that DevOps has been taking NetOps for granted. If development and operations teams are going to continue to get along well with their network operations colleagues, everyone needs to play together. The way to achieve this is with a sandbox — in fact, a digital sandbox is exactly what NetOps needs.
DevOps assumes that the network is agile, can change rapidly and is easily consumable. But this isn’t always the case. The challenge for data center operators is how to efficiently use the network fabric by tuning it on the fly, depending on the needs of applications and services and whether they are scaling, either up or down. We saw during the pandemic, for instance, that as more people suddenly took to working from home, cloud services suddenly required entirely different network resources to ensure they met user demand.
This will likely be the new normal. 5G, AI, IoT and Industry 4.0 adoption by enterprises, with emphasis on the edge cloud, will significantly multiply the number of applications, placing demands on the network fabric and not necessarily in a predictable fashion. The only way to achieve the agility and flexibility to match it — and to reduce risk — will be through automation, just as we’ve seen for continuous development and continuous integration.
In DevOps, automation scripts are thoroughly tested and validated in self-contained developer sandboxes before they’re unleashed on the production server. On the NetOps side, testing and validation have typically been done on physical network equipment in the lab. These microcosms of the actual network are resource-intensive and expensive to maintain. They also tend to assume that the network is somewhat stable, since reconfiguring the test network consumes time and resources. The risk with today’s data center fabrics is that the actual network configuration is likely to drift, undermining the testing and validation that has occurred in the lab-based network. This will not simply give rise to problems; it will make them more difficult to troubleshoot and solve.
Enter the digital sandbox for the network
The solution is to emulate the sandbox approach of DevOps in NetOps. This requires a flexible, software-driven environment that can produce a digital twin of the planned or live network fabric over which the DevOps team intends to deliver its application. The code running in the sandbox would be the same as that running on the switches in the physical environment, and the sandbox would also emulate the topology and the connections of the physical network.
Such a digital sandbox would give the NetOps teams the ability to replicate and test multiple scenarios, including new configurations and applications, to evaluate how they will perform prior to full deployment. The configuration needs to be identical, not only simulating, but emulating both the control plane and the data plane of the network. This will allow the teams to see exactly what the expected and real behaviors are going to be before releasing the configuration or application to production.
There are some NetOps systems that do some of these things now, but very few orchestrate the emulation environment, so that at the push a button, it’s possible to create a digital twin of the physical network in a virtualized environment. Ideally, it should automatically set up the same way, with the same production configuration: the same connections, the same control plane and scaled down data plane. This can’t be solved by individual network equipment vendors unless they adopt an open-source approach. Data center fabrics are in almost all cases multi-vendor. Physical devices from multiple vendors need to connect into the sandbox so that live interop can be validated and tested as well. Additionally, external sources of information like BGP peers, route reflectors or traffic generators should be able to connect to the sandbox.
Reducing operational risk
The sandbox should also give NetOps breathing space when it comes to Day 2+ management, troubleshooting and application development. As with DevOps, once released, problems with applications and services will occur, so there needs to be a graceful way to recover. The NetOps system needs to constantly monitor when and where changes happen and monitor variations from the intents of the overlay or underlay services on each node. The key is to sort the important issues from those that are not. By instantly firing up a digital twin of the live network, the team has a safe and risk-free place to go and investigate without the potential of causing additional problems.
Once a problem has been identified, the root cause can be isolated, and fixes tested, in the digital sandbox, without the potential of isolating other customers. The system should also automate the introduction of applications and fixes to the production environment in similar ways to continuous development principles. The digital sandbox will make post-deployment validation easier, as it is a working reference of the expected network behavior. It can also be used for new innovations, for example, to instantly test a new telemetry application, enable new protocols or migrate from a current technology to a new one.
Scale operations to keep pace
Not having equipment in a physical lab will save money, however, the real savings are in time. By not having to set up physical hardware to test, configure and validate software, NetOps teams can reduce time to complete operational tasks from days and weeks to minutes and hours. They can also decrease risk and improve the metrics they are judged by — such as uptime, performance, mean time to repair and mean time to innocence. And they can scale operations to keep pace with business growth while meeting the increasing needs of their DevOps colleagues for speed and flexibility.
The digital sandbox will enable NetOps teams to use the time saved to do even more exhaustive testing, optimize the network for new workloads, develop network applications, and tackle new strategic challenges. But most importantly, it will allow them to work more closely and integrate with their DevOps colleagues.