Complexity and Network Automation
Other By Louis Spencer JR | September 24, 2018
This is the first blog in a series that will dive into the details around a modern approach to network automation.
Network automation of complex configurations is much, much harder than automation of simple configurations. What makes a configuration complex and what is required in an automation system to handle complex configurations?
Let’s start by looking at some simple network automation tasks. Automating simple configurations is a good way to get started with network automation. The configurations are easy to build, easy to check, and are generally low risk to the proper operation of the network. My favorites for simple configurations are syslog, SNMP, NTP, and password maintenance.
The simple configurations are limited in scope—they don’t interact with packet forwarding subsystems, so they tend to be low risk. In many cases, the new configuration can be added and verified before removing the old configuration statements. This two-step process reduces risk at the expense of greater effort.
Simple configurations use a few commands that have few variables. The variables are easily identified and provide an opportunity to learn how an automation system performs variable replacement. One-line commands like ntp server ip-address or snmp-server community stringare good examples.
Configuration complexity goes up when the variables are references to other things in the configuration. Adding snmp-server trap-source interface-id is simple as long as the interface name can be predicted (i.e. loopback 0). The complexity rises quickly when the automation system must extract information from the existing configuration for use in the configuration update.
Network automation tools must handle the case where a simple task becomes complex without requiring that the network administrators have programming skills. This is where Gluware beats simple tools like Ansible. Gluware’s modeling of configuration features allow it to understand the current configuration and to implement only those changes that are needed to transition to the new (intended) configuration. If the configuration already matches the intended state, it does not perform any changes.
Complex configuration tasks contain multiple elements that interact with one another and can influence the forwarding plane, increasing risk of an outage. Examples of complex configuration are QoS, MPLS, and DMVPN.
QoS is a great example. The differentiated services model needs to classify and mark traffic, convert policy into the configuration for queueing, shaping, and policing of traffic, and apply the defined policy to interfaces. The specific configuration is a combination of class maps, policy maps, and ACLs that must be configured to work together. To add more complexity, an edge interface would have a classification, marking, and queueing policy while an uplink interface would typically only have a queueing policy. Putting all these pieces together for an eight-queue design can be pretty complex. And the complexity is compounded if you need a hierarchical policy. Finally, you’ve not even touched on differences in QoS policy across multiple device types that use different queueing engines.
Many of the configuration components used by complex configurations use two additional mechanisms that complicate the configuration. The first is grouping. A routing protocol definition could include the default metric, a list of network address ranges, a list of neighbors, and a security configuration. Indentation is often used in the command line interface as a visual clue, as shown in the BGP example below.
router bgp 64500 network 192.168.0.0 mask 255.255.252.0 network 10.0.0.0 mask 255.0.0.0 neighbor 10.0.0.2 remote-as 64500 neighbor 10.0.0.3 remote-as 64500 neighbor 172.16.1.2 remote-as 64300
The second mechanism that increases complexity is order dependence. Access control lists are the best-known example, as shown in the following sample configuration. Switching the order of the two statements will deny all IP traffic instead of allowing established TCP connections.
access-list 105 permit tcp any any gt 1023 established access-list 105 deny ip any any
The configuration automation tool must handle this complexity and at the same time provide a simple user interface. For example, if there is already a QoS policy implemented in some percentage of the devices, does the automation tool know how to identify the old policy and remove it while installing the new policy? Or does the resulting configuration contain both the old and the new policy? Even though no interfaces may reference the old policy, it is a potentially confusing artifact of the old configuration. Ideally, the configuration automation system will identify the old policy and remove it.
Modern automation platforms, like Gluware, have the ability to define the desired state of configuration for each supported feature. It discovers the current configuration and state of a device, using either API or CLI interfaces, analyzes that state information, and compares the current state with the intended state. Configuration changes are then executed to bring the device configuration into compliance with the intended state. It can handle grouping and statement ordering that is required for complex configurations that include class-maps, policy-maps, ACLs, and interface configuration.
Where Is the Industry Today?
Most automation tools in the industry are capable of making simple changes like adding a single configuration statement, or replacing part of a configuration, line for line. However, once the changes start to become complex, these tools begin to fail. This is why the number of automated network configuration tasks is under 20% in most enterprises (Forrester Research). That’s unfortunate, because Gartner’s paper Avoid These ‘Bottom 10’ Networking Worst Practices identifies Manual Network Changes and Lack of Automation as one of the worst practices.
Gluware is a particularly interesting product because the network staff doesn’t have to develop programming skills. While such skills are valuable over the long run, being able to make progress in the short term is what organization executives want to see. Then the task becomes one of determining which tasks should be automated. I prefer the approach of starting with simple, easily accomplished tasks, using a platform that is capable of complex changes when needed. Build on successful implementation of the simple tasks. Become familiar with the new processes that are required. It is a cultural change, as described in a recent blog Is Automation Really Faster? A slightly different view is presented by Olivier Huynh Van, the Gluware CTO, in a very interesting blog titled 7 Steps Toward Network Automation.
There is a lot of interest in simple network automation tools like Ansible, Salt, and NAPALM. But once you get past simple automation tasks, these tools begin to require more advanced programming skills. Is this the direction your organization wants to take? If not, then some of the commercial tools become important and Gluware is one of the candidates.
Adopting network configuration automation is a journey. Investigate products and test-drive several of the available tools to decide for yourself which ones will work best in your environment. This means dedicating time for the product evaluations. Executive support will be essential.
After selecting a product, spend the time to make it work in your environment. It takes time and investment to identify the new processes that are needed and to transition from old processes. It is important that everyone across the organization understands the importance of network automation, and more importantly, how increasing network automation can drive business agility and profit. Publicize the successes, failures, and lessons-learned in order to continue to receive support across the organization, especially from the executives.