Network Automation with State Machines
Automation has become vital to building large scale networks. However, building these networks and managing their entire life cycle with minimal human intervention …
Talk Title | Network Automation with State Machines |
Speakers | Yihua He, Zoe Blevins |
Conference | NANOG68 |
Conf Tag | |
Location | Dallas, Texas |
Date | Oct 17 2016 - Oct 19 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | Talk Video |
Automation has become vital to building large scale networks. However, building these networks and managing their entire life cycle with minimal human intervention remains a challenge. We realize that the fundamental action in this automation can be abstracted as reconciling the difference between the actual state and the desired state of the system. Guided by state machines, we have implemented a fully automated system to provision, turn-up, and manage our data center networks at Yahoo. In this system, the network architecture is modeled as a set of configuration templates. A no touch configuration generating engine is built on top of the model. The actual state of devices within the system is collected in real time by agents running on the devices. Additional data is pulled in from external sources, such as inventory databases, to feed the templating engine. Changes to the desired state come from input by engineers via the API, as well as state data collected from the devices. These changes then trigger the model to generate desired configurations for devices. Once a new version of a configuration has been generated, it advances through 3 states, GENERATED, RELEASED, and VALIDATED. These states are used to track the progress of a change and control the rate and sequence at which new configurations are released out into the network. The transition from GENERATED to RELEASED is where the rate and sequence of such changes are controlled, and will be explored in depth as part of this presentation. Once a configuration is in the RELEASED state, it is ready to be picked up by the network device. The device will then apply the configuration, run a series of health checks, and report the version of the active configuration to the system. This presentation will cover the overall design of the system, share the details of the state machines, walk through a specific use case, and discuss challenges faced when implementing the system.