Network Programmability with NX-OS & Python – First Steps

*Note: Some basic introductory knowledge of Python will be assumed for these blogposts. If you’re new to Python, I suggest having a look at Al Sweigart’s Automate The Boring Stuff online book for a beginner’s guide. Also, I’ll come back and update this post with real working code samples that you can try it yourself – for now, I’m using loose Python-like syntax to demonstrate the behavior rather than the individual lines of code.

Preface

Over the last year, I’ve had a shift in focus (mostly thanks to $DAYJOB) in my daily responsibilities to focus in the datacentre space of networking. The largest part of these new responsibilities has been improving our deployments of network gear, reducing time-to-market and eliminating as many common errors caused by these rollouts as possible. Without too many details, these implementations are accomplished via manual work of logging into devices, applying a configuration and verifying the change was completed successfully. This involves all copy-pasta of CLI commands and has caused our department myriads of headaches (and late night phones calls).

Hardware & Software

We needed to deploy a lot of new gear. This was brought up by the common enterprise IT culprits for putting in new kit, such as end-of-life concerns  and adding capacity to the current network. I’m sure everyone has been in the same planning meetings, where you needed to replace lots of old, aging and creaky equipment as well as addressing new business requirements.

The platform of our choice has been the Cisco Nexus 9300 line of top-of-rack switches. After rolling the standard 3-tier network architecture for many years, we decided to switch gears and build a modern leaf-spine Clos fabric using Cisco’s VXLAN EVPN technology. This has given us great capacity for 10G server access and scalability using 40G & 100G in the spine & core layers.

The best benefit of rolling with this technology & newer hardware has been the programmatic interfaces that Cisco has included. This comes to us in the form of NX-API.

NX-API

Cisco’s NX-API is an interface that allows an operator to interact with a Nexus device via standard HTTP calls. While this by itself doesn’t seem to be all that useful (and admittedly for most network engineers, adds complexity to their workflow) , the real benefits comes in with the way in which NX-API interacts with the device’s state.

Take this simple example of a manual CLI configuration:

switch1# conf t
 switch1(config)# vlan 100
 switch1(config-vlan)# name MyVlan100
 switch1(config)# interface Vlan100
 switch1(config-if)# ip address 10.0.100.1/24
 switch1(config-if)# no ip redirects
 switch1(config-if)# ip arp timeout 300
 switch1(config-if)# int Ethernet1/4
 switch1(config-if)# switchport mode trunk
 switch1(config-if)# switchport trunk allowed vlan 100

How would you typically verify this config was applied?

show vlan id 100
show interface Vlan100
show interface Ethernet1/4

All good so far. Now, imagine you had to verify this configuration was active on 5 devices? What about 50 devices? You can see how this starts to get unruly extremely quickly. Unfortunately, this has been the reality for most network engineers for decades as our vendors have not exposed interfaces for us to interact with that return this in a structured manner.

Let’s say you had 36 Cisco Nexus 9K’s that you wanted to verify this exact configuration. With NX-API, you use these same commands and get something back like this:

{ 'result': 
 [
  { 'body': 
   [
    { 'TABLE_VLAN':
     [
      { 'ROW_VLAN':
       [ 
         { 'vlan_id': 100,
           'vlan_name': 'MyVlan100',
           'vlan_state': 'active' } ],
            ... 
          }
       ]
      }
    ]
   }
 ]
}

This particular example shows that same “show vlan id 100” command but formatted in JSON. NX-API takes most common CLI commands and puts wrappers on individual components of the data that is typically shown in the CLI output. What this allows you to do is, using a scripting language, to drill down to exactly the data you want and present it in a deterministic fashion. For example, the ‘vlan_id’ will always return an integer of 1-4095, ‘vlan_name’ will return a string of characters representing the name of the VLAN, etc. This means that you perform basic operations such as checking that the VLAN ID is greater than or less than a given number, that the VLAN name conforms to some pattern that you set for your naming conventions or that it’s even active at all. Here’s how you might use it in a script:

import requests
import json

url='http://YOURIP/ins'
switchuser='USERID'
switchpassword='PASSWORD'

myheaders={'content-type':'application/json-rpc'}
payload=[
  {
    "jsonrpc": "2.0",
    "method": "cli",
    "params": {
      "cmd": "show interface",
      "version": 1
    },
    "id": 1
  }
]
response = requests.post(url,data=json.dumps(payload), headers=myheaders,
                         auth=(switchuser,switchpassword)).json()

The above code snippet is directly taken from NX-API Sandbox.

Another nice feature of this API is the Cisco NX-API Sandbox. So long as you’ve added “feature nxapi” to the device configuration, you can open up your web browser and log straight into this sandbox (by simply opening http://switch_ip_address/). The Sandbox will also give you example code snippets for building the proper payload and sending this HTTP request in Python so you can put in your scripts. Check out Cisco’s Programmability Guide for a quick guide on using the Sandbox.

Once you have a few working scripts that are using the API, you can even send the configuration commands using similar HTTP calls. I’ll cover this more on a later post.

Consistent Data

With an API such as this available, you can combine your CLI-fu with some basic knowledge of Python data types to interact with a device in interesting ways. For example, if you pulled the list of BGP peers via “show ip bgp summary” and the returned JSON data, you can loop through what is essentially a list of dictionaries to pull out the data you’re looking to extract. That is what you see in the for loop in the code sample above.

for each list item in result['key1']['key2']['key3']['list_of_peers']:   
   # Whenever we loop over this particular list of dictionaries ('list_of_peers'), 
   # the dictionary I get back will almost always have the same set of keys that I can access
   print(item['bgp_peer_ip']) 
   print(item['bgp_asn'])
   print(item['bgp_prefixes_received'])

Every time you send that command to any switch, you’re always going to receive the same nested list of key-value pairs, to which you can loop over quickly in Python and pull out only the data you care about. No longer do you need to SSH to a device (either manually or via an SSH client library), authenticate to a device, execute the desired command and scrape through the output manually (again, either manually in a terminal window or parsing it as a giant string in your interpreter).

Conclusion

If you get overwhelmed from the code above and all the fancy brackets, don’t worry. A good starting point would be to go through a beginner Python course such as LPTHW and familiarize yourself with variable types (such as integers, strings, lists, dictionaries, etc) and how to interact with them. This also can be done in any language of choice. All modern programming languages and interpreters have the necessary libraries (typically in their standard libraries) for sending HTTP1.1 requests to an endpoint and parsing well-known notations such as JSON or XML.

When you combine that with what you already know about using network devices everyday, you can start automating your repetitive, boring and simple day-to-day tasks and/or troubleshooting. It can be extended from everything to gathering routing & interface statistics to parsing MAC address tables to verifying SNMP community strings. The possibilities truly are endless.

In my next post, I’ll focus on taking another step in using tools such as Jinja2 combined with NX-API to generate plain-text configuration files, which can help build your configs in an error-free & consistent way.

BGP Aggregate Addresses

This month I am studying in preparation for my CCIP BGP+MPLS exam (booked June 27th). I decided to go through with the CCIP certification, despite my annoyance with the new Service Provider track because the BGP, MPLS and QoS topics are covered in length on the CCIE Routing & Switching blueprints. I figure this is a good bridge to fill in the gaps that CCNP R&S leaves out (in particular MPLS and QoS, which isn’t covered at all in any of the CCNP blueprints).

As such, I’ve been able to dive into all the knobs and switches that BGP offers to control routing policy. For those who have gone through the newer CCNP R&S track, the BGP fundamentals are explained and covered enough to get engineers familiarized with its operations. There’s a lot of depth lacking in CCNP and for good reason…BGP can be a career in and of itself. In service provider environments, when you’re pulling half a million IPv4 routes from upstream peers and providing L3VPN services to your customers via MPLS, you need a protocol like BGP that can scale.

Route summarization, when half a million routes are available on the global Internet table, can help keep specific and unnecessary routes from propagating out to upstream providers and thus alleviate memory and CPU required for carrying these thousands of routes. To summarize a set of routes in BGP, you have a few options:

  • Manual static Null0 routes advertised in BGP
  • aggregate-address command

Let’s look at a scenario. This is taken out of a BGP topology I’ve been working on this week to help me gain a better understanding of some of the more advanced BGP topics.

Subnets*:

  • 100.100.255.0/24 for all CE-facing Point-to-Point links
  • 100.100.254.0/24 for BGP update souce loopbacks
  • 100.100.253.0/24 for all inter-AS Point-to-Point links
  • 100.100.200.0/24 allocated to Enterprise A from this ISP
  • 100.100.0.0/16 allocated to ISP from registry

*Note: this is my best guess of how an ISP would assign addressing in its network. Being an enterprise guy, I’ve yet to be exposed to any service provider network. For those with more experience, any corrections on this please let me know in the comments below 🙂

In this topology, we have one route reflector “RR” with IBGP running between RR and all the PE routers (just PE1 and PE2 in this case).
We want to aggregate all of the ISP’s routes to advertise upstream to Upstream SP at AS 200.

Below is the BGP RIB on our route reflector before any aggregation. These are all the routes advertised by the PE routers as well as any allocations given to customers who require more than a single address.

RR#sh ip bgp
BGP table version is 31, local router ID is 100.100.254.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
* i100.100.200.0/24 100.100.254.3            0    100      0 65501 i
*>i                 100.100.254.1            0    100      0 65501 i
*>i100.100.255.0/31 100.100.254.1            0    100      0 i
*>i100.100.255.8/31 100.100.254.3            0    100      0 i
!
!
AS200#sh ip bgp
BGP table version is 37, local router ID is 100.100.253.0
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 100.100.200.0/24 100.100.253.1                          0 400 i
*> 100.100.255.0/31 100.100.253.1                          0 400 i
*> 100.100.255.8/31 100.100.253.1                          0 400 i

As you can see, in a huge service provider network, the BGP RIB would be filled with any public IP addresses used to connect its customers to the outside world, as well as any allocations given by this ISP to its larger customers (such as Enterprise A in this case, which is dual homed at PE1 and PE2). Also included is the BGP RIB of the upstream AS 200 router, who receives these specific prefixes from AS 400.

Now let’s reduce the routing table by aggregating them into a summarized route. First, we’ll start by adding in a static route to the Null0 interface and advertise it in BGP:

! On RR:
conf t
 ip route 100.100.0.0 255.255.0.0 Null0
!
router bgp 400
 network 100.100.0.0 mask 255.255.0.0
!
RR#sh ip ro static
100.0.0.0/8 is variably subnetted, 11 subnets, 4 masks
  S 100.100.0.0/16 is directly connected, Null0
RR#sh ip bgp
BGP table version is 25, local router ID is 100.100.254.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
          r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*> 100.100.0.0/16   0.0.0.0                            32768 i
*>i100.100.255.0/31 100.100.254.1            0    100      0 i
*>i100.100.255.8/31 100.100.254.3            0    100      0 i

And to verify on the upstream AS:

AS200#sh ip bgp
BGP table version is 31, local router ID is 100.100.253.0
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
          r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*> 100.100.0.0/16   100.100.253.1            0             0 400 i
*> 100.100.200.0/24 100.100.253.1                          0 400 i
*> 100.100.255.0/31 100.100.253.1                          0 400 i
*> 100.100.255.8/31 100.100.253.1                          0 400 i

Since we’re still advertising the more specific routes inside the ISP AS 400, manual filtering will be required on the router reflector. This can be accomplished by a simple prefix list or route-map on RR.

! on RR
ip prefix-list OurAlloc permit 100.100.0.0/16 
!
! Match only our allocated address space
!
router bgp 400
 neighbor 100.100.253.0 prefix-list OurAlloc out

The problem with this approach is that, while it is fairly simple, does require you to manually filter any more-specific routes on the edge of your AS. Also, if you are serving multihomed customers with their own address allocation (independent of this ISP’s allocation), you will have to take those into account as well in your filtering.

The other way to aggregate a set of routes in BGP is through the aggregate-address command. This command not only creates a Null0 route automatically but also suppresses more-specific routes from the BGP RIB. Using only the aggregate-address summarization, upstream peers will only receive the aggregated route and not the individual more-specific prefixes.

! on RR
router bgp 400
 aggregate-address 100.100.0.0 255.255.0.0 summary-only
!
RR#sh ip bgp
BGP table version is 31, local router ID is 100.100.254.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
          r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*> 100.100.0.0/16   0.0.0.0                            32768 i
s i100.100.200.0/24 100.100.254.3            0    100      0 65501 i
s>i                 100.100.254.1            0    100      0 65501 i
s>i100.100.255.0/31 100.100.254.1            0    100      0 i
s>i100.100.255.8/31 100.100.254.3            0    100      0 i

AS200#sh ip bgp
BGP table version is 37, local router ID is 100.100.253.0
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
          r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*> 100.100.0.0/16   100.100.253.1            0             0 400 i

As you can see, after using the aggregate-address command on AS 400 RR, only our configured summarized address is advertised out to AS200. You can also see the suppressed routes in RR’s BGP RIB, since we used the “summary-only” parameter in the aggregate-address command. All more-specific routes are suppressed from being advertised to BGP peers thus reducing what used to be many routes to just what’s configured.

Aggregation, combined with proper filtering, should be performed wherever and whenever possible. As of today, CIDR Report indicates over 410,000 routes exist in the global table. With aggregation (as estimated by CIDR Report), as much as half of all the routes in existence today can be aggregated.

OSPF Area ID’s – Dotted or Decimal?

I had a thought the other day while at work regarding OSPF area numbers in Cisco IOS.

Having been brought up on Cisco Net Academy in college, Cisco always teaches the following command syntax:

R1(config-t)#router ospf 1
R1(config-router)#network 10.1.1.0 0.0.0.255 area 0

In fact, 90% of all Cisco documentation and command references will always express OSPF Area ID’s as decimal numbers.
One thing you probably want to keep in mind is what is actually transmitted on the wire.
As an experiment, I used area ID’s of 1000 and 2000 on two directly connected routers, and used both formats for configuring the areas.

On R1
router ospf 1
router-id 1.1.1.1
log-adjacency-changes
network 10.0.0.0 0.0.0.3 area 0

On R2
router ospf 1
router-id 2.2.2.2
log-adjacency-changes
network 10.0.0.0 0.0.0.3 area 0.0.0.0

As expected, the OSPF neighbors came up, since we know that when configuring area 0 as a decimal, it’s sent out irregardless as dotted decimal in OSPF packets.

R1#sh ip os int fa1/0
FastEthernet1/0 is up, line protocol is up
Internet Address 10.0.0.1/30, Area 0
Process ID 1, Router ID 1.1.1.1, Network Type BROADCAST, Cost: 1
Transmit Delay is 1 sec, State BDR, Priority 1
Designated Router (ID) 2.2.2.2, Interface address 10.0.0.2
Backup Designated router (ID) 1.1.1.1, Interface address 10.0.0.1
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
truncated...

R2#sh ip osp int fa1/0
FastEthernet1/0 is up, line protocol is up
Internet Address 10.0.0.2/30, Area 0.0.0.0
Process ID 1, Router ID 2.2.2.2, Network Type BROADCAST, Cost: 1
Transmit Delay is 1 sec, State DR, Priority 1
Designated Router (ID) 2.2.2.2, Interface address 10.0.0.2
Backup Designated router (ID) 1.1.1.1, Interface address 10.0.0.1
Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
truncated...

Now that I have these routers talking to each other over the backbone area, let’s see how LSA’s for the loopback areas (1000 and 2000) look when configured with decimals under Cisco IOS:


R1(config-if)#do sh run int fa1/1
Building configuration...

Current configuration : 116 bytes
!
interface FastEthernet1/1
ip address 10.10.1.1 255.255.255.0
ip ospf 1 area 1000

R2(config-if)#do sh run int fa1/1
Building configuration...

Current configuration : 116 bytes
!
interface FastEthernet1/1
ip address 10.20.1.1 255.255.255.0
ip ospf 1 area 2000

And the captures:

R1 OSPF Hello on Area 1000
R2 OSPF Hello on Area 2000

As we can see, even though we use decimal notation, OSPF packets are always sent using dotted decimal.
It’s fairly simple why, being just a matter of binary conversation of decimal numbers to dotted decimal format.
For example, R1 was configured on Area ID 1000.

Decimal 1000 in binary is = 0011 1110 1000
Expanding that out to dotted decimal, we have the following in binary = 0000 0000.0000 0000.0000 0011.1110 1000

Converting it back to dotted decimal, we see in the hellos that OSPF Area ID 1000 = Area ID 0.0.3.232.

We can do the same for R2 on Area ID 2000.

Decimal 2000 = 0111 1101 0000 binary
32-bit binary 0000 0000.0000 00000.0000 0111.1101 0000 = 0.0.7.208

So regardless of which way you configure your area ID’s, OSPF will always transmit using dotted decimal format.
While it is easy to maintain and see decimal base-10 on small OSPF networks, an area ID greater than 255 will result in the next 8-bit octet incrementing as your area ID numbers increase (ie. Area ID 256 = 0.0.1.0 in dotted decimal). While 1000 and 2000 are nice numbers in a running config, it will always transmit in dotted decimal.

Configuration differences in multi-vendor networks

One of the challenges working in a multi-vendor environment is trying to keep track of all the configuration differences. Having all the guides handy helps when you have to look-up a command or two, but actually having a whole end-to-end configuration for a vendor you don’t normally work with can be challenging.

I spent this week training a new employee on both Brocade FastIron/NetIron and BNT isCLI, so I got a chance to refresh my non-Cisco vendor configuration. It can get pretty insane having to jump between the different OS’s so I’ll be making periodic posts (such as this one) for both my own reference and to give some exposure to the various CLI’s that aren’t as widespread as Cisco IOS.

Brocade (formerly Foundry Networks) kit looks pretty innocent at first and is very IOS-like:

FastIron>en
No password has been assigned yet...
FastIron#conf t
FastIron(config)#
FastIron(config)#enable super-user-password br0cade
FastIron(config)#int ethernet 1/1/1
FastIron(config-if-e1000-1/1/1)#

…However, Brocade FastIron has a different way of configuring your basic Layer 2 settings:

FastIron(config-if-e1000-1/1/1)#switchport
Unrecognized command
FastIron(config-if-e1000-1/1/1)#exit
FastIron(config)#vlan 100
FastIron(config-vlan-100)#tagg
tagged                     802.1Q tagged port
FastIron(config-vlan-100)#tagged eth 1/1/1
Added tagged port(s) ethe 1/1/1 to port-vlan 100.

This is the equivalent to Cisco’s switchport mode trunk switchport trunk allowed vlan 1,100.
In my opinion, Brocade has a much more sane way of configuring your L2. You can also use a range of ports (tagged eth 1/1/1 to 1/1/48 eth...) and doesn’t suffer the drawback of having to go into each interface to assign a VLAN. To configure “access switchport”, you add ports to the VLAN as “untagged”.

Some other Layer 2 features in Brocade:

Rapid Spanning Tree
FastIron(config-vlan)#spanning-tree 802-1w
FastIron(config-vlan)#spanning-tree 802-1w priority 4096
FastIron(config-vlan)#spanning-tree 802-1w admin-pt2pt-mac

Voice VLANs
FastIron(config)#vlan 110 name voice
FastIron(config-vlan-110)#tagged eth 1/1/20
Added tagged port(s) ethe 1/1/20 to port-vlan 110.
FastIron(config-vlan-110)#vlan 120 name data
FastIron(config-vlan-120)#tagged eth 1/1/20
Added tagged port(s) ethe 1/1/20 to port-vlan 120.
FastIron(config-vlan-120)#int eth 1/1/20
FastIron(config-if-e1000-1/1/20)#dual-mode 120

Virtual Router Interface (SVI)
FastIron(config)#vlan 100
FastIron(config-vlan-100)#router-interface ve 1
FastIron(config-vlan-100)#int ve 1
FastIron(config-vif-1)#ip address 10.10.10.1/24

As you can see, the FastIron CLI (also applies to NetIron) is very much IOS-like, so the learning curve between IOS and Brocade is pretty minimal. Some others include being able to use any “show” command in any of the CLI hierarchy and use of “enable”/”disable” commands to bring interfaces down or up (instead of your “shut”/”no shut”).

Since this post is looking a bit long in the tooth, I’ll leave the BNT configs for next time. 🙂

Docs used: Brocade FastIron 07.3.00 Configuration Guide