I often work remotely on customers’ infrastructures with their remote hands on-site. When a small office or branch changes ISPs or IP blocks, I occasionally find myself in a position where I have to change the only public IP address of a device like a branch office router or firewall, with no out-of-band management. The trouble with this is fairly obvious (on a Cisco device): by changing the IP address via which I am accessing the device over SSH, I will lose my own management session to it. Once the management session is lost, I can’t update the default route, and now the device is broken and I get to walk the on-site hands (who are often not very Cisco-literate) through changing a default route.
There are, of course, several ways to avoid this situation all together:
- Have out-of-band access using a 3G/4G/LTE-connected terminal server (I wrote about one of these before)
- Use a remote app like GetConsole so the remote hands can get me console access out of band using their smart phone
- Use something with a proper commit/rollback mechanism like a Juniper device
- Dial-up modem to the AUX port!
Clearly, from the list above, there are means to avoid this situation but I don’t always have access to one of them. That’s life in a VAR or consulting situation where you have to play the hand that’s dealt and sometimes you just can’t get OOB management where you need it when you need it. I have also tried various other tricks over the years including uploading a text file to flash with a small command set like:
interface GigabitEthernet0/0 ip address 192.0.2.50 255.255.255.248 no ip route 0.0.0.0 0.0.0.0 188.8.131.52 ip route 0.0.0.0 0.0.0.0 192.0.2.49
And then applying this by doing a “copy flash:/change.txt running-config”. Results vary. Sometimes, that has worked fine. Other times, it has done scary things like lock the VTY lines of the router for a few minutes before variably aborting or possibly completing the action. Ugh. Another option is what I’ve always known as a “Hail Mary” — uploading a new startup-config with the modified values and praying hard during the reboot that you didn’t screw the config up to a point that it is rejected. This is even worse. So recently I got thinking about whether any built-in automation tools could help. Surely, if one had an on-site configuration management tool of some flavor, a change could be scheduled to occur when needed. But again, my clients don’t always have such things available. The next thought I had was to use Cisco’s Embedded Event Manager (EEM) tool. EEM might just be the ticket, as scripts can be run on a trigger or on-demand, do not have to block on CLI input or output, and can execute just about any command available in IOS along with some surrounding logic.
To see how EEM can help with this remote IP change, here is the basic example network I’m using:
So, our router R1 is initially connected to ISP1, with the link addressed as 192.0.2.0/24. We will be migrating the Fa0/0 link of R1 over to ISP2, where it will now be addressed as 184.108.40.206/24 (apologies to AS7018…). The interface is FastEth0/0 in both cases — we are representing a hard cutover here, where an on-site helper moves the cable from the old ISP to the new ISP at our signal.
First, we establish that R1 can, in fact, talk to ISP1:
R1#sh run int fa0/0 Building configuration... Current configuration : 93 bytes ! interface FastEthernet0/0 ip address 192.0.2.1 255.255.255.0 end R1#ping 192.0.2.2 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.0.2.2, timeout is 2 seconds: .!!!! Success rate is 80 percent (4/5), round-trip min/avg/max = 4/17/40 ms
With that established, I took my first crack at an EEM script to make the IP change. Initially, I used the “timer countdown” event, but this was a poor idea as the event fired after the prescribed delay time after configuration. I was looking for something that could be remotely triggered by the consultant (that’s me!) just before the wire move was to take place. I tried again using “event none” which requires a manual invocation to start:
R1: event manager applet IP-CHANGE event none action 1.0 syslog priority alerts msg "IP change 30 second countdown initiated" action 1.1 wait 30 action 2.0 syslog priority alerts msg "Proceeding with IP address change" action 3.0 cli command "enable" action 4.0 cli command "config term" action 5.0 cli command "interface fast0/0" action 6.0 cli command "ip addr 220.127.116.11 255.255.255.0" action 7.0 cli command "no ip route 0.0.0.0 0.0.0.0 192.0.2.2" action 8.0 cli command "ip route 0.0.0.0 0.0.0.0 18.104.22.168" action 9.9 syslog priority warnings msg "IP change complete."
When remotely running this script, my CLI would hang after issuing the “event manager run IP-CHANGE” command, and after 20 seconds would simply return to the CLI prompt. After some logging and debugging work, I caught this:
*Mar 27 18:21:17.379: EEM policy IP-CHANGE has exceeded it's elapsed time limit of 20.0 seconds
Ahh, my 30 second wait timer exceeded the infinite loop safeguard. OK, after some more research and a couple tweaks, I tried again:
R1: event manager applet IP-CHANGE event none sync no default 60 maxrun 60 action 100 syslog priority alerts msg "IP change 30 second countdown initiated" action 110 wait 30 action 120 syslog priority alerts msg "Proceeding with IP address change" action 130 cli command "enable" action 140 cli command "config term" action 150 cli command "interface fast0/0" action 160 cli command "ip addr 22.214.171.124 255.255.255.0" action 170 cli command "no ip route 0.0.0.0 0.0.0.0 192.0.2.2" action 180 cli command "ip route 0.0.0.0 0.0.0.0 126.96.36.199" action 200 syslog priority notifications msg "IP change complete."
Setting the default action and maxrun limits to 60 gives the script time to execute. Further, setting “sync no” on the event makes the applet non-blocking as it runs asynchronously with the CLI. The CLI comes back right away after issuing the command to run the script. The drawback of this approach is that the script cannot “echo” anything to the CLI since the instance of the running script is essentially detached from the CLI session that invoked it. However, the script works!
R3 is a router deeper in my “Internet” that I was using to launch the EEM applet from. I wanted to get a realistic test in which the EEM applet was invoked by a machine that was connected remotely to the device being re-addressed over the interface that was being modified.
R3#telnet 192.0.2.1 Trying 192.0.2.1 … Open R1#event man run IP-CHANGE R1#exit [Connection to 192.0.2.1 closed by foreign host] !------ Wait for the connection to be moved to ISP2 R3#telnet 188.8.131.52 Trying 184.108.40.206 … Open R1#show run int fa0/0 Building configuration... Current configuration : 94 bytes ! interface FastEthernet0/0 ip address 220.127.116.11 255.255.255.0 end R1#show run | i ^ip route 0 ip route 0.0.0.0 0.0.0.0 18.104.22.168 R1#show log | i HA_EM *Mar 27 19:25:33.051: %HA_EM-1-LOG: IP-CHANGE: IP change 30 second countdown initiated *Mar 27 19:26:03.051: %HA_EM-1-LOG: IP-CHANGE: Proceeding with IP address change *Mar 27 19:26:03.423: %HA_EM-4-LOG: IP-CHANGE: IP change complete.
From the log entries and the fact that I could reach the router at it’s new IP, I know that my change worked. The problem now is, the script is pretty dumb. What if the new Internet connection to ISP2 has a problem of some sort? Or my on-site hands can’t find the right port? Or for some other reason we can’t complete the move? It would be nice to be able to automatically roll the change back if necessary. Next time, we’ll enhance this EEM applet to do just that!
I’ve used EEM in this way a few times myself, though your technique is a lot more fleshed out than mine. For configuration roll-back, I use a two-stage process within the applet. One is to issue a “reload in 5” command at the beginning and the second is use “configure terminal revert timer 2” to enter configuration mode. This allows me a two-minute window to verify and confirm the changes once the applet has run before the configuration is rolled back. In the event that the rollback fails, the reload is the final catch-all that will undo everything.
Great input. Funny, I’ve never really gotten into the habit of using Cisco’s rollback method. I really should. The more advanced method I have worked up does much fancier checking, but the “rollback” was still “manual” within the script. I sort of like your idea of having an emergency “reload in” as part of the script just in case things *really* go sideways and perhaps using the archive/revert function for a cleaner rollback than the manual method I am using now.
Thanks for reading and providing your experience/input!
[…] my last EEM post I provided a simple means to change an IP address and default route of a Cisco router using a […]