(SOLVED) Problem upgrading from NX-OS 7.0(3)I7(5a) to 7.0(3)I7(7) on Nexus 3000 switches for a NTAP Cluster

Last week I heard a lot about the big Cisco CDP bug. I offered to help a customer upgrade their Nexus 3132Q-V switches from NX-OS 7.0(3)I7(5a) to 7.0(3)I7(7) and subsequently apply the Software Maintenance Update(SMU) to fix the CDP bug. I have done plenty of NX-OS upgrades. This was the first time I got stuck all due to a bug. Here we go:

Before upgrading anything, you should always consult the relevant locations for upgrade advice. These include NetApp’s Active IQ (Login required, for ONTAP upgrade advice) and the (old) Software Download Page (Login required, the Cluster Network switch details are not on the new location as of this post) where you would select your Switch Brand (Broadcom, Cisco, NetApp) in the pull-down menu next to the Cluster Network/Management Switches. These pages show appropriate compatibility matrices for ONTAP, switch OS versions and Reference Configuration File (RCF) versions.

In our case, we are using ONTAP 9.6P5 on the Cluster and the Nexus 3132Q-V switches were running 7.0(3)I7(5a). According to Cisco, the CDP bug above is fixed by upgrading to 7.0(3)I7(7) and then applying the SMU.

We followed directions from NetApp to do the upgrade. There are a number of verification steps and actions that are done from ONTAP to essentially force the Cluster Network to use one switch. At that point the unused switch is upgraded and brought back into service. This is where we are picking up in the docs.

First we copy the the new code to the switch:

ntapclus-sw1# copy http://laptop:8123/nxos.7.0.3.I7.7.bin bootflash:///nxos.7.0.3.I7.7.bin vrf management

This completed in about 3-4 minutes (nearly a 1GB file). The next step is to simply install the new NX-OS code:

ntapclus-sw1# install all nxos bootflash:///nxos.7.0.3.I7.7.bin
Installer will perform compatibility check first. Please wait.
Installer is forced disruptive

Verifying image bootflash:/nxos.7.0.3.I7.7.bin for boot variable "nxos".
[################## ] 86% -- FAIL.
Return code 0x40450030 (Digital signature verification failed).
Pre-upgrade check failed. Return code 0x40930011 (Image verification failed).
ntapclus-sw1#

Huh? What is that Digital signature verification failed? I tried multiple times with it always failing at either 85% or 86%. So for due diligence, I decided to check the MD5SUM:

ntapclus-sw1# sho file bootflash:///nxos.7.0.3.I7.7.bin md5sum
a9d40fbfaf43c214c3d97cb290788d06

Well, that matches exactly from Cisco’s website so I know the code downloaded to my laptop fine and subsequently transferred to the switch without error. Off to search on Google. In a few minutes , while not exactly what I was hoping for, I found a Cisco defect (CSCvm37015 , Cisco Login required!) that indicated this exists in NX-OS 7.0(3)I7(5a). The solution is to disable digital image signature verification. This is easily done:

ntapclus-sw1# configure
Enter configuration commands, one per line. End with CNTL/Z.
ntapclus-sw1(config)# no feature signature-verification
WARNING: This will disable digital image signature verification for all NxOS software attempted to be installed using any install method.
Are you sure you want to continue? (y/n) : [n] y
WARNING: Image Signature Verification has been Disabled!
ntapclus-sw1(config)# end
ntapclus-sw1# copy run st
[########################################] 100%
Copy complete, now saving to disk (please wait)…
Copy complete.

Ok, I tried installing NX-OS 7.0(3)I7(7) again to see what happens:

ntapclus-sw1# install all nxos bootflash:nxos.7.0.3.I7.7.bin
Installer will perform compatibility check first. Please wait.
Installer is forced disruptive
Verifying image bootflash:/nxos.7.0.3.I7.7.bin for boot variable "nxos".
[####################] 100% -- SUCCESS

Verifying image type.
[####################] 100% -- SUCCESS -- SUCCESS

Preparing "nxos" version info using image bootflash:/nxos.7.0.3.I7.7.bin.
[####################] 100% -- SUCCESS

Preparing "bios" version info using image bootflash:/nxos.7.0.3.I7.7.bin.
[####################] 100% -- SUCCESS -- SUCCESS

Performing module support checks. -- SUCCESS

Notifying services about system upgrade. -- SUCCESS

Compatibility check is done:
Module bootable         Impact Install-type Reason
------ -------- -------------- ------------ ------
     1      yes     disruptive        reset default upgrade is not hitless

Images will be upgraded according to following table:
Module      Image Running-Version(pri:alt)        New-Version Upg-Required
------ ---------- ------------------------ ------------------ ------------
     1       nxos             7.0(3)I7(5a)        7.0(3)I7(7)          yes
     1       bios       v04.24(04/21/2016) v04.24(04/21/2016)           no

Switch will be reloaded for disruptive upgrade.    
Do you want to continue with the installation (y/n)? [n] y

Install is in progress, please wait.
Performing runtime checks. -- SUCCESS

Setting boot variables.
[####################] 100% -- SUCCESS

Performing configuration copy.
[####################] 100% -- SUCCESS

Module 1: Refreshing compact flash and upgrading bios/loader/bootrom.
Warning: please do not remove or power off the module at this time.
[####################] 100% -- SUCCESS

Finishing the upgrade, switch will reboot in 10 seconds.
ntapclus-sw1#
Network error: Software caused connection abort

Excellent! Looks like that solved the issue! I waited a few minutes for the switch to come back online. Then I logged in to undo the digital image signature verification modification:

ntapclus-sw1# conf
Enter configuration commands, one per line. End with CNTL/Z.
ntapclus-sw1(config)# feature signature-verification
ntapclus-sw1(config)# end
ntapclus-sw1# copy run startup-config
[########################################] 100%
Copy complete, now saving to disk (please wait)…

And and apply the SMU:

ntapclus-sw1# install add bootflash:nxos.CSCvr09175-n9k_ALL-1.0.0-7.0.3.I7.7.lib32_n9000.rpm activate
Adding the patch (/nxos.CSCvr09175-n9k_ALL-1.0.0-7.0.3.I7.7.lib32_n9000.rpm)
[####################] 100%
Install operation 1 completed successfully at Fri Feb 14 15:27:40 2020

Activating the patch (/nxos.CSCvr09175-n9k_ALL-1.0.0-7.0.3.I7.7.lib32_n9000.rpm)
[####################] 100%
Install operation 2 completed successfully at Fri Feb 14 15:27:50 2020

ntapclus-sw1# install commit
[####################] 100%
Install operation 3 completed successfully at Fri Feb 14 15:27:55 2020

ntapclus-sw1#

After that finished, we repeated the process (using NetApp Docs from above) to revert back to normal operations and then shift the Cluster LIFs to the other switch. This allows for the second switch to be updated also.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s