Monthly Archives: March 2016

OnCommand Unified Manager and OnCommand Performance Manager -> Fully Integrated? Mostly.

Working at a customer site on residency just outside of Baltimore, MD. We have installed and implemented OnCommand Unified Manager 6.4RC1(OCUM) and OnCommand Performance Manager 2.1RC1(OCPM) utilizing the Full Integration feature found in these two products at this release and moving forward. The vApp/ESXi versions were used here, but I suspect using other variants will likely produce similar results.

After the installation, it was determined that the email address for the “admin” account needed to change. Figured I would just go into the GUI and modify the admin email address.

After doing this, anything that was OCUM related got the update. This was also verified on the maintenance console of OCUM:

root@OCUM:/home/diag# mysql -e " select id,name,emailAddress from ocum.authorizationunit;"
| id   | name         | emailAddress         |
|    1 | admin        | |
|    2 | ocpm         |     |
|    3 | Cloud-Admins | NULL                 |
|    4 | RAD-NetOps   | NULL                 |
|    6 | RAD-Archive  | NULL                 |
|    7 | tmccar14     |      |
|  100 | cliadmin     |  |
| 1001 | tmac         | NULL                 |

When we looked on OCPM for something similar, we found this:

root@OCPM:/home/diag# mysql -e " select id,name,emailAddress from ocf.authorizationunit;"
| id   | name  | emailAddress      |
|    1 | admin | |
| 1002 | tmac  | NULL              |

Currently, the only way to *fix* this is by enabling the diagnostic user and logging into the maintenance console. (I will not be enabling how to do that here, consult NetApp Tech Support if you really need to do this!). After you are on the maintenance console, I was instructed to use this command to fix the database:

root@OCPM:/home/diag# mysql -e "update ocf.authorizationunit set emailAddress='' where id=1;"

Re-running the command above showed the updated info:

root@OCPM:/home/diag# mysql -e " select id,name,emailAddress from ocf.authorizationunit;"
| id   | name  | emailAddress         |
| 1    | admin | |
| 1002 | tmac  | NULL                 |

A bug has been opened to learn about this behavior. Hopefully, they will be able to fix this minor little issue soon.

Full Integration of OnCommand Unified Manager and Performance Manager

NetApp has recently released a “full integration” of the two core Clustered Data ONTAP monitoring products, OnCommand Unified Manager (vsphere version link) and OnCommand Performance Manager (vsphere version link).

ocumWhat does this mean?

Historically, when using these two products, you would need to setup each individually and mange each individually. With the “full integration” release, you still perform a basic setup on both. If using HTTPS Certificates generated by your own Certificate Authority, generate the signing requests, get and install the certificates and then, following the documentation, configure the “full integration” on the maintenance console of the performance manager. After a few minutes, you are presented with an updated single management pane through the OnCommand Unified Manger. Nearly all configuration options that apply to one, will apply to the other as needed. In fact, the GUI to OnCommand Performance Manager is now gone as a stand alone product (hitting the OCPM IP address with a browser no longer works) when full integration is used.

Partial Integration is what the application used in prior releases and is still a viable option. The preferred method moving forward is the Full Integration.




Power Supplies causing other issues? Really!



So, I have recently been involved in a couple of cases regarding power supplies. Back in October I was asked to come to a site during a maintenance windows to see about fixing a problem that won’t seem to go away.

Case #1:

This first case had the following symptoms:

  • The IOM3-B module appeared quasi-online. It was there, but not quite.
    • Firmware updates did not work. Resetting/re-seating did not do much.
  • The DS4246 shelf would not allow the shelf ID to be set.
  • I am sure there were other un-diagnosed issues, but these two were most obvious

NetApp was baffled. I asked for and received a whole new shelf, two Power Supply modules and two IOM3 modules to basically have everything on hand to fix whatever the problem could be. This had been festering for a few weeks. The customer and NetApp Support simply wanted this fixed.

During our outage, the first thing we did was eliminate the shelf. We moved all disks, Power Supplies and IOMs over to the new shelf and powered it on. The Shelf ID LED would not come on….at all. Mmm? Ok. Swap the IOM3’s for the new ones. Still nothing! Swap the Power Supplies. Ah HA! The Shelf ID light came on.

To further isolate, we ended up shuffling the Power Supplies around further finding that there was one bad Power Supply that was causing significant problems. When it was in *any* shelf, problems followed. Remove the Power Supply and the problems disappear.

After looking at older ASUP’s it is likely we might have been able to deduce a bad power supply, but the details were in a less commonly used section of the environment output.

Case #2:

This second case had the following symptoms:

  • Upon performing A-side / B-side power testing, according to the netapp environment command, both power supplies were now unknown!
  • Some / most of the drives powered down
  • after power-cycling the shelf (both power supplies) NONE of the drives would power up!

Here we tried a few things, power-cycling a few times, resetting the IOM6 modules. For this case, we removed ONE power supply (PSU #4, lower right from the back of the shelf perspective). As soon as that ONE power supply was removed, the drives started powering on.

This was very odd. Fortunately for me, after I got this rectified and that power supply replaced, my NetApp case owner just happened to be an Electrical Engineer! He was able to dive into the many AutoSupport (ASUP) messages and further determine that power supply #1 in the same shelf was also on the fritz and it should be replaced also.

He was able to deduce that voltages and amperage’s were not quite right and strongly recommended to replace that power supply #1…which we did.

The takeaway

Never discount the power supplies. Also, be careful when you pull them out if you suspect them. In my case number two, we did the A-side test and all appeared OK when power was restored. After the B-side test, that is when everything went nuts so I figured that was the place to start. In hind sight, I would also use the environmental commands to verify amperage and voltage among other items before pulling a power supply.