Blame the SAN

Over the years, I wish I would have kept track of all the times that my SAN at work was blamed for causing problems.  It’s on my mind today after some work we did…

In our main facility, we have a Cisco MDS 9513 Director class chassis, with eight internal switch modules.  Looking at it from a high level, the switches work similarly to ethernet switches, as they basically allow connections between the end devices plugged into the ports.  Connections are controlled by zones which can be compared to network VLANs, in that a zone allows its included devices or end points to talk to each other.  A basic zone, for example, could be configured to include Server A and the ports associated with Storage Array B so that the server could access the luns on that storage array.  In my switches and probably a common setting for most others, if a given server is not included in a zone, it can’t talk to anything else.  Good for security as well as sanity!

So, this morning a couple FC attached tape drives were installed, and I connected them up to the MDS.  Once powered up and with the switch ports activated I configured them just like normal.  I zoned them up with several OpenVMS servers as they would be the ones using the tape drives for backup.  After the servers scanned for available new connections, they were drawing a blank.  Why weren’t the drives showing up?  It must be a problem with the SAN.  I double checked my end a couple times and nothing was amiss.  It was a similar configuration to other FC attached tape drives we have had online for years, so I was highly doubtful that now some aspect of it would be failing.

It turned out that there were some OS-specific scanning options that needed to be done so that the server systems could recognize the new drives, so all was well in the end.  And it only took a few hours to get to that point.

I am not writing this to vent or to complain, because I believe everything we do, right or wrong, is a learning experience for those involved.  I am not trying to put the blame back on any other system administrator, because I too have probably been guilty before of the mentality that says it can’t be a problem with my stuff, it has to be yours.  I do know, though, from many years of experience that a lot of times I’ve seen fellow workers get very defensive when a problem comes up and have been quick to point fingers at others only to find later it was their own issue.  I do know that my own systems, like the FC switches, have worked without issue for a very long time, and I trust that a new change similar to what I’ve done dozens of times in the past is going to work just like normal.

I also know that I am willing to do what I can to help someone out in trying to figure out an issue, especially if it involves my hardware.  I may not know everything (well, of course not!) but I’ll give you what I can.  Please, don’t just keep saying it’s the SAN… what are you doing on your end to help figure it out?

Published
Categorized as SAN, Work