External storage enclosures in Solaris

Over the past few years, I’ve been working on various parts of Solaris platform integration, with an emphasis on disk monitoring. While the majority of my time has been focused on fishworks, I have managed to implement a few more pieces of the original design.

About two months ago, I integrated the libscsi and libses libraries into Solaris Nevada. These libraries, originally written by Keith Wesolowski, form an abstraction layer upon which higher level software can be built. The modular nature of libses makes it easy to extend with vendor-specific support libraries in order to provide additional information and functionality not present in the SES standard, something difficult to do with the kernel-based ses(7d) driver. And since it is written in userland, it is easy to port to other operating systems. This library is used as part of the fwflash firmware upgrade tool, and will be used in future Sun storage management products.

While libses itself is an interesting platform, it’s true raison d’etre is to serve as the basis for enumeration of external enclosures as part of libtopo. Enumeration of components in a physically meaningful manner is a key component of the FMA strategy. These components form FMRIs (fault managed resource identifiers) that are the target of diagnoses. These FMRIs provide a way of not just identifying that “disk c1t0d0 is broken”, but that this device is actually in bay 17 of the storage enclosure whose chassis serial number is “2029QTF0809QCK012″. In order to do that effectively, we need a way to discover the physical topology of the enclosures connected to the system (chassis and bays) and correlate it with the in-band I/O view of the devices (SAS addresses). This is where SES (SCSI enclosure services) comes into play. SES processes show up as targets in the SAS fabric, and by using the additional element status descriptors, we can correlate physical bays with the attached devices under Solaris. In addition, we can also enumerate components not directly visible to Solaris, such as fans and power supplies.

The SES enumerator was integrated in build 93 of nevada, and all of these components now show up in the libtopo hardware topology (commonly referred to as the “hc scheme”). To do this, we walk over al the SES targets visible to the system, grouping targets into logical chassis (something that is not as straightforward as it should be). We use this list of targets and a snapshot of the Solaris device tree to fill in which devices are present on the system. You can see the result by running fmtopo on a build 93 or later Solaris machine:

# /usr/lib/fm/fmd/fmtopo
...
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:serial=2029QTF0000000002:part=Storage-J4400:revision=3R13/ses-enclosure=0
hc://:product-id=SUN-Storage-J4400:chassis-id=22029QTF0809QCK012:server-id=:part=123-4567-01/ses-enclosure=0/psu=0
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:part=123-4567-01/ses-enclosure=0/psu=1
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/fan=0
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/fan=1
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/fan=2
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/fan=3
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=2029QTF0811RM0386:part=375-3584-01/ses-enclosure=0/controller=0
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=2029QTF0811RM0074:part=375-3584-01/ses-enclosure=0/controller=1
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/bay=0
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=5QD0PC3X:part=SEAGATE-ST37500NSSUN750G-0720A0PC3X:revision=3.AZK/ses-enclosure=0/bay=0/disk=0
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=/ses-enclosure=0/bay=1
...

To really get all the details, you can use the ‘-V’ option to fmtopo to dump all available properties:

# fmtopo -V '*/ses-enclosure=0/bay=0/disk=0'
TIME                 UUID
Jul 14 03:54:23 3e95d95f-ce49-4a1b-a8be-b8d94a805ec8
hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=5QD0PC3X:part=SEAGATE-ST37500NSSUN750G-0720A0PC3X:revision=3.AZK/ses-enclosure=0/bay=0/disk=0
group: protocol                       version: 1   stability: Private/Private
resource          fmri      hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=5QD0PC3X:part=SEAGATE-ST37500NSSUN750G-0720A0PC3X:revision=3.AZK/ses-enclosure=0/bay=0/disk=0
ASRU              fmri      dev:///:devid=id1,sd@TATA_____SEAGATE_ST37500NSSUN750G_0720A0PC3X_____5QD0PC3X____________//scsi_vhci/disk@gATASEAGATEST37500NSSUN750G0720A0PC3X5QD0PC3X
label             string    SCSI Device  0
FRU               fmri      hc://:product-id=SUN-Storage-J4400:chassis-id=2029QTF0809QCK012:server-id=:serial=5QD0PC3X:part=SEAGATE-ST37500NSSUN750G-0720A0PC3X:revision=3.AZK/ses-enclosure=0/bay=0/disk=0
group: authority                      version: 1   stability: Private/Private
product-id        string    SUN-Storage-J4400
chassis-id        string    2029QTF0809QCK012
server-id         string
group: io                             version: 1   stability: Private/Private
devfs-path        string    /scsi_vhci/disk@gATASEAGATEST37500NSSUN750G0720A0PC3X5QD0PC3X
devid             string    id1,sd@TATA_____SEAGATE_ST37500NSSUN750G_0720A0PC3X_____5QD0PC3X____________
phys-path         string[]  [ /pci@0,0/pci10de,377@a/pci1000,3150@0/disk@1c,0 /pci@0,0/pci10de,375@f/pci1000,3150@0/disk@1c,0 ]
group: storage                        version: 1   stability: Private/Private
logical-disk      string    c0tATASEAGATEST37500NSSUN750G0720A0PC3X5QD0PC3Xd0
manufacturer      string    SEAGATE
model             string    ST37500NSSUN750G 0720A0PC3X
serial-number     string    5QD0PC3X
firmware-revision string       3.AZK
capacity-in-bytes string    750156374016

So what does this mean, other than providing a way for you to finally figure out where disk ‘c3t0d6′ is really located? Currently, it allows the disks to be monitored by the disk-transport fmd module to generate faults based on predictive failure, over temperature, and self-test failure. The really interesting part is where we go from here. In the near future, thanks to work by Rob Johnston on the sensor framework, we’ll have the ability to manage LEDs for disks that are part of external enclosures, diagnose failures of power supplies and fans, as well as the ability to read sensor data (such as fan speeds and temperature) as part of a unified framework.

I often like to joke about the amount of time that I have spent just getting a single LED to light. At first glance, it seems like a pretty simple task. But to do it in a generic fashion that can be generalized across a wide variety of platforms, correlated with physically meaningful labels, and incorporate a diverse set of diagnoses (ZFS, SCSI, HBA, etc) requires an awful lot of work. Once it’s all said and done, however, future platforms will require little to no integration work, and you’ll be able to see a bad drive generate checksum errors in ZFS, resulting in a FMA diagnosis indicating the faulty drive, activate a hot spare, and light the fault LED on the drive bay (wherever it may be). Only then will we have accomplished our goal of an end-to-end storage strategy for Solaris – and hopefully someone besides me will know what it has taken to get that little LED to light.

Posted on July 13, 2008 at 9:23 pm by eschrock · Permalink
In: OpenSolaris

6 Responses

Subscribe to comments via RSS

  1. Written by sean walmsley
    on July 14, 2008 at 3:50 pm
    Permalink

    Thanks for the update on this stuff. From what you describe, this will be a huge improvement over current adhoc means of tracking down device locations.
    I keep seeing references to "Fishworks" but no details – can you provide any?

  2. Written by Eric Schrock
    on July 15, 2008 at 8:37 am
    Permalink

    Sadly, Fishworks is still under wraps at the moment. If you search for "fishworks solaris" you’ll find a few interesting tidbits, but we’re not quite ready to get into the details. You can bet there will be a flood of blog posts once we are.

  3. Written by Chris
    on July 17, 2008 at 10:18 am
    Permalink

    No one bother about fishworks. It was supposed to come out last year. Then all went quiet after Sun was sued by NetApp. So, I can only assume that fishworks possibly conflicts with some of the items in the lawsuit. Just assume it’s vaporware. We’ll all be better off.

  4. Written by Ross
    on August 11, 2008 at 5:26 am
    Permalink

    This is great to see, can’t wait for all of this to be fully integrated into Solaris. Just one question, do you know if any of this will work for third party hardware (Such as Supermicro SES-2 capable chassis), or will it be for Sun enclosures only?
    Much as I love Sun kit, there isn’t anything in the x64 range comparable to things like the Supermicro 836 chassis, so it would be a real boon to get enclosure status lights working on there.

  5. Written by Eric Schrock
    on August 11, 2008 at 10:12 am
    Permalink

    Ross -
    Yes, this will work with external enclosures. The only downside is that the SES-2 standard doesn’t always map well to how people are implementing the hardware. In particular, vendors often have multiple SES targets in the same physical chassis with different logical identifiers. In these more complex systems, we have no way of knowing the true layout of the physical chassis, and so you will end up with multiple "enclosures" that really correspond to different SES targets. This will not break any programmatic consumers, but it may be confusing to the user.
    - Eric

  6. Written by Ross
    on August 12, 2008 at 12:34 am
    Permalink

    Sounds fair enough. Am I right in thinking that "programmatic consumers" means I could relatively easily write a script / utility that could present the SES information to our particular hardware layout?
    We’ll be standardising on one or two chassis here, so if that’s possible it will be well worth the effort.

Subscribe to comments via RSS