Jerry Jelinek's blog

Search
Close this search box.

SVM V20z root mirror panic

April 18, 2005

One of the problems I have been working on recently has to
do with running Solaris Volume Manager on
V20z
and
V40z
servers. In general, SVM does not care what kinds of disks
it is layered on top of. It justs passes the I/O requests through
to the drivers for the underlying storage. However, we were seeing
problems on the V20z when it was configured for root mirroring.

With root mirroring, a common test is to pull the primary boot disk
and reboot to verify that the system comes up properly on the other
side of the mirror. What was happening was that Solaris would start
to boot, then panic.

It turns out that we were hitting a limitation in the boot support for
disks that are connected to the system with an mpt(7D) based HBA. The
problematic code exists in bootconf.exe within the Device Configuration
Assistant (DCA). The DCA is responsible for loading and starting the
Solaris kernel.
The problem is that the bootconf code was not failing over to the
altbootpath device path, so Solaris would
start to boot but then panic because the DCA was passing it the old
bootpath device path. With the primary boot disk removed from the
system, this was no longer the correct boot device path.
You can see if this limitation might impact you by using the “prtconf -D”
command and looking for the mpt driver.

We have some solutions for this limitation in the pipeline, but in
the meantime, there is an easy workaround for this. You need to edit
the /boot/solaris/bootenv.rc file and remove the entries for bootpath
and altbootpath. At this point, the DCA should
automatically detect the correct boot device path and pass it into the kernel.

There are a couple of limitations to this workaround. First, it only
works for S10 and later. In S9, it won’t automatically boot. Instead
it will enter the DCA and you will have to manually choose the boot
device. Also, it only works for systems where both disks in the root
mirror are attached
via the mpt HBA. This is a typical configuration for the V20z and V40z. We are working
on better solutions to this limitation, but hopefully this workaround
is useful in the meantime.

10 Responses

  1. I have a Solaris 10 on a v20z with all patches as of 10-05-2005 and wanted to mirror the drives but it looks like a gamble.

    With your proposed workaround, I assume (since you can not just PULL a disk) I assume that you have PULL a disk and then PUT in a different disk (satisfies that two disks are attached to the mpt HBA).

    The real downside is that it will never auto-boot once a mirror goes south.

  2. I think you are misreading my note. If
    you want to be able to pull a disk then
    you need to follow the workaround I described.
    If you do that, the system should boot ok,
    even when the primary boot disk has been removed.
    Jerry

  3. Hi Jerry,
    can you tell me when you have a patch for this problem ?
    Thanks,
    Ahmad

  4. Hi!
    I tried your workaround, but I still got
    panics on boot. In my bootenv.rc I only
    had one line with the bootpath.
    An alternate bootpath was not defined.
    Any ideas ?
    Regards,
    Ahmad

  5. As far as I know, there is no patch for this
    that is being developed since the DCA is gone
    in S10u1. It has been replaced by GRUB so
    we no longer see this bug. Also, it doesn’t
    look like you did the workaround correctly
    since you have to remove both the bootpath
    and altbootpath entries.
    Jerry

  6. Hi!
    Thanks for your reply.
    Where do I find the altbootpath.
    In the script I could not find it ?
    Here a cut if the config:
    Thanks in advance,
    Ahmad
    master[admin] > cat /boot/solaris/bootenv.rc
    #
    # Copyright 2004 Sun Microsystems, Inc. All rights reserved.
    # Use is subject to license terms.
    #
    #ident “@(#)bootenv.rc 1.34 04/11/10 SMI”
    #
    # bootenv.rc — boot “environment variables”
    #
    setprop auto-boot? ‘true’
    setprop auto-boot-cfg-num ‘-1’
    setprop auto-boot-timeout ‘5’
    setprop boottimeout ‘0’
    setprop bshfirst ‘false’
    setprop output-device ‘ttya’
    setprop input-device ‘ttya’
    setprop boot-file ”
    setprop target-driver-for-scsi ‘sd’
    setprop target-driver-for-direct ‘cmdk’
    setprop target-driver-for-csa ‘cmdk’
    setprop target-driver-for-dsa ‘cmdk’
    setprop target-driver-for-smartii ‘cmdk’
    setprop target-driver-for-pci1000,30 ‘sd’
    setprop target-driver-for-pci1000,50 ‘sd’
    setprop target-driver-for-pci1000,532 ‘sd’
    setprop target-driver-for-pci1028,a ‘sd’
    setprop target-driver-for-pci1028,e ‘sd’
    setprop target-driver-for-pci1028,f ‘sd’
    setprop target-driver-for-pci1028,493 ‘sd’
    setprop target-driver-for-pci1028,518 ‘sd’
    setprop target-driver-for-pci1028,520 ‘sd’
    setprop target-driver-for-pci9005,285 ‘sd’
    setprop pciide ‘true’
    setprop prealloc-chunk-size ‘0x2000’
    setprop ata-dma-enabled ‘1’
    setprop atapi-cd-dma-enabled ‘0’
    setprop ttyb-rts-dtr-off ‘false’
    setprop ttyb-ignore-cd ‘true’
    setprop ttya-rts-dtr-off ‘false’
    setprop ttya-ignore-cd ‘true’
    setprop ttyb-mode ‘9600,8,n,1,-‘
    setprop ttya-mode ‘9600,8,n,1,-‘
    setprop kbd-type ‘US-English(104-Key)’
    setprop boot-device ‘disk0 disk1’

  7. The altbootpath is not normally in this
    file, so the fact that you don’t see it is
    not a big deal. If you had set the altbootpath
    via the eeprom command, then it would be there.
    Jerry

  8. Hi!
    I did not configure an altbootpath, but perhaps
    I have configured it without knowing how.
    So I’ve checked the eeprom.
    Could this line be a problem:
    master[admin] > eeprom | grep -i boot
    auto-boot?=true
    auto-boot-cfg-num=-1
    auto-boot-timeout=5
    boottimeout=0
    boot-file=
    boot-device=disk0 disk1
    Could the boot-device be the problem ?
    Is there another way to find out, whether the
    altbootpath is set ?
    regards,
    Ahmad

  9. Curious — Infodoc 83605 tells us to do exactly the opposite, to have altboopath in there.
    I really, really wish that running Solaris 10 on the x4?00 were less painful than running it on commodity legacy Microcult hardware. Perhaps my error is in not recognizing the former *as* the latter.

Recent Posts

September 23, 2010
September 13, 2010
May 26, 2009

Archives