The Observation Deck

Search
Close this search box.

More blog sifting

June 17, 2005

If you didn’t see it,
Liane Praza
picked up where
my sifting
left off, adding
a blog entry pointing to
more Opening Day entries
— this
time
in the categories of
devices and device configuration, security, networking,
and standards. But there are still a ton of entries to
categorize, so picking up again in no particular order…

  • System calls.
    System calls are the among most fundamental mechanisms in operating systems:
    they are the mechanism by which untrusted, unprivileged software requests
    a service of trusted, privileged software. We are lucky to have two
    great entries describing the architecture-specific mechanisms of
    system calls in Solaris:
    check out
    Russ Blaine’s entry
    on
    system calls on x86, and
    Gavin Maltby’s
    entry on
    system
    calls on SPARC
    . Then, to understand the architectural-neutral aspects of
    system calls, head over to
    Eric Schrock‘s
    entry on
    how to add a system call.

    As a quick aside, that
    last entry is a great example of how we in Solaris Kernel Development
    are using blogs to write
    down information that (believe it or not) has just been an unspoked part
    of the craft before now. As
    Tim Bray observed,
    blogs have become a critical conduit of information for us — we believe
    that they are the most scalable way to get information from the
    people who have it to the people who need it. If (when?) you become
    an OpenSolaris developer,
    you can expect some friendly peer pressure to create a blog and
    join the party.

  • Build process and workspace management.
    We pride ourselves on a seamless build process,
    and a couple of entries have gone into various aspects of this in depth.
    To give you an idea of how seriously we take the build process — and
    why — check out
    Scott Rotondo’s
    entry on using lint to find security vulnerabilities.
    In particular, note what Scott says when he added a new lint option that
    generated
    500 new warnings: “I needed to fix all of these before integrating
    my change to
    Makefile.master because we require the Solaris source to be
    lint-clean.” To which I add only, “dammit.”
    Next, head over to
    Jim Carlson’s
    entry describing the work he did to support
    non-root builds. Jim’s entry demonstrates how difficult it is to
    radically change the build process — and how he managed to pull it off.
    Finally, if you want to really let your makefile flag fly,
    check out
    Mark Nelson’s
    entry describing the build support for localized messages.

    In terms of workspace management, you’ll want to check out
    Will Fiveash’s
    entry describing our workspace management tool, wx. For a long
    time, wx
    was a shell script in
    Bonwick’s home directory.
    It was incredibly useful, but it was also easy to accidentally blow your
    brains out.
    (As
    Bart is fond of saying, it
    was “all blade and no handle.”) Will’s rewrite made for a much more
    safer, much more sophisticated wx — and it was a huge help to
    us in automating the final approach of the
    DTrace integration.

  • Debuggability. If you read just a couple of the
    Opening Day entries,
    you probably noticed a trend: many of the entries were about finding
    some nasty bug in the system.
    This is an accurate reflection of our ethos in developing Solaris:
    the operating system must be reliable above all else, and we view
    debugging the operating system as our primary responsibility.
    This responsibility runs deeper than just the act of debugging, because
    our needs so outstripped existing tools that

    we designed and built
    our own
    — most notably
    mdb
    and DTrace.
    Fortunately, we ship these tools to you, so you can use them on your
    own system and on your own applications.

    There are many entries describing these tools and how they were used
    to tackle a problem.
    Fittingly, a good place to start is
    Mike Shapiro’s
    entry describing using mdb to debug a sendmail bug. This bug is
    described in
    4278156,
    which has one of the
    greatest bug synopses of all time: “sendmail died in a two SIGALRM fire.”
    1
    For more on the power of mdb,
    take a look at
    Eric Saxe’s
    entry on
    using mdb to debug a scheduling problem,
    Ashish Mehta’s entry
    on
    using
    mdb to debug
    a race condition
    , and
    Eric Kustarz’s entry demonstrating an mdb debugger command (“dcmd”) that he wrote to
    retrieve NFSv4 recovery messages postmortem.
    This last example is a particularly good one
    because this is exactly the kind of custom debugging
    infrastructure that mdb’s modular architecture makes easy to build.
    For a comprehensive example of how we have developed subsystem-specific
    debugging infrastructure, read
    Sasha Kolbasov’s
    entry on the
    mdb
    dcmds related to STREAMS
    .
    As Sasha mentions, the place to start for learning to write your
    own modules is the
    documentation
    but you can get a flavor for it by reading
    Yu Xiangning’s
    entry on writing a
    writing
    a module for kmdb
    .
    kmdb is the in-situ kernel debugger that implements mdb, and when you
    need it, nothing else will do — as
    Dan Mick describes
    in his entry on debugging with kmdb and moddebug.
    For more details on kmdb itself,
    check out
    Matt Simmons’
    entry on
    kmdb’s design and implementation.
    To see how mdb can help debug your application, take a look at
    Will Fiveash’s notes
    on using debugging application memory problems. Will
    mentions ::findleaks, a debugger command that I originally
    implemented for kernel crash dumps, and that
    Jonathan Adams
    subsequently
    ported to work on application core files and — as he mentions in
    his entry —
    reworking it substantially in the process.

    While mdb is the acme of postmortem debugging,
    if the manifestation of a bug is non-fatal, it’s often more
    effective
    to use DTrace to debug it.
    For an exanple of this,
    look at
    Bart Smaalders’
    entry on using DTrace to debug jitter.
    It was gratifying to see Bart debug this problem using DTrace, because
    latency bubbles were actually one of the motivating pathologies behind
    DTrace.
    And finally, debuggability doesn’t end with tools; subsystems must be
    designed with
    debuggability in mind, as
    Stephen Hahn
    describes in his entry on
    designing libuutil for debuggability.

I think that about does it for today. As someone pointed out on Liane’s
blog, we need a Wiki for this; we agree — it’s on the list of planned
enhancements for
opensolaris.org. Until then,
stay tuned for more sifting…


Technorati tags:



2 Responses

  1. Thanks you so much Bryan for your big effor in giving an excellent summary of these Sun guru’s blogs.
    I actually tried to build OpenSolaris kernel on SPARC following ReleaseNotes using the tar balls I downloaded from download center, but failed. I was quite disappointed to see a couple of errors in the ReleaseNotes. Maybe a lot of senior developers won’t follow or even bother to read ReleaseNotes, but for OpenSolaris buildable source code, ReleaseNotes should be reviewed carefully before released to the outside world.

Leave a Reply

Recent Posts

November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005

Archives