More blog sifting
If you didn’t see it,
Liane Praza
picked up where
my sifting
left off, adding
a blog entry pointing to
more Opening Day entries — this
time
in the categories of
devices and device configuration, security, networking,
and standards. But there are still a ton of entries to
categorize, so picking up again in no particular order…
-
System calls.
System calls are the among most fundamental mechanisms in operating systems:
they are the mechanism by which untrusted, unprivileged software requests
a service of trusted, privileged software. We are lucky to have two
great entries describing the architecture-specific mechanisms of
system calls in Solaris:
check out
Russ Blaine’s entry
on
system calls on x86, and
Gavin Maltby’s
entry on
system
calls on SPARC. Then, to understand the architectural-neutral aspects of
system calls, head over to
Eric Schrock‘s
entry on
how to add a system call.As a quick aside, that
last entry is a great example of how we in Solaris Kernel Development
are using blogs to write
down information that (believe it or not) has just been an unspoked part
of the craft before now. As
Tim Bray observed,
blogs have become a critical conduit of information for us — we believe
that they are the most scalable way to get information from the
people who have it to the people who need it. If (when?) you become
an OpenSolaris developer,
you can expect some friendly peer pressure to create a blog and
join the party. -
Build process and workspace management.
We pride ourselves on a seamless build process,
and a couple of entries have gone into various aspects of this in depth.
To give you an idea of how seriously we take the build process — and
why — check out
Scott Rotondo’s
entry on using lint to find security vulnerabilities.
In particular, note what Scott says when he added a new lint option that
generated
500 new warnings: “I needed to fix all of these before integrating
my change to
Makefile.master because we require the Solaris source to be
lint-clean.” To which I add only, “dammit.”
Next, head over to
Jim Carlson’s
entry describing the work he did to support
non-root builds. Jim’s entry demonstrates how difficult it is to
radically change the build process — and how he managed to pull it off.
Finally, if you want to really let your makefile flag fly,
check out
Mark Nelson’s
entry describing the build support for localized messages.In terms of workspace management, you’ll want to check out
Will Fiveash’s
entry describing our workspace management tool, wx. For a long
time, wx
was a shell script in
Bonwick’s home directory.
It was incredibly useful, but it was also easy to accidentally blow your
brains out.
(As
Bart is fond of saying, it
was “all blade and no handle.”) Will’s rewrite made for a much more
safer, much more sophisticated wx — and it was a huge help to
us in automating the final approach of the
DTrace integration. -
Debuggability. If you read just a couple of the
Opening Day entries,
you probably noticed a trend: many of the entries were about finding
some nasty bug in the system.
This is an accurate reflection of our ethos in developing Solaris:
the operating system must be reliable above all else, and we view
debugging the operating system as our primary responsibility.
This responsibility runs deeper than just the act of debugging, because
our needs so outstripped existing tools that
we designed and built
our own — most notably
mdb
and DTrace.
Fortunately, we ship these tools to you, so you can use them on your
own system and on your own applications.There are many entries describing these tools and how they were used
to tackle a problem.
Fittingly, a good place to start is
Mike Shapiro’s
entry describing using mdb to debug a sendmail bug. This bug is
described in
4278156,
which has one of the
greatest bug synopses of all time: “sendmail died in a two SIGALRM fire.”
1
For more on the power of mdb,
take a look at
Eric Saxe’s
entry on
using mdb to debug a scheduling problem,
Ashish Mehta’s entry
on
using
mdb to debug
a race condition, and
Eric Kustarz’s entry demonstrating an mdb debugger command (“dcmd”) that he wrote to
retrieve NFSv4 recovery messages postmortem.
This last example is a particularly good one
because this is exactly the kind of custom debugging
infrastructure that mdb’s modular architecture makes easy to build.
For a comprehensive example of how we have developed subsystem-specific
debugging infrastructure, read
Sasha Kolbasov’s
entry on the
mdb
dcmds related to STREAMS.
As Sasha mentions, the place to start for learning to write your
own modules is the
documentation –
but you can get a flavor for it by reading
Yu Xiangning’s
entry on writing a
writing
a module for kmdb.
kmdb is the in-situ kernel debugger that implements mdb, and when you
need it, nothing else will do — as
Dan Mick describes
in his entry on debugging with kmdb and moddebug.
For more details on kmdb itself,
check out
Matt Simmons’
entry on
kmdb’s design and implementation.
To see how mdb can help debug your application, take a look at
Will Fiveash’s notes
on using debugging application memory problems. Will
mentions ::findleaks, a debugger command that I originally
implemented for kernel crash dumps, and that
Jonathan Adams
subsequently
ported to work on application core files and — as he mentions in
his entry –
reworking it substantially in the process.While mdb is the acme of postmortem debugging,
if the manifestation of a bug is non-fatal, it’s often more
effective
to use DTrace to debug it.
For an exanple of this,
look at
Bart Smaalders’
entry on using DTrace to debug jitter.
It was gratifying to see Bart debug this problem using DTrace, because
latency bubbles were actually one of the motivating pathologies behind
DTrace.
And finally, debuggability doesn’t end with tools; subsystems must be
designed with
debuggability in mind, as
Stephen Hahn
describes in his entry on
designing libuutil for debuggability.
I think that about does it for today. As someone pointed out on Liane’s
blog, we need a Wiki for this; we agree — it’s on the list of planned
enhancements for
opensolaris.org. Until then,
stay tuned for more sifting…
Technorati tags:
OpenSolaris
Solaris
DTrace
mdb
Notice: get_the_author_email is deprecated since version 2.8! Use get_the_author_meta('email') instead. in /home/knmngmprl21d/public_html/blogs/wp-includes/functions.php on line 3467
on June 17, 2005 at 6:33 pm
Permalink
Thanks you so much Bryan for your big effor in giving an excellent summary of these Sun guru’s blogs.
I actually tried to build OpenSolaris kernel on SPARC following ReleaseNotes using the tar balls I downloaded from download center, but failed. I was quite disappointed to see a couple of errors in the ReleaseNotes. Maybe a lot of senior developers won’t follow or even bother to read ReleaseNotes, but for OpenSolaris buildable source code, ReleaseNotes should be reviewed carefully before released to the outside world.
Notice: get_the_author_email is deprecated since version 2.8! Use get_the_author_meta('email') instead. in /home/knmngmprl21d/public_html/blogs/wp-includes/functions.php on line 3467
on June 18, 2005 at 1:41 pm
Permalink
Qiang -
If you have any trouble building OpenSolaris, please head over to the opensolaris-help forum, where we can assist you and improve the Release notes for everyone else’s benefit.
- Eric