The Observation Deck

Search
Close this search box.

Month: July 2004

One of the downsides of being an operating systems developer is that the
demos of the technology that you develop often suck. (“Look, it boots!
And hey, we can even run programs and it doesn’t crash!”) So it’s been a
pleasant change to develop
DTrace,
a technology that packs a jaw-dropping
demo. In demonstrating DTrace for customers around the world, I have had
the distinct (and rare) pleasure of impressing the most technically adept
(and often jaded) audiences. My typical demonstration is on my

Solaris x86 laptop
, where I use DTrace to
instrument the running system — exploring
with the audience the peculiarities that exist even on an idle laptop.
(This usually involves discovering and understanding the unnecessary work
being done by
acroread,
dhcpagent,
sendmail, etc.)
This ad hoc demo
shows DTrace as it’s meant to be used: dynamically answering questions
that themselves were formed on-the-fly.

And when I demonstrate DTrace, I always do so on the absolute latest
Solaris 10 build. Our mantra in Solaris Kernel Development is “FCS Quality
All the Time” — we believe that the product should always be
ready to be run in production. And if we’re going to tell a customer that
it’s ready to be run in production, we damn well better run it in
production ourselves. This has the added advantage that we tend to run
into bugs before our customers do, allowing us to ship a final product that
is that much more solid. Over the past year, I have given hundreds of
DTrace demonstrations in front of customers running latest bits, and before
last week, it had always gone off without a hitch…1

Last week, I had the opportunity to give a DTrace demonstration for a
highly technical — and highly influential — audience at a Fortune 100
company. When I demonstrate DTrace, I typically do a couple of invocations
on the command line before things become sufficiently complicated to merit
writing a DTrace
script
. And it was when I went to run
the first such script (a script that explored the activity of
xclock) that
it happened:

# dtrace -s ./xclock.d
Segmentation Fault (core dumped)
#

If you’ve never had it, there’s no feeling quite like having a demo blow up
on you: it’s as if you peed your pants, failed an exam and were punched in
the gut — all at the same horrifying instant. It’s a feeling that every
software developer should have exactly once in their lives: that unique
rush of shock,
and then humiliation and then despair, followed by the adrenal surge of a
fight-or-flight reaction. In the time it takes a single process to dump
core, you go from an (over)confident technologist to a frightened woodland
creature, transfixed by the light of an oncoming freight train. For the
woodland creature, at least it all ends mercifully quickly; the creature is
spared the suffering of trying to explain away its foolishness. The
hapless technologist, on the other hand, is left with several options:

  1. Pretend that you didn’t write the software: “Boy, will you get a load
    of those fancy-pants software engineers? Overpriced underworked morons,
    every last one!”

  2. Explain that this is demo software and isn’t expected to work:
    “Well, that’s why we haven’t shipped it yet! I mean, what fool would
    run this stuff anyway? Other than me, that is.”

  3. Make light of it: “Hey, knock knock! Who’s there? Not my
    software, that’s for sure! Wocka wocka wocka!”

  4. Suck it up: “That’s a serious problem. If you can excuse me for
    a second, let me get a handle on what we’ve got here that we can demo.”

I always aim for this last option, but
on the rare occasion that this has happened to me (and this is — honest —
probably the worst that a customer-facing demo has gone for me)
I usually end up with
some combination of the last three, often with plenty of stuttering,
some mild swearing (“Damn! Damn!”) and profuse sweating.

In my particular case, the worst part was not knowing the exact pathology
of the bug that I had just
run into. Was there something basic that was broken or toxic about
my machine? Would all scripts that I tried to run dump
core? And if this was broken, what else was broken? Would I panic the
machine or crash a target app if I continued? (Much more serious
problems, both.) In an effort to get a handle on it, I did a quick

pstack
on the
core file:

0804718f ???????? (8046604, 2)
d137c839 dt_instr_size (82d051a, 8067320, 223, d1380fe2) + 59
d137c0c2 dt_pid_create_return_probe (81651b8, 8067320, 8046af0, 8047170, 80472d
d137370d dt_pid_per_sym (80472ac, 8047170, d087b02c) + 15b
d13739ae dt_pid_sym_filt (80472ac, 8047170, d087b02c, 804715c) + 7c
d13152ca Psymbol_iter_com (81651b8, ffffffff, 8069060, 1, 407, 1) + 1e0
d13153ae Psymbol_iter_by_addr (81651b8, 8069060, 1, 407, d1373932, 80472ac) + 1
d1373b81 dt_pid_per_mod (80472ac, 82cf600, 8069060) + 191
d1373d56 dt_pid_mod_filt (80472ac, 82cf600, 8069060) + a3
d1314fe4 Pobject_iter (81651b8, d1373cb3, 80472ac) + 4f
d13740b4 dt_pid_create_probes (82cafa0, 8067320) + 344
d1353af8 dt_setcontext (8067320, 82cafa0) + 42
d13537d4 dt_compile_one_clause (8067320, 82be430, 82cdae0) + 32
d1353a9c dt_compile_clause (8067320, 82be430) + 26
d1354d66 dt_compile (8067320, 16a, 3, 0, 80, 1) + 3d9
d1355263 dtrace_program_strcompile (8067320, 8047ec2, 3, 80, 1, 8066848) + 23
080526ef ???????? (8066e48)
0805370e main     (3, 8047df8, 8047e08) + 8fc
0805177a ???????? (3, 8047eb8, 8047ebf, 8047ec2, 0, 8047edf)

This was dying in the code that analyzes a target binary as part of
creating
pid
provider
probes. There was at least a chance that
this problem was localized to something specific about the xclock
program text — it was worth trying a similar script on a different
process.
Fortunately, I was able to stave off total panic long
enough to write such a script and — even better —
this one worked. The problem did indeed seem to be localized to something
specific in xclock. And thanks to my
coreadm
settings, the core file from the seg faulting
dtrace
had been stashed away for later analysis; the best thing I could do at
that point was
drive on with the rest of the demo.

And this is what I did. The rest of the demo went well, and the
audience was ultimately impressed with the technology. And while I never quite
regained my stride (in part because my mind was racing
about which change to DTrace could have introduced the problem), I
was at least sufficiently effective — we achieved the goals of the
meeting.2
On the plane back home, I root-caused the problem and developed a fix.
The next day, I integrated the fix into Solaris — and I don’t think
I’ve ever been so relieved to put latest bits on my laptop!

In the end, having the demo blow up certainly wasn’t a pleasant experience —
but I wouldn’t change my
decision to demo on the latest bits. Not only did we discover a serious
bug, we discovered the hole in our test suite that prevented us from
finding the bug before it integrated. So who am I to get upset about a
little personal humiliation if the upshot is a better product? 😉


1 This is a slight exaggeration. I had actually run into
DTrace bugs in front of customers, but they were always sufficiently
small that only a trained eye would realize that something was amiss —
things like slightly incorrect error messages.

2 The primary goal of such a demo is often to get the customer
sufficiently excited about Solaris 10
to download Solaris Express
(usually for x86) and start playing around
with the technology themselves.
We are nearly always successful in this — and I have
even had a few customers start downloading Solaris Express before
the end of the meeting!

I’ve been following with interest this thread on the linux-kernel mailing list. The LTT folks have apparently given up on the claim that they’ve got “basically almost everything [DTrace] claims to provide.” They now acknowledge the difference in functionality between LTT and DTrace, but they’re using it to continue an attack on the Linux development model. (Or more accurately, to attack how that model was applied to them.) The most interesting
paragraph is this one in a post by Karim Yaghmour:

As for DTrace, then its existence and current feature set is only a
testament to the fact that the Linux development model can sometimes have
perverted effects, as in the case of LTT. The only reason some people can
rave and wax about DTrace's capabilities and others, as yourself, drag
LTT through the mud because of not having these features, is because the
DTrace developers did not have to put up with having to be excluded from
their kernel for 5 years. As I had said earlier, we would be eons ahead
if LTT had been integrated into the kernel in one of the multiple attempts
that was made to get it in in the past few years. Lest you think that
DTrace actually got all its features in a single merge iteration ...

Some points of clarification: we actually did get most of our features in our initial integration into Solaris (on September 3, 2003). In Solaris, projects that integrate are expected to be in a completed state; if there is follow-on work planned, fine — but what integrates into the gate must be something that is usable and complete from a customer’s perspective. So contrary to Karim’s assertion, most of DTrace came in that first (giant) putback.
As a consequence of this, DTrace spent a long time living the same way LTT has lived: outside the gate, trying to keep in sync while development was underway. Admittedly, DTrace did this for two years — not five. And this is Solaris, not Linux; it’s easier to keep in sync if only because there is only one definition of the latest Solaris. (The newest DTrace build was always based off of the latest Solaris 10 build.) Still: we didn’t let the fact that we had not yet integrated prevent us from developing DTrace, nor did we let it prevent us from building a user community around DTrace. By the time DTrace (finally!) integrated into Solaris 10, we had hundreds of internal users, and a long list of actual problems that were found only with the help of DTrace. Not that DTrace would have been unable to integrate without these things, but having them certainly accelerated the process of integration.

More generally, though, I’m getting a little tired of this argument that LTT would be exactly where DTrace is had they only been allowed into the Linux kernel five years ago. I believe that there is some fundamental innovation in DTrace that LTT simply did not anticipate. For insight into what LTT did anticipate, look at the LTT To Do List from 2002. In that document, you will find many future directions, but not so much as a whisper of the need for DTrace basics like aggregations,
speculative tracing,
thread-local variables
or
associative arrays — let alone DTrace arcana
like stability or translators. Would LTT be further along now had it been allowed to integrate into Linux five years ago? Surely. Would it be anywhere near where DTrace is today? In the immortal words of the Magic 8-Ball, “very doubtful.”

Recently, Karim Yaghmour posted the following to the linux-kernel mailing list:

As I noted when discussing this with Andrew, we've been trying to get
LTT into the kernel for the past five (5) years. During that time we've
repeatedly encountered the same type of arguments for not including it,
and have provided proof as to why those arguments are not substantiated.
Lately I've at least got Andrew to admit that there were no maintenance
issues with the LTT trace statements (given that they've literally
remained unchanged ever since LTT was introduced.) In an effort to
address the issues regarding the usefulness of such a tool, I direct
those interested to this article on DTrace, a trace utility for Solaris:
http://www.theregister.co.uk/2004/07/08/dtrace_user_take/
<rant>
With LTT and DProbes, we've basically got almost everything this tool
claims to provide, save that we would be even further down the road if
we did not need to spend so much time updating patches ...
</rant>
Karim
--
Author, Speaker, Developer, Consultant

Now, Karim’s really only interested in DTrace it that it helps him make his larger point that his project has been unfairly (or unwisely) denied entry into the Linux kernel.
His is a legitimate point, and something that is often lost in the assertions that Linux is developed faster than other operating systems: for all of its putative development speed, Linux has a surprising number of otherwise valuable projects that have been repeatedly denied entry for reasons that seem to be petty and non-technical. DProbes/LTT is certainly one example of such a project, and LKCD is probably another.

But what of Karim’s assertion that LTT and DProbes “basically [have] everything [DTrace] claims to provide”? This claim is false, and indicates that while Karim may have scanned
The Register article, he didn’t bother to browse even
our USENIX paper — let alone
our documentation. From these, one will note that
while LTT lacks many DTrace niceties, it also lacks several vital features. Two among these are
aggregations and
thread-local variables — two features that are not syntactic sugar or bolted-on afterthoughts, but rather are core to the DTrace architecture. These features turn out to be essential in using DTrace to quickly resolve problems. For an example of how these features are used, see Section 9 of
our USENIX paper — and note that every script that we wrote to debug that problem used aggregations, and that several critical steps were only possible with thread-local variables.

And fortunately, you don’t even have to take my word for it: RedHat developer Daniel Berrangé has
posted a comparison of DTrace and DProbes/LTT that reaches roughly the same conclusions…

Ted Leung noted the discussion that Werner and I have been having, and observed that we should consider Rob Pike’s (in)famous polemic, “Systems Software Research is Irrelevant.” I should say that I broadly agree with most of Pike’s conclusions — and academic systems software research has seemed increasingly irrelevant in the last five years. That said, I think that what Pike characterizes as “systems research” is far too skewed to the interface to the system — which (tautologically) is but the periphery of the larger system. In my opinion, “systems research” should focus not on the interface of the system, but rather its guts: those hidden Rube Goldberg-esque innards that are rife with old assumptions and unintended consequences. Pike would perhaps dismiss the study of these innards as “phenomenology”, but I would counter that understanding phenomena is a prerequisite to understanding larger systemic truths. Of course, the problem to date has been that much systems research has not been able to completely understand phenomena — the research has often consisted merely of characterizing it.

As evidence that systems research has become irrelevant, Pike points to the fact that SOSP has had markedly fewer papers that have presenting new operating systems, observing that “a new language or OS can make the machine feel different, give excitement, novelty.” While I agree with the sentiment that innovation is the source of excitement (and that such exciting innovation has been woefully lacking from academic systems research), I disagree with the implication that systems innovation is restricted to a new language or OS; a new file system, a new debugger, or a new way of virtualization can be just as exciting. So the good news is that work need not be a new system to be important systems work, but the bad news is that while none of these is as large as a new OS, they’re still huge projects — far more than a graduate student (or even a lab of graduate students) can be expected to complete in a reasonable amount of time.

So if even these problems are too big for academia, what’s to become of academic systems research? For starters, if it’s to be done by graduate students, it will have to be content with smaller innovation. This doesn’t mean that it need be any less innovative — just that the scope of innovation will be naturally narrower. As an extreme example, take the new nohup -p in Solaris 9. While this is a very small body of work, it is exciting and innovative. And yet, most academics would probably dismiss this work as completely uninteresting — even though most could probably not describe the mechanism by which it works. Is this a dissertation? Certainly not — and it’s not even clear how such a small body of work could be integrated into a larger thesis. But it’s original, novel work, and it solves an important and hard (if small) problem. Note, too, that this work is interesting because of the phenomenon that prohibited a naive implementation: any solution that doesn’t address the deadlock inherent in the problem isn’t actually an acceptable solution. This is an extreme example, but it should make the point that smaller work can be interesting — as long as it’s innovative, robust and thorough.

But if the problems that academic systems researchers work on are going to become smaller, the researchers must have the right foundation upon which to build their work: small work is necessarily more specific, and work is markedly less relevant if it’s based on an obsolete system. And (believe it or not) this actually brings us to one of our primary motivations for open sourcing Solaris: we wish to provide complete access to a best-of-breed system that allows researchers to solve new problems instead of revisiting old ones. Will an open source Solaris single-handedly make systems research relevant? Certainly not — but it should make for one less excuse…

Ashlee Vance of The Register has recently compared me to infamous kitchen gadget pitchman
Ron Popeil. Let me clear up two misconceptions that have apparently arisen from this comparison. First, despite some claims to the contrary, DTrace cannot be used to make turkey-jerky. And second, the rumors of a DTrace infomercial starring
Tom Vu are absolutely false. (That said, it is true that many have used DTrace to work their way up from lowly busboys to yacht-owning multi-millionaires…)

Werner Vogels, a member of the USENIX ’04 Program Committee, has written very thoughtful responses to some of my observations. And it’s clear that Werner and I see the same problem: there is insufficient industrial/academic cooperation in computer science systems research — and the lack of cooperation is to the detriment of both groups.

That said, it’s clear that there are some different perspectives as to how to address the problem. A common sentiment that I’m seeing in the comments is that it is up to industry to keep USENIX relevant (in Werner’s words, “industry will need to be more pro-active in making researchers aware of what the problems are that they need to solve”). I don’t entirely agree; in my opinion, the responsibility for keeping USENIX relevant doesn’t lie exclusively with industry — and it doesn’t lie exclusively with academia, either. Rather, the responsibility lies with USENIX itself, for it is the mission of USENIX to encourage research with a “practical bias.” As such, it is up to USENIX to assemble a Program Committee that will reflect this mission, and it is up to both academia and industry to participate as requested. This means that USENIX cannot simply wait for volunteers from industry to materialize — USENIX must seek out people in industry who understand both the academic and the industrial sides of systems research, and they must convince these people to work on a Program Committee. Now, I know that this has happened in the past — and frankly I thought that the USENIX ’04 Program Committee was a step in the right direction: where USENIX ’03 had four (of sixteen) members from industry, USENIX ’04 had six (of seventeen). But unfortunately, USENIX ’05 seems to be a marked decline in industry participation, even from USENIX ’03: the number from industry has dropped back to four (of eighteen). Worse, all four are from industry labs; where both USENIX ’03 and USENIX ’04 had at least one product-oriented member from industry, USENIX ’05 has none.

Examining these three years of USENIX brings up an interesting question: what has the Program Committee composition looked like over time? That is, is the situation getting better or worse vis a vis industry participation? To answer this question, I looked at the Program Committee composition for the last nine years.

The results are perhaps well-known, but they were shocking to me:

To me, this trend should be deeply disconcerting: an organization that has dedicated itself to research with a “practical bias” is clearly losing that bias in its flagship conference.

So what to do? First, we need some recognition from the USENIX side that this is a serious issue, and that it requires substantial corrective action. I believe that the USENIX Board should charter a committee that consists of academia and industry (both labs and product groups) in roughly equal measure. This committee should hash out some of the misconceptions that each group has of the other, clearly define the problems, develop some long-term (measurable) goals, and make some concrete short- and medium-term recommendations. The deliverable of the committee should be a report summarizing their findings and recommendations — recommendations that the Board should consider but is obviously free to ignore.

The situation is serious, and there is much work to be done to rectify it — but I am heartened by the amount of thought that Werner has put into this issue. If we can find more like him from both industry and academia, we can get the “practical bias” back into USENIX.

Recent Posts

November 26, 2023
November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005

Archives