DTrace at Google

Recently, I gave a Tech Talk at Google on DTrace, the video for which is now online. If you’ve seen me present before and you don’t want to suffer through the same tired anecdotes, arcane jokes, and disturbing rants, jump ahead to about 57:16 to see a demo of DTrace for Python — and in particular John Levon‘s incredible Python ustack helper. You also might want to skip ahead to 1:10:46 for the Q&A — any guesses what the first question was? (Hint: it was such an obvious question, both I and the room erupted with laughter.) Note that my rather candid answer to that question represents my opinion alone, and does not constitute legal advice — and nor does it represent any sort of official position of Sun…

Posted on August 21, 2007 at 7:29 am by bmc · Permalink
In: Solaris

11 Responses

Subscribe to comments via RSS

  1. Written by Marc
    on August 21, 2007 at 9:39 am
    Permalink

    Very cool! I had not heard about the python work.
    Would have been great with a resolution good enough to read the screen…

  2. Written by Colin Burgess
    on August 21, 2007 at 12:28 pm
    Permalink

    Wonderful so far – just had to let you know that I nearly snorted hot coffee through my nostrils when you explained about some smartypants wanting MAP_FIXED.
    If I was paid by the number of times I have to explain to some people just WHY choosing your own mapping addresses is Not A Good Idea(tm), I’d be on an island somewhere…
    And the sad thing is… it’s oftentimes the same people that I explained to the last time!

  3. Written by Steinar H. Gunderson
    on August 21, 2007 at 1:19 pm
    Permalink

    Hi,
    I was present at the talk (which was really good, BTW — little hype and more _why_ is it cool), and somehow forgot the follow-up to the obvious question: If Sun is interested in seeing DTrace in Linux, would simply dual-licensing the code (dual CDDL/GPLv2) be an option?

  4. Written by Bryan Cantrill
    on August 21, 2007 at 2:01 pm
    Permalink

    Marc, apologies about the screen resolution. You might want to check out John’s blog entry (linked above) for details on what I was describing.
    Colin, glad I provided amusement on MAP_FIXED, and yes, it’s a Bad Idea — a great example of a little knowledge being a dangerous thing.
    Steinar, dual licensing is a mess, unfortunately — it opens up nasty pathologies like license-based forks that represent unacceptable risk. Should you be interested in a flame war on the subject, see this blog entry and its comments:
    http://blogs.sun.com/ahl/entry/dtrace_knockoffs

  5. Written by Alok Bisani
    on August 21, 2007 at 4:49 pm
    Permalink

    A very cool demonstration, and no sales pitch can beat this kind of demo. The resolution was very poor on Google video, it would be nice if you can publish the few DTrace scripts that you used somewhere. (And add a comment on Google Video?)
    Incidentally, we moved off Solaris8 to Linux Blade servers recently at work. Such facilities made available, via licensing (or as a paid for Kernel module? does such a thing exist??) on other systems would definitely be useful. (Best to sell stuff while you can … )

  6. Written by Edward O'Callaghan
    on August 21, 2007 at 6:09 pm
    Permalink

    Hey,
    Would be cool if more dev used DTrace on Xorg to help clean up some of its many bugs. :P
    Good work,
    Edward.

  7. Written by Steinar H. Gunderson
    on August 21, 2007 at 11:33 pm
    Permalink

    Bryan, thanks for the clarification wrt. dual-licensing.
    On a totally different note: When it comes to mixer_applet2, I’m not sure if my point got through at the presentation (I’m not a native speaker :-) ), but in case you’re curious, mixer_applet2 is the GNOME mixer applet (ie., the volume control; it does not mix PCM audio or anything like that). What it does every 100ms is to poll the sound card’s volume controls, so it can show the correct volume in its icon if something else were to change it.
    Now, this causes the CPU to wake up a lot, as you found out live — and waking up is bad for a laptop, obviously. The Intel people have made a Linux program called PowerTOP designed to find causes of CPU wakeup (so, like a more specialized version of what you constructed in D at the presentation) in order to let the CPU sleep longer and thus save battery life when the machine is idle. One of the first issues that were fixed as result of these efforts were in fact this issue — newer versions of mixer_applet2 can subscribe to ALSA notification events (via HAL) and simply get a message whenever the value changes. (I am unsure if this works for Solaris, though, as I don’t think OSS has any sort of messaging like this.)

  8. Written by Matthew Johnson
    on August 23, 2007 at 1:12 am
    Permalink

    Hi Bryan,
    Its a joy to see your presentation again.
    I don’t know if you remember but you visited my company in Australia last year (might of been the year before) after a D-Trace presentation in Sydney. We put you straight to work on the terminal to demonstrate D-trace in action with our software.
    Anyway our software is extremely threads/mutex/condition variable extensive. We have found that running 3 instances of our software performs much better than 1 instance(configured with the same amount of threads as 3 instances) whilst only using %50 of CPU (no iowait) on a fully loaded E2900.
    So we have theorized that there is shared mutex(s)/condition(s) resource that are causing threads to block up behind.
    Could Dtrace identify the top x mutex/condition variables that have the most threads waiting for?
    Which probes should we be looking at?
    Would love to see you out in Australia again :)
    Thanks
    Matt.

  9. Written by Bryan Cantrill
    on August 23, 2007 at 6:42 am
    Permalink

    Hey Matt,
    I definitely remember you, and the fun afternoon we had drinking beers and debugging performance issues in Sydney in October, 2005. And I still retell one of the things we found that day: as you might recall, you had seen a performance regression going from S9 to S10 — which turned out to be due to a configuration change to use /var/tmp (an on-disk filesystem) instead of /tmp (an in-memory filesystem) for your temporary files. To me, that’s a great example of a mistake that anyone could easily make — but that can be reasonably difficult to find without a tool like DTrace.
    As for your recent problem: yes (of course!) DTrace can be of help here. Take a look at lockstat(1M) (for in-kernel locks) and plockstat(1M) (for user-level locks), both of which are implemented in terms of DTrace. I think you might also want to make use of the sched provider to understand how your threads are being scheduled (and why three instances perform better than one!).
    That should point you in the right direction, but if you’re looking for more help, consider asking dtrace-discuss@opensolaris.org — there’s a ton of helpful DTrace expertise on that list.

  10. Written by Harish Mallipeddi
    on August 26, 2007 at 8:56 am
    Permalink

    Bryan,
    If I wanted to get the Python ustack working with dtrace, which version of Solaris should I be downloading? And where do I download it from? Sorry I’m just new to Solaris!

  11. Written by Bryan Cantrill
    on August 26, 2007 at 6:24 pm
    Permalink

    Harish,
    You want the latest Solaris Express Community Edition (which is updated every two weeks, and contains the latest version of the operating system). It’s available here:
    http://opensolaris.org/os/downloads/

Subscribe to comments via RSS