USDT Providers Redux

In this post I’m going to review DTrace USDT providers and show a complete working example that I hope will be a useful reference for people interested in building providers for their own applications.

First, the prerequisites:


Tektronix MSO 4034 (photo by Rostislav Lisovy (flickr))

Back in 2009 I posted about how we put together an Apache USDT provider for the Sun Storage appliances. Although I wasn’t able to include any of the actual source, this post became the de-facto documentation on implementing a USDT provider with translated arguments. Two years later, as part of our work on Cloud Analytics we’ve written (from scratch) a new Apache provider called mod_usdt, and this time the complete source is available on github.

I hope this provider can be a useful reference for others building USDT providers. There are plenty of other providers out there already, but many of them don’t use translated arguments (see below), and most of them are (naturally) embedded inside a complex application and build system, so it’s hard to pick out what’s actually relevant. Since this module is built separately from Apache, nearly 100% of the code in mod_usdt relates directly to the provider.

The main files are these:

There are also some example DTrace scripts in the repo for tracing request latency; see the README for details.  Between these various pieces, you have everything you need to define and use the provider.

At this point, if you want to understand the pieces of the implementation, your best bet is to go read all the code in mod_usdt. It’s not actually that much code, and it’s well documented. The rest of this post describes how the surrounding mechanism works.

USDT probes

The provider implementation uses macros in the source to define probe points:

    HTTPD_REQUEST_START(request->method, request->uri);

This example (not actually from mod_usdt) includes two probe arguments, which can be accessed from D scripts to filter or aggregate on this information (the HTTP method and URI in this example). Passing simple arguments works well for simple probes, but starts getting unwieldy as you add a bunch more arguments. So DTrace also allows you to define C-style structs that are passed as arguments to the probes. To do this, you must define translators that take the actual arguments from the application and populate a struct for use in the probe. (Don’t worry if you didn’t catch all that. This will become clearer below.)

Building the shared object

We start with the provider definition file, httpd_provider.d, which defines probes like this:

probe request__start(dthttpd_t *p) : (conninfo_t *p, httpd_rqinfo_t *p);

This defines a probe called “request-start”. The application will pass a pointer to dthttpd_t, and the actual D probes will get a conninfo_t and a httpd_rqinfo_t.

When you run “make”, the first step is to take this provider definition generate a header file that defines the macros that the application uses to fire the probes:

dtrace -xnolibs -h -o build/httpd_provider.h -s src/httpd_provider.d

The header file has definitions like this:

#define HTTPD_REQUEST_START(arg0) \
        __dtrace_httpd___request__start(arg0)

Conceptually, the application uses this macro to fire the probe. (As we’ll see, there’s some magic involved.) When you compile the file:

gcc -Wall -Werror -fPIC  -Ibuild -I/opt/local/include/apr-1 -I/opt/local/include -I/opt/local/include/db4 -D_REENTRANT -I/usr/include -DLDAP_DEPRECATED -I/opt/local/include/httpd -c -o build/usdt.o src/usdt.c

The resulting object file indeed includes a call to the special __dtrace function, but there’s another important step next:

dtrace -xnolibs -G -o build/httpd_provider.o -s src/httpd_provider.d build/usdt.o

The “dtrace -G” pass iterates each of the objects (just usdt.o in this case), identifies these function calls, and for each one records which probe is being fired and the location in the program text[2]. It then replaces the function calls with “nop” instructions, which are ignored by the CPU. A new object file (httpd_provider.o) is generated that includes the probe and location information in a special SUNW_dof section. This object also contains an “_init” function — we’ll get to this shortly. To wrap up the build, we link these objects into the final shared library:

gcc -shared -fPIC  -o build/mod_usdt.so build/usdt.o build/httpd_provider.o

Loading the module

When the library is actually loaded, the _init function transmits the information in the SUNW_dof section down to the DTrace kernel module. See the source for details. With this, DTrace knows which probes are available for this process and their locations in memory.

Enable the probes

When a user runs dtrace(1M) to instrument a USDT probe, DTrace replaces the “nop” instructions at all of that probe’s call sites with “int 3″ (0xcc) instructions, which will cause a fast trap into the kernel. The DTrace kernel module can tell (from the location of the trapping instruction) that this corresponds to an enabled probe, so it fires the probe and returns back to userland.

When the probes are disabled again (i.e. when you CTRL-C the dtrace(1M) process), the “int 3″ instructions are changed back to nops.

The upshot of all this is that when DTrace is not enabled, the application runs essentially the same as it did before USDT support was added at all. There’s no overhead to just having DTrace support.

Translated arguments

The files in /usr/lib/dtrace define structures and translators that take arguments passed into probes from the application and convert them to semantically meaningful structures. For details, see src/httpd.d in mod_usdt.

As described above, you can get away without translated arguments by passing primitives (ints and pointers, with which you can access strings too) directly to the probes. That way, you don’t have to deliver files into /usr/lib/dtrace.

And a few more things

Believe it or not, all of the above is a gross simplification of what actually goes on. To see all the details of implementing a provider, you’ll have to read all of mod_usdt. For a taste of how complex the mechanism for firing userland probes really is, check out Bryan’s post on what happens when magic collides.

Also, I explained above that there’s no overhead for disabled probes. That’s almost true, as long as there’s no code required to set up the arguments for the DTrace probe. But often times, you do want to include more complex arguments. To minimize the overhead of setting them up when DTrace is not enabled, you can use ISENABLED macros to tell at run-time whether the probe is enabled. mod_usdt uses these to avoid setting up the structure if the probes are not enabled.

Conclusion

USDT is extremely powerful for tracing semantically meaningful events in userland applications. While implementing a provider doesn’t require a lot of work per se, the steps required are not very obvious. I hope that mod_usdt will serve as a useful reference for those looking to add USDT to their own application, besides being useful in its own right for those using Apache.


[1] This contrasts with the pid provider, which allows users to trace almost any instruction, function entry, and function exit in an application. This flexibility makes the pid provider extremely valuable for ad-hoc investigation, but the resulting dependence on application implementation details makes it unsuitable for stable tools.

[2] The “dtrace -G” pass emits relocations, since the final locations of these instructions won’t be known until the run-time linker runs.

Posted on December 13, 2011 at 11:13 am by dap · Permalink
In: DTrace, Joyent, SmartOS · Tagged with: ,

3 Responses

Subscribe to comments via RSS

  1. Written by Dave Pacheco's Blog » Anatomy of a DTrace USDT provider
    on December 13, 2011 at 11:17 am
    Permalink

    [...] the information in this post has been updated with a complete source example. This post remains for historical [...]

  2. Written by Richard Elling
    on December 16, 2011 at 7:10 pm
    Permalink

    Thanks Dave! DTracers really appreciate the work you do!

  3. Written by dap
    on December 18, 2011 at 1:56 pm
    Permalink

    Thanks, Richard!

Subscribe to comments via RSS