The Observation Deck

Search
Close this search box.

Solaris 10 Revealed

January 25, 2005

Or some of it, anyway. If you haven’t yet seen it, today we open sourced some of Solaris. This isn’t actually the formal launch of OpenSolaris — we’re still busily working away on that — but we wanted to reveal enough of the juicy bits of the Solaris source for the world to realize that we’re serious about OpenSolaris. I view OpenSolaris as an important milestone in the history of operating systems — doubly so because it is Solaris 10 that we are open sourcing, an operating system that I believe to be an important historical milestone in its own right — so it is an honor for me to report that the source code that we decided to release today is that for DTrace.

But enough with the pomp; let’s talk source. I assume that most people who download the source today will download it, check to see that it’s actually source code,1 and then say to themselves, “now what?” To help answer that question, I thought I would take a moment to describe the layout of the source, and point you to some interesting tidbits therein.

As with any Unix variant with Bell labs roots, you’ll find all of the source under the “usr/src” directory. Under usr/src, you’ll find four directories:

  • usr/src/cmd contains commands. Soon, this directory will be populated with its 400+ commands, but for now it just contains subdirectories for each of the following DTrace consumers: dtrace(1M), intrstat(1M), lockstat(1M) and plockstat(1M). This directory additionally contains a subdirectory for the isaexec command, as the DTrace consumers are all isaexec(3C)‘d.
  • usr/src/lib contains libraries. Soon, this directory will be populated with its 150+ libraries, but for now it just contains the libraries upon which the DTrace consumers depend. Each of these libraries is in an eponymous subdirectory:
    • libdtrace(3LIB) is the library that does much of the heavy lifting for a DTrace consumer. The D compiler lives here, along with all of the infrastructure to process outbound data from the kernel. The kernel/user interface is discussed in detail in <sys/dtrace.h>.
    • libctf is a library that is able to interpret the Compact C Type Format (CTF2). CTF is much more compact than the traditional type stabs, and we use it to represent the kernel’s types. (CTF is what allows ::print to work in mdb. If you’ve never done it, try “echo '::print -at vnode_t' | mdb -k” as root on a Solaris 9 or Solaris 10 machine.) DTrace is very dependent on libctf and hence the reason that we’re including it now. Note that much of the source for libctf is in usr/src/common; see below.
    • libproc is a library that exports interfaces for process control via /proc. (The Solaris /proc is vastly different from the Linux /proc in that it is used primarily as a means of process control and process information — not simply as a means of system information; see proc(4) for details.) Many, many Solaris utilities link with libproc including pcred(1), pfiles(1), pflags(1), pldd(1), plimit(1), pmap(1), ppgsz(1), ppriv(1), prctl(1), preap(1), prstat(1), prun(1), pstack(1), pstop(1), ptime(1), ptree(1) and pwdx(1). Thanks to the powerful interfaces in libproc, many of these utilities are quite short in terms of lines of code. These interfaces aren’t yet public, but you can get a taste for them by looking at /usr/include/libproc.h — or by reading the source, of course!
  • usr/src/uts is the main event — the kernel. (“uts” stands for “Unix Time-Sharing System”, and is another artifact from Bell Labs.) The subdirectories here are roughly what you might expect:
    • common contains common code
    • i86pc contains code specific to the PC machine architecture
    • intel contains code specific to the x86 instruction set architecture
    • sparc contains code specific to the SPARC instruction set architecture
    • sun4 contains code specific to the sun4u machine architecture, but general to all platform architectures within that machine architecture

    The difference between instruction set architecture and machine architecture is a bit fuzzy, especially when there is a one-to-one relationship between the two. (And in case you’re not yet confused, platform architectures add another perplexing degree of freedom within machine architectures.) All of this made more sense when there was a one-to-many relationship between instruction sets and machine architectures, but sun4m, sun4d and sun4c have been EOL’d and the source for these machine architectures has been removed. This layout may seem confusing, but I’m here to describe the source layout — not defend it — so moving on…

    In terms of DTrace, most of the excitement is in usr/src/uts/common/dtrace, uts/src/uts/intel/dtrace and usr/src/uts/sparc/dtrace. In usr/src/uts/common/dtrace, you’ll find the meat of the in-kernel DTrace component in the 13,000+ lines of dtrace.c. In this directory you will additionally find the source for common providers like profile(7D) and systrace(7D) along with the common components for the lockstat(7D) provider. In the ISA-specific directories, you’ll find the ISA-specific halves to these providers, along with wholly ISA-specific providers like fbt(7D). You will also find the ISA-specific components to DTrace itself in dtrace_asm.s and dtrace_isa.c.

So that’s the basic layout of the source but…now what? If you’re like me, you don’t have the time or interest to understand something big and foreign — and you mainly look at source code to get a flavor for things. Maybe you just want to grep for “XXX” (you’ll find two — but we on the DTrace team are responsible for neither one) or look for curse words (you’ll find none — at least none yet) or just search for revealingly frank words like “hack”, “kludge”, “vile”, “sleaze”, “ugly”, “gross”, “mess”, etc. (of which you will regrettably find at least one example of each).

But in the interest of leaving you with more than just whiffs of our dirty laundry, let me guide you to some interesting tidbits that require little or no DTrace knowledge to appreciate. I’m not claiming that these tidbits are particularly novel or even particularly core to DTrace; I’m only claiming that they’re interesting for one reason or another. For example, check out this comment in dtrace.c:

/*
 * We want to have a name for the minor.  In order to do this,
 * we need to walk the minor list from the devinfo.  We want
 * to be sure that we don't infinitely walk a circular list,
 * so we check for circularity by sending a scout pointer
 * ahead two elements for every element that we iterate over;
 * if the list is circular, these will ultimately point to the
 * same element.  You may recognize this little trick as the
 * answer to a stupid interview question -- one that always
 * seems to be asked by those who had to have it laboriously
 * explained to them, and who can't even concisely describe
 * the conditions under which one would be forced to resort to
 * this technique.  Needless to say, those conditions are
 * found here -- and probably only here.  Is this is the only
 * use of this infamous trick in shipping, production code?
 * If it isn't, it probably should be...
 */

This is the code that executes the ddi_pathname function in D. A critical constraint on executing D is that any programming errors must be caught and handled gracefully. (We call this the “safety constraint” because failure to abide by it will induce fatal system failure.) While many programming environments recover from memory-related errors, we in DTrace must additionally guard against infinite iteration — a much harder problem. (In fact, this problem is so impossibly hard that its name has become synonymous with undecidability: this is the halting problem that Turing proved impossible to solve in 1936.) We skirt the halting problem in DTrace by not allowing programmer-specified iteration whatsoever: DIF doesn’t allow backwards branches. But some D functions — like ddi_pathname() — require iteration over untrusted data structures to function. When iterating over an untrusted list, our fear is circularity (be it innocent or pernicious), and the easiest way for us to determine this circularity is to use the interview question described above.

I wrote this code a while ago; now that I read it again I actually can imagine some uses for this in other production code — but I would imagine it would all be of the assertion variety. (That is, again the data structure is effectively untrusted.) Any other use still strikes me as busted (prove me wrong?), and I still have disdain for those that ask it as an interview question (apologies if this includes you, gentle reader).

Here’s another interesting tidbit, also in dtrace.c:

	switch (v) {
	case DIF_VAR_ARGS:
		ASSERT(mstate->dtms_present & DTRACE_MSTATE_ARGS);
		if (i >= sizeof (mstate->dtms_arg) /
		    sizeof (mstate->dtms_arg[0])) {
			int aframes = mstate->dtms_probe->dtpr_aframes + 2;
			dtrace_provider_t *pv;
			uint64_t val;
			pv = mstate->dtms_probe->dtpr_provider;

			if (pv->dtpv_pops.dtps_getargval != NULL)
				val = pv->dtpv_pops.dtps_getargval(pv->dtpv_arg,
				    mstate->dtms_probe->dtpr_id,
				    mstate->dtms_probe->dtpr_arg, i, aframes);
			else
				val = dtrace_getarg(i, aframes);

			/*
			 * This is regrettably required to keep the compiler
			 * from tail-optimizing the call to dtrace_getarg().
			 * The condition always evaluates to true, but the
			 * compiler has no way of figuring that out a priori.
			 * (None of this would be necessary if the compiler
			 * could be relied upon to _always_ tail-optimize
			 * the call to dtrace_getarg() -- but it can't.)
			 */
			if (mstate->dtms_probe != NULL)
				return (val);
			ASSERT(0);
			...

This is the code that retrieves an argument due to a reference to args[n] (or argn). The clause above will only be executed if n is equal to or greater than five — in which case we need to go fishing in the stack frame for the argument. And here’s where things get a bit gross: in order to be able to find the right stack frame, we must know exactly how many stack frames have been artificially pushed in the process of getting into DTrace. This includes frames that the provider may have pushed (tracked in the probe as the dtpr_aframes variable) and the frames that DTrace itself has pushed (rather bogusly represented by the constant “2”, above: one for dtrace_probe() and one for dtrace_dif_emulate()). The problem is that if the call to dtrace_getarg() is tail-optimized, our calculation is incorrect. We therefore have to trick the compiler by having an expression after the call that the compiler is forced to evaluate after the call. We do this by having an expression that always evaluates to true, but dereferences through a pointer. Because dtrace_getarg() is in another object file, no amount of alias disambiguation is going to figure out that it doesn’t modify dtms_probe; the compiler doesn’t tail-optimize the above call, the stack frame calculation is correct, and the arguments are correctly fished out of the (true) caller’s stack frame.

There’s an interesting footnote to the above code: recently, we ran a research tool that performs static analysis of code on the source for the Solaris kernel. The tool was actually pretty good, and found all sorts of interesting issues. Among other things, the tool flagged the above code, observing that dtms_probe is never NULL. (The tool may be clever enough to determine that, but it obviously can’t be clever enough to know that we’re trying to outsmart the compiler here.) While this might give us pause, it needn’t: the tool might warn about it, but no compiler could safely avoid evaluating the expression — because dtrace_getarg() is not in the same object file, it cannot be absolutely certain that dtrace_getarg() does not store to dtms_probe.

As long as we’re going through dtrace_getarg(), though, it may be interesting to look at a routine that implements this on SPARC.3 This routine, found in usr/src/uts/sparc/dtrace/dtrace_asm.s, fishes a specific argument out of a specified register window — without causing a window spill trap. Here’s the function:

#if defined(lint)
/*ARGSUSED*/
int
dtrace_fish(int aframes, int reg, uintptr_t *regval)
{
return (0);
}
#else   /* lint */
	ENTRY(dtrace_fish)
	rd      %pc, %g5
	ba      0f
	add     %g5, 12, %g5
	mov     %l0, %g4
	mov     %l1, %g4
	mov     %l2, %g4
	mov     %l3, %g4
	mov     %l4, %g4
	mov     %l5, %g4
	mov     %l6, %g4
	mov     %l7, %g4
	mov     %i0, %g4
	mov     %i1, %g4
	mov     %i2, %g4
	mov     %i3, %g4
	mov     %i4, %g4
	mov     %i5, %g4
	mov     %i6, %g4
	mov     %i7, %g4
0:
	sub     %o1, 16, %o1            ! Can only retrieve %l's and %i's
	sll     %o1, 2, %o1             ! Multiply by instruction size
	add     %g5, %o1, %g5           ! %g5 now contains the instr. to pick
	rdpr    %ver, %g4
	and     %g4, VER_MAXWIN, %g4

	!
	! First we need to see if the frame that we're fishing in is still
	! contained in the register windows.
	!
	rdpr    %canrestore, %g2
	cmp     %g2, %o0
	bl      %icc, 2f
	rdpr    %cwp, %g1
	sub     %g1, %o0, %g3
	brgez,a,pt %g3, 0f
	wrpr    %g3, %cwp

	!
	! CWP minus the number of frames is negative; we must perform the
	! arithmetic modulo MAXWIN.
	!
	add     %g4, %g3, %g3
	inc     %g3
	wrpr    %g3, %cwp
0:
	jmp     %g5
	ba      1f
1:
	wrpr    %g1, %cwp
	stn     %g4, [%o2]
	retl
	clr     %o0                     ! Success; return 0.
2:
	!
	! The frame that we're looking for has been flushed to the stack; the
	! caller will be forced to
	!
	retl
	add     %g2, 1, %o0             ! Failure; return deepest frame + 1
	SET_SIZE(dtrace_fish)
#endif

First, apologies for the paucity of comments in the above. The lack of comments is particularly unfortunate because the function is somewhat subtle, as it uses some dicey register window manipulation plus an odd SPARC technique known as “instruction picking”: the jmp with the the ba in the delay slot picks one of the instructions out of the table that follows the ba 0f, thus allowing the caller to specify any register to fish out of the window without requiring any compares.4 If you’re interested in the details of the register window manipulation logic in this function, you should consult the SPARC V9 Architecture Manual.

That about does it for tidbits, at least for now. As you browse the DTrace source, you may well find yourself asking “does it need to be this complicated?” The short answer is, in most cases, “regrettably yes.” If you’re looking for some in-depth discussion on the specific issues that complicate specific features, I would direct you to functions like dtrace_hres_tick() (in usr/src/uts/common/os/dtrace_subr.c), dtrace_buffer_reserve() (in usr/src/uts/common/dtrace/dtrace.c) and dt_consume_begin() (in usr/src/lib/libdtrace/common/dt_consume.c). These functions are good examples of how seemingly simple DTrace features like timestamps, ring buffers and BEGIN/END probes can lead to much more complexity than one might guess.

Finally, I suppose there’s an outside chance that you might actually want to understand how DTrace works — perhaps even to modify it yourself. If this describes you, you should first heed this advice from usr/src/uts/common/dtrace/dtrace.c:

/*
 * DTrace - Dynamic Tracing for Solaris
 *
 * This is the implementation of the Solaris Dynamic Tracing framework
 * (DTrace).  The user-visible interface to DTrace is described at length in
 * the "Solaris Dynamic Tracing Guide".  The interfaces between the libdtrace
 * library, the in-kernel DTrace framework, and the DTrace providers are
 * described in the block comments in the <sys/dtrace.h> header file.  The
 * internal architecture of DTrace is described in the block comments in the
 * <sys/dtrace_impl.h> header file.  The comments contained within the DTrace
 * implementation very much assume mastery of all of these sources; if one has
 * an unanswered question about the implementation, one should consult them
 * first.
 * ...

This is important advice, because we (by design) put many of the implementation comments in a stock header file, <sys/dtrace_impl.h>. We did this because we believe in the Unix idea that the system implementation should be described as much as possible in its publicly available header files.5 Discussing the comments in <sys/dtrace_impl.h> evokes a somewhat amusing anecdote; take this comment describing the implementation of speculative tracing:

/*
 * DTrace Speculations
 *
 * Speculations have a per-CPU buffer and a global state.  Once a speculation
 * buffer has been committed or discarded, it cannot be reused until all CPUs
 * have taken the same action (commit or discard) on their respective
 * speculative buffer.  However, because DTrace probes may execute in arbitrary
 * context, other CPUs cannot simply be cross-called at probe firing time to
 * perform the necessary commit or discard.  The speculation states thus
 * optimize for the case that a speculative buffer is only active on one CPU at
 * the time of a commit() or discard() -- for if this is the case, other CPUs
 * need not take action, and the speculation is immediately available for
 * reuse.  If the speculation is active on multiple CPUs, it must be
 * asynchronously cleaned -- potentially leading to a higher rate of dirty
 * speculative drops.  The speculation states are as follows:
 *
 *  DTRACESPEC_INACTIVE       <= Initial state; inactive speculation
 *  DTRACESPEC_ACTIVE         <= Allocated, but not yet speculatively traced to
 *  DTRACESPEC_ACTIVEONE      <= Speculatively traced to on one CPU
 *  DTRACESPEC_ACTIVEMANY     <= Speculatively traced to on more than one CPU
 *  DTRACESPEC_COMMITTING     <= Currently being committed on one CPU
 *  DTRACESPEC_COMMITTINGMANY <= Currently being committed on many CPUs
 *  DTRACESPEC_DISCARDING     <= Currently being discarded on many CPUs
 *
 * The state transition diagram is as follows:
 *
 *     +----------------------------------------------------------+
 *     |                                                          |
 *     |                      +------------+                      |
 *     |  +-------------------| COMMITTING |<-----------------+   |
 *     |  |                   +------------+                  |   |
 *     |  | copied spec.            ^             commit() on |   | discard() on
 *     |  | into principal          |              active CPU |   | active CPU
 *     |  |                         | commit()                |   |
 *     V  V                         |                         |   |
 * +----------+                 +--------+                +-----------+
 * | INACTIVE |---------------->| ACTIVE |--------------->| ACTIVEONE |
 * +----------+  speculation()  +--------+  speculate()   +-----------+
 *     ^  ^                         |                         |   |
 *     |  |                         | discard()               |   |
 *     |  | asynchronously          |            discard() on |   | speculate()
 *     |  | cleaned                 V            inactive CPU |   | on inactive
 *     |  |                   +------------+                  |   | CPU
 *     |  +-------------------| DISCARDING |<-----------------+   |
 *     |                      +------------+                      |
 *     | asynchronously             ^                             |
 *     | copied spec.               |       discard()             |
 *     | into principal             +------------------------+    |
 *     |                                                     |    V
 *  +----------------+             commit()              +------------+
 *  | COMMITTINGMANY |<----------------------------------| ACTIVEMANY |
 *  +----------------+                                   +------------+
 */

In writing up this comment, I became elated that I was able to render the state transition diagram as a planar graph. In fact, I was so (excessively) proud that I showed the state transition diagram to my wife, explaining that I wanted to have it tattooed on my back.6 But my initial cut of this had a typo: instead of saying “discard() on active CPU,” that edge was labelled “dicard() on active CPU.” Of course, my wife saw this instantly, and — without a missing a beat — responded “please don’t tattoo ‘dicard’ on your back.” Pride goeth before a fall, but it’s especially painful when it goeth mere seconds before…

Anyway, if you’ve already read all three of these (the Solaris Dynamic Tracing Guide, <sys/dtrace.h> and <sys/dtrace_impl.h>) then you’re ready to start reading the source for purposes of understanding it. If you run into source that you don’t understand (and certainly if you believe that you’ve found a bug), please post to the DTrace forum. Not only will one of us answer your question, but there’s a good chance that we’ll update the comments as well; if you can’t understand it, we probably haven’t been sufficiently clear in our comments. (If you haven’t already inferred it, source readability is very important to us.)

Well, that should be enough to get oriented. If it isn’t obvious, we’re very excited to be making the source to Solaris available. And hopefully this hors d’oeuvre of DTrace source will hold your appetite until we serve the main course of Solaris source. Bon appetit!


1 It’s unclear what passes for convincing in this regard. Perhaps people just want to be sure that it’s not just a bunch of files filled with the results of “while true; do banner all work and no play makes jack a dull boy ; done“?

2 The reason that it’s CTF and not CCTF is a kind of strange inside joke: the format is so compact, even its acronym is compressed. Yes, this is about as unfunny as the Language H humor that we never seem to get sick of…

3 Please don’t infer from this that I’m a SPARC bigot; both of my laptops, my desktop and my build machine are all AMD64 boxes running the 64-bit kernel. It’s only that the SPARC version of this particular operation happens to be interesting, not that it’s interesting because it happens to be SPARC…

4 This brings up a fable of sorts: once, many years ago, there was a system for dynamic instrumentation of SPARC. Unlike DTrace, however, this system was both aggressive and naïve in its instrumentation and, as a result of this unfortunate combination of attributes, couldn’t guarantee safety. In particular, the technique used by the function here — using a DCTI couple to effect instruction picking — was incorrectly instrumented by the system. When confronted with this in front of a large-ish and highly-technical audience at Sun, the author of said system (who was interviewing for a job at Sun at the time) responded (in a confident, patronizing tone) that the SPARC architecture didn’t allow such a construct. The members of the audience gasped, winced and/or snickered: to not deal correctly with this construct was bad enough, but to simply pretend it didn’t exist (and to be a prick about it on top of it all) was beyond the pale; the author didn’t get the job offer, and the whole episode entered local lore. The moral of the story is this: don’t assume that someone asking you a question is an idiot — especially if the question is about the intricacies of SPARC DCTI couples.

5 It’s unclear if this was really a deliberate philosophy or more an accident of history, but it’s certainly a Solaris philosophy at any rate. Perhaps its transition from Unix accident to Solaris philosophy was marked by Jeff‘s “Riemann sum” comment (and accompanying ASCII art diagram) in <sys/kstat.h>?

6 I’m pretty sure that I was joking…

Technorati tags:


26 Responses

  1. Congratulations!

    It is a very wise decision to start OpenSolaris with DTrace. DTrace is really _the_ major feature of Solaris 10. And such a feature is missing in all Open Source operating systems…

    Something completely different:

    Which was the static analysis tool you mentioned? Is it UNO from Bell Labs (http://spinroot.com/uno/)? It does a global, inter-module analysis of the source code and can thus detect the case you described.

  2. Ralf: I’ll check on the name of the static analysis tool. (I wasn’t the one who ran the tool; I was just cc:’d on the analysis of the results.)
    And Mr. 192.18.42.11: I’m flattered by your high praise and impressed by your close read of <tt>dtrace_fish()</tt> — you are clearly someone who measures things down to the last micron. And indeed,
    you won’t find anything like <tt>dtrace_fish()</tt> outside of <tt>dtrace_asm.s</tt>…

  3. “And so it begins…”

    Dtrace will be a nice intro to Solaris for everyone.

    To the public: We honestly find and/or root-cause a LOT of bugs with dtrace. You hear this all the time, but it’s not just hype… honest!

    Bryan: You mention <tt>usr/src/common</tt>, but then don’t elaborate on it. Either you meant to say <tt>usr/src/uts/common</tt>, or forgot to elaborate on <tt>usr/src/common</tt>.

  4. G’Day,
    I had to laugh at the cyclic linked list solution – I haven’t heard of it in production code before, however recently Nathan Kroenert (Sun PTS) wrote it for fun and sent me a copy.
    I first saw it on p334 of “Deep C Secrets” – Peter van der Linden. 🙂
    Brendan Gregg
    [Sydney, Australia]
    (visiting CA, USA)

  5. The O(1) space, two-pointer circular list detector can be found in the implementation of (print) or equivalent in most lisps. Since print is a key component of LISP debugging, it has to not go nuts when it stumbles across circular structures.
    It took me not long to find an instance of this in GNU emacs…

  6. Dan: thanks for spotting the missing explanation of <tt>usr/src/common</tt>; I’ll add that tonight or tomorrow morning. Brendan: thanks for the van der Linden pointer; I’ll have to dig up my copy to see his discussion of the technique. And Bill: damn you for proving me wrong! 😉 I have a wad brewing at the moment, so maybe I’ll change that comment to reflect (print)…

  7. actually there is more to circular list detector than bill explains. all lisps with so called <em>
    destructive semantics</em> (rplca, replcd and its modern variants) that enable the programmers to cut into lists would have to implement circularity detection in the core (eg. list length) for any production use.
    the technique dtrace implements is ancient though
    i guess it is somewhat surprising that bill had
    to look into emacs to find an instance…

  8. Matt: google “dtrace” for more information on DTrace. That will take you to the BigAdmin page that has many links, including to the documentation.

    Carl: I suppose I shouldn’t be surprised to see the usual indignant drivel from dilettantes, but that doesn’t make it any less depressing. Because you apparently didn’t bother to read it, may I refer you to Section 2.4 of our
    USENIX paper, which explicitly contrasts Kerninst and DTrace. Not like you’ll read it now, but as that section explains, Kerninst is unsafe for use on production systems, doesn’t allow for aggregation based on arbitrary tuples, does not allow for arbitrary predicates, and has no support for arbitrary actions. And just out of curiosity, what did you think the “aggressive and naïve” instrumentation framework from footnote 4 was, anyway?

  9. Humility boy. Little wonder Sun loses money by the bucketful when it pays people like you to spout venom. Now that’s going to endear us to Solaris and DTrace. What makes you think other people don’t read your silly papers. Does you manager know you are such a good ambassador for Solaris what with such literate and colorful writing. Perhaps the day you wake up and realize other “unsafe” products make 100x more money that Solaris you’ll learn to shut up and do what customers want instead of acting like you know it all. All you know is how to act like an idiot. Granted you excel at that.

  10. Carl: Apologies if you interpreted it as “venom” — it’s become tedious to hear criticism that we have addressed so directly in our USENIX paper, and my frustration clearly showed through. I think my point still stands, though, so please do read
    the paper if you get the chance or (I’ve always wanted to say this) download the source and play around with it yourself!

  11. Carl –
    You will find that a venomous attitude begets a similar response. You failed to pose a thoughtful question, instead mounting a character attack clearly meant to disparage Sun and its engineers. “Trust Sun to spin anceint technology” is not exactly respectful and has no bearing on the technical merits of either DTrace or KernInst. If you had phrased your question as “This seems similar to KernInst, what are the major differences?” you would probably have gotten a more cordial response.
    You have to learn to treat people with respect if you expect the same in return.

  12. Hi Bryan,
    Thanks for the sweet introduction. It sounds like Richard and Jim are planning to update the Solaris Internals book. Any way you can cover the dtrace kernel hooks in the next revision of the book? It will be nice to review the actual Solaris source code while reading the updated version of the book.
    – Ryan

  13. Ryan, Richard and Jim are planning an update to the book, and I know that they’re planning to use DTrace quite a bit — but I don’t know how much of the internals of DTrace they plan to discuss. The key with DTrace is that the vast majority of probes don’t require any sort of “hook” per se — the providers understand how to instrument text that they’ve never seen before. For example, see <tt>usr/src/uts/intel/dtrace/fbt.c</tt> and <tt>usr/src/uts/sparc/dtrace/fbt.c</tt> for details on how this is done for the FBT provider.

  14. With libctf now available, could someone point me to the (elsewhere mentioned) script that allows to add CTF to third-party libraries and kernel modules?

  15. I don’t know firsthand, but I would imagine mark-and-sweep type garbage collection has to have some circularity detection to know how to throw away otherwise unreferenced but self-linked objects. And I’m afraid I am one of those people that ask this as an interview question — I don’t care if they know or come up with the “right” answer, in fact it’s better to walk through all the inelegant solutions and find out how much people know about time-efficiency of algorithms, etc. Besides, you should know that the purpose of interview questions isn’t to see if they know the answer but to see how they operate under some level of pressure.

  16. Bryan, as a collector/preserver/fan of older Sun hardware, this small comment stuck out:
    “All of this made more sense when there was a one-to-many relationship between instruction sets and machine architectures, but sun4m, sun4d and sun4c have been EOL’d and the source for these machine architectures has been removed.”
    It seems that OpenSolaris would be the perfect opportunity for Sun to release that old code for hobbyists and tinkerers to play with, with a very strict caveat of course that those are unsupported architectures… In particular, I’ve got a basement (& storage unit) full of XDbus machines – a full SS1000E, an SC2000, three 2000Es, and two CS6400s. 🙂 Since sun4d support extended through Sol8, but sun4d6/cray4d support was cut off “officially” at 2.6, I’d be interested in looking at what it would take to extend the life of the CS6400 through Sol8 as well (figuring that’d be relatively “easy”, given support for the 1000/2000).
    Do you think the powers-that-be could be convinced at some point to release the code for those older architectures, if some intrepid souls were interested in trying to bring it up-to-date?

Leave a Reply

Recent Posts

November 18, 2023
November 27, 2022
October 11, 2020
July 31, 2019
December 16, 2018
September 18, 2018
December 21, 2016
September 30, 2016
September 26, 2016
September 13, 2016
July 29, 2016
December 17, 2015
September 16, 2015
January 6, 2015
November 10, 2013
September 3, 2013
June 7, 2012
September 15, 2011
August 15, 2011
March 9, 2011
September 24, 2010
August 11, 2010
July 30, 2010
July 25, 2010
March 10, 2010
November 26, 2009
February 19, 2009
February 2, 2009
November 10, 2008
November 3, 2008
September 3, 2008
July 18, 2008
June 30, 2008
May 31, 2008
March 16, 2008
December 18, 2007
December 5, 2007
November 11, 2007
November 8, 2007
September 6, 2007
August 21, 2007
August 2, 2007
July 11, 2007
May 20, 2007
March 19, 2007
October 12, 2006
August 17, 2006
August 7, 2006
May 1, 2006
December 13, 2005
November 16, 2005
September 13, 2005
September 9, 2005
August 21, 2005
August 16, 2005

Archives