Solaris 10 Launch

So it’s been an exciting week for Solaris: at long last, we officially launched Solaris 10 on Monday. Unlike most product launches, the Solaris 10 launch was heavy on both technical details and customer testimonials: it was very important to us that those covering the event understand that this isn’t ballyhooed nothingness — this is real technology that is having a tangible impact on those using it. To that end, Mike, Andy and I described the Solaris 10 technology areas in some depth to a group of fifty journalists in a Solaris “boot camp” on the morning of the launch. I was pleased by how many journalists were there to begin with, and impressed that none left over the two hours or so of informal presentations: this showed a real willingness on the part of the press to understand what we had done. (Impressively, they even stayed after I suggested to one journalist that he and I strip to the waist and wrestle to settle a difference of opinion. Fortunately, we were able to settle the difference without resorting to fisticuffs.)

But my favorite part of the launch — hands down — was when Don Fike from FedEx stood on the stage and described the application performance problems that FedEx has found using DTrace. It’s always gratifying to see a customer achieve a win with DTrace (which of course is what motivated us to write DTrace in the first place), but it’s something else entirely to have a customer be willing to stand on a stage with you and put their reputation on the line by vouching for your technology. And on top of it all, to have that customer be FedEx — a company that I (and most, I suspect) hold in very high regard — well, it nearly brought a tear to my eye; moments like that just don’t come often in one’s career…

Overall, the launch was a great success. Driving back up to the City with Mike, we wondered aloud: how would the competition respond? As it turns out, we didn’t have to wait long: Martin Fink, HP’s VP of Linux, dashed off a hasty diatribe against Solaris 10. As others have pointed out, this is pure HP FUD: it doesn’t attack our technology in any concrete fashion, but rather attempts to put baseless fear in the minds of those who might be considering it. In particular, Fink returns to a classic FUD attack from the early 1990s: fear of a mixed-endianness planet. This was certainly a surprising angle of attack: given that this issue has been technically solved for nearly a decade, I naturally assumed that this was a dead issue for any technologist. But then, his attack reveals what is confirmed by Fink’s bio (and photo?): Fink isn’t a technologist. But most amusing was Ben Rockwood’s hilarious response Thank you, Ben, for responding with the pluck and thoroughness that I believe characterize the Solaris community…

Posted on November 18, 2004 at 10:30 pm by bmc · Permalink
In: Solaris

2 Responses

Subscribe to comments via RSS

  1. Written by Josh McCormick
    on November 24, 2004 at 6:39 pm
    Permalink

    Here’s my question for you, Bryan. And I know you’re going to have an enthusiastic answer!
    I’ve been a Solaris sysadmin for 7 years. In the large group of people who I work with, I’m among the best in tracking down performance problems. I’ve got a decent enough handle on C, which it seems so many of my coworkers don’t have.
    I absolutely see the benefits in Dtrace. Let me explain why, with a truss example. Recently, my customer had a production problem with a very slow in-house application which accessed a Sybase database. They had no idea why. As a last result, they ask the SA to take a look at it in production and see what they can see. By just running truss, I was able to spot something unusual.
    And usually it takes me a few passes at truss to spot something interesting. I start off sounding like a cold read on a psychic hotline and steadily grow more and more pinpointed. What I found is that most of the application activity consists of a loop. It polled a file descriptor to see if anything was there. Then, 99.9% of the time, it immediately followed that by a read operation.
    What did that tell me? Either each time it is doing a read operation, it is not grabbing enough to make a good dent in the buffer, or that the database was providing way more data than expected. (Okay. I used lsof to figure out it was talking to the database. Sun should provide lsof, but I’m hoping you’ll tell me I can do the same with Dtrace.)
    Now, this was news to the application team. Why? Because, as it turns out, they did API calls to a vendor’s black box to work a particular business function. The problem wasn’t in their code, it was the vendors. So they went to the vendor, armed with what I had found, and got a very quick fix.
    The point here, though, is that I had to have some good general knowledge of C, system calls, and the general situation in order to help debug the issue. My coworkers don’t code in C. Not in the slightest. They’re systems administrators.
    They wouldn’t have had a chance to figure out what a constant cycle of polling and reading meant. They might say, “I see it is doing a lot of reading, but I don’t see anything wrong.” But once I point out to them what was happening, they’d get it.
    Now I go back to your Register article. After reading it, I understand that mmap, msync, and munmap would be costly, but I wonder if I could have came to that conclusion myself. Perhaps with times in the tracings, I could see that it was taking a long time. But I wouldn’t necessarily know that opening with O_DSYNC would have been a better replacement.
    You could say, well, really this is more of a software development/debugging tool, and not a systems administrator’s tool. But then the story would be reversed in that the application team can’t be expected to understand much from an OS perspective.
    I think one of my coworkers hit the nail on the head when he asked, “Do you have to know how to program in C in order to use dtrace?” I think that question is two-fold. First, in creating some good traces in D. Second, in interpreting what dtrace tells you, because applications are generally written in C, and the OS is in C. The second item seems harder than the first item.
    So I kind of wonder what it takes to train someone up in dtrace. (BTW, I’d love to take a course from SunSolve on Dtrace!) They’d have to be a systems administrator. They’d have to know C. How much C is an interesting point for debate. You might require them to have taken the Solaris Internals class? Maybe their are other requirements.
    So, while I think Dtrace is totally awesome, I’m just wondering. Is it too awesome for my coworkers to get some real use out of?
    My follow-up question is that while I have a particular problem that I know I can get some use out of today (an application that performs slow in production, but just fine in test/dev), there is no way that I can get them to migrate to Solaris 10. The vendor won’t have it certified. Monitoring, backups, internal tools. All not Solaris 10 ready. (Yes, I know… the binary compatibility guarantee. It just doesn’t work that way in the business world.)
    I’m just a bit unhappy that is going to be some time until I can sink my teeth into some real problems with Dtrace because a new version of the OS limits my opportunities of where I can implement this. And in an environment where we’re finding our performance problems in production and not so much in test/dev.
    Sorry for so many words.

  2. Written by Pete Fritchman
    on December 13, 2004 at 1:13 pm
    Permalink

    Glad you enjoyed Don’s words about DTrace, we talked a bit beforehand about what kind of things we had Jarod show us and what we did after he left. On a side note, I’m working on starting a community site, http://www.dtrace.info, where people can share some clever dtrace scripts and browse others. Would you mind if I snag some of the useful D scripts you’ve used on your blog for the initial site? I’d be quite happy if there were a couple Sun DTrace people frequently contributing their scripts, too :-)

Subscribe to comments via RSS