Assessing software engineering candidates

Note: This blog entry reproduces RFD 151. Comments are encouraged in the discussion for RFD 151.

How does one assess candidates for software engineering positions? This is an age-old question without a formulaic answer: software engineering is itself too varied to admit a single archetype.

Most obviously, software engineering is intellectually challenging; it demands minds that not only enjoy the thrill of solving puzzles, but can also stay afloat in a sea of numbing abstraction. This raw capacity, however, is insufficient; there are many more nuanced skills that successful software engineers must posess. For example, software engineering is an almost paradoxical juxtaposition of collaboration and isolation: successful software engineers are able to work well with (and understand the needs of!) others, but are also able to focus intensely on their own. This contrast extends to the conveyance of ideas, where they must be able to express their own ides well enough to persuade others, but also be able to understand and be persuaded by the ideas of others — and be able to implement all of these on their own. They must be able to build castles of imagination, and yet still understand the constraints of a grimy reality: they must be arrogant enough to see the world as it isn’t, but humble enough to accpet the world as it is. Each of these is a balance, and for each, long-practicing software engineers will cite colleagues who have been ineffective because they have erred too greatly on one side or another.

The challenge is therefore to assess prospective software engineers, without the luxury of firm criteria. This document is an attempt to pull together accumulated best practices; while it shouldn’t be inferred to be overly prescriptive, where it is rigid, there is often a painful lesson behind it.

In terms of evaluation mechanism: using in-person interviewing alone can be highly unreliable and can select predominantly for surface aspects of a candidate’s personality. While we advocate (and indeed, insist upon) interviews, they should come relatively late in the process; as much assessment as possible should be done by allowing the candidate to show themselves as software engineers truly work: on their own, in writing.

Traits to evaluate

How does one select for something so nuanced as balance, especially when the road ahead is unknown? We must look at a wide-variety of traits, presented here in the order in which they are traditionally assessed:

Aptitude

As the ordering implies, there is a temptation in traditional software engineering hiring to focus on aptitude exclusively: to use an interview exclusively to assess a candidate’s pure technical pulling power. While this might seem to be a reasonable course, it in fact leads down the primrose path to pop quizzes about algorithms seen primarily in interview questions. (Red-black trees and circular linked list detection: looking at you.) These assessments of aptitude are misplaced: software engineering is not, in fact, a spelling bee, and one’s ability to perform during an arbitrary oral exam may or may not correlate to one’s ability to actually develop production software. We believe that aptitude is better assessed where software engineers are forced to exercise it: based on the work that they do on their own. As such, candidates should be asked to provide three samples of their works: a code sample, a writing sample, and an analysis sample.

Code sample

Software engineers are ultimately responsible for the artifacts that they create, and as such, a code sample can be the truest way to assess a candidate’s ability.

Candidates should be guided to present code that they believe best reflects them as a software engineer. If this seems too broad, it can be easily focused: what is some code that you’re proud of and/or code that took you a while to get working?

If candidates do not have any code samples because all of their code is proprietary, they should write some: they should pick something that they have always wanted to write but have been needing an excuse — and they should go write it! On such a project, the guideline to the candidate should be to spend at least (say) eight hours on it, but no more than twenty-four — and over no longer than a two week period.

If the candidate is to write something de novo and/or there is a new or interesting technology that the organization is using, it may be worth guiding the candidate to use it (e.g., to write it in a language that the team has started to use, or using a component that the team is broadly using). This constraint should be uplifting to the candidate (e.g., “You may have wanted to explore this technology; here’s your chance!”). At Joyent in the early days of node.js, this was what we called “the node test”, and it yielded many fun little projects — and many great engineers.

Writing sample

Writing good code and writing good prose seem to be found together in the most capable software engineers. That these skills are related is perhaps unsurprising: both types of writing are difficult; both require one to create wholly new material from a blank page; both demand the ability to revise and polish.

To assess a candidate’s writing ability, they should be asked to provide a writing sample. Ideally, this will be technical writing, e.g.:

If a candidate has all of these, they should be asked to provide one of each; if a candidate has none of them, they should be asked to provide a writing sample on something else entirely, e.g. a thesis, dissertation or other academic paper.

Analysis sample

Part of the challenge of software engineering is dealing with software when it doesn’t, in fact, work correctly. At this moment, a software engineer must flip their disposition: instead of an artist creating something new, they must become a scientist, attempting to reason about a foreign world. In having candidates only write code, analytical skills are often left unexplored. And while this can be explored conversationally (e.g., asking for “debugging war stories” is a classic — and often effective — interview question), an oral description of recalled analysis doesn’t necessarily allow the true depths of a candidate’s analytical ability to be plumbed. For this, candidates should be asked to provide an analysis sample: a written analysis of software behavior from the candidate. This may be difficult for many candidates: for many engineers, these analyses may be most often found in defect reports, which may not be public. If the candidate doesn’t have such an analysis sample, the scope should be deliberately broadened to any analytical work they have done on any system (academic or otherwise). If this broader scope still doesn’t yield an analysis sample, the candidate should be asked to generate one to the best of their ability by writing down their analysis of some aspect of system behavior. (This can be as simple as asking them to write down the debugging story that would be their answer to the interview question — giving the candidate the time and space to answer the question once, and completely.)

Education

We are all born uneducated — and our own development is a result of the informal education of experience and curiosity, as well as a better structured and more formal education. To assess a candidate’s education, both the formal and informal aspects of education should be considered.

Formal education

Formal education is easier to assess by its very formality: a candidate’s education is relatively easily evaluated if they had the good fortune of discovering their interest and aptitude at a young age, had the opportunity to pursue and complete their formal education in computer science, and had the further good luck of attending an institution that one knows and has confidence in.

But one should not be bigoted by familiarity: there are many terrific software engineers who attended little-known schools or who took otherwise unconventional paths. The completion of a formal education in computer science is much more important than the institution: the strongest candidate from a little-known school is almost assuredly stronger than the weakest candidate from a well-known school.

In other cases, it’s even more nuanced: there have been many later-in-life converts to the beauty and joy of software engineering, and such candidates should emphatically not be excluded merely because they discovered software later than others. For those that concentrated in entirely non-technical disciplines, further probing will likely be required, with greater emphasis on their technical artifacts.

The most important aspect of one’s formal education may not be its substance so much as its completion. Like software engineering, there are many aspects of completing a formal education that aren’t necessarily fun: classes that must be taken to meet requirements; professors that must be endured rather than enjoyed; subject matter that resists quick understanding or appeal. In this regard, completion of a formal education represents the completion of a significant task. Inversely, the failure to complete one’s formal education may constitute an area of concern. There are, of course, plausible life reasons to abandon one’s education prematurely (especially in an era when higher education is so expensive), but there are also many paths and opportunities to resume and complete it. The failure to complete formal education may indicate deeper problems, and should be understood.

Informal education

Learning is a life-long endeavor, and much of one’s education will be informal in nature. Assessing this informal education is less clear, especially because (by its nature) there is little formally to show for it — but candidates should have a track record of being able to learn on their own, even when this self-education is arduous. One way to probe this may be with a simple question: what is an example of something that you learned that was a struggle for you? As with other questions posed here, the question should have a written answer.

Motivation

Motivation is often not assessed in the interview process, which is unfortunate because it dictates so much of what we do and why. For many companies, it will be important to find those that are intrinsically motivated — those who do what they do primarily for the value of doing it.

Selecting for motivation can be a challenge, and defies formula. Here, open source and open development can be a tremendous asset: it allows others to see what is being done, and, if they are excited by the work, to join the effort and to make their motivation clear.

Values

Values are often not evaluated formally at all in the software engineering process, but they can be critical to determine the “fit” of a candidate. To differentiate values from principles: values represent relative importance versus the absolute importance of principles. Values are important in a software engineering context because we so frequently make tradeoffs in which our values dictate our disposition. (For example, the relative importance of speed of development versus rigor; both are clearly important and positive attributes, but there is often a tradeoff to be had between them). Different engineering organizations may have different values over different times or for different projects, but it’s also true that individuals tend to develop their own values over their career — and it’s essential that the values of a candidate do not clash with the values of the team that they are to join.

But how to assess one’s values? Many will speak to values that they don’t necessarily hold (e.g., rigor), so simply asking someone what’s important to them may or may not yield their true values. One observation is that one’s values — and the adherence or divergence from those values — will often be reflected in happiness and satisfaction with work. When work strongly reflects one’s values, one is much more likely to find it satisfying; when values are compromised (even if for a good reason), work is likely be unsatisfying. As such, the specifics of one’s values may be ascertained by asking candidates some probing questions, e.g.:

Our values can also be seen in the way we interact with others. As such, here are some questions that may have revealing answers:

The answers to these questions should be written down to allow them to be answered thoughtfully and in advance — and then to serve as a starting point for conversation in an interview.

Some questions, however, are more amenable to a live interview. For example, it may be worth asking some situational questions like:

Integrity

In an ideal world, integrity would not be something we would need to assess in a candidate: we could trust that everyone is honest and trustworthy. This view, unfortunately, is naïve with respect to how malicious bad actors can be; for any organization — but especially for one that is biased towards trust and transparency — it is essential that candidates be of high integrity: an employee who operates outside of the bounds of integrity can do nearly unbounded damage to an organization that assumes positive intent.

There is no easy or single way to assess integrity for people with whom one hasn’t endured difficult times. By far the most accurate way of assessing integrity in a candidate is for them to already be in the circle of one’s trust: for them to have worked deeply with (and be trusted by) someone that is themselves deeply trusted. But even in these cases where the candidate is trusted, some basic verification is prudent.

Criminal background check

The most basic integrity check involves a criminal background check. While local law dictates how these checks are used, the check should be performed for a simple reason: it verifies that the candidate is who they say they are. If someone has made criminal mistakes, these mistakes may or may not disqualify them (much will depend on the details of the mistakes, and on local law on how background checks can be used), but if a candidate fails to be honest or remorseful about those mistakes, it is a clear indicator of untrustworthiness.

Credential check

A hidden criminal background in software engineering candidates is unusual; much more common is a slight “fudging” of credentials or other elements of one’s past: degrees that were not in fact earned; grades or scores that have been exaggerated; awards that were not in fact bestowed; gaps in employment history that are quietly covering up by changing the time that one was at a previous employer. These transgressions may seem slight, but they can point to something quite serious: a candidate’s willingness or desire to mislead others to advance themselves. To protect against this, a basic credential check should be performed. This can be confined to degrees, honors, and employment.

References

References can be very tricky, especially for someone coming from a difficult situation (e.g., fleeing poor management). Ideally, a candidate is well known by someone inside the company who is trusted — but even this poses challenges: sometimes we don’t truly know people until they are in difficult situations, and someone “known” may not, in fact, be known at all. Worse, references are most likely to break down when they are most needed: dishonest, manipulative people are, after all, dishonest and manipulative; they can easily fool people — and even references — into thinking that they are something that they are not. So while references can provide value (and shouldn’t be eliminated as a tool), they should also be used carefully and kept in perspective.

Interviews

For individuals outside of that circle of trust, checking integrity is probably still best done in person. There are several potential mechanisms here:

Mechanics of evaluation

Interviews should begin with phone screens to assess the most basic viability, especially with respect to motivation. This initial conversation might include some basic but elementary (and unstructured) homework to gauge that motivation. The candidate should be pointed to material about the company and sources that describe methods of work and specifics about what that work entails. The candidate should be encouraged to review some of this material and send formal written thoughts as a quick test of motivation. If one is not motivated enough to learn about a potential employer, it’s hard to see how they will suddenly gain the motivation to see them through difficult problems.

If and when a candidate is interested in deeper interviews, everyone should be expected to provide the same written material.

Candidate-submitted material

The candidate should submit the following:

Candidate-submitted material should be collected and distributed to everyone on the interview list.

Before the interview

Everyone on the interview schedule should read the candidate-submitted material, and a pre-meeting should then be held to discuss approach: based on the written material, what are the things that the team wishes to better understand? And who will do what?

Pre-interview job talk

For senior candidates, it can be effective to ask them to start the day by giving a technical presentation to those who will interview them. On the one hand, it may seem cruel to ask a candidate to present to a roomful of people who will be later interviewing them, but to the candidate this should be a relief: this allows them to start the day with a home game, where they are talking about something that they know well and can prepare for arbitrarily. The candidate should be allowed to present on anything technical that they’ve worked on, and it should be made clear that:

  1. Confidentiality will be respected (that is, they can present on proprietary work)

  2. The presentation needn’t be novel — it is fine for the candidate to give a talk that they have given before

  3. Slides are fine but not required

  4. The candidate should assume that the audience is technical, but not necessarily familiar with the domain that they are presenting

  5. The candidate should assume about 30 minutes for presentation and 15 minutes for questions.

The aim here is severalfold.

First, this lets everyone get the same information at once: it is not unreasonable that the talk that a candidate would give would be similar to a conversation that they would have otherwise had several times over the day as they are asked about their experience; this minimizes that repetition.

Second, it shows how well the candidate teaches. Assuming that the candidate is presenting on a domain that isn’t intimately known by every member of the audience, the candidate will be required to instruct. Teaching requires both technical mastery and empathy — and a pathological inability to teach may point to deeper problems in a candidate.

Third, it shows how well the candidate fields questions about their work. It should go without saying that the questions themselves shouldn’t be trying to find flaws with the work, but should be entirely in earnest; seeing how a candidate answers such questions can be very revealing about character.

All of that said: a job talk likely isn’t appropriate for every candidate — and shouldn’t be imposed on (for example) those still in school. One guideline may be: those with more than seven years of experience are expected to give a talk; those with fewer than three are not expected to give a talk (but may do so); those in between can use their own judgement.

Interviews

Interviews shouldn’t necessarily take one form; interviewers should feel free to take a variety of styles and approaches — but should generally refrain from “gotcha” questions and/or questions that may conflate surface aspects of intellect with deeper qualities (e.g., Microsoft’s infamous “why are manhole covers round?”). Mixing interview styles over the course of the day can also be helpful for the candidate.

After the interview

After the interview (usually the next day), the candidate should be discussed by those who interviewed them. The objective isn’t necessarily to get to consensus first (though that too, ultimately), but rather to areas of concern. In this regard, the post-interview conversation must be handled carefully: the interview is deliberately constructed to allow broad contact with the candidate, and it is possible than someone relatively junior or otherwise inexperienced will see something that others will miss. The meeting should be constructed to assure that this important data isn’t supressed; bad hires can happen when reservations aren’t shared out of fear of disappointing a larger group!

One way to do this is to structure the meeting this way:

  1. All participants are told to come in with one of three decisions: Hire, Do not hire, Insufficient information. All participants should have one of these positions and they should not change their initial position. (That is, one’s position on a candidate may change over the course of the meeting, but the initial position shouldn’t be retroactively changed.) If it helps, this position can be privately recorded before the meeting starts.

  2. The meeting starts with everyone who believes Do not hire explaining their position. While starting with the Do not hire positions may seem to give the meeting a negative disposition, it is extremely important that the meeting start with the reservations lest they be silenced — especially when and where they are so great that someone believes a candidate should not be hired.

  3. Next, those who believe Insufficient information should explain their position. These positions may be relatively common, and it means that the interview left the interviewer with unanswered questions. By presenting these unanswered questions, there is a possibility that others can provide answers that they may have learned in their interactions with the candidate.

  4. Finally, those who believe Hire should explain their position, perhaps filling in missing information for others who are less certain.

If there are any Do not hire positions, these should be treated very seriously, for it is saying that the aptitude, education, motivation, values and/or integrity of the candidate are in serious doubt or are otherwise unacceptable. Those who believe Do not hire should be asked for the dimensions that most substantiate their position. Especially where these reservations are around values or integrity, a single Do not hire should raise serious doubts about a candidate: the risks of bad hires around values or integrity are far too great to ignore someone’s judgement in this regard!

Ideally, however, no one has the position of Do not hire, and through a combination of screening and candidate self-selection, everyone believes Hire and the discussion can be brief, positive and forward-looking!

If, as is perhaps most likely, there is some mix of Hire and Insufficient information, the discussion should focus on the information that is missing about the candidate. If other interviewers cannot fill in the information about the candidate (and if it can’t be answered by the corpus of material provided by the candidate), the group should together brainstorm about how to ascertain it. Should a follow-up conversation be scheduled? Should the candidate be asked to provide some missing information? Should some aspect of the candidate’s background be explored? The collective decision should not move to Hire as long as there remain unanswered questions preventing everyone from reaching the same decision.

Assessing the assessment process

It is tautologically challenging to evaluate one’s process for assessing software engineers: one lacks data on the candidates that one doesn’t hire, and therefore can’t know which candidates should have been extended offers of employment but weren’t. As such, hiring processes can induce a kind of ultimate survivorship bias in that it is only those who have survived (or instituted) the process who are present to assess it — which can lead to deafening echo chambers of smug certitude. One potential way to assess the assessment process: ask candidates for their perspective on it. Candidates are in a position to be evaluating many different hiring processes concurrently, and likely have the best perspective on the relative merits of different ways of assessing software engineers.

Of course, there is peril here too: while many organizations would likely be very interested in a candidate who is bold enough to offer constructive criticism on the process being used to assess them while it is being used to assess them, the candidates themselves might not realize that — and may instead offer bland bromides for fear of offending a potential employer. Still, it has been our experience that a thoughtful process will encourage a candidate’s candor — and we have found that the processes described here have been strengthened by listening carefully to the feedback of candidates.

Posted on October 5, 2018 at 11:02 am by bmc · Permalink · Comments Closed
In: Uncategorized

Should KubeCon be double-blind?

With a paltry 13% acceptance rate, KubeCon is naturally going to generate a lot of disappointment — the vast, vast majority of proposals aren’t being accepted. But as several have noted, a small number of vendors account for a significant number of accepted talks. Is this an issue? In particular, review for KubeCon isn’t double-blind; should it be?

In terms of my own perspective here, I view conferences for practitioners (and especially their concomitant hallway tracks) as essential for the community of our craft. Historically, I have been troubled by the strangulation of practioner conferences by academic computer science: after we presented DTrace at USENIX 2004, I publicly wondered about the fate of USENIX — which engendered some thoughtful discussion. When USENIX had me keynote their annual technical conference twelve years later, I used the opportunity to express my concerns with the conference model, and wondered about finding the right solution both for practitioners and for academic computer science. That evening, we had a birds-of-a-feather session, which (encouragingly) was very well attended. There were many interesting perspectives, but the one that stood out to me was from Kathryn McKinley, who makes a compelling case that reviews should be double-blind. In the BOF, McKinley was emphatic and persuasive that conferences absolutely must be double-blind in their review — and that anything less is a disservice to the community and the discipline.

Wanting to take that advice, when we organized Systems We Love later that year, we ran it double-blind with a very large (and, if I may say, absolutely awesome!) program committee. We had many, many submissions — well over ten times the number of slots! We were double-blind for the first few stages of review, until the number of submissions had been reduced by a factor of five. Once we had reduced the number of talks submissions to “merely” double the number of slots, we de-blinded to get the rest of the way to a program. (Which was agonizing — too many great submissions!) By de-blinding, we were essentially using factors about the submitter as a tie-breaker to differentiate submissions that were both high quality — and as a way to get voices we might not otherwise hear from.

Personally, I feel that we were able to hit a sweet spot by doing in this way — and there were quite a few surprises when we de-blinded. Of note, at least a quarter of the speakers (and perhaps more, as I didn’t ask everyone) were presenting for the first time. Equally as surprising: several “big names” had submissions that we rejected while blinded — but looking at their submissions, the submissions themselves just weren’t that great! (Which isn’t to say that they don’t have a ton of terrific work to their name — just that every swing of the bat is not going to be a home run.)

So: should KubeCon be double-blind? I consider myself firmly in McKinley’s camp in that I believe that any oversubscribed conference needs to be double-blind to a very significant degree. That said, I also think our challenges as practitioners don’t exactly map to the challenges in academic computer science. (For example, because we aren’t using conferences as a publishing vector, I don’t think we need to be double-blind-until-accept — I think we can de-blind ourself to our rejections.) I also don’t even think we need to be double-blind all the way through the process: we should be double-blind until the program committee has reduced the number of submissions to the point that every remaining submission is deemed one that the program committee wants to accept. (That is, to the point that were it not for the physical limits of the conference, the program committee would want to accept the remaining submissions.) De-blinding at this point assures that the quality of the content is primarily due to the merit of the submission — not due to the particulars of the submitter. (That is, not based on what they’ve done in the past — or who their employer happens to be.) That said, de-blinding at the point of quality does allow these other factors to be used to mold the final program.

For KubeCon — and for other practitioner conferences — I think a hybrid model is the best approach: double-blind for a significant fraction of review, de-blinded for a final program formulation, and then perhaps “invited talks” for talks that were rejected when blind, but that the program committee wishes to accept based on the presenter. This won’t lead to less disappointment at KubeCon (13% is too low an acceptance rate to not be rejecting high-quality submissions), but I believe that a significantly double-blind process will give the community the assurance of a program that best represents it!

Posted on October 3, 2018 at 11:41 am by bmc · Permalink · Comments Closed
In: Uncategorized

The relative performance of C and Rust

My blog post on falling in love with Rust got quite a bit of attention — with many being surprised by what had surprised me as well: the high performance of my naive Rust versus my (putatively less naive?) C. However, others viewed it as irresponsible to report these performance differences, believing that these results would be blown out of proportion or worse. The concern is not entirely misplaced: system benchmarking is one of those areas where — in Jonathan Swift’s words from three centuries ago — “falsehood flies, and the truth comes limping after it.”

There are myriad reasons why benchmarking is so vulnerable to leaving truth behind. First, it’s deceptively hard to quantify the performance of a system simply because the results are so difficult to verify: the numbers we get must be validated (or rejected) according to the degree that they comport with our expectations. As a result, if our expectations are incorrect, the results can be wildly wrong. To see this vividly, please watch (or rewatch!) Brendan Gregg’s excellent (and hilarious) lightning talk on benchmarking gone wrong. Brendan recounts his experience dealing with a particularly flawed approach, and it’s a talk that I always show anyone who is endeavoring to benchmark the system: it shows how easy it is to get it totally wrong — and how important it is to rigorously validate results.

Second, even if one gets an entirely correct result, it’s really only correct within the context of the system. As we succumb to the temptation of applying a result more universally than this context merits — as the asterisks and the qualifiers on a performance number are quietly amputated — a staid truth is transmogrified into a flying falsehood. Worse, some of that context may have been implicit in that the wrong thing may have been benchmarked: in trying to benchmark one aspect of the system, one may inadvertently merely quantify an otherwise hidden bottleneck.

So take all of this as disclaimer: I am not trying to draw large conclusions about “C vs. Rust” here. To the contrary, I think that it is a reasonable assumption that, for any task, a lower-level language can always be made to outperform a higher-level one. But with that said, a pesky fact remains: I reimplemented a body of C software in Rust, and it performed better for the same task; what’s going on? And is there anything broader we can say about these results?

To explore this, I ran some statemap rendering tests on SmartOS on a single-socket Haswell server (Xeon E3-1270 v3) running at 3.50GHz. The C version was compiled with GCC 7.3.0 with -O2 level optimizations; the Rust version was compiled with 1.29.0 with --release. All of the tests were run bound to a processor set containing a single core; all were bound to one logical CPU within that core, with the other logical CPU forced to be idle. cpustat was used to gather CPU performance counter data, with one number denoting one run with pic0 programmed to that CPU performance counter. The input file (~30MB compressed) contains 3.5M state changes, and in the default config will generate a ~6MB SVG.

Here are the results for a subset of the counters relating to the cache performance:

Counter statemap-gcc statemap-rust
cpu_clk_unhalted.thread_p 32,166,437,125 23,127,271,226 -28.1%
inst_retired.any_p 49,110,875,829 48,752,136,699 -0.7%
cpu_clk_unhalted.ref_p 918,870,673 660,493,684 -28.1%
mem_uops_retired.stlb_miss_loads 8,651,386 2,353,178 -72.8%
mem_uops_retired.stlb_miss_stores 268,802 1,000,684 272.3%
mem_uops_retired.lock_loads 7,791,528 51,737 -99.3%
mem_uops_retired.split_loads 107,969 52,745,125 48752.1%
mem_uops_retired.split_stores 196,934 41,814,301 21132.6%
mem_uops_retired.all_loads 11,977,544,999 9,035,048,050 -24.6%
mem_uops_retired.all_stores 3,911,589,945 6,627,038,769 69.4%
mem_load_uops_retired.l1_hit 9,337,365,435 8,756,546,174 -6.2%
mem_load_uops_retired.l2_hit 1,205,703,362 70,967,580 -94.1%
mem_load_uops_retired.l3_hit 66,771,301 33,323,740 -50.1%
mem_load_uops_retired.l1_miss 1,276,311,911 105,524,579 -91.7%
mem_load_uops_retired.l2_miss 69,671,774 34,616,966 -50.3%
mem_load_uops_retired.l3_miss 2,544,750 1,364,435 -46.4%
mem_load_uops_retired.hit_lfb 1,393,631,815 157,897,686 -88.7%
mem_load_uops_l3_hit_retired.xsnp_miss 435 526 20.9%
mem_load_uops_l3_hit_retired.xsnp_hit 1,269 740 -41.7%
mem_load_uops_l3_hit_retired.xsnp_hitm 820 517 -37.0%
mem_load_uops_l3_hit_retired.xsnp_none 67,846,758 33,376,449 -50.8%
mem_load_uops_l3_miss_retired.local_dram 2,543,699 1,301,381 -48.8%

 

So the Rust version is issuing a remarkably similar number of instructions (within less than one percent!), but with a decidedly different mix: just three quarters of the loads of the C version and (interestingly) many more stores. The cycles per instruction (CPI) drops from 0.65 to 0.47, indicating much better memory behavior — and indeed the L1 misses, L2 misses and L3 misses are all way down. The L1 hits as an absolute number are actually quite high relative to the loads, giving Rust a 96.9% L1 hit rate versus the C version’s 77.9% hit rate. Rust also lives much better in the L2, where it has half the L2 misses of the C version.

Okay, so Rust has better memory behavior than C? Well, not so fast. In terms of what this thing is actually doing: the core of statemap processing is coalescing a large number of state transitions in the raw data into a smaller number of rectangles for the resulting SVG. When presented with a new state transition, it picks the “best” two adjacent rectangles to coalesce based on a variety of properties. As a result, this code spends all of its time constantly updating an efficient data structure to be able to make this decision. For the C version, this is a binary search tree (an AVL tree), but Rust (interestingly) doesn’t offer a binary search tree — and it is instead implemented with a BTreeSet, which implements a B-tree. B-trees are common when dealing with on-disk state, where the cost of loading a node contained in a disk block is much, much less than the cost of searching that node for a desired datum, but they are less common as a replacement for an in-memory BST. Rust makes the (compelling) argument that, given the modern memory hierarchy, the cost of getting a line from memory is far greater than the cost of reading it out of a cache — and B-trees make sense as a replacement for BSTs, albeit with a much smaller value for B. (Cache lines are somewhere between 64 and 512 bytes; disk blocks start at 512 bytes and can be much larger.)

Could the performance difference that we’re seeing simply be Rust’s data structure being — per its design goals — more cache efficient? To explore this a little, I varied the value of the number of rectangles in the statemap, as this will affect both the size of the tree (more rectangles will be a larger tree, leading to a bigger working set) and the number of deletions (more rectangles will result in fewer deletions, leading to less compute time).

The results were pretty interesting:

A couple of things to note here: first, there are 3.5M state transitions in the input data; as soon as the number of rectangles exceeds the number of states, there is no reason for any coalescing, and some operations (namely, deleting from the tree of rectangles) go away entirely. So that explains the flatline at roughly 3.5M rectangles.

Also not surprisingly, the worst performance for both approaches occurs when the number of rectangles is set at more or less half the number of state transitions: the tree is huge (and therefore has relatively poorer cache performance for either approach) and each new state requires a deletion (so the computational cost is also high).

So far, this seems consistent with the BTreeSet simply being a more efficient data structure. But what is up with that lumpy Rust performance?! In particular there are some strange spikes; e.g., zooming in on the rectangle range up to 100,000 rectangles:

Just from eyeballing it, they seem to appear at roughly logarithmic frequency with respect to the number of rectangles. My first thought was perhaps some strange interference relationship with respect to the B-tree and the cache size or stride, but this is definitely a domain where an ounce of data is worth much more than a pound of hypotheses!

Fortunately, because Rust is static (and we have things like, say, symbols and stack traces!), we can actually just use DTrace to explore this. Take this simple D script, rustprof.d:

#pragma D option quiet

profile-4987hz
/pid == $target && arg1 != 0/
{
        @[usym(arg1)] = count();
}

END
{
        trunc(@, 10);
        printa("%10@d %A\n", @);
}

I ran this against two runs: one at a peak (e.g., 770,000 rectangle) and then another at the adjacent trough (e.g., 840,000 rectangles), demangling the resulting names by sending the the output through rustfilt. Results for 770,000 rectangles:

# dtrace -s ./rustprof.d -c "./statemap --dry-run -c 770000 ./pg-zfs.out" | rustfilt
3943472 records processed, 769999 rectangles
      1043 statemap`<alloc::collections::btree::map::BTreeMap<K, V>>::remove
      1180 statemap`<std::collections::hash::map::DefaultHasher as core::hash::Hasher>::finish
      1208 libc.so.1`memmove
      1253 statemap`<serde_json::read::StrRead<'a> as serde_json::read::Read<'a>>::parse_str
      1320 statemap`<std::collections::hash::map::HashMap<K, V, S>>::remove
      1695 libc.so.1`memcpy
      2558 statemap`statemap::statemap::Statemap::ingest
      4123 statemap`<std::collections::hash::map::HashMap<K, V, S>>::insert
      4503 statemap`<std::collections::hash::map::HashMap<K, V, S>>::get
     26640 statemap`alloc::collections::btree::search::search_tree

And now the same thing, but against the adjacent valley of better performance at 840,000 rectangles:

# dtrace -s ./rustprof.d -c "./statemap --dry-run -c 840000 ./pg-zfs.out" | rustfilt
3943472 records processed, 839999 rectangles
       971 statemap`<std::collections::hash::map::DefaultHasher as core::hash::Hasher>::write
      1071 statemap`<alloc::collections::btree::map::BTreeMap<K, V>>::remove
      1158 statemap`<std::collections::hash::map::DefaultHasher as core::hash::Hasher>::finish
      1228 libc.so.1`memmove
      1348 statemap`<serde_json::read::StrRead<'a> as serde_json::read::Read<'a>>::parse_str
      1628 libc.so.1`memcpy
      2524 statemap`statemap::statemap::Statemap::ingest
      2948 statemap`<std::collections::hash::map::HashMap<K, V, S>>::insert
      4125 statemap`<std::collections::hash::map::HashMap<K, V, S>>::get
     26359 statemap`alloc::collections::btree::search::search_tree

The samples in btree::search::search_tree are roughly the same — but the poorly performing one has many more samples in HashMap<K, V, S>::insert (4123 vs. 2948). What is going on? The HashMap implementation in Rust uses Robin Hood hashing and linear probing — which means that hash maps must be resized when they hit a certain load factor. (By default, the hash map load factor is 90.9%.) And note that I am using hash maps to effectively implement a doubly linked list: I will have a number of hash maps that — between them — will contain the specified number of rectangles. Given that we only see this at particular sizes (and given that the distance between peaks increases exponentially with respect to the number of rectangles), it seems entirely plausible that at some numbers of rectangles, the hash maps will grow large enough to induce quite a bit more probing, but not quite large enough to be resized.

To explore this hypothesis, it would be great to vary the hash map load factor, but unfortunately the load factor isn’t currently dynamic. Even then, we could explore this by using with_capacity to preallocate our hash maps, but the statemap code doesn’t necessarily know how much to preallocate because the rectangles themselves are spread across many hash maps.

Another option is to replace our use of HashMap with a different data structure — and in particular, we can use a BTreeMap in its place. If the load factor isn’t the issue (that is, if there is something else going on for which the additional compute time in HashMap<K, V, S>::insert is merely symptomatic), we would expect a BTreeMap-based implementation to have a similar issue at the same points.

With Rust, conducting this experiment is absurdly easy:

diff --git a/src/statemap.rs b/src/statemap.rs
index a44dc73..5b7073d 100644
--- a/src/statemap.rs
+++ b/src/statemap.rs
@@ -109,7 +109,7 @@ struct StatemapEntity {
     last: Option,                      // last start time
     start: Option,                     // current start time
     state: Option,                     // current state
-    rects: HashMap<u64, RefCell>, // rectangles for this entity
+    rects: BTreeMap<u64, RefCell>, // rectangles for this entity
 }

 #[derive(Debug)]
@@ -151,6 +151,7 @@ use std::str;
 use std::error::Error;
 use std::fmt;
 use std::collections::HashMap;
+use std::collections::BTreeMap;
 use std::collections::BTreeSet;
 use std::str::FromStr;
 use std::cell::RefCell;
@@ -306,7 +307,7 @@ impl StatemapEntity {
             description: None,
             last: None,
             state: None,
-            rects: HashMap::new(),
+            rects: BTreeMap::new(),
             id: id,
         }
     }

That’s it: because the two (by convention) have the same interface, there is nothing else that needs to be done! And the results, with the new implementation in light blue:

Our lumps are gone! In general, the BTreeMap-based implementation performs a little worse than the HashMap-based implementation, but without as much variance. Which isn’t to say that this is devoid of strange artifacts! It’s especially interesting to look at the variation at lower levels of rectangles, when the two implementations seem to alternate in the pole position:

I don’t know what happens to the BTreeMap-based implementation at about ~2,350 rectangles (where it degrades by nearly 10% but then recovers when the number of rectangles hits ~2,700 or so), but at this point, the effects are only academic for my purposes: for statemaps, the default number of rectangles is 25,000. That said, I’m sure that digging there would yield interesting discoveries!

So, where does all of this leave us? Certainly, Rust’s foundational data structures perform very well. Indeed, it might be tempting to conclude that, because a significant fraction of the delta here is the difference in data structures (i.e., BST vs. B-tree), the difference in language (i.e., C vs. Rust) doesn’t matter at all.

But that would be overlooking something important: part of the reason that using a BST (and in particular, an AVL tree) was easy for me is because we have an AVL tree implementation built as an intrusive data structure. This is a pattern we use a bunch in C: the data structure is embedded in a larger, containing structure — and it is the caller’s responsibility to allocate, free and lock this structure. That is, implementing a library as an intrusive data structure completely sidesteps both allocation and locking. This allows for an entirely robust arbitrarily embeddable library, and it also makes it really easy for a single data structure to be in many different data structures simultaneously. For example, take ZFS’s zio structure, in which a single contiguous chunk of memory is on (at least) two different lists and three different AVL trees! (And if that leaves you wondering how anything could possibly be so complicated, see George Wilson’s recent talk explaining the ZIO pipeline.)

Implementing a B-tree this way, however, would be a mess. The value of a B-tree is in the contiguity of nodes — that is, it is the allocation that is a core part of the win of the data structure. I’m sure it isn’t impossible to implement an intrusive B-tree in C, but it would require so much more caller cooperation (and therefore a more complicated and more error-prone interface) that I do imagine that it would have you questioning life choices quite a bit along the way. (After all, a B-tree is a win — but it’s a constant-time win.)

Contrast this to Rust: intrusive data structures are possible in Rust, but they are essentially an anti-pattern. Rust really, really wants you to have complete orthogonality of purpose in your software. This leads you to having multiple disjoint data structures with very clear trees of ownership — where before you might have had a single more complicated data structure with graphs of multiple ownership. This clear separation of concerns in turn allows for these implementations to be both broadly used and carefully optimized. For an in-depth example of the artful implementation that Rust allows, see Alexis Beingessner’s excellent blog entry on the BTreeMap implementation.

All of this adds up to the existential win of Rust: powerful abstractions without sacrificing performance. Does this mean that Rust will always outperform C? No, of course not. But it does mean that you shouldn’t be surprised when it does — and that if you care about performance and you are implementing new software, it is probably past time to give Rust a very serious look!

Update: Several have asked if Clang would result in materially better performance; my apologies for not having mentioned that when I did my initial analysis, I had included Clang and knew that (at the default rectangles of 25,000), it improved things a little but not enough to approach the performance of the Rust implementation. But for completeness sake, I cranked the Clang-compiled binary at the same rectangle points:

Despite its improvement over GCC, I don’t think that the Clang results invalidate any of my analysis — but apologies again for not including them in the original post!

Posted on September 28, 2018 at 6:28 pm by bmc · Permalink · 14 Comments
In: Uncategorized

Falling in love with Rust

Let me preface this with an apology: this is a technology love story, and as such, it’s long, rambling, sentimental and personal. Also befitting a love story, it has a When Harry Met Sally feel to it, in that its origins are inauspicious…

First encounters

Over a decade ago, I worked on a technology to which a competitor paid the highest possible compliment: they tried to implement their own knockoff. Because this was done in the open (and because it is uniquely mesmerizing to watch one’s own work mimicked), I spent way too much time following their mailing list and tracking their progress (and yes, taking an especially shameful delight in their occasional feuds). On their team, there was one technologist who was clearly exceptionally capable — and I confess to being relieved when he chose to leave the team relatively early in the project’s life. This was all in 2005; for years for me, Rust was “that thing that Graydon disappeared to go work on.” From the description as I read it at the time, Graydon’s new project seemed outrageously ambitious — and I assumed that little would ever come of it, though certainly not for lack of ability or effort…

Fast forward eight years to 2013 or so. Impressively, Graydon’s Rust was not only still alive, but it had gathered a community and was getting quite a bit of attention — enough to merit a serious look. There seemed to be some very intriguing ideas, but any budding interest that I might have had frankly withered when I learned that Rust had adopted the M:N threading model — including its more baroque consequences like segmented stacks. In my experience, every system that has adopted the M:N model has lived to regret it — and it was unfortunate to have a promising new system appear to be ignorant of the scarred shoulders that it could otherwise stand upon. For me, the implications were larger than this single decision: I was concerned that it may be indicative of a deeper malaise that would make Rust a poor fit for the infrastructure software that I like to write. So while impressed that Rust’s ambitious vision was coming to any sort of fruition at all, I decided that Rust wasn’t for me personally — and I didn’t think much more about it…

Some time later, a truly amazing thing happened: Rust ripped it out. Rust’s reasoning for removing segmented stacks is a concise but thorough damnation; their rationale for removing M:N is clear-eyed, thoughtful and reflective — but also unequivocal in its resolve. Suddenly, Rust became very interesting: all systems make mistakes, but few muster the courage to rectify them; on that basis alone, Rust became a project worthy of close attention.

So several years later, in 2015, it was with great interest that I learned that Adam started experimenting with Rust. On first read of Adam’s blog entry, I assumed he would end what appeared to be excruciating pain by deleting the Rust compiler from his computer (if not by moving to a commune in Vermont) — but Adam surprised me when he ended up being very positive about Rust, despite his rough experiences. In particular, Adam hailed the important new ideas like the ownership model — and explicitly hoped that his experience would serve as a warning to others to approach the language in a different way.

In the years since, Rust continued to mature and my curiosity (and I daresay, that of many software engineers) has steadily intensified: the more I have discovered, the more intrigued I have become. This interest has coincided with my personal quest to find a programming language for the back half of my career: as I mentioned in my Node Summit 2017 talk on platform as a reflection of values, I have been searching for a language that reflects my personal engineering values around robustness and performance. These values reflect a deeper sense within me: that software can be permanent — that software’s unique duality as both information and machine afford a timeless perfection and utility that stand apart from other human endeavor. In this regard, I have believed (and continue to believe) that we are living in a Golden Age of software, one that will produce artifacts that will endure for generations. Of course, it can be hard to hold such heady thoughts when we seem to be up to our armpits in vendored flotsam, flooded by sloppy abstractions hastily implemented. Among current languages, only Rust seems to share this aspiration for permanence, with a perspective that is decidedly larger than itself.

Taking the plunge

So I have been actively looking for an opportunity to dive into Rust in earnest, and earlier this year, one presented itself: for a while, I have been working on a new mechanism for system visualization that I dubbed statemaps. The software for rendering statemaps needs to inhale a data stream, coalesce it down to a reasonable size, and render it as a dynamic image that can be manipulated by the user. This originally started off as being written in node.js, but performance became a problem (especially for larger data sets) and I did what we at Joyent have done in such situations: I rewrote the hot loop in C, and then dropped that into a node.js add-on (allowing the SVG-rendering code to remain in JavaScript). This was fine, but painful: the C was straightforward, but the glue code to bridge into node.js was every bit as capricious, tedious, and error-prone as it has always been. Given the performance constraint, the desire for the power of a higher level language, and the experimental nature of the software, statemaps made for an excellent candidate to reimplement in Rust; my intensifying curiosity could finally be sated!

As I set out, I had the advantage of having watched (if from afar) many others have their first encounters with Rust. And if those years of being a Rust looky-loo taught me anything, it’s that the early days can be like the first days of snowboarding or windsurfing: lots of painful falling down! So I took deliberate approach with Rust: rather than do what one is wont to do when learning a new language and tinker a program into existence, I really sat down to learn Rust. This is frankly my bias anyway (I always look for the first principles of a creation, as explained by its creators), but with Rust, I went further: not only did I buy the canonical reference (The Rust Programming Language by Steve Klabnik, Carol Nichols and community contributors), I also bought an O’Reilly book with a bit more narrative (Programming Rust by Jim Blandy and Jason Orendorff). And with this latter book, I did something that I haven’t done since cribbing BASIC programs from Enter magazine back in the day: I typed in the example program in the introductory chapters. I found this to be very valuable: it got the fingers and the brain warmed up while still absorbing Rust’s new ideas — and debugging my inevitable transcription errors allowed me to get some understanding of what it was that I was typing. At the end was something that actually did something, and (importantly), by working with a program that was already correct, I was able to painlessly feel some of the tremendous promise of Rust.

Encouraged by these early (if gentle) experiences, I dove into my statemap rewrite. It took a little while (and yes, I had some altercations with the borrow checker!), but I’m almost shocked about how happy I am with the rewrite of statemaps in Rust. Because I know that many are in the shoes I occupied just a short while ago (namely, intensely wondering about Rust, but also wary of its learning curve — and concerned about the investment of time and energy that climbing it will necessitate), I would like to expand on some of the things that I love about Rust other than the ownership model. This isn’t because I don’t love the ownership model (I absolutely do) or that the ownership model isn’t core to Rust (it is rightfully thought of as Rust’s epicenter), but because I think its sheer magnitude sometimes dwarfs other attributes of Rust — attributes that I find very compelling! In a way, I am writing this for my past self — because if I have one regret about Rust, it’s that I didn’t see beyond the ownership model to learn it earlier.

I will discuss these attributes in roughly the order I discovered them with the (obvious?) caveat that this shouldn’t be considered authoritative; I’m still very much new to Rust, and my apologies in advance for any technical details that I get wrong!

1. Rust’s error handling is beautiful

The first thing that really struck me about Rust was its beautiful error handling — but to appreciate why it so resonated with me requires some additional context. Despite its obvious importance, error handling is something we haven’t really gotten right in systems software. For example, as Dave Pacheo observed with respect to node.js, we often conflate different kinds of errors — namely, programmatic errors (i.e., my program is broken because of a logic error) with operational errors (i.e., an error condition external to my program has occurred and it affects my operation). In C, this conflation is unusual, but you see it with the infamous SIGSEGV signal handler that has been known to sneak into more than one undergraduate project moments before a deadline to deal with an otherwise undebuggable condition. In the Java world, this is slightly more common with the (frowned upon) behavior of catching java.lang.NullPointerException or otherwise trying to drive on in light of clearly broken logic. And in the JavaScript world, this conflation is commonplace — and underlies one of the most serious objections to promises.

Beyond the ontological confusion, error handling suffers from an infamous mechanical problem: for a function that may return a value but may also fail, how is the caller to delineate the two conditions? (This is known as the semipredicate problem after a Lisp construct that suffers from it.) C handles this as it handles so many things: by leaving it to the programmer to figure out their own (bad) convention. Some use sentinel values (e.g., Linux system calls cleave the return space in two and use negative values to denote the error condition); some return defined values on success and failure and then set an orthogonal error code; and of course, some just silently eat errors entirely (or even worse).

C++ and Java (and many other languages before them) tried to solve this with the notion of exceptions. I do not like exceptions: for reasons not dissimilar to Dijkstra’s in his famous admonition against “goto”, I consider exceptions harmful. While they are perhaps convenient from a function signature perspective, exceptions allow errors to wait in ambush, deep in the tall grass of implicit dependencies. When the error strikes, higher-level software may well not know what hit it, let alone from whom — and suddenly an operational error has become a programmatic one. (Java tries to mitigate this sneak attack with checked exceptions, but while well-intentioned, they have serious flaws in practice.) In this regard, exceptions are a concrete example of trading the speed of developing software with its long-term operability. One of our deepest, most fundamental problems as a craft is that we have enshrined “velocity” above all else, willfully blinding ourselves to the long-term consequences of gimcrack software. Exceptions optimize for the developer by allowing them to pretend that errors are someone else’s problem — or perhaps that they just won’t happen at all.

Fortunately, exceptions aren’t the only way to solve this, and other languages take other approaches. Closure-heavy languages like JavaScript afford environments like node.js the luxury of passing an error as an argument — but this argument can be ignored or otherwise abused (and it’s untyped regardless), making this solution far from perfect. And Go uses its support for multiple return values to (by convention) return both a result and an error value. While this approach is certainly an improvement over C, it is also noisy, repetitive and error-prone.

By contrast, Rust takes an approach that is unique among systems-oriented languages: leveraging first algebraic data types — whereby a thing can be exactly one of an enumerated list of types and the programmer is required to be explicit about its type to manipulate it — and then combining it with its support for parameterized types. Together, this allows functions to return one thing that’s one of two types: one type that denotes success and one that denotes failure. The caller can then pattern match on the type of what has been returned: if it’s of the success type, it can get at the underlying thing (by unwrapping it), and if it’s of the error type, it can get at the underlying error and either handle it, propagate it, or improve upon it (by adding additional context) and propagating it. What it cannot do (or at least, cannot do implicitly) is simply ignore it: it has to deal with it explicitly, one way or the other. (For all of the details, see Recoverable Errors with Result.)

To make this concrete, in Rust you end up with code that looks like this:

fn do_it(filename: &str) -> Result {
    let stat = match fs::metadata(filename) {
        Ok(result) => { result },
        Err(err) => { return Err(err); }
    };                  

    let file = match File::open(filename) {
        Ok(result) => { result },
        Err(err) => { return Err(err); }
    };

    /* ... */

    Ok(())
}

Already, this is pretty good: it’s cleaner and more robust than multiple return values, return sentinels and exceptions — in part because the type system helps you get this correct. But it’s also verbose, so Rust takes it one step further by introducing the propagation operator: if your function returns a Result, when you call a function that itself returns a Result, you can append a question mark on the call to the function denoting that upon Ok, the result should be unwrapped and the expression becomes the unwrapped thing — and upon Err the error should be returned (and therefore propagated). This is easier seen than explained! Using the propagation operator turns our above example into this:

fn do_it_better(filename: &str) -> Result {
    let stat = fs::metadata(filename)?;
    let file = File::open(filename)?;

    /* ... */

    Ok(())
}

This, to me, is beautiful: it is robust; it is readable; it is not magic. And it is safe in that the compiler helps us arrive at this and then prevents us from straying from it.

Platforms reflect their values, and I daresay the propagation operator is an embodiment of Rust’s: balancing elegance and expressiveness with robustness and performance. This balance is reflected in a mantra that one hears frequently in the Rust community: “we can have nice things.” Which is to say: while historically some of these values were in tension (i.e., making software more expressive might implicitly be making it less robust or more poorly performing), through innovation Rust is finding solutions that don’t compromise one of these values for the sake of the other.

2. The macros are incredible

When I was first learning C, I was (rightly) warned against using the C preprocessor. But like many of the things that we are cautioned about in our youth, this warning was one that the wise give to the enthusiastic to prevent injury; the truth is far more subtle. And indeed, as I came of age as a C programmer, I not only came to use the preprocessor, but to rely upon it. Yes, it needed to be used carefully — but in the right hands it could generate cleaner, better code. (Indeed, the preprocessor is very core to the way we implement DTrace’s statically defined tracing.) So if anything, my problems with the preprocessor were not its dangers so much as its many limitations: because it is, in fact, a preprocessor and not built into the language, there were all sorts of things that it would never be able to do — like access the abstract syntax tree.

With Rust, I have been delighted by its support for hygienic macros. This not only solves the many safety problems with preprocessor-based macros, it allows them to be outrageously powerful: with access to the AST, macros are afforded an almost limitless expansion of the syntax — but invoked with an indicator (a trailing bang) that makes it clear to the programmer when they are using a macro. For example, one of the fully worked examples in Programming Rust is a json! macro that allows for JSON to be easy declared in Rust. This gets to the ergonomics of Rust, and there are many macros (e.g., format!, vec!, etc.) that make Rust more pleasant to use.

Another advantage of macros: they are so flexible and powerful that they allow for effective experimentation. For example, the propagation operator that I love so much actually started life as a try! macro; that this macro was being used ubiquitously (and successfully) allowed a language-based solution to be considered. Languages can be (and have been!) ruined by too much experimentation happening in the language rather than in how it’s used; through its rich macros, it seems that Rust can enable the core of the language to remain smaller — and to make sure that when it expands, it is for the right reasons and in the right way.

3. format! is a pleasure

Okay, this is a small one but it’s (another) one of those little pleasantries that has made Rust really enjoyable. Many (most? all?) languages have an approximation or equivalent of the venerable sprintf, whereby variable input is formatted according to a format string. Rust’s variant of this is the format! macro (which is in turn invoked by println!, panic!, etc.), and (in keeping with one of the broader themes of Rust) it feels like it has learned from much that came before it. It is type-safe (of course) but it is also clean in that the {} format specifier can be used on any type that implements the Display trait. I also love that the {:?} format specifier denotes that the argument’s Debug trait implementation should be invoked to print debug output. More generally, all of the format specifiers map to particular traits, allowing for an elegant approach to an historically grotty problem. There are a bunch of other niceties, and it’s all a concrete example of how Rust uses macros to deliver nice things without sullying syntax or otherwise special-casing. None of the formatting capabilities are unique to Rust, but that’s the point: in this (small) domain (as in many) Rust feels like a distillation of the best work that came before it. As anyone who has had to endure one of my talks can attest, I believe that appreciating history is essential both to understand our present and to map our future. Rust seems to have that perspective in the best ways: it is reverential of the past without being incarcerated by it.

4. include_str! is a godsend

One of the filthy aspects of the statemap code is that it is effectively encapsulating another program — a JavaScript program that lives in the SVG to allow for the interactivity of the statemap. This code lives in its own file, which the statemap code should pass through to the generated SVG. In the node.js/C hybrid, I am forced to locate the file in the filesystem — which is annoying because it has to be delivered along with the binary and located, etc. Now Rust — like many languages (including ES6) — has support for raw-string literals. As an aside, it’s interesting to see the discussion leading up to its addition, and in particular, how a group of people really looked at every language that does this to see what should be mimicked versus what could be improved upon. I really like the syntax that Rust converged on: r followed by one or more octothorpes followed by a quote to begin a raw string literal, and a quote followed by a matching number of octothorpes followed to end a literal, e.g.:

    let str = r##""What a curious feeling!" said Alice"##;

This alone would have allowed me to do what I want, but still a tad gross in that it’s a bunch of JavaScript living inside a raw literal in a .rs file. Enter include_str!, which allows me to tell the compiler to find the specified file in the filesystem during compilation, and statically drop it into a string variable that I can manipulate:

        ...
        /*
         * Now drop in our in-SVG code.
         */
        let lib = include_str!("statemap-svg.js");
        ...

So nice! Over the years I have wanted this many times over for my C, and it’s another one of those little (but significant!) things that make Rust so refreshing.

5. Serde is stunningly good

Serde is a Rust crate that allows for serialization and deserialization, and it’s just exceptionally good. It uses macros (and, in particular, Rust’s procedural macros) to generate structure-specific routines for serialization and deserialization. As a result, Serde requires remarkably little programmer lift to use and performs eye-wateringly well — a concrete embodiment of Rust’s repeated defiance of the conventional wisdom that programmers must choose between abstractions and performance!

For example, in the statemap implementation, the input is concatenated JSON that begins with a metadata payload. To read this payload in Rust, I define the structure, and denote that I wish to derive the Deserialize trait as implemented by Serde:

#[derive(Deserialize, Debug)]
#[allow(non_snake_case)]
struct StatemapInputMetadata {
    start: Vec<u64>,
    title: String,
    host: Option<String>,
    entityKind: Option<String>,
    states: HashMap<String, StatemapInputState>,
}

Then, to actually parse it:

     let metadata: StatemapInputMetadata = serde_json::from_str(payload)?;

That’s… it. Thanks to the magic of the propagation operator, the errors are properly handled and propagated — and it has handled tedious, error-prone things for me like the optionality of certain members (itself beautifully expressed via Rust’s ubiquitous Option type). With this one line of code, I now (robustly) have a StatemapInputMetadata instance that I can use and operate upon — and this performs incredibly well on top of it all. In this regard, Serde represents the best of software: it is a sophisticated, intricate implementation making available elegant, robust, high-performing abstractions; as legendary White Sox play-by-play announcer Hawk Harrelson might say, MERCY!

6. I love tuples

In my C, I have been known to declare anonymous structures in functions. More generally, in any strongly typed language, there are plenty of times when you don’t want to have to fill out paperwork to be able to structure your data: you just want a tad more structure for a small job. For this, Rust borrows an age-old construct from ML in tuples. Tuples are expressed as a parenthetical list, and they basically work as you expect them to work in that they are static in size and type, and you can index into any member. For example, in some test code that needs to make sure that names for colors are correctly interpreted, I have this:

        let colors = vec![
            ("aliceblue", (240, 248, 255)),
            ("antiquewhite", (250, 235, 215)),
            ("aqua", (0, 255, 255)),
            ("aquamarine", (127, 255, 212)),
            ("azure", (240, 255, 255)),
            /* ... */
        ];

Then colors[2].0 (say) which will be the string “aqua”; (colors[1].1).2 will be the integer 215. Don’t let the absence of a type declaration in the above deceive you: tuples are strongly typed, it’s just that Rust is inferring the type for me. So if I accidentally try to (say) add an element to the above vector that contains a tuple of mismatched signature (e.g., the tuple “((188, 143, 143), ("rosybrown"))“, which has the order reversed), Rust will give me a compile-time error.

The full integration of tuples makes them a joy to use. For example, if a function returns a tuple, you can easily assign its constituent parts to disjoint variables, e.g.:

fn get_coord() -> (u32, u32) {
   (1, 2)
}

fn do_some_work() {
    let (x, y) = get_coord();
    /* x has the value 1, y has the value 2 */
}

Great stuff!

7. The integrated testing is terrific

One of my regrets on DTrace is that we didn’t start on the DTrace test suite at the same time we started the project. And even after we starting building it (too late, but blessedly before we shipped it), it still lived away from the source for several years. And even now, it’s a bit of a pain to run — you really need to know it’s there.

This represents everything that’s wrong with testing in C: because it requires bespoke machinery, too many people don’t bother — even when they know better! Viz.: in the original statemap implementation, there is zero testing code — and not because I don’t believe in it, but just because it was too much work for something relatively small. Yes, there are plenty of testing frameworks for C and C++, but in my experience, the integrated frameworks are too constrictive — and again, not worth it for a smaller project.

With the rise of test-driven development, many languages have taken a more integrated approach to testing. For example, Go has a rightfully lauded testing framework, Python has unittest, etc. Rust takes a highly integrated approach that combines the best of all worlds: test code lives alongside the code that it’s testing — but without having to make the code bend to a heavyweight framework. The workhorses here are conditional compilation and Cargo, which together make it so easy to write tests and run them that I found myself doing true test-driven development with statemaps — namely writing the tests as I develop the code.

8. The community is amazing

In my experience, the best communities are ones that are inclusive in their membership but resolute in their shared values. When communities aren’t inclusive, they stagnate, or rot (or worse); when communities don’t share values, they feud and fracture. This can be a very tricky balance, especially when so many open source projects start out as the work of a single individual: it’s very hard for a community not to reflect the idiosyncrasies of its founder. This is important because in the open source era, community is critical: one is selecting a community as much as one is selecting a technology, as each informs the future of the other. One factor that I value a bit less is strictly size: some of my favorite communities are small ones — and some of my least favorite are huge.

For purposes of a community, Rust has a luxury of clearly articulated, broadly shared values that are featured prominently and reiterated frequently. If you head to the Rust website this is the first sentence you’ll read:

Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.

That gets right to it: it says that as a community, we value performance and robustness — and we believe that we shouldn’t have to choose between these two. (And we have seen that this isn’t mere rhetoric, as so many Rust decisions show that these values are truly the lodestar of the project.)

And with respect to inclusiveness, it is revealing that you will likely read that statement of values in your native tongue, as the Rust web page has been translated into thirteen languages. Just the fact that it has been translated into so many languages makes Rust nearly unique among its peers. But perhaps more interesting is where this globally inclusive view likely finds its roots: among the sites of its peers, only Ruby is similarly localized. Given that several prominent Rustaceans like Steve Klabnik and Carol Nichols came from the Ruby community, it would not be unreasonable to guess that they brought this globally inclusive view with them. This kind of inclusion is one that one sees again and again in the Rust community: different perspectives from different languages and different backgrounds. Those who come to Rust bring with them their experiences — good and bad — from the old country, and the result is a melting pot of ideas. This is an inclusiveness that runs deep: by welcoming such disparate perspectives into a community and then uniting them with shared values and a common purpose, Rust achieves a rich and productive heterogeneity of thought. That is, because the community agrees about the big things (namely, its fundamental values), it has room to constructively disagree (that is, achieve consensus) on the smaller ones.

Which isn’t to say this is easy! Check out Ashley Williams in the opening keynote from RustConf 2018 for how exhausting it can be to hash through these smaller differences in practice. Rust has taken a harder path than the “traditional” BDFL model, but it’s a qualitatively better one — and I believe that many of the things that I love about Rust are a reflection of (and a tribute to) its robust community.

9. The performance rips

Finally, we come to the last thing I discovered in my Rust odyssey — but in many ways, the most important one. As I described in an internal presentation, I had experienced some frustrations trying to implement in Rust the same structure I had had in C. So I mentally gave up on performance, resolving to just get something working first, and then optimize it later.

I did get it working, and was able to benchmark it, but to give some some context for the numbers, here is the time to generate a statemap in the old (slow) pure node.js implementation for a modest trace (229M, ~3.9M state transitions) on my 2.9 GHz Core i7 laptop:

% time ./statemap-js/bin/statemap ./pg-zfs.out > js.svg

real	1m23.092s
user	1m21.106s
sys	0m1.871s

This is bad — and larger input will cause it to just run out of memory. And here’s the version as reimplemented as a C/node.js hybrid:

% time ./statemap-c/bin/statemap ./pg-zfs.out > c.svg

real	0m11.800s
user	0m11.414s
sys	0m0.330s

This was (as designed) a 10X improvement in performance, and represents speed-of-light numbers in that this seems to be an optimal implementation. Because I had written my Rust naively (and my C carefully), my hope was that the Rust would be no more than 20% slower — but I was braced for pretty much anything. Or at least, I thought I was; I was actually genuinely taken aback by the results:

$ time ./statemap.rs/target/release/statemap ./pg-zfs.out > rs.svg
3943472 records processed, 24999 rectangles

real	0m8.072s
user	0m7.828s
sys	0m0.186s

Yes, you read that correctly: my naive Rust was ~32% faster than my carefully implemented C. This blew me away, and in the time since, I have spent some time on a real lab machine running SmartOS (where I have reproduced these results and been able to study them a bit). My findings are going to have to wait for another blog entry, but suffice it to say that despite executing a shockingly similar number of instructions, the Rust implementation has a different load/store mix (it is much more store-heavy than C) — and is much better behaved with respect to the cache. Given the degree that Rust passes by value, this makes some sense, but much more study is merited.

It’s also worth mentioning that there are some easy wins that will make the Rust implementation even faster: after I had publicized the fact that I had a Rust implementation of statemaps working, I was delighted when David Tolnay, one of the authors of Serde, took the time to make some excellent suggestions for improvement. For a newcomer like me, it’s a great feeling to have someone with such deep expertise as David’s take an interest in helping me make my software perform even better — and it is revealing as to the core values of the community.

Rust’s shockingly good performance — and the community’s desire to make it even better — fundamentally changed my disposition towards it: instead of seeing Rust as a language to augment C and replace dynamic languages, I’m looking at it as a language to replace both C and dynamic languages in all but the very lowest layers of the stack. C — like assembly — will continue to have a very important place for me, but it’s hard to not see that place as getting much smaller relative to the barnstorming performance of Rust!

Beyond the first impressions

I wouldn’t want to imply that this is an exhaustive list of everything that I have fallen in love with about Rust. That list is much longer would include at least the ownership model; the trait system; Cargo; the type inference system. And I feel like I have just scratched the surface; I haven’t waded into known strengths of Rust like the FFI and the concurrency model! (Despite having written plenty of multithreaded code in my life, I haven’t so much as created a thread in Rust!)

Building a future

I can say with confidence that my future is in Rust. As I have spent my career doing OS kernel development, a natural question would be: do I intend to rewrite the OS kernel in Rust? In a word, no. To understand my reluctance, take some of my most recent experience: this blog entry was delayed because I needed to debug (and fix) a nasty problem with our implementation of the Linux ABI. As it turns out, Linux and SmartOS make slightly different guarantees with respect to the interaction of vfork and signals, and our code was fatally failing on a condition that should be impossible. Any old Unix hand (or quick study!) will tell you that vfork and signal disposition are each semantic superfund sites in their own right — and that their horrific (and ill-defined) confluence can only be unimaginably toxic. But the real problem is that actual software implicitly depends on these semantics — and any operating system that is going to want to run existing software will itself have to mimic them. You don’t want to write this code, because no one wants to write this code.

Now, one option (which I honor!) is to rewrite the OS from scratch, as if legacy applications essentially didn’t exist. While there is a tremendous amount of good that can come out of this (and it can find many use cases), it’s not a fit for me personally.

So while I may not want to rewrite the OS kernel in Rust, I do think that Rust is an excellent fit for much of the broader system. For example, at the recent OpenZFS Developers Summit, Matt Ahrens and I were noodling the notion of user-level components for ZFS in Rust. Specifically: zdb is badly in need of a rewrite — and Rust would make an excellent candidate for it. There are many such examples spread throughout ZFS and the broader the system, including a few in kernel. Might we want to have a device driver model that allows for Rust drivers? Maybe! (And certainly, it’s technically possible.) In any case, you can count on a lot more Rust from me and into the indefinite future — whether in the OS, near the OS, or above the OS.

Taking your own plunge

I wrote all of this up in part to not only explain why I took the plunge, but to encourage others to take their own. If you were as I was and are contemplating diving into Rust, a couple of pieces of advice, for whatever they’re worth:

I’m sure that there’s a bunch of stuff that I missed; if there’s a particular resource that you found useful when learning Rust, message me or leave a comment here and I’ll add it.

Let me close by offering a sincere thanks to those in the Rust community who have been working so long to develop such a terrific piece of software — and especially those who have worked so patiently to explain their work to us newcomers. You should be proud of what you’ve accomplished, both in terms of a revolutionary technology and a welcoming community — thank you for inspiring so many of us about what infrastructure software can become, and I look forward to many years of implementing in Rust!

Posted on September 18, 2018 at 3:31 pm by bmc · Permalink · 22 Comments
In: Uncategorized

Talks I have given

Increasingly, people have expressed the strange urge to binge-watch my presentations. This potentially self-destructive behavior seems likely to have unwanted side-effects like spontaneous righteous indignation, superfluous historical metaphor, and near-lethal exposure to tangential anecdote — and yet I find myself compelled to enable it by collecting my erstwhile scattered talks. While this blog entry won’t link to every talk I’ve ever given, there should be enough here to make anyone blotto!

To accommodate the more recreational watcher as well as the hardened addict, I have also broken my talks up into a a series of trilogies, with each following a particular subject area or theme. In the the future, as I give talks that become available, I will update this blog entry. And if you find that a link here is dead, please let me know!

Before we get to the list: if you only watch one talk of mine, please watch Principles of Technology Leadership (slides) presented at Monktoberfest 2017. This is the only talk that I have asked family and friends to watch, as it represents my truest self — or what I aspire that self to be, anyway.

The talks

Talks I have given, in reverse chronological order:


Trilogies of talks

As with anyone, there are themes that run through my career. While I don’t necessarily give talks in explicit groups of three, looking back on my talks I can see some natural groupings that make for related sequences of talks.

The Software Values Trilogy

In late 2016 and through 2017, it felt like fundamental values like decency and integrity were under attack; it seems appropriate that these three talks were born during this turbulent time:

The Debugging Trilogy

While certainly not the only three talks I’ve given on debugging, these three talks present a sequence on aspects of debugging that we don’t talk about as much:

The Beloved Trilogy

A common theme across my Papers We Love and Systems We Love talks is (obviously?) an underlying love for the technology. These three talks represent a trilogy of beloved aspects of the system that I have spent two decades in:

The Open Source Trilogy

While my career started developing proprietary software, I am blessed that most of it has been spent in open source. This trilogy reflects on my experiences in open source, from the dual perspective of both a commercial entity and as an individual contributor:

The Container Trilogy

I have given many (too many!) talks on containers and containerization, but these three form a reasonable series (with hopefully not too much overlap!):

The DTrace Trilogy

Another area where I have given many more than three talks, but these three form a reasonable narrative:

The Surge Lightning Trilogy

For its six year run, Surge was a singular conference — and the lightning talks were always a highlight. My lightning talks were not deliberately about archaic Unixisms, it just always seemed to work out that way — an accidental narrative arc across several years.

Posted on February 3, 2018 at 9:43 pm by bmc · Permalink · 2 Comments
In: Uncategorized

The sudden death and eternal life of Solaris

As had been rumored for a while, Oracle effectively killed Solaris on Friday. When I first saw this, I had assumed that this was merely a deep cut, but in talking to Solaris engineers still at Oracle, it is clearly much more than that. It is a cut so deep as to be fatal: the core Solaris engineering organization lost on the order of 90% of its people, including essentially all management.

Of note, among the engineers I have spoken with, I heard two things repeatedly: “this is the end” and (from those who managed to survive Friday) “I wish I had been laid off.” Gone is any of the optimism (however tepid) that I have heard over the years — and embarrassed apologies for Oracle’s behavior have been replaced with dismay about the clumsiness, ineptitude and callousness with which this final cut was handled. In particular, that employees who had given their careers to the company were told of their termination via a pre-recorded call — “robo-RIF’d” in the words of one employee — is both despicable and cowardly. To their credit, the engineers affected saw themselves as Sun to the end: they stayed to solve hard, interesting problems and out of allegiance to one another — not out of any loyalty to the broader Oracle. Oracle didn’t deserve them and now it doesn’t have them — they have been liberated, if in a depraved act of corporate violence.

Assuming that this is indeed the end of Solaris (and it certainly looks that way), it offers a time for reflection. Certainly, the demise of Solaris is at one level not surprising, but on the other hand, its very suddenness highlights the degree to which proprietary software can suffer by the vicissitudes of corporate capriciousness. Vulnerable to executive whims, shareholder demands, and a fickle public, organizations can simply change direction by fiat. And because — in the words of the late, great Roger Faulkner — “it is easier to destroy than to create,” these changes in direction can have lasting effect when they mean stopping (or even suspending!) work on a project. Indeed, any engineer in any domain with sufficient longevity will have one (or many!) stories of exciting projects being cancelled by foolhardy and myopic management. For software, though, these cancellations can be particularly gutting because (in the proprietary world, anyway) so many of the details of software are carefully hidden from the users of the product — and much of the innovation of a cancelled software project will likely die with the project, living only in the oral tradition of the engineers who knew it. Worse, in the long run — to paraphrase Keynes — proprietary software projects are all dead. However ubiquitous at their height, this lonely fate awaits all proprietary software.

There is, of course, another way — and befitting its idiosyncratic life and death, Solaris shows us this path too: software can be open source. In stark contrast to proprietary software, open source does not — cannot, even — die. Yes, it can be disused or rusty or fusty, but as long as anyone is interested in it at all, it lives and breathes. Even should the interest wane to nothing, open source software survives still: its life as machine may be suspended, but it becomes as literature, waiting to be discovered by a future generation. That is, while proprietary software can die in an instant, open source software perpetually endures by its nature — and thrives by the strength of its communities. Just as the existence of proprietary software can be surprisingly brittle, open source communities can be crazily robust: they can survive neglect, derision, dissent — even sabotage.

In this regard, I speak from experience: from when Solaris was open sourced in 2005, the OpenSolaris community survived all of these things. By the time Oracle bought Sun five years later in 2010, the community had decided that it needed true independence — illumos was born. And, it turns out, illumos was born at exactly the right moment: shortly after illumos was announced, Oracle — in what remains to me a singularly loathsome and cowardly act — silently re-proprietarized Solaris on August 13, 2010. We in illumos were indisputably on our own, and while many outsiders gave us no chance of survival, we ourselves had reason for confidence: after all, open source communities are robust because they are often united not only by circumstance, but by values, and in our case, we as a community never lost our belief in ZFS, Zones, DTrace and myriad other technologies like MDB, FMA and Crossbow.

Indeed, since 2010, illumos has thrived; illumos is not only the repository of record for technologies that have become cross-platform like OpenZFS, but we have also advanced our core technologies considerably, while still maintaining highest standards of quality. Learning some of the mistakes of OpenSolaris, we have a model that allows for downstream innovation, experimentation and differentiation. For example, Joyent’s SmartOS has always been focused on our need for a cloud hypervisor (causing us to develop big features like hardware virtualization and Linux binary compatibility), and it is now at the heart of a massive buildout for Samsung (who acquired Joyent a little over a year ago). For us at Joyent, the Solaris/illumos/SmartOS saga has been formative in that we have seen both the ill effects of proprietary software and the amazing resilience of open source software — and it very much informed our decision to open source our entire stack in 2014.

Judging merely by its tombstone, the life of Solaris can be viewed as tragic: born out of wedlock between Sun and AT&T and dying at the hands of a remorseless corporate sociopath a quarter century later. And even that may be overstating its longevity: Solaris may not have been truly born until it was made open source, and — certainly to me, anyway — it died the moment it was again made proprietary. But in that shorter life, Solaris achieved the singular: immortality for its revolutionary technologies. So while we can mourn the loss of the proprietary embodiment of Solaris (and we can certainly lament the coarse way in which its technologists were treated!), we can rejoice in the eternal life of its technologies — in illumos and beyond!

Posted on September 4, 2017 at 12:30 pm by bmc · Permalink · 38 Comments
In: Uncategorized

Reflections on Systems We Love

Last Tuesday, several months of preparation came to fruition in the inaugural Systems We Love. You never know what’s going to happen the first time you get a new kind of conference together (especially one as broad as this one!) but it was, in a word, amazing. The content was absolutely outstanding, with attendee after attendee praising the uniformly high quality. (For guided tours, check out both Ozan Onay’s excellent exegesis and David Cassel’s thorough New Stack story — and don’t miss Sarah Huffman’s incredible illustrations!) It was such a great conference that many were asking about when we would do it again — and there is already interest in replicating it elsewhere. As an engineer, this makes me slightly nervous as I believe that success often teaches you nothing: luck becomes difficult to differentiate from design. But at the risk of taunting the conference gods with the arrogance of a puny mortal, here’s some stuff I do think we did right:

Okay, so that’s a pretty long list of things that worked; what didn’t work so well? I would say that there was basically only a single issue: the packed schedule. We had 19 (!!) 20 minute talks, and there simply wasn’t time for the length or quantity of breaks that one might like. I think it worked out better than it sounds like it would (thanks to our excellent and varied presenters!), but it was nonetheless exhausting and I think everyone would have appreciated at least one more break. Still, there were essentially no complaints about the number of presentations, so we wouldn’t want to overshoot by slimming down too much; perhaps the optimal number is 16 talks spread over four sessions of four talks apiece?

So where to go from here? We know now that there is a ton of demand and a bunch of great content to match (I’m still bummed about the terrific submissions we turned away!), so we know that we can (and will) easily have this be an annual event. But it seems like we can do more: maybe an event on the east coast? Perhaps one in Europe? Maybe as a series of meetups in the style of Papers We Love? There are a lot of possibilities, so please let us know what you’d like to see!

Finally, I would like to reflect on the most personally satisfying bit of Systems We Love: simply by bringing so many like-minded people together in the same room and having them get to know one another, we know that lives have been changed; new connections have been made, new opportunities have been found, and new journeys have begun. We knew that this would happen in the abstract, but in recent days, we have seen it ourselves: in the new year, you will see new faces on the Joyent engineering team that we met at Systems We Love. (If it needs to be said, the love of systems is a unifying force across Joyent; if you find yourself captivated by the content and you’re contemplating a career change, we’re hiring!) Like most (if not all) of us, the direction of my life has been significantly changed by meeting or hearing the right person at the right moment; that we have helped facilitate similar changes in our own small way is intensely gratifying — and is very much at the heart of what Systems We Love is about!

Posted on December 21, 2016 at 12:16 pm by bmc · Permalink · Comments Closed
In: Uncategorized

Submitting to Systems We Love

We’ve been overwhelmed by the positive response to Systems We Love! As simple as this concept is, Systems We Love — like Papers We Love, !!Con and others that inspired it — has tapped into a current of enthusiasm. Adam Leventhal captured this zeitgeist in a Hacker News comment:

What catches our collective attention are systems we hate, systems that suck, systems that fail–or systems too new to know. It’s refreshing to consider systems established and clever enough to love. There are wheels we don’t need to reinvent, systems that can teach us.

Are you tantalized by Systems We Love but you don’t know what proposal to submit? For those looking for proposal guidance, my advice is simple: find the love. Just as every presentation title at !!Con must assert its enthusiasm by ending with two bangs, you can think of every talk at Systems We Love as beginning with an implicit “Why I love…” So instead of a lecture on, say, the innards of ZFS (and well you may love ZFS!), pick an angle on ZFS that you particularly love. Why do you love it or what do you love about it? Keep it personal: this isn’t about asserting the dominance of one system — this is about you and a system (or an aspect of a system) that you love.

Now, what if you don’t think you love anything at all? Especially if you write software for a living and you’ve been at it for a while, it can be easy to lose the love in the sluice of quotidian sewage that is a deployed system. But I would assert that beneath any sedimented cynicism there must be a core of love: think back to when you were first discovering software systems as your calling and to your initial awe of learning of how much more complicated these systems are than you realized (what a colleague of mine once called “the miracle of boot”) — surely there is something in that awe from which you draw (or at least, drew) inspiration! I acknowledge that this is the exception rather than the rule — that it feels like we are more often disappointed rather than pleasantly surprised — but this is the nature of the job: our work as software engineers takes us to the boundaries of systems that are emerging or otherwise don’t work properly rather than into the beautiful caverns deep below the surface. To phrase this in terms of an old essay of mine, we spend our time in systems that are grimy or fetid rather than immaculate — but Systems We Love is about the inspiration that we derive from those immaculate systems (or at least their immaculate aspects).

Finally, don’t set the bar too high for yourself: we are bound to have a complicated relationship with any system with which we spend significant time, and just because you love one aspect of a system doesn’t mean that other parts don’t enrage, troll or depress you! So just remember it’s not Systems We Know, Systems We Invented or Systems We Worship — it’s Systems We Love and we hope to see you there!

Posted on September 30, 2016 at 2:57 pm by bmc · Permalink · Comments Closed
In: Uncategorized

Systems We Love

One of the exciting trends of the past few years is the emergence of Papers We Love. I have long been an advocate of journal clubs, but I have also found that discussion groups can stagnate when confined to a fixed group or a single domain; by broadening the participants and encouraging presenters to select papers that appeal to the heart as well as the head, Papers We Love has developed a singular verve. Speaking personally, I have enjoyed the meetups that I have attended — and I was honored to be given the opportunity to present on Jails and Zones at Papers We Love NYC (for which, it must be said, I was flattered by the best introduction of all time). I found the crowd that gathered to be engaged and invigorating — and thought-provoking conversation went well into the night.

The energy felt at Papers We Love is in stark contrast to the academic venues in which computer science papers are traditionally presented, which I accentuated in a candid keynote at the USENIX Annual Technical Conference, pointing to PWL as a model that is much more amenable to cross-pollination of ideas between academics and practitioners. My keynote was fiery, and it may have landed on dry tinder: if Rik Farrow’s excellent summary of my talk is any indicator, the time is right for a broader conversation about how we publish rigorous work.

But for us practitioners, however well they are discussed, academic work remains somewhat ancillary: while papers are invaluable as a mechanism for the rigorous presentation of thinking, it is ultimately the artifacts that we develop — the systems themselves — that represent the tangible embodiment of our ideas. And for the systems that I am personally engaged in, I have found that getting together to them is inspiring and fruitful, e.g. the quadrennial dtrace.conf or the more regular OpenZFS developer summit. My experiences with Papers We Love and with these system-specific meetings caused me to ask on a whim if there would be interest in a one-day one-track conference that tried to capture the PWL zeitgeist but for systems — a “Systems We Love.”

While I had thrown this idea out somewhat casually, the response was too clear to ignore: there was most definitely interest — to the point of expectation that it would happen! And here at Joyent, a company for which love of systems is practically an organizing principle, the interest quickly boiled into a frothy fervor; we couldn’t not do this!

It took a little while to get the logistics down, but I’m very happy to report that Systems We Love is on: December 13th in San Francisco! To determine the program, I am honored to be joined by an extraordinary program committee: hailing from a wide-range of backgrounds, experience levels, and interests — and united by a shared love of systems. So: the call for proposals is open — and if you have a love of systems, we hope that you will consider submitting a proposal and/or joining us on December 13th!

Posted on September 26, 2016 at 2:55 pm by bmc · Permalink · Comments Closed
In: Uncategorized

Hacked by a bug?

Early this afternoon, I had just recorded a wide-ranging episode of Arrested DevOps with the incomparable Bridget Kromhout and noticed that I had a flurry of Twitter mentions, all in reaction to this tweet of mine. There was just one problem: I didn’t tweet it. With my account obviously hacked, I went into fight-or-flight mode and (thanks in no small part to Bridget’s calm presence) did the obvious things: I changed my Twitter password, revoked the privileges of all applications, and tried to assess the damage…

Other than the tweet, I (thankfully!) didn’t see any obvious additional damage: no crazy DMs or random follows or unfollows. In terms of figuring out where the malicious tweet had come from, the source of the tweet was “Twitter for Android” — but according to my login history, the last Twitter for Android login was from me during my morning commute about two-and-a-half hours before the tweet. (And according to Twitter, I have only used the one device to access my account.) The only intervening logins were two from Quora about an hour prior to the tweet. (Aside: WTF, Quora?! Revoked!)

Then there was the oddity of the tweet itself. There was no caption — just the two images from what I gathered to be Germany. Looking at the raw tweet, however, cleared up its source:

{
  "created_at": "Mon Sep 12 17:56:31 +0000 2016",
  "id": 775392664602554400,
  "id_str": "775392664602554369",
  "text": "https://t.co/pYKRhaAdvC",
  "truncated": false,
  "entities": {
    "hashtags": [],
    "symbols": [],
    "user_mentions": [],
    "urls": [],
    "media": [
      {
        "id": 775378240244449300,
        "id_str": "775378240244449280",
        "indices": [
          0,
          23
        ],
        "media_url": "http://pbs.twimg.com/media/CsKyZsBWgAAHgVq.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CsKyZsBWgAAHgVq.jpg",
        "url": "https://t.co/pYKRhaAdvC",
        "display_url": "pic.twitter.com/pYKRhaAdvC",
        "expanded_url": "https://twitter.com/MattAndersonBBC/status/775378264772775936/photo/1",
        "type": "photo",
        "sizes": {
          "medium": {
            "w": 1200,
            "h": 1200,
            "resize": "fit"
          },
          "large": {
            "w": 2048,
            "h": 2048,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "small": {
            "w": 680,
            "h": 680,
            "resize": "fit"
          }
        },
        "source_status_id": 775378264772776000,
        "source_status_id_str": "775378264772775936",
        "source_user_id": 1193503572,
        "source_user_id_str": "1193503572"
      }
    ]
  },
  "extended_entities": {
    "media": [
      {
        "id": 775378240244449300,
        "id_str": "775378240244449280",
        "indices": [
          0,
          23
        ],
        "media_url": "http://pbs.twimg.com/media/CsKyZsBWgAAHgVq.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CsKyZsBWgAAHgVq.jpg",
        "url": "https://t.co/pYKRhaAdvC",
        "display_url": "pic.twitter.com/pYKRhaAdvC",
        "expanded_url": "https://twitter.com/MattAndersonBBC/status/775378264772775936/photo/1",
        "type": "photo",
        "sizes": {
          "medium": {
            "w": 1200,
            "h": 1200,
            "resize": "fit"
          },
          "large": {
            "w": 2048,
            "h": 2048,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "small": {
            "w": 680,
            "h": 680,
            "resize": "fit"
          }
        },
        "source_status_id": 775378264772776000,
        "source_status_id_str": "775378264772775936",
        "source_user_id": 1193503572,
        "source_user_id_str": "1193503572"
      },
      {
        "id": 775378240248614900,
        "id_str": "775378240248614912",
        "indices": [
          0,
          23
        ],
        "media_url": "http://pbs.twimg.com/media/CsKyZsCWEAA4oOp.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CsKyZsCWEAA4oOp.jpg",
        "url": "https://t.co/pYKRhaAdvC",
        "display_url": "pic.twitter.com/pYKRhaAdvC",
        "expanded_url": "https://twitter.com/MattAndersonBBC/status/775378264772775936/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 680,
            "h": 680,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "medium": {
            "w": 1200,
            "h": 1200,
            "resize": "fit"
          },
          "large": {
            "w": 2048,
            "h": 2048,
            "resize": "fit"
          }
        },
        "source_status_id": 775378264772776000,
        "source_status_id_str": "775378264772775936",
        "source_user_id": 1193503572,
        "source_user_id_str": "1193503572"
      }
    ]
  },
  "source": "Twitter for Android",
  "in_reply_to_status_id": null,
  "in_reply_to_status_id_str": null,
  "in_reply_to_user_id": null,
  "in_reply_to_user_id_str": null,
  "in_reply_to_screen_name": null,
  "user": {
    "id": 173630577,
    "id_str": "173630577",
    "name": "Bryan Cantrill",
    "screen_name": "bcantrill",
    "location": "",
    "description": "Nom de guerre: Colonel Data Corruption",
    "url": "http://t.co/VyAyIJP8vR",
    "entities": {
      "url": {
        "urls": [
          {
            "url": "http://t.co/VyAyIJP8vR",
            "expanded_url": "http://dtrace.org/blogs/bmc",
            "display_url": "dtrace.org/blogs/bmc",
            "indices": [
              0,
              22
            ]
          }
        ]
      },
      "description": {
        "urls": []
      }
    },
    "protected": false,
    "followers_count": 10407,
    "friends_count": 1557,
    "listed_count": 434,
    "created_at": "Sun Aug 01 23:51:44 +0000 2010",
    "favourites_count": 2431,
    "utc_offset": -25200,
    "time_zone": "Pacific Time (US & Canada)",
    "geo_enabled": true,
    "verified": false,
    "statuses_count": 4808,
    "lang": "en",
    "contributors_enabled": false,
    "is_translator": false,
    "is_translation_enabled": false,
    "profile_background_color": "C0DEED",
    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
    "profile_background_tile": false,
    "profile_image_url": "http://pbs.twimg.com/profile_images/618537697670397952/gW9iQsvF_normal.jpg",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/618537697670397952/gW9iQsvF_normal.jpg",
    "profile_link_color": "0084B4",
    "profile_sidebar_border_color": "C0DEED",
    "profile_sidebar_fill_color": "DDEEF6",
    "profile_text_color": "333333",
    "profile_use_background_image": true,
    "has_extended_profile": false,
    "default_profile": true,
    "default_profile_image": false,
    "following": false,
    "follow_request_sent": false,
    "notifications": false
  },
  "geo": null,
  "coordinates": null,
  "place": {
    "id": "5a110d312052166f",
    "url": "https://api.twitter.com/1.1/geo/id/5a110d312052166f.json",
    "place_type": "city",
    "name": "San Francisco",
    "full_name": "San Francisco, CA",
    "country_code": "US",
    "country": "United States",
    "contained_within": [],
    "bounding_box": {
      "type": "Polygon",
      "coordinates": [
        [
          [
            -122.514926,
            37.708075
          ],
          [
            -122.357031,
            37.708075
          ],
          [
            -122.357031,
            37.833238
          ],
          [
            -122.514926,
            37.833238
          ]
        ]
      ]
    },
    "attributes": {}
  },
  "contributors": null,
  "is_quote_status": false,
  "retweet_count": 2,
  "favorite_count": 9,
  "favorited": false,
  "retweeted": false,
  "possibly_sensitive": false,
  "possibly_sensitive_appealable": false,
  "lang": "und"
}

Note in particular that the media has a source_status_id_str of 775378264772775936; it’s from this tweet roughly an hour before mine from Matt Anderson, the BBC Culture editor who (I gather) is Berlin-based.

Why would someone who had just hacked my account burn it by tweeting an innocuous (if idiosyncratic) photo of campaign posters on the streets of Berlin?! Suddenly this is feeling less like I’ve been hacked, and more like I’m the victim of data corruption.

Some questions I have, that I don’t know enough about the Twitter API to answer: first, how are tweets created that refer to media entities from other tweets? i.e., is there something about that tweet that can give a better clue as to how it was generated? Does the fact that it’s geolocated to San Francisco (albeit with the broadest possible coordinates) indicate that it might have come from the Twitter client misbehaving on my phone? (I didn’t follow Matthew Anderson and my phone was on my desk when this was tweeted — so this would be the app going seriously loco.) And what I’m most dying to know: what other tweets refer to the photos from the tweet from Matthew? (I gather that DataSift can answer this question, but I’m not a DataSift customer and they don’t appear to have a free tier.) If there’s a server-side bug afoot here, it wouldn’t be surprising if I’m not the only one affected.

I’m not sure I’m ever going to know the answers to these questions, but I’m leaving the tweet up there in hopes that it will provide some clues — and with the belief that the villain in the story, if ever brought to justice, will be a member of the shadowy cabal that I have fought my entire career: busted software.

Posted on September 13, 2016 at 12:11 am by bmc · Permalink · Comments Closed
In: Uncategorized