Recently, I made the mistake of rhetorically asking if I needed to spell out why unikernels are unfit for production. The response was overwhelming: whether people feel that unikernels are wrong-headed and are looking for supporting detail or are unikernel proponents and want to know what the counter-arguments could possibly be, there is clearly a desire to hear the arguments against running unikernels in production.
So, what’s the problem with unikernels? Let’s get a definition first: a unikernel is an application that runs entirely in the microprocessor’s privileged mode. (The exact nomenclature varies; on x86 this would be running at Ring 0.) That is, in a unikernel there is no application at all in a traditional sense; instead, application functionality has been pulled into the operating system kernel. (The idea that there is “no OS” serves to mislead; it is not that there isn’t an operating system but rather that the application has taken on the hardware-interfacing responsibilities of the operating system — it is “all OS”, if a crude and anemic one.) Before we discuss the challenges with this, it’s worth first exploring the motivations for unikernels — if only because they are so thin…
The primary reason to implement functionality in the operating system kernel is for performance: by avoiding a context switch across the user-kernel boundary, operations that rely upon transit across that boundary can be made faster. In the case of unikernels, these arguments are specious on their face: between the complexity of modern platform runtimes and the performance of modern microprocessors, one does not typically find that applications are limited by user-kernel context switches. And as shaky as they may be, these arguments are further undermined by the fact that unikernels very much rely on hardware virtualization to achieve any multi-tenancy whatsoever. As I have expanded on in the past, virtualizing at the hardware layer carries with it an inexorable performance tax: by having the system that can actually see the hardware (namely, the hypervisor) isolated from the system that can actually see the app (the guest operating system) efficiencies are lost with respect to hardware utilization (e.g., of DRAM, NICs, CPUs, I/O) that no amount of willpower and brute force can make up. But it’s not worth dwelling on performance too much; let’s just say that the performance arguments to be made in favor of unikernels have some well-grounded counter-arguments and move on.
The other reason given by unikernel proponents is that unikernels are “more secure”, but it’s unclear what the intellectual foundation for this argument actually is. Yes, unikernels often run less software (and thus may have less attack surface) — but there is nothing about unikernels in principle that leads to less software. And yes, unikernels often run new or different software (and are therefore not vulnerable to the OpenSSL vuln-of-the-week) but this security-through-obscurity argument could be made for running any new, abstruse system. The security arguments also seem to whistle past the protection boundary that unikernels very much depend on: the protection boundary between guest OS’s afforded by the underlying hypervisor. Hypervisor vulnerabilities emphatically exist; one cannot play up Linux kernel vulnerabilities as a silent menace while simultaneously dismissing hypervisor vulnerabilities as imaginary. To the contrary, by depriving application developers of the tools of a user protection boundary, the principle of least privilege is violated: any vulnerability in an application tautologically roots the unikernel. In the world of container-based deployment, this takes a thorny problem — secret management — and makes it much nastier (and with much higher stakes). At best, unikernels amount to security theater, and at worst, a security nightmare.
The final reason often given by proponents of unikernels is that they are small — but again, there is nothing tautologically small about unikernels! Speaking personally, I have done kernel implementation on small kernels and big ones; you can certainly have lean systems without resorting to the equivalent of a gastric bypass with a unikernel! (I am personally a huge fan of Alpine Linux as a very lean user-land substrate for Linux apps and/or Docker containers.) And to the degree that unikernels don’t contain much code, it seems more by infancy (and, for the moment, irrelevancy) than by design. But it would be a mistake to measure the size of a unikernel only in terms of its code, and here too unikernel proponents ignore the details of the larger system: because a unikernel runs as a guest operating system, the DRAM allocated by the hypervisor for that guest is consumed in its entirety — even if the app itself isn’t making use of it. Because running out of memory remains one of the most pernicious of application failure modes (especially in dynamic environments), memory sizing tends to be overengineered in that requirements are often blindly doubled or otherwise slopped up. In the unikernel model, any such slop is lost — nothing else can use it because the hypervisor doesn’t know that it isn’t, in fact, in use. (This is in stark contrast to containers in which memory that isn’t used by applications is available to be used by other containers, or by the system itself.) So here again, the argument for unikernels becomes much more nuanced (if not rejected entirely) when the entire system is considered.
So those are the reasons for unikernels: perhaps performance, a little security theater, and a software crash diet. As tepid as they are, these reasons constitute the end of the good news from unikernels. Everything else from here on out is bad news: costs that must be borne to get to those advantages, however flimsy.
The disadvantages of unikernels start with the mechanics of an application itself. When the operating system boundary is obliterated, one may have eliminated the interface for an application to interact with the real world of the network or persistent storage — but one certainly hasn’t forsaken the need for such an interace! Some unikernels (like OSv and Rumprun) take the approach of implementing a “POSIX-like” interface to minimize disruption to applications. Good news: apps kinda work! Bad news: did we mention that they need to be ported? And here’s hoping that your app’s “POSIX-likeness” doesn’t extend to fusty old notions like creating a process: there are no processes in unikernels, so if your app depends on this (ubiquitous, four-decades-old) construct, you’re basically hosed. (Or worse than hosed.)
If this approach seems fringe, things get much further afield with language-specific unikernels like MirageOS that deeply embed a particular language runtime. On the one hand, allowing implementation only in a type-safe language allows for some of the acute reliability problems of unikernels to be circumvented. On the other hand, hope everything you need is in OCaml!
So there are some issues getting your app to work, but let’s say you’re past all this: either the POSIX surface exposed by your unikernel of choice is sufficient for your app (or platform), or it’s already written in OCaml or Erlang or Haskell or whatever. Should you have apps that can be unikernel-borne, you arrive at the most profound reason that unikernels are unfit for production — and the reason that (to me, anyway) strikes unikernels through the heart when it comes to deploying anything real in production: Unikernels are entirely undebuggable. There are no processes, so of course there is no ps, no htop, no strace — but there is also no netstat, no tcpdump, no ping! And these are just the crude, decades-old tools. There is certainly nothing modern like DTrace or MDB. From a debugging perspective, to say this is primitive understates it: this isn’t paleolithic — it is precambrian. As one who has spent my career developing production systems and the tooling to debug them, I find the implicit denial of debugging production systems to be galling, and symptomatic of a deeper malaise among unikernel proponents: total lack of operational empathy. Production problems are simply hand-waved away — services are just to be restarted when they misbehave. This attitude — even when merely implied — is infuriating to anyone who has ever been responsible for operating a system. (And lest you think I’m an outlier on this issue, listen to the applause in my DockerCon 2015 talk after I emphasized the need to debug systems rather than restart them.) And if it needs to be said, this attitude is angering because it is wrong: if a production app starts to misbehave because of a non-fatal condition like (say) listen drops, restarting the app is inducing disruption at the worst possible time (namely, when under high load) and doesn’t drive at all towards the root cause of the problem (an insufficient backlog).
Now, could one implement production debugging tooling in unikernels? In a word, no: debugging tooling very often crosses the user-kernel boundary, and is most effective when leveraging the ad hoc queries that the command line provides. The organs that provide this kind of functionality have been deliberately removed from unikernels in the name of weight loss; any unikernel that provides sufficiently sophisticated debugging tooling to be used in production would be violating its own dogma. Unikernels are unfit for production not merely as implemented but as conceived: they cannot be understood when they misbehave in production — and by their own assertions, they never will be able to be.
All of this said, I do find some common ground with proponents of unikernels: I agree that the container revolution demands a much leaner, more secure and more efficient run-time than a shared Linux guest OS running on virtual hardware — and at Joyent, our focus over the past few years has been delivering exactly that with SmartOS and Triton. While we see a similar problem as unikernel proponents, our approach is fundamentally different: instead of giving up on the notion of secure containers running on a multi-tenant substrate, we took the already-secure substrate of zones and added to it the ability to natively execute Linux binaries. That is, we chose to leverage advances in operating systems rather than deny their existence, bringing to Linux and Docker not only secure on-the-metal containers, but also critical advances like ZFS, Crossbow and (yes) DTrace. This merits a final reemphasis: our focus on production systems is reflected in everything we do, but most especially in our extensive tooling for debugging production systems — and by bringing this tooling to the larger world of Linux containers, Triton has already allowed for production debugging that we never before would have thought possible!
In the fullness of time, I think that unikernels will be most productive as a negative result: they will primarily serve to demonstrate the impracticality of their approach for production systems. As such, they will join transactional memory and the M-to-N scheduling model as breathless systems software fads that fell victim to the merciless details of reality. But you needn’t take my word for it: as I intimated in my tweet, undebuggable production systems are their own punishment — just kindly inflict them upon yourself and not the rest of us!
At the beginning of the year, I laid down a few predictions. While I refuse on principle to engage in Stephen O’Grady-style self-flagellation, I do think it’s worth revisiting the headliner prediction, namely that 2015 is the year of the container. I said at the time that it wasn’t particularly controversial, and I don’t think it’s controversial now: 2015 was the year of the container, and one need look no further than the explosion of container conferences with container camps and container summits and container cons.
My second prediction was marginally more subtle: that the impedence mismatch between containers in development and containers in production would be a wellspring of innovation. If anything, this understated the case: the wellspring turned out to be more like an open sluice, and 2015 saw the world flooded with multiple ways of doing seemingly everything when it comes to containers. That all of these technologies and frameworks are open source have served to accelerate them all, and mutations abound (Hypernetes, anyone?).
On the one hand this is great, as we all benefit by so many people exploring so many different ideas. But on the other hand, the flurry of choice can become a blizzard of confusion — especially when and where there is seemingly overlap between two technologies. (Or worse, when two overlapping and opinionated technologies disagree ardently on those opinions!) This slide from Karl Isenberg of Mesosphere at KubeCon last month captured it; the point is neither the specific technologies (as Karl noted, plenty are missing) and nor is it about the specific layers (many would likely quibble with some of the details of Karl’s taxonomy) but rather about the explosion of abstraction (and concomitant confusion) in this domain.
One of the biggest challenges that we have in containers heading into 2016 is that this confusion now presents significant head winds for early-adopters and second-movers alike. This has become so acute that I posed a question to KubeCon attendees: are we at or near Peak Confusion in the container space? The conclusion among everyone I spoke with (vendors, developers, operators and others) was that we’re nowhere near Peak Confusion — with many even saying that confusion is still accelerating. (!) Even for those of us who have been in containers for years, this has been a little terrifying — and I can imagine for those entirely new to containers, it’s downright paralyzing.
So, what’s to be done? I think much of the responsibility lies with the industry: instead of viewing containers as new territory for conquest, we must take it upon ourselves to assure for users an interoperable and composable future — one in which technologies can differentiate themselves based on the qualities of their implementation rather than the voraciousness of their appetite. Lest this sound utopian, it is this same ethos that underlies our modern internet, as facilitated by the essential work of the Internet Engineering Task Force (IETF). Thanks to the IETF and its ethos of “rough consensus and running code” we ended up with the interoperable internet. (Indeed, this text itself was brought to you by RFC 791, RFC 793, RFC 1034, and RFC 2616 — among many, many others.)
As for an entity that can potentially serve an IETF-like role for container-based computing, I look with guarded optimism to today’s lauch of the Cloud-native Computing Foundation. Joyent has been involved with the CNCF since its inception, and based on what we’ve seen so far, we see great promise for it in 2016 and beyond. We believe that by elucidating component boundaries and by fostering open source projects that share the values of interoperability and composability, the CNCF can combine the best attributes of both the IETF and the Apache Foundation: rough consensus and running, open source software that allows elastic, container-deployed, service-oriented infrastructure. If the CNCF can do this it will (we believe) serve a vital mission for practitioners: displace confusion with clarity — and therefore accelerate our collective cloud-native future!
One of the exciting challenges of being an all open source company is figuring out how to get design conversations out of the lunch time discussion and the private IRC/Jabber/Slack channels and into the broader community. There are many different approaches to this, and the most obvious one is to simply use whatever is used for issue tracking. Issue trackers don’t really fit the job, however: they don’t allow for threading; they don’t really allow for holistic discussion; they’re not easily connected with a single artifact in the repository, etc. In short, even on projects with modest activity, using issue tracking for design discussions causes the design discussions to be drowned out by the defects of the day — and on projects with more intense activity, it’s total mayhem.
So if issue tracking doesn’t fit, what’s the right way to have an open source design discussion? Back in the day at Sun, we had the Software Development Framework (SDF), which was a decidedly mixed bag. While it was putatively shrink-to-fit, in practice it felt too much like a bureaucratic hurdle with concomitant committees and votes and so on — and it rarely yielded productive design discussion. That said, we did like the artifacts that it produced, and even today in the illumos community we find that we go back to the Platform Software Architecture Review Committee (PSARC) archives to understand why things were done a particular way. (If you’re looking for some PSARC greatest hits, check out PSARC 2002/174 on zones, PSARC 2002/188 on least privilege or PSARC 2005/471 on branded zones.)
In my experience, the best part of the SDF was also the most elemental: it forced things to be written down in a forum reserved for architectural discussions, which alone forced some basic clarity on what was being built and why. At Joyent, we have wanted to capture this best element of the SDF without crippling ourselves with process — and in particular, we have wanted to allow engineers to write down their thinking while it is still nascent, such that it can be discussed when there is still time to meaningfully change it! This thinking, as it turns out, is remarkably close to the original design intent of the IETF’s Request for Comments, as expressed in RFC 3:
The content of a note may be any thought, suggestion, etc. related to the software or other aspect of the network. Notes are encouraged to be timely rather than polished. Philosophical positions without examples or other specifics, specific suggestions or implementation techniques without introductory or background explication, and explicit questions without any attempted answers are all acceptable. The minimum length for a note is one sentence.
These standards (or lack of them) are stated explicitly for two reasons. First, there is a tendency to view a written statement as ipso facto authoritative, and we hope to promote the exchange and discussion of considerably less than authoritative ideas. Second, there is a natural hesitancy to publish something unpolished, and we hope to ease this inhibition.
We aren’t the only ones to be inspired by the IETF’s venerable RFCs, and the language communities in particular seem to be good at this: Java has Java Specification Requests, Python has Python Enhancement Proposals, Perl has the (oddly named) Perl 6 apocalypses, and Rust has Rust RFCs. But the other systems software communities have been nowhere near as structured about their design discussions, and you are hard-pressed to find similar constructs for operating systems, databases, container management systems, etc.
Encouraged by what we’ve seen by the language communities, we wanted to introduce RFCs for the open source system software that we lead — but because we deal so frequently with RFCs in the IETF context, we wanted to avoid the term “RFC” itself: IETF RFCs tend to be much more formalized than the original spirit, and tend to describe an agreed-upon protocol rather than nascent ideas. So to avoid confusion with RFCs while still capturing some of what they were trying to solve, we have started a Requests for Discussion (RFD) repository for the open source projects that we lead. We will announce an RFD on the mailing list that serves the community (e.g., sdc-discuss) to host the actual discussion, with a link to the corresponding directory in the repo that will host artifacts from the discussion. We intend to kick off RFDs for the obvious things like adding new endpoints, adding new commands, adding new services, changing the behavior of endpoints and commands, etc. — but also for the less well-defined stuff that captures earlier thinking.
Finally, for the RFD that finally got us off the mark on doing this, see RFD 1: Triton Container Naming Service. Discussion very much welcome!
Once, long ago, there was an engineer who broke the operating system particularly badly. Now, if you’ve implemented important software for any serious length of time, you’ve seriously screwed up at least once — but this was notable for a few reasons. First, the change that the engineer committed was egregiously broken: the machine that served as our building’s central NFS server wasn’t even up for 24 hours running the change before the operating system crashed — an outcome so bad that the commit was unceremoniously reverted (which we called a “backout”). Second, this wasn’t the first time that the engineer had been backed out; being backed out was serious, and that this had happened before was disconcerting. But most notable of all: instead of taking personal responsibility for it, the engineer had the audacity to blame the subsystem that had been the subject of the change. Now on the one hand, this wasn’t entirely wrong: the change had been complicated and the subsystem that was being modified was a bit of a mess — and it was arguably a preexisting issue that had merely been exposed by the change. But on the other hand, it was the change that exposed it: the subsystem might have been brittle with respect to such changes, but it had at least worked correctly prior to it. My conclusion was that the problem wasn’t the change per se, but rather the engineer’s decided lack of caution when modifying such a fragile subsystem. While the recklessness that had become a troubling pattern for this particular engineer, it seemed that there was a more abstract issue: how does one safely make changes to a large, complicated, mature software system?
Hoping to channel my frustration into something positive, I wrote up an essay on the challenges of developing Solaris, and sent it out to everyone doing work on the operating system. The taxonomy it proposed turned out to be useful and embedded itself in our engineering culture — but the essay itself remained private (it pre-dated blogs.sun.com by several years). When we opened the operating system some years later, the essay was featured on opensolaris.org. But as that’s obviously been ripped down, and because the taxonomy seems to hold as much as ever, I think it’s worth reiterating; what follows is a polished (and lightly updated) version of the original essay.
In my experience, large software systems — be they proprietary or open source — have a complete range of software quality within their many subsystems.
Some subsystems you find are beautiful works of engineering — they are squeaky clean, well-designed and well-crafted. These subsystems are a joy to work in but (and here’s the catch) by virtue of being well-designed and well-implemented, they generally don’t need a whole lot of work. So you’ll get to use them, appreciate them, and be inspired by them — but you probably won’t spend much time modifying them. (And because these subsystems are such a pleasure to work in, you may find that the engineer who originally did the work is still active in some capacity — or that there is otherwise a long line of engineers eager to do any necessary work in such a rewarding environment.)
Other subsystems are cobbled-together piles of junk — reeking garbage barges that have been around longer than anyone remembers, floating from one release to the next. These subsystems have little-to-no comments (or what comments they have are clearly wrong), are poorly designed, needlessly complex, badly implemented and virtually undebuggable. There are often parts that work by accident, and unused or little-used parts that simply never worked at all. They manage to survive for one or more of the following reasons:
- They work just well enough to not justify the cost of either rewriting them or switching them out
- The problem they solve isn’t important enough to justify the cost of rewriting them or switching them out
- The problem they solve is so nasty that the cost of a rewrite or a switch is enormous — or at least that it dwarfs the cost of ongoing maintenance
If you find yourself having to do work in one of these subsystems, you must exercise extreme caution: you will need to write as many test cases as you can think of to beat the snot out of your modification, and you will need to perform extensive self-review. You can try asking around for assistance, but you’ll quickly discover that no one is around who understands the subsystem. Your code reviewers probably won’t be able to help much either — maybe you’ll find one or two people that have had the same misfortune that you find yourself experiencing, but it’s more likely that you will have to explain most aspects of the subsystem to your reviewers. You may discover as you work in the subsystem that maintaining it is simply untenable — and it may be time to consider rewriting the subsystem from scratch. (After all, most of the subsystems that are in the first category replaced subsystems that were in the second.) One should not come to this decision too quickly — rewriting a subsystem from scratch is enormously difficult and time-consuming. Still, don’t rule it out a priori.
Even if you decide not to rewrite such a subsystem, you should improve it while you’re there in manners that don’t introduce excessive risk. For example, if something took you a while to figure out, don’t hesitate to add a block comment to explain your discoveries. And if it was a pain in the ass to debug, you should add the debugging support that you found lacking. This will make it slightly easier on the next engineer — and it will make it easier on you when you need to debug your own modifications.
Most subsystems, however, don’t actually fall neatly into either of these categories — they are somewhere in the middle. That is, they have parts that are well thought-out, or design elements that are sound, but they are also littered with implicit intradependencies within the subsystem or implicit interdependencies with other subsystems. They may have debugging support, but perhaps it is incomplete or out of date. Perhaps the subsystem effectively met its original design goals, but it has been extended to solve a new problem in a way that has left it brittle or overly complex. Many of these subsystems have been fixed to the point that they work reliably — but they are delicate and they must be modified with care.
The majority of work that you will do on existing code will be to subsystems in this last category. You must be very cautious when making changes to these subsystems. Sometimes these subsystems have local experts, but many changes will go beyond their expertise. (After all, part of the problem with these subsystems is that they often weren’t designed to accommodate the kind of change you might want to make.) You must extensively test your change to the subsystem. Run your change in every environment you can get your hands on, and don’t be be content that the software seems to basically work — you must beat the hell out of it. Obviously, you should run any tests that might apply to the subsystem, but you must go further. Sometimes there is a stress test available that you may run, but this is not a substitute for writing your own tests. You should review your own changes extensively. If it’s multithreaded, are you obeying all of the locking rules? (What are the locking rules, anyway?) Are you building implicit new dependencies into the subsystem? Are you using interfaces in a new way that may present some new risk? Are the interfaces that the subsystem exports being changed in a way that violates an implicit assumption that one of the consumers was making? These are not questions with easy answers, and you’ll find that it will often be grueling work just to gain confidence that you are not breaking or being broken by anything else.
If you think you’re done, review your changes again. Then, print your changes out, take them to a place where you can concentrate, and review them yet again. And when you review your own code, review it not as someone who believes that the code is right, but as someone who is certain that the code is wrong: review the code as if written by an archrival who has dared you to find anything wrong with it. As you perform your self-review, look for novel angles from which to test your code. Then test and test and test.
It can all be summed up by asking yourself one question: have you reviewed and tested your change every way that you know how? You should not even contemplate pushing until your answer to this is an unequivocal YES.. Remember: you are (or should be!) always empowered as an engineer to take more time to test your work. This is true of every engineering team that I have ever or would ever work on, and it’s what makes companies worth working for: engineers that are empowered to do the Right Thing.
Production quality all the time
You should assume that once you push, the rest of the world will be running your code in production. If the software that you’re developing matters, downtime induced by it will be painful and expensive. But if the software matters so much, who would be so far out of their mind as to run your changes so shortly after they integrate? Because software isn’t (or shouldn’t be) fruit that needs to ripen as it makes its way to market — it should be correct when it’s integrated. And if we don’t demand production quality all the time, we are concerned that we will be gripped by the Quality Death Spiral. The Quality Death Spiral is much more expensive than a handful of outages, so it’s worth the risk — but you must do your part by delivering production quality all the time.
Does this mean that you should contemplate ritual suicide if you introduce a serious bug? Of course not — everyone who has made enough modifications to delicate, critical subsystems has introduced a change that has induced expensive downtime somewhere. We know that this will be so because writing system software is just so damned tricky and hard. Indeed, it is because of this truism that you must demand of yourself that you not integrate a change until you are out of ideas of how to test it. Because you will one day introduce a bug of such subtlety that it will seem that no one could have caught it.
And what do you do when that awful, black day arrives? Here’s a quick coping manual from those of us who have been there:
- Don’t pretend it didn’t happen — you screwed up, but your mother still loves you
- Don’t minimize the problem, shrug it off or otherwise make light of it — this is serious business, and your colleagues take it seriously
- If someone spent time debugging your bug, thank them
- If someone was inconvenienced by your bug, apologize to them
- Take responsibility for your bug — don’t bother to blame other subsystems, the inherent complexity of the software, your code reviewers, your testers, the community, etc.
- If it was caught before it was running in production, be thankful that a production user wasn’t affected by it
But most importantly, you must ask yourself: what could I have done differently? If you honestly don’t know, ask a fellow engineer to help you. We’ve all been there, and we want to make sure that you are able to learn from it. Once you have an answer, take solace in it; no matter how bad you feel for having introduced a problem, you can know that the experience has improved you as an engineer — and that’s the most anyone can ask for.
The older I get, the more engineering values matter to me — and the more I seek out shared values in those with whom I endeavor to build things. For us at Joyent, those engineering values reflect that we operate the software we make: we believe that foundational systems must be designed to be robust and high-performing — and when they fail in this regard, it is incumbent upon the system itself to provide the tooling to diagnose the errant behavior. These values are not new (indeed, they are some of the oldest in computing), but there are times when they can feel endangered. It is our belief that the rise of cloud computing has — if anything — made the traditional values of systems software robustness more important. Recently, I’ve had the opportunity to get to know some of the Google engineers involved in the Kubernetes effort, and I have found that they broadly share Joyent’s engineering values — that they too seek to build a robust software substrate, as informed by their (substantial) experience operating systems at scale. Given our shared values, I was particularly pleased to learn of Google’s desire to create a new kind of foundation with their formation of the Cloud-native Computing Foundation. Today, I am excited to announce that Joyent is a charter member of the Cloud-native Computing Foundation, as it represents the values we sought to embody in the Triton stack — and I am honored to have been personally asked to serve on the foundation’s technical steering committee. We believe that we haven’t just joined a(nother) foundation, we have joined with those who share the mission that we have always had for ourselves: to help effect the next revolution in computing.
That I could possibly be so enthusiastic for a foundation merits further explanation, as I have historically been very forthright with my skepticism about foundations with respect to open source: three years ago, in a presentation on Corporate Open Source Anti-patterns (video), I described the insistence of giving newly-opened source code to a foundation as an anti-pattern, noting that giving up ownership also eschews leadership. I further cautioned that many underestimate the complexity and constraints of a 501(c)(3) — while overestimating the need for an explicitly non-profit organization’s involvement in a company’s open source efforts. While these statements about foundations were unequivocal, I also ended that presentation by saying that my observations shouldn’t be perceived as hard rules — and implied that the thinking may change over time as we continue to learn from our own experiences.
Three years after that presentation, I still broadly stand by my claims — but (as my enthusiasm for the Cloud Native Computing Foundation indicates) foundations are one area where my thinking has definitely shifted. In particular, in those rare instances when an open source technology reaches a level of ubiquity such as to sediment into collective bedrock, I believe that it actually does belong in a foundation. How do you know if your open source project is in this category? If multiple companies are betting their future on your open source project, congratulate yourself for laying down the bedrock upon which others are building — and then get it into a foundation to assure its future. This can be hard to internalize (after all, you have almost certainly put more resources into it than anyone else; why should you be expected to simply give that away?!), but the reality is that the commercial pressures that are now being exerted on your (incredibly popular!) technology will rip it apart if you don’t preserve its fate. This can be doubly frustrating when you feel you are acting in the community’s best interests, but as soon as that community includes rival commercial interests, only a foundation can provide the necessary (but not sufficient!) neutrality to assure the community that the technology’s future transcends the fate of any one company. Certainly, we learned all this the hard way with node.js — but the problem is in no way unique to node.js or to Joyent. Indeed, with open source now essentially a constraint on new infrastructure software, we can expect this transition (from corporate-owned open source to foundation-owned open source) will happen with increasing frequency. (Should you find yourself at OSCON this week, this trend and its ramifications is the subject of my talk on Thursday.)
In this regard, the Docker world has been particularly interesting of late: the domain is entirely open source, with many companies (including Joyent!) betting their futures not just on Docker, but on the many other technologies in the ecosystem. With so much bedrock suddenly forming, foundations were practically preordained — so it was no surprise to see the announcement of the Open Container Project at DockerCon just a few weeks ago. We at Joyent applaud these developments (and we are a charter member of the OCP), but I confess that the sprouting of foundations has left me feeling somewhat underwhelmed: are we really to have a foundation for every GitHub repo that reaches a certain level of popularity? To be clear, I don’t object to the foundations in the abstract so much as the cacophony of their putative missions: having the mission of a foundation being merely to promote a particular technology feels like it’s aiming a bit low in Maslow’s hierarchy of needs. Now, one can certainly collect open source software into a foundation like the Apache Foundation — but as we move to a world where an increasing amount of software is open source, what becomes of their mission? Foundations that are amalgamations of otherwise unrelated software seem to me to run the risk of becoming open source orphanages: providing shelter and a modicum of structure, perhaps, but lacking a sense of collective purpose.
The promise of the Cloud-native Computing Foundation is that it offers a potential third model: while the foundation will serve as the new home for Kubernetes, it’s not limited to Kubernetes — nor is it an open source dumping ground. Rather, this foundation is dedicated to a particular ethos: the creation of the new kinds of application and (especially) service stacks that represent modern, server-side computing. That is, it is a foundation with a true mission: to advance key open source technologies that constitute modern, elastic computing. As such, it seeks to transcend any single technology — it has a raison d’être that runs deeper than mere self-preservation. I would like to think that this third parth can serve as a model in the new, all-open world: foundations as entities that don’t let their corporate neutrality prevent them from being opinionated as to their mission, their constituent technologies or — importantly — their engineering values!
When Docker first rocketed into the nerdosphere in 2013, some wondered how we at Joyent felt about its popularity. Having run OS containers in multi-tenant production for nearly a decade (and being one of the most vocal proponents of OS-based virtualization), did we somehow resent the relatively younger Docker? Some were surprised to learn that (to the contrary!) we have been elated to see the rise of Docker: we share with Docker a vision for a containerized future, and we love that Docker has brought the technology to a much broader audience — and via an entirely different vector (namely, emphasizing developer agility instead of merely operational efficiency). Given our enthusiasm, you can imagine the question we posed to ourselves over a year ago: could we somehow combine the operational strength of SmartOS containers with the engaging developer experience of Docker? Importantly, we had no desire to develop a “better” Docker — we merely wanted to use SmartOS and SmartDataCenter as a substrate upon which to deploy Docker containers directly onto the metal. Doing this would leverage over a decade of deep operating systems engineering with technologies like Crossbow, ZFS, DTrace and (of course) Zones — and would deliver all of the operational advantages of pure OS-based virtualization to Docker containers: performance, elasticity, security and density.
That said, there was an obvious hurdle: while designed to be cross-platform, Docker is a Linux-borne technology — and the repository of Docker images is today a collection of Linux binaries. While SmartOS is Unix, it (somewhat infamously) isn’t Linux: applications need to be at least recompiled (if not ported) to work on SmartOS. Into this gap came a fortuitous accident: David Mackay, a member of the illumos community, attempted to revive LX-branded zones, an old Sun project that provided Linux emulation in a zone. While this project had been very promising when it was first done years ago, it had also been restricted to emulating a 2.4 Linux kernel for 32-bit binaries — and it was clear at the time that modernizing it was going to be significant work. As a result, the work sat unattended in the system for a while before being unceremoniously ripped out in 2010. It seemed clear that with the passage of time, this work would hardly be revivable: it had been so long, any resurrection was going to be tantamount to a rewrite.
But fortunately, David didn’t ask us our opinion before he attempted to revive it — he just did it. (As an aside: a tremendous advantage of open source is that the community can perform experiments that you might deem too risky or too expensive in terms of opportunity cost!) When David reported his results, we were taken aback: yes, this had the same limitations that it had always had (namely, 32-bit and lacking many modern Linux facilities), but given how many modern binaries still worked, it was also clear that this was a more viable path than we had thought. Energized by David’s results, Joyent’s Jerry Jelinek picked it up from there, reintegrating the Linux brand into SmartOS in March of last year. There was still much to do of course, but Jerry’s work was a start — and reflected the constraints we imposed on ourselves: do it all in the open; do it all on SmartOS master; develop general-purpose illumos facilities wherever possible; and aim to upstream it all when we were done.
Around this time, I met with Docker CTO Solomon Hykes to share our (new) vision. Honestly, I didn’t know what his reaction would be; I had great respect for what Docker had done and was doing, but didn’t know how he would react to a system bold enough to go its own way at such a fundamental level. Somewhat to my surprise, Solomon was incredibly supportive: not only was he aware of SmartOS, but he was also intimately familiar with zones — and he didn’t need to be convinced of the merits of our approach. Better, he asked a question near and dear to my heart: “Does this mean that I’ll be able to DTrace my Linux apps in a Docker container?” When I indicated that yes, that’s exactly what it would mean, he responded: “It will be the best of all worlds!” That Solomon (and by extension, Docker) was not merely willing but actually eager to see Docker on SmartOS was hugely inspirational to us, and we redoubled our efforts.
Back at Joyent, we worked assiduously under Jerry’s leadership over the spring and summer, and by the fall, we were ready for an attempt on the summit: 64-bit. Like other bringup work we’ve done, this work was terrifying in that we had very little forward visibility, and little ability to parallelize. As if he were Obi-Wan Kenobi meeting Darth Vader in the Death Star, Jerry had to face 64-bit — alone. Fortunately, Jerry didn’t suffer Ben Kenobi’s fate; by late October, he had 64-bit working! With the project significantly de-risked, everything kicked into high gear: Josh Wilsdon, Trent Mick and their team went to work understanding how to integrate SmartDataCenter with Docker; Josh Clulow, Patrick Mooney and I attacked some of the nasty LX-branded zone issues that remained; and Robert Mustacchi and Rob Gulewich worked towards completing their vision for network virtualization. Knowing what we were going to do — and how important open source is to modern infrastructure software in general and Docker in particular — we also took an important preparatory step: we open sourced SmartDataCenter and Manta.
Charged by having all of our work in the open and with a clear line of sight on what we wanted to deliver, progress was rapid. One major question: where to run the Docker daemon? In digging into Docker, we saw that much of what the actual daemon did would need to be significantly retooled to be phrased in terms of not only SmartOS but also SmartDataCenter. However, our excavations also unearthed a gem: the Docker Remote API. Discovering a robust API was a pleasant surprise, and it allowed us to take a different angle: instead of running a (heavily modified) Docker daemon, we could implement a new SDC service to provide a Docker Remote API endpoint. To Docker users, this would look and feel like Docker — and it would give us a foundation that we knew we could develop. At this point, we’re pretty good at developing SDC-based services (microservices FTW!), and progress on the service was quick. Yes, there were some thorny issues to resolve (and definitely note differences between our behavior and the stock Docker behavior!), but broadly speaking we have been able to get it to work without violating the principle of least surprise. And from a Docker developer perspective, having a Docker host that represents an entire datacenter — that is, a (seemingly) galactic Docker host — feels like an important step forward. (Many are as excited by this work as we are, but I think my favorite reaction is the back-handed compliment from Jeff Waugh of Canonical fame; somehow a compliment that is tied to an insult feels indisputably earnest.)
With everything coming together, and with new hardware being stood up for the new service, there was one important task left: we needed to name this thing. (Somehow, “SmartOS + LX-branded zones + SmartDataCenter + sdc-portolan + sdc-docker” was a bit of a mouthful.) As we thought about names, I turned back to Solomon’s words a year ago: if this represented the best of two different worlds, what mythical creatures were combinations of different animals? While this search yielded many fantastic concoctions (a favorite being Manticore — and definitely don’t mess with Typhon!), there was one that stood out: Triton, son of Poseidon. As half-human and half-fish and a god of the deep, Triton represents the combination of two similar but different worlds — and as a bonus, the name rolls off the tongue and fits nicely with the marine metaphor that Docker has pioneered.
So it gives me great pleasure to introduce Triton to the world — a piece of (open source!) engineering brought to you by a cast of thousands, over the course of decades. In a sentence (albeit a wordy one), Triton lets you run secure Linux containers directly on bare metal via an elastic Docker host that offers tightly integrated software-defined networking. The service is live, so if you want to check it out, sign up! If you’re looking for more technical details, check out both Casey’s blog entry and my Future of Docker in Production presentation. If you’d like it on-prem, get in touch. And if you’d prefer to DIY, start with sdc-docker. Finally, forgive me one shameless plug: if you happen to be in the New York City area in early April, be sure to join us at the Container Summit, where we’ll hear perspectives from analysts like Gartner, enterprise users of containers like Lucera and Walmart, and key Docker community members like Tutum, Shopify, and Docker themselves. Should make for an interesting afternoon!
Welcome to Triton — and to the best of all worlds!
Recently, Randy Bias of EMC (formerly of CloudScaling) wrote an excellent piece on Why “Vanilla OpenStack” doesn’t exist and never will. If you haven’t read it and you are anywhere near a private cloud effort, you should consider it a must-read: Randy debunks the myth of a vanilla OpenStack in great detail. And it apparently does need debunking; as Randy outlines, those who are deploying an on-premises cloud expect:
- A uniform, monolithic cloud operating system (like Linux)
- Set of well-integrated and interoperable components
- Interoperability with their own vendors of choice in hardware, software, and public cloud
We at Joyent can vouch for these expectations, because years ago we had the same aspirations for our own public cloud. Though perhaps unlike others, we have also believed in the operating system as differentiator — and specifically, that OS containers are the foundation of elastic infrastructure — so we didn’t wait for a system to emerge, but rather endeavored to write our own. That is, given the foundation of our own container-based operating system — SmartOS — we set out to build exactly what Randy describes: a set of well-integrated, interoperable components on top of a uniform, monolithic cloud operating system that would allow us to leverage the economics of commodity hardware. This became SmartDataCenter, a container-centric distributed system upon which we built our own cloud and which we open sourced this past November.
The difference between SmartDataCenter and OpenStack mirrors the difference between the expectations for OpenStack and the reality that Randy outlines: where OpenStack is accommodating of many different visions for the cloud, SmartDataCenter is deliberately opinionated. In SmartDataCenter you don’t pick the storage substrate (it’s ZFS) or the hypervisor (it’s SmartOS) or the network virtualization (it’s Crossbow). While OpenStack deliberately accommodates swapping in different architectural models, SmartDataCenter deliberately rejects it: we designed it for commodity storage (shared-nothing — for good reason), commodity network equipment (no proprietary SDN) and (certainly) commodity compute. So while we’re agnostic with respect to hardware (as long as it’s x86-based and Ethernet-based), we are prescriptivist with respect to the software foundation that runs upon it. The upshot is that the integrator/operator retains control over hardware (and the different economic tradeoffs that that control allows), but needn’t otherwise design the system themselves — which we know from experience can result in greatly reduced times of deployment. (Indeed, one of the great prides of SmartDataCenter is our ease of install: provided you’re racked, stacked and cabled, you can get a cloud stood up in a matter of hours rather than days, weeks or longer.)
So in apparent contrast to OpenStack, SmartDataCenter only comes in “vanilla” (in Randy’s parlance). This is not to say that SmartDataCenter is in any way plain; to the contrary, by having such a deliberately designed foundation, we can unlock rapid innovation, viz. our emerging Docker integration with SmartDataCenter that allows for Docker containers to be deployed securely and directly on the metal. We are very excited about the prospects of Docker on SmartDataCenter, and so are other people. So in as much as SmartDataCenter is vanilla, it definitely comes with whipped cream and a cherry on top!
Fifteen years ago, I initiated a time-honored tradition among my colleagues in kernel development at Sun: shortly after the first of every year, we would get together at our favorite local restaurant to form predictions for the coming year. We made one-year, three-year and six-year predictions for both our technologies and more broadly for the industry. We did this for nine years running — from 2000 to 2008 inclusive — and came to know the annual ritual as a play on the restaurant name: Predicteria.
I have always been in interested in our past notions of the future (hoverboards and self-lacing shoes FTW!), and looking back now at nearly a decade of our predictions has led me to an inescapable (and perhaps obvious) conclusion: predictions tell you more about the present than the future. That is, predictions reflect the zeitgeist of the day — both in substance and in tone: in good years, people predict halcyon days; in bad ones, the apocalypse. And when a particular company or technology happened to be in the news or otherwise on the collective mind, predictions tended to be centered around it: it was often the case that several people would predict that a certain company would be acquired or that a certain technology would flourish — or perish. (Let the record reflect that the demise of Itanium was accurately predicted many times over.)
Which is not to say that we never made gutsy predictions; in 2006, a colleague made a one-year prediction that “GOOG embarrassed by revelation of unauthorized US government spying at Gmail.” The timing may have been off, but the concern was disturbingly prescient. Sometimes the predictions were right, but for the wrong reasons: in 2003, one of my three-year predictions was that “Apple develops new ‘must-have’ gadget called the iPhone, a digital camera/MP3 player/cell phone.” This turned out to be stunningly accurate, even down to the timing (and it was by far my most accurate big prediction over the years), but if you can’t tell by the snide tone, I thought that such a thing would be Glass-like in its ludicrousness; I had not an inkling as to its transformative power. (And indeed, when the iPhone did in fact emerge a few years later, several at Predicteria predicted that it would be a disappointing flop.)
But accurate predictions were the exception, not the rule; our predictions were usually wrong — often wildly so. Evergreen wildly wrong predictions included: the rise of carbon nanotube-based memory, the relevance of quantum computing, and the death of tape, disk or volatile DRAM (each predicted several times over). We were also wrong by our omissions: as a group, we entirely failed to predict cloud computing — or even the rise of hardware-based virtualization.
I give all of this as a backdrop to some predictions for the coming year. If my experience taught me anything, it’s that these predictions may very well be right on trajectory, but wrong on timing — and that they may well capture current thinking more than they meaningfully predict the future. They also may be (rightfully) criticized for, as they say, talking our book — but we have made our bets based on where we think things are going, not vice versa. And finally, I apologize that these are somewhat milquetoast predictions; I’m afraid that practical concerns muffle the gutsy predictions that name names and boldly predict their fates!
Without further ado, looking forward to 2015:
- 2015 is the year of the container. If you’ve read our 2014 in review, you won’t be at all surprised to read this and I don’t really think that it’s controversial. Thanks to Docker, the world is finally figuring out that OS-based virtualization is actually highly disruptive (better performance and lower cost!), and I think that this realization will become mainstream in 2015. I don’t think that Docker will necessarily be the only substrate, but I believe that its momentum is such that it will continue to dominate in 2015.
- The impedance mismatch between containers in development and containers in production will be a wellspring of innovation. Currently, containers have a ton of developer enthusiasm — but limited production deployments, in part because of production concerns around security, persistence and network virtualization. All of these problems have multiple solutions — and there will be still more in 2015 (not least from us). It’s hard to predict winners (or if winners will even emerge in 2015), but it’s a sure bet that there will be (many) players tackling the problems in interesting ways.
- The on-prem versus public cloud debate will be increasingly seen as a false dichotomy. As I outlined at GigaOm Structure in June, the future is heterogenous: the public versus on-prem distinction will be be a rent-versus-buy decision based on economics, risk management and latency. Entities will not have one answer here, but many: one is not going to vanquish the other. That said, we also believe that choosing one versus the other shouldn’t involve having to change the underlying technology substrate — a major factor for us when we open sourced SmartDataCenter in November.
- The internet-of-things broadens beyond mere trend. I continue to believe that IoT is an absolute monster of a shift — one that will have profound changes for nearly every economic endeavor. As Marc Andreessen famously observed, “software is eating the world” — and ubiquitous computing will increasingly be the buffet. At Velocity 2014, I described architecting for the deluge of data, and I think that the arrival of machine-generated data (and the analytics to turn that data into business decisions) will be a defining characteristic of the next decade(s) — part of why we open sourced Manta when we opened SmartDataCenter. To what degree this happens in 2015 is unclear, but at some point we will stop thinking of IoT as a “trend” and realize that it is simply part of the rising tide of the information society.
- Wearable computing fizzles. If wearable computing ever has a time, it isn’t now: devices out there today are gimmicky, brittle, and — to put it euphemistically — overly flashy. These devices will continue to enjoy some life among the one-percenter Sharper Image set, but they won’t actually make the leap to broad consumer devices. (An exception may be the smart onesie; speaking from my own experience, first-time parents are suckers for anything that seems to relate to infant safety.)
- Security concerns become focused on intrusion and fraud detection. If the Target breach didn’t wake people up, the Sony Pictures fracas sure as hell did. As we head into 2015, it is being more broadly recognized that security is not a public vs. on-prem issue (after all, the biggest breaches have been on-prem) — and that disgruntled insiders may represent the greatest single threat to an entity. Security (and specifically, intrusion and fraud detection) will be viewed increasingly as a real-time big data problem — one that could save a business from an existential threat.
- Cryptocurrencies begin their long journey to laughingstock. In 2015, cryptocurrencies will stop being viewed as having “one bad year” — and start being viewed for what they are: modern-day Beanie Babies for ultra-libertarian math nerds. It’s stunning to me how many otherwise intelligent people fell for cryptocurrency, having forgotten (or failed to understand in the first place) what a currency actually is: a medium for exchange and a store of value. As such, currencies that experience acute inflation or (worse) acute deflation tautologically fail. We actually have a lot of (willfully ignored) macroeconomic history that tells us this; central banking was only invented after we had had disastrous experiences with every other way of managing currency (viz. the Panic of 1837). And this isn’t just like, my opinion, man: no one purchasing or selling goods is going to opt into a currency that introduces volatility risk in an orthogonal (and often low margin) commercial transaction. (Indeed, in the you-can’t-make-this-up department, Bitcoin has become too volatile to be reliably used for ransom.) Of course, proponents are now saying that “the value isn’t in the currency but in the blockchain application stack” — which is like saying the value of Beanie Babies is in their soft and cuddly nature: yes the value of Punchers the Lobster is non-zero, but it doesn’t mean that it’s actually worth 2,600 clams.
Right or wrong, these predictions point to an exciting 2015. And if nothing else you can rely on my for a candid self-assessment of my predictions — you’ll just need to wait fifteen years or so!
When looking back on 2014 from an infrastructure perspective, it’s hard not to have one word on the lips: Docker. (Or, as we are wont to do in Silicon Valley when a technology is particularly hot, have the same word on the lips three times over à la Gabbo: “Docker, Docker, DOCKER!”) While Docker has existed since 2013, 2014 was indisputably the year in which it transcended from an interesting project to a transformative technology — a shift which had profound ramifications for us at Joyent.
The enthusiasm for Docker has been invigorating: it validates Joyent’s core hypothesis that OS-based virtualization is the infrastructure substrate of the future. That said, going into 2014, there was also a clear impedance mismatch: while Docker was refreshingly open to being cross-platform, the reality is that it was being deployed exclusively on Linux — and that the budding encyclopedia of Docker images was exclusively Linux-based. Our operating system, SmartOS, is an illumos derivative that it many ways is similar to Linux (they’re both essentially Unix, after all), but it’s also different enough to be an impediment. So the arrival of Docker in 2013 left us headed into 2014 with a kind of dilemma: how can we enable Docker on our proven SmartOS-based substrate for OS containers while still allowing existing Linux-based images to function?
Into this quandary came a happy accident: David Mackay, an illumos community member, revived lx branded zones, work that had been explored some number of years ago to execute complete Linux binary environments in an illumos zone. This work was so old that, frankly, we didn’t feel it was likely to be salvageable — but we were pleasantly surprised when it seemed to still function for some modern binaries. (If it needs to be said, this is yet another example of why we so fervently believe in open source: it allows for others to explore ideas that may seem too radical for commercial entities with mortgages to pay and mouths to feed.)
Energized by the community, Joyent engineer Jerry Jelinek went to work in the spring, bolstering the emulation layer and getting it to work with progressively more and more modern Linux systems. By late summer, 32-bit was working remarkably well on Ubuntu 14.04 (an odyssey that I detailed in my illumos day Surge presentation) and we were ready to make an attempt at the summit: 64-bit Linux emulation. Like much bringup work, the 64-bit work was excruciating because it was very hard to forecast: you can be one bug away from a functioning system or a hundred — and the only way to really know is to grind through them all. Fortunately, we are nothing if not persistent, and by late fall we had 64-bit working on most stuff — and thanks to early adopter community members like Jorge Schrauwen, we were able to quickly find increasingly obscure software to validate it against. (Notes to self: (1) “Cabal hell” is a thing and (2) I bet HHVM is unaware of the implicit dependency they have on Linux address space layout.)
With the LX branded zone work looking very promising, Joyent engineer Josh Wilsdon led a team studying Docker to determine the best way to implement it on SmartOS for our orchestration software, SmartDataCenter. In doing this, we learned about a great Docker strength: its remote API. This API allows us to do exactly what robust APIs have allowed us to do for time immemorial: replace one implementation with a different one without breaking upstack software. Implementing a Docker API endpoint would also allow for a datacenter-wide Docker view that would solve many other problems for us as well; in late autumn, we set out building sdc-docker, a Docker engine for SDC that we have been developing in the open. As with the LX branded zone work, we are far enough along to validate the approach: we know that we can make this work.
In parallel to these two bodies of work, a third group of Joyent engineers led by Robert Mustacchi was tackling a long-standing problem: extending the infrastructure present in SmartOS for robust (and secure!) network virtualization for OS containers to the formation of virtual layer two networks that can span an entire datacenter (that is, finally breaking the shackles of .1q VLANs). We have wanted to do this for quite some time, but the rise of Docker has given this work a new urgency: of the Linux problems with respect to OS-based containers, network virtualization is clearly among the most acute — and we have heard over and over again that it has become an impediment to Docker in production. Robert and team have made great progress and by the end of 2014 had the first signs of life from the SDC integration point for this work.
The SmartDataCenter-based aspects of our Docker and network virtualization work embody an important point of distinction: while OpenStack has been accused of being “a software particle-board designed by committee”, SDC has been deliberately engineered based on our experience actually running a public cloud at scale. That said, OpenStack has had one (and arguably, only one) historic advantage: it is open source. While the components of SDC (namely, SmartOS and node.js) have been open, SDC itself was not. The rise of Docker — and the clear need for an open, container-based stack instead of some committee-designed VMware retread — allowed us to summon the organizational will to take an essential leap: on November 6th, we open sourced SDC and Manta.
Speaking of Manta: with respect to containers, Joyent has been living in the future (which, in case it sounds awesome, is actually very difficult; being ahead of the vanguard is a decidedly mixed blessing). If the broader world is finally understanding the merits of OS-based virtualization with respect to standing compute, it still hasn’t figured out that it has profound ramifications for scale-out storage. However, with the rise of Docker in 2014, we have more confidence than ever that this understanding will come in due time — and by open sourcing Manta we hope to accelerate it. (And certainly, you can imagine that we’ll help connect the dots by allowing Manta jobs to be phrased as Docker containers in 2015.)
Add it all up — the enthusiasm for Docker, the great progress of the LX-branded zone work, the Docker engine for SDC, the first-class network virtualization that we’re building into the system — and then give it the kicker of an entirely open source SmartDataCenter and Manta, and you can see that it’s been a hell of a 2014 for us. Indeed, it’s been a hell of a 2014 for the entire Docker community, and we believe that Matt Asay got it exactly right when he wrote that “Docker, hot as it was in 2014, will be even hotter in 2015.”
So here’s to a hot 2014 — and even hotter 2015!
Today we are announcing that we are open sourcing the two systems at the heart of our business: SmartDataCenter and the Manta object storage platform. SmartDataCenter is the container-based orchestration software that runs the Joyent public cloud; we have used it for the better half of a decade to run on-the-metal OS containers — securely and at scale. Manta is our multi-tenant ZFS-based object storage platform that provides first-class compute by allowing OS containers to be spun up directly upon objects — effecting arbitrary computation at scale without data movement. The unifying technological foundation beneath both SmartDataCenter and Manta is OS-based virtualization, a technology that Joyent pioneered in the cloud way back in 2006. We have long known the transformative power of OS containers, so it has been both exciting and validating for us to see the rise of Docker and the broadening of appreciation for OS-based virtualization. SmartDataCenter and Manta show that containers aren’t merely a fad or developer plaything but rather a fundamental technological advance that represents the foundation for the next generation of computing — and we believe that open sourcing them advances the adoption of container-based architectures more broadly.
Without any further ado — and to assure that we don’t fall into the most prominent of my own corporate open source anti-patterns — here is the source for SmartDataCenter and the source for Manta. These are sophisticated systems with many moving parts, and you’ll see that these two repositories are in fact meta-repositories that explain the design of each of the systems and then point to the (many) components that comprise them (all now open source, natch). We believe that some of these subcomponents will likely find use entirely outside of SDC and Manta. For example, Manatee is a ZooKeeper-based system that manages Postgres replication and automates failover; Moray is a key-value service that lives on top of Postgres. Taken together, Manatee and Moray implement a highly-available key-value service that we use as the foundation for many other components in SDC and Manta — and one that we think others will find useful as well.
In terms of source code mechanics, you’ll see that many of the components are implemented in either node.js or by extending C-based systems. This is not by fiat but rather by the choices of individual engineers; over the past four years, as we learned about the nuances of node.js error handling and as we invested heavily in tooling for running node.js in production, node.js became the right tool for many of our jobs — and we used it for many of the services that constitute SDC and Manta.
And because any conversation about open source has to address licensing at some point or another, let’s get that out of the way: we opted for the Mozilla Public License 2.0. While relatively new, there is a lot to like about this license: its file-based copyleft allows it to be proprietary-friendly while also forcing certain kinds of derived work to be contributed back; its explicit patent license discourages litigation, offering some measure of troll protection; its explicit warranting of original work obviates the need for a contributor license agreement (we’re not so into CLAs); and (best of all, in my opinion), it has been explicitly designed to co-exist with other open source licenses in larger derived works. Mozilla did terrific work on MPL 2.0, and we hope to see it adopted by other companies that share our thinking around open source!
In terms of the business ramifications, at Joyent we have long been believers in open source as a business model; as the leaders of the node.js and SmartOS projects, we have seen the power of open source to start new conversations, open up new markets and (importantly) yield new customers. Ten years ago, I wrote that open source is “a loss leader — minus the loss, of course”; after a decade of experience with open source business models, I would add that open source also serves as sales outreach without cold calls, as a channel without loss of margin, and as a marketing campaign without advertisements. But while we have directly experienced the business advantages of open source, we at Joyent have also lived something of a dual life: node.js and SmartOS have been open source, but the distributed systems that we have built using these core technologies have remained largely behind our walls. So that these systems are now open source does not change the fundamentals of our business model: if you would like to consume SmartDataCenter or Manta as a service, you can spin up an instance on the public cloud or use our Manta storage service. Similarly, if you want a support contract and/or professional services to run either SmartDataCenter or Manta on-premises, we’ll sell them to you. Based on our past experiences with open source, we do know that there will be one important change: these technologies will find their way into the hands of those that we have no other way of reaching — and that some fraction of these will become customers. Also based on past experience, we know that some (presumably much smaller) fraction of these new technologists will — by merits of their interest in and contributions to these projects — one day join us as engineers at Joyent. Bluntly, open source is our farm system, and broadening our hiring channel during a blazingly hot market for software talent is playing no small role in our decision here. In short, this is not an act of altruism: it is a business decision — if a multifaceted one that we believe has benefits beyond the balance sheet.
Welcome to open source SDC and Manta — and long-live the container revolution!