Visualizing the Cloud
I’ve worked on visualizations for a while, most recently with heatmaps for Joyent’s Cloud Analytics. While we’re using and enhancing these right now, we are also in a great position to continue developing new visualizations for cloud computing, given:
- Easy observability into all nodes via zones, and deeper analysis using DTrace.
- JavaScript and node.js to drive powerful visualizations.
- Large datacenters running interesting cloud computing workloads from a large variety of customers.
Before I get into deeper analysis with DTrace, I’ll show something simple that has has proven interesting so far.
The goal was to visualize the entire cloud to get a sense of what is running. Basic process details were collected: PID, PPID, process name, and recent percent CPU% (fetched using “ps -o zone,pid,ppid,pcpu,comm”). This was then graphed (using graphvis for now).
1. Processes
Examining just a few processes to begin with (click any of these images for the full version):

Parent-child relationships are shown with arrows. The size of each process reflects recent CPU usage: bigger means busier. The color identifies the type of process: system processes are shown in light blue. These details can be adjusted – the process size could show memory footprint, for example.
2. Zone
This is what a typical cloud computing node looks like (also known as a “zone”, or Joyent “SmartMachine“), in this case, a web server:

The master process for the web server can be seen surrounded by its worker processes, all shown in red. The worker processes are drawn larger, since they are busier on CPU doing work to respond to web requests. In the middle is a gray oval representing the “init” process of the zone (the real customer zone name has been scrubbed here). The full set of system processes that make up the zone can also be seen, with their relationship.
3. Server
Now scaling to show an entire physical server, which is running nine zones (plus one “global” zone):
Green is for language related processes, such as php, python, java, etc. Pink shows database processes, including MySQL, memcached, Riak, etc. The green/red zone is a Ruby/Apache server, and the top left zone has both mysqld and memcached. The largest pink process at the top is a busy MySQL server.
Previously we could look at lists of processes using ps or ptree to see the same data, but getting a quick sense of what’s running – and what’s busy – from pages of text output took a lot more time. Consider examining the same data on a rack of servers – this could become hundreds of pages of text output.
4. Rack
Visualizing all the zones in a rack:
More zone types pop out and can be identified quickly. The chain of five green circles is a Perl server, with five busy perl processes.
5. Availability Zone
Now for an entire Joyent “availability zone”, which consists of a fleet of racks in a datacenter:
It’s the first time we’ve seen every process that’s running on a single page. This includes over 300 servers and over 3500 zones. I could zoom this out a bit further to span an entire datacenter, and further still to span the entire company. Although, the full version of the above image is already way too big to share on this blog!
This image can be generated automatically to look for anomalies and changes in the cloud. We’ve made many discoveries so far, with the graphs often beautiful and unexpected.
Dead zone
One of the discoveries can be seen in the middle of the graph above: six large zones that appear as concentric circles. Here’s how they look zoomed in:
Our jaw dropped when we first saw this. What’s happened is that this zone is running a shell program via cron (system scheduler), that processes the result of getent. The getent process is stuck on an LDAP lookup that never completes, and so all its related processes are also stuck. Cron kept generating these mindlessly, until the zone had hit its process limit.
Fortunately these were old Joyent test zones that were not being used by a customer.
The beginning
This is just the beginning: the data above is very simple, process details from “ps”. We’ve also been using DTrace to add detail to these process maps, which I hope to blog about when I get the time.
These are not yet part of the Joyent Cloud Analytics product; whether they will be depends on proving their usefulness in solving real problems. So far it’s looking promising: we are finding useful information quickly with these experimental visualizations.
In: Joyent · Tagged with: cloud, visualizations









on October 4, 2011 at 4:22 pm
Permalink
Hy Brendan, great post.
Thanks
on October 5, 2011 at 7:34 am
Permalink
Hi Brendan,
it is said that “A picture is worth a thousand words”. You proved it enough well :-)
on October 5, 2011 at 12:21 pm
Permalink
[...] put up this morning by Joyent’s Brendan Gregg over at the DTrace blog. It’s a very nice visualization of the Joyent Cloud using DTrace to map out our [...]
on October 8, 2011 at 5:16 am
Permalink
Any chance you’re willing to share your script?
on November 22, 2011 at 6:23 pm
Permalink
Cool stuff. You may want to try Gephi (http://gephi.org) for the graph visualizations. It has a LabelAdjust algorithm that avoid label overlapping.
on November 28, 2011 at 9:18 pm
Permalink
[...] amazing. Read the Full Story. Posted in Networking by Rich Brueckner 0 [...]
on January 3, 2012 at 6:33 am
Permalink
Brendan,
It’s great to see organisations doing innovative things to solve Cloud Visualisation. Here in the UK we have been working on this for the last 3 years and have spun out a new company called Real-Status, http://www.real-status.com. (I am a co-founder) Our first product called HyperGlance allows you to show 10′s of thousands of nodes in 3D, which means that you can show much greater scale in one “single pane of glass”. We would love to be able to work with Joyent and to help visualise yours and your customer’s data sets. Please check out our demo at
http://www.youtube.com/user/hyperglance
Regards
Peter Job
Chairman and Co-Founder, Real-Status