Visualizing the Cloud

I’ve worked on visualizations for a while, most recently with heatmaps for Joyent’s Cloud Analytics. While we’re using and enhancing these right now, we are also in a great position to continue developing new visualizations for cloud computing, given:

Before I get into deeper analysis with DTrace, I’ll show something simple that has has proven interesting so far.

The goal was to visualize the entire cloud to get a sense of what is running. Basic process details were collected: PID, PPID, process name, and recent percent CPU% (fetched using “ps -o zone,pid,ppid,pcpu,comm”). This was then graphed (using graphvis for now).

1. Processes

Examining just a few processes to begin with (click any of these images for the full version):

Parent-child relationships are shown with arrows. The size of each process reflects recent CPU usage: bigger means busier. The color identifies the type of process: system processes are shown in light blue. These details can be adjusted – the process size could show memory footprint, for example.

2. Zone

This is what a typical cloud computing node looks like (also known as a “zone”, or Joyent “SmartMachine“), in this case, a web server:

The master process for the web server can be seen surrounded by its worker processes, all shown in red. The worker processes are drawn larger, since they are busier on CPU doing work to respond to web requests. In the middle is a gray oval representing the “init” process of the zone (the real customer zone name has been scrubbed here). The full set of system processes that make up the zone can also be seen, with their relationship.

3. Server

Now scaling to show an entire physical server, which is running nine zones (plus one “global” zone):

Green is for language related processes, such as php, python, java, etc. Pink shows database processes, including MySQL, memcached, Riak, etc. The green/red zone is a Ruby/Apache server, and the top left zone has both mysqld and memcached. The largest pink process at the top is a busy MySQL server.

Previously we could look at lists of processes using ps or ptree to see the same data, but getting a quick sense of what’s running – and what’s busy – from pages of text output took a lot more time. Consider examining the same data on a rack of servers – this could become hundreds of pages of text output.

4. Rack

Visualizing all the zones in a rack:

More zone types pop out and can be identified quickly. The chain of five green circles is a Perl server, with five busy perl processes.

5. Availability Zone

Now for an entire Joyent “availability zone”, which consists of a fleet of racks in a datacenter:

It’s the first time we’ve seen every process that’s running on a single page. This includes over 300 servers and over 3500 zones. I could zoom this out a bit further to span an entire datacenter, and further still to span the entire company. Although, the full version of the above image is already way too big to share on this blog!

This image can be generated automatically to look for anomalies and changes in the cloud. We’ve made many discoveries so far, with the graphs often beautiful and unexpected.

Dead zone

One of the discoveries can be seen in the middle of the graph above: six large zones that appear as concentric circles. Here’s how they look zoomed in:

Our jaw dropped when we first saw this. What’s happened is that this zone is running a shell program via cron (system scheduler), that processes the result of getent. The getent process is stuck on an LDAP lookup that never completes, and so all its related processes are also stuck. Cron kept generating these mindlessly, until the zone had hit its process limit.

Fortunately these were old Joyent test zones that were not being used by a customer.

The beginning

This is just the beginning: the data above is very simple, process details from “ps”. We’ve also been using DTrace to add detail to these process maps, which I hope to blog about when I get the time.

These are not yet part of the Joyent Cloud Analytics product; whether they will be depends on proving their usefulness in solving real problems. So far it’s looking promising: we are finding useful information quickly with these experimental visualizations.

Posted on October 4, 2011 at 4:11 pm by Brendan Gregg · Permalink
In: Joyent · Tagged with: ,

7 Responses

Subscribe to comments via RSS

  1. Written by Francesco Sullo
    on October 4, 2011 at 4:22 pm
    Permalink

    Hy Brendan, great post.
    Thanks

  2. Written by Przemyslaw Bak (przemol)
    on October 5, 2011 at 7:34 am
    Permalink

    Hi Brendan,

    it is said that “A picture is worth a thousand words”. You proved it enough well :-)

  3. Written by Repost: Visualizing the Joyent Cloud with DTrace « Joyeur
    on October 5, 2011 at 12:21 pm
    Permalink

    [...] put up this morning by Joyent’s Brendan Gregg over at the DTrace blog. It’s a very nice visualization of the Joyent Cloud using DTrace to map out our [...]

  4. Written by Mike D.
    on October 8, 2011 at 5:16 am
    Permalink

    Any chance you’re willing to share your script?

  5. Written by Mathieu
    on November 22, 2011 at 6:23 pm
    Permalink

    Cool stuff. You may want to try Gephi (http://gephi.org) for the graph visualizations. It has a LabelAdjust algorithm that avoid label overlapping.

  6. Written by How to Visualize the Cloud | Inside-Cloud.com
    on November 28, 2011 at 9:18 pm
    Permalink

    [...] amazing. Read the Full Story. Posted in Networking by Rich Brueckner 0 [...]

  7. Written by Peter Job
    on January 3, 2012 at 6:33 am
    Permalink

    Brendan,

    It’s great to see organisations doing innovative things to solve Cloud Visualisation. Here in the UK we have been working on this for the last 3 years and have spun out a new company called Real-Status, http://www.real-status.com. (I am a co-founder) Our first product called HyperGlance allows you to show 10′s of thousands of nodes in 3D, which means that you can show much greater scale in one “single pane of glass”. We would love to be able to work with Joyent and to help visualise yours and your customer’s data sets. Please check out our demo at
    http://www.youtube.com/user/hyperglance

    Regards
    Peter Job
    Chairman and Co-Founder, Real-Status

Subscribe to comments via RSS