Performance Analysis talk at SCALE10x

Last week I gave a talk at the Southern California Linux Expo (SCALE) titled “Performance Analysis: New Tools and Concepts from The Cloud”. There was a great turnout for my talk, which was videoed by Deirdré (video below). The conference was a lot of fun too. In my talk I presented six problems of performance [...]

Read more...
Posted on January 30, 2012 at 8:37 pm by Brendan Gregg · Permalink · Comments Closed
In: performance, slides, video

ZFS+10: illumos meetup

ZFS recently celebrated its informal 10th anniversary; to mark the occasion, Delphix hosted a ZFS-themed meetup for the illumos community (sponsored generously by Joyent). Many thanks to Deirdre Straughan, the new illumos community manager, for helping to organize and for filming the event. Three of my colleagues at Delphix presented work they’ve been doing in [...]

Read more...
Posted on January 20, 2012 at 2:39 pm by ahl · Permalink · Comments Closed
In: Delphix, illumos

Playing with Node/V8 postmortem debugging

“Post Mortem” by C. MacLaurin Several weeks ago I posted about postmortem debugging for Node.js, a critical technique for understanding fatal software failure (and thereby keeping up software quality). Now that the underlying pieces are freely available[1], you can use the documentation below to start debugging your own Node programs. With these tools you can [...]

Read more...
Posted on January 13, 2012 at 11:42 am by dap · Permalink · Comments Closed
In: Solaris, joyent

Activity of the ZFS ARC

Disk I/O is still a common source of performance issues, despite modern cloud environments, modern file systems and huge amounts of main memory serving as file system cache. Understanding how well that cache is working is a key task while investigating disk I/O issues. In this post, I’ll show the activity of the ZFS file [...]

Read more...
Posted on January 9, 2012 at 5:50 pm by Brendan Gregg · Permalink · Comments Closed
In: ARC, DTrace, Kernel, ZFS, performance

Where does your Node program spend its time?

Photo by Julian Lim (flickr) Performance analysis is one of the most difficult challenges in building production software. If a slow application isn’t spending much time on CPU, it could be waiting on filesystem (disk) I/O, network traffic, garbage collection, or many other things. We built the Cloud Analytics tool to help administrators and developers quickly [...]

Read more...
Posted on January 5, 2012 at 3:32 pm by dap · Permalink · Comments Closed
In: joyent

Visualizing Device Utilization

Device utilization is a key metric for performance analysis and capacity planning. In this post, I’ll illustrate different ways to visualize device utilization across multiple devices, and how that utilization is changing over time. As a system to study, I’ll examine a production cloud environment that contains over 5,000 virtual CPUs (over 600 physical processors). [...]

Read more...
Posted on December 18, 2011 at 2:47 pm by Brendan Gregg · Permalink · Comments Closed
In: cpu, heatmaps, performance, visualizations

Flame Graphs

MySQL Flame Graph Determining why CPUs are busy is a routine task for performance analysis, which often involves profiling stack traces. Profiling by sampling at a fixed rate is a coarse but effective way to see which code-paths are hot (busy on-CPU). It usually works by creating a timed interrupt that collects the current program [...]

Read more...
Posted on December 16, 2011 at 11:24 am by Brendan Gregg · Permalink · Comments Closed
In: DTrace, performance, profiling, visualizations

USDT Providers Redux

In this post I’m going to review DTrace USDT providers and show a complete working example that I hope will be a useful reference for people interested in building providers for their own applications. First, the prerequisites: DTrace is the comprehensive dynamic tracing framework available on Illumos-based, BSD, and MacOS systems. If you’ve never used [...]

Read more...
Posted on December 13, 2011 at 11:13 am by dap · Permalink · Comments Closed
In: Solaris, joyent

The case of the un-unmountable tmpfs

Every once in a rare while our development machines encounter an fatal error during boot because we couldn’t unmount tmpfs. This weekend I cracked the case, so I thought I’d share my uses of boot-time DTrace, and the musty corners of the operating systems that I encountered along the way. First I should explain a [...]

Read more...
Posted on December 12, 2011 at 9:39 am by ahl · Permalink · Comments Closed
In: DTrace, ZFS, anonymous, boot, pageout, tmpfs

2000x performance win

I recently helped analyze a performance issue in an unexpected but common place, where the fix improved performance of a task by around 2000x (two thousand times faster). As this is short, interesting and useful, I’ve reproduced it here in a lab environment to share details and screenshots. Issue In a production SmartOS cloud environment, [...]

Read more...
Posted on December 9, 2011 at 12:57 am by Brendan Gregg · Permalink · Comments Closed
In: DTrace, performance, smartos