Performance Analysis talk at SCALE10x
Last week I gave a talk at the Southern California Linux Expo (SCALE) titled “Performance Analysis: New Tools and Concepts from The Cloud”. There was a great turnout for my talk, which was videoed by Deirdré (video below). The conference was a lot of fun too. In my talk I presented six problems of performance [...]
Read more...In: performance, slides, video
ZFS+10: illumos meetup
ZFS recently celebrated its informal 10th anniversary; to mark the occasion, Delphix hosted a ZFS-themed meetup for the illumos community (sponsored generously by Joyent). Many thanks to Deirdre Straughan, the new illumos community manager, for helping to organize and for filming the event. Three of my colleagues at Delphix presented work they’ve been doing in [...]
Read more...Playing with Node/V8 postmortem debugging
“Post Mortem” by C. MacLaurin Several weeks ago I posted about postmortem debugging for Node.js, a critical technique for understanding fatal software failure (and thereby keeping up software quality). Now that the underlying pieces are freely available[1], you can use the documentation below to start debugging your own Node programs. With these tools you can [...]
Read more...Activity of the ZFS ARC
Disk I/O is still a common source of performance issues, despite modern cloud environments, modern file systems and huge amounts of main memory serving as file system cache. Understanding how well that cache is working is a key task while investigating disk I/O issues. In this post, I’ll show the activity of the ZFS file [...]
Read more...In: ARC, DTrace, Kernel, ZFS, performance
Where does your Node program spend its time?
Photo by Julian Lim (flickr) Performance analysis is one of the most difficult challenges in building production software. If a slow application isn’t spending much time on CPU, it could be waiting on filesystem (disk) I/O, network traffic, garbage collection, or many other things. We built the Cloud Analytics tool to help administrators and developers quickly [...]
Read more...Visualizing Device Utilization
Device utilization is a key metric for performance analysis and capacity planning. In this post, I’ll illustrate different ways to visualize device utilization across multiple devices, and how that utilization is changing over time. As a system to study, I’ll examine a production cloud environment that contains over 5,000 virtual CPUs (over 600 physical processors). [...]
Read more...In: cpu, heatmaps, performance, visualizations
Flame Graphs
MySQL Flame Graph Determining why CPUs are busy is a routine task for performance analysis, which often involves profiling stack traces. Profiling by sampling at a fixed rate is a coarse but effective way to see which code-paths are hot (busy on-CPU). It usually works by creating a timed interrupt that collects the current program [...]
Read more...In: DTrace, performance, profiling, visualizations
USDT Providers Redux
In this post I’m going to review DTrace USDT providers and show a complete working example that I hope will be a useful reference for people interested in building providers for their own applications. First, the prerequisites: DTrace is the comprehensive dynamic tracing framework available on Illumos-based, BSD, and MacOS systems. If you’ve never used [...]
Read more...The case of the un-unmountable tmpfs
Every once in a rare while our development machines encounter an fatal error during boot because we couldn’t unmount tmpfs. This weekend I cracked the case, so I thought I’d share my uses of boot-time DTrace, and the musty corners of the operating systems that I encountered along the way. First I should explain a [...]
Read more...In: DTrace, ZFS, anonymous, boot, pageout, tmpfs
2000x performance win
I recently helped analyze a performance issue in an unexpected but common place, where the fix improved performance of a task by around 2000x (two thousand times faster). As this is short, interesting and useful, I’ve reproduced it here in a lab environment to share details and screenshots. Issue In a production SmartOS cloud environment, [...]
Read more...In: DTrace, performance, smartos

