The USE Method: SmartOS Performance Checklist

The USE Method provides a strategy for performing a complete check of system health, identifying common bottlenecks and errors. For each system resource, metrics for utilization, saturation and errors are identified and checked. Any issues discovered are then investigated using further strategies.

In this post, I’ll provide an example of a USE-based metric list for use within a SmartOS SmartMachine (Zone), such as those provided by the Joyent Public Cloud. These use the illumos kernel, and so this list should also be mostly relevant for OmniOS Zones, and to a lesser degree (due to some missing features) Solaris Zones. This is primarily intended for users of the zones. For the system administrators of the physical systems (via the Global Zone), also see the Solaris checklist, which has greater visibility.

Cloud limits (software resource controls) are listed first, as they are usually encountered before the physical limits.

Cloud Limits

These cover CPU, memory, disk I/O (file system), and network.

component type metric
CPU cap utilization sm-cpuinfo (previously jinf -c); raw counters: kstat -p caps::cpucaps_zone*:, “usage” == current CPU used, “value” == CPU cap
CPU cap saturation uptime load averages are zone-aware; per-process: prstat -mLc 1, “LAT”; rough counter: kstat -p caps::cpucaps_zone*:above_sec
CPU cap errors N/A
Memory cap utilization sm-meminfo rss for main memory (previously jinf -m); sm-meminfo swap for virtual memory; zonememstat, “RSS” vs “CAP”; prstat -Z, zone “RSS”, “SIZE” (VM); raw counters: kstat -p memory_cap:::, “rss” vs “physcap”, “swap” vs “swapcap”
Memory cap saturation zonememstat, increasing “NOVER” (# over) and “POUT” (paged out); per-process: prstat -mLc 1, “DFL”; some raw counters: kstat -p memory_cap:::anonpgin
Memory cap errors DTrace failed malloc()s; raw counters: kstat -p memory_cap:::anon_alloc_fail
FS I/O throttle utilization N/A – it kicks in only when needed (see saturation)
FS I/O throttle saturation vfsstat, “d/s” (delays/sec), and magnitude of “del_t” (average delay time, us)
FS I/O throttle errors N/A
FS capacity utilization df -h, “used” / “size”
FS capacity saturation once it’s full, ENOSPC
FS capacity errors DTrace errno for FS syscalls; /var/adm/messages file system full messages
Network cap utilization dladm show-linkprop -p maxbw for max bandwidth (if set); dladm show-link -s -i 1 net0, for current throughput; nicstat can also show throughput
Network cap saturation not available from within a zone (need to DTrace mac_bw_state & SRS_BW_ENFORCED)
Network cap errors N/A

Storage devices (disks) are not listed, since limits for storage I/O are imposed at the file system layer.

Physical Resources

Since Zones are OS-Virtualization (OS partitioning), the physical resources are not emulated or virtualized, and many of the observability tools will show you the entire physical system. This can be both good – you can really understand what’s going on, and confusing – why are the resources busy when my system is idle? (it’s someone else; you can’t see their process address space).

component type metric
CPU utilization per-cpu: mpstat 1, “idl”; system-wide: vmstat 1, “id”; per-process: prstat -c 1 (“CPU” == recent), prstat -mLc 1 (“USR” + “SYS”); per-kernel-thread: not available from within a zone
CPU saturation system-wide: vmstat 1, “r”; per-process: prstat -mLc 1, “LAT”
CPU errors fmdump
Memory capacity utilization system-wide: vmstat 1, “free” (main memory), “swap” (virtual memory); per-process: prstat -c, “RSS” (main memory), “SIZE” (virtual memory)
Memory capacity saturation system-wide: vmstat 1, “sr” (bad now), “w” (was very bad); vmstat -p 1, “api” (anon page ins == pain), “apo”; per-process: prstat -mLc 1, “DFL”
Memory capacity errors fmdump; DTrace failed malloc()s
Network Interfaces utilization nicstat (see notes below); kstat (look for physical interface kstats, eg, kstat -p | grep ifspeed to find their names, and then kstat -p ixgbe::mac: for ixgbe interfaces)
Network Interfaces saturation nicstat; kstat for whatever custom statistics are available (eg, "nocanputs", "defer", "norcvbuf", "noxmtbuf"); netstat -s, retransmits
Network Interfaces errors netstat -i, error counters; kstat for extended errors, look in the interface and "link" statistics (there are often custom counters for the card); driver internals not available from within a zone
Storage device I/O utilization system-wide: iostat -xnz 1, "%b"
Storage device I/O saturation iostat -xnz 1, "wait"
Storage device I/O errors iostat -En; driver internals not available from within a zone
Storage capacity utilization swap: swap -s; file systems: "df -h"
Storage capacity saturation once it's full, ENOSPC
Storage capacity errors DTrace errno on FS syscalls; /var/adm/messages file system full messages
Storage controller utilization iostat -Cxnz 1, compare to known IOPS/tput limits per-card
Storage controller saturation look for kernel queueing: sd (iostat "wait" again)
Storage controller errors /var/adm/messages; driver internals not available from within a zone
Network controller utilization infer from kstat or nicstat and known controller max tput
Network controller saturation see network interface saturation
Network controller errors kstat for whatever is there; driver internals not available from within a zone
CPU interconnect utilization not available from within a zone
CPU interconnect saturation not available from within a zone
CPU interconnect errors not available from within a zone
Memory interconnect utilization not available from within a zone
Memory interconnect saturation not available from within a zone
Memory interconnect errors not available from within a zone
I/O interconnect utilization not available from within a zone
I/O interconnect saturation not available from within a zone
I/O interconnect errors not available from within a zone

Software Resources

component type metric
Kernel mutex utilization not available from within a zone
Kernel mutex saturation mpstat "smtx"
Kernel mutex errors not available from within a zone
User mutex utilization plockstat -H (held time); DTrace plockstat provider
User mutex saturation plockstat -C (contention); prstat -mLc 1, "LCK"; DTrace plockstat provider
User mutex errors DTrace plockstat and pid providers, for EAGAIN, EINVAL, EPERM, EDEADLK, ENOMEM, EOWNERDEAD, ... see pthread_mutex_lock(3C)
Process capacity utilization kstat, "unix:0:var:v_proc" for system-wide max, system-wide current usage isn't available in a zone, but "unix:0:process_cache:slab_alloc" gives a rough idea; zone: "unix:0:system_misc:nproc" for current zone usage; prctl -n zone.max-processes -i zone ZONE, "privileged/system" for zone max, and "usage" for current usage.
Process capacity saturation queueing on pidlinklock in pid_allocate(), as it scans for available slots once the table gets full.
Process capacity errors "can't fork()" messages
Thread capacity utilization user-level: prctl -n zone.max-lwps -i zone ZONE, "privileged/system" for zone max, and "usage" for current zone usage; kernel: limited by system memory - see memory usage.
Thread capacity saturation threads blocking on memory allocation - see memory cap usage.
Thread capacity errors user-level: pthread_create() failures with EAGAIN, EINVAL, ...; kernel: not available from within a zone
File descriptors utilization system-wide (no limit other than RAM); per-process: pfiles vs ulimit or prctl -t basic -n process.max-file-descriptor PID; a quicker check than pfiles is ls /proc/PID/fd | wc -l
File descriptors saturation I don't think there is any queueing or blocking, other than on memory allocation.
File descriptors errors truss or DTrace (better) to look for errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...).

What's Next

See the USE Method for the follow-up strategies after identifying a possible bottleneck. If you complete this checklist but still have a performance issue, move onto other strategies: drill-down analysis and latency analysis.

Also see the Solaris Performance Checklist if you have access to the physical host (global zone).

Posted on December 19, 2012 at 9:23 am by Brendan Gregg · Permalink · One Comment
In: Performance · Tagged with: , , , , , ,

USENIX LISA 2012: Performance Analysis Methodology

At USENIX LISA 2012, I gave a talk titled Performance Analysis Methodology. This covered ten performance analysis anti-methodologies and methodologies, including the USE Method. I wrote about these in the ACMQ article Thinking Methodically about Performance, which is worth reading for more detail. I’ve also posted USE Method-derived checklists for Solaris- and Linux-based systems.

The video of the talk is on the LISA site, and the slides are below, also available as a PDF.

I’ve summarized the methodologies in the talk below.

Methodology Summaries

Blame-Someone-Else Anti-Method:

  1. Find a system or environment component you are not responsible for
  2. Hypothesize that the issue is with that component
  3. Redirect the issue to the responsible team
  4. When proven wrong, go to 1

Streetlight Anti-Method:

  1. Pick observability tools that are
    • familiar
      found on the Internet
      found at random
  2. Run tools
  3. Look for obvious issues

Ad Hoc Checklist Method:

  1. ..N. Run A, if B, do C

Problem Statement Method:

  1. What makes you think there is a performance problem?
  2. Has this system ever performed well?
  3. What has changed recently? (Software? Hardware? Load?)
  4. Can the performance degradation be expressed in terms of latency or run time?
  5. Does the problem affect other people or applications
(or is it just you)?
  6. What is the environment? What software and hardware is used? Versions? Configuration?

Scientific Method:

  1. Question
  2. Hypothesis
  3. Prediction
  4. Test
  5. Analysis

Workload Characterization Method:

  1. Who is causing the load? PID, UID, IP addr, …
  2. Why is the load called? code path
  3. What is the load? IOPS, tput, type
  4. How is the load changing over time?

Drill-Down Analysis Method:

  1. Start at highest level
  2. Examine next-level details
  3. Pick most interesting breakdown
  4. If problem unsolved, go to 2

Latency Analysis Method:

  1. Measure operation time (latency)
  2. Divide into logical synchronous components
  3. Continue division until latency origin is identified
  4. Quantify: estimate speedup if problem fixed

USE Method:

For every resource, check:

  1. Utilization
  2. Saturation
  3. Errors

Stack Profile Method:

  1. Profile thread stack traces (on- and off-CPU)
  2. Coalesce
  3. Study stacks bottom-up
Posted on December 13, 2012 at 2:51 pm by Brendan Gregg · Permalink · One Comment
In: Performance · Tagged with: , , , ,

USENIX LISA 2010: Visualizations for Performance Analysis

My USENIX LISA talk from 2010 is now available on youtube, also embedded below. The title is Visualizations for Performance Analysis (and more), and showed how the full distribution of data could be presented as a heat map. This is especially useful for latency analysis, as fast-path and slow-path differences can be studied, as well as latency outliers. I’ve also written about this before for ACM, with the article Visualizing System Latency.

The video and slides of the talk are below. The slides can also be downloaded as PDF.

I’m delighted to be speaking again at LISA this year in San Diego, on December 13th. My talk is titled Performance Analysis Methodology, and covers existing and new methodologies for approaching system performance.

Posted on December 10, 2012 at 9:36 am by Brendan Gregg · Permalink · Comments Closed
In: Performance · Tagged with: , , ,