Performance Instrumentation Counters: short talk

Performance Instrumentation Counters (PICs) allow CPU internals to be observed, and are especially useful for identifying why exactly CPUs are busy – not just that they are. I’ve blogged about them before, as part of analyzing HyperTransport utilization and CPI (Cycles-per-Instruction). There are a number of performance analysis needs for which can only be answered via PICs, either using the command line cpustat/cputrack tools, developer suites such as Oracle Sun Studio, or accessing them via DTrace. They include observing:

This information is useful, not just for developers writing code (who are typically more familiar with their existence from using Oracle Sun Studio), but also for system administrators doing performance analysis and capacity planning.

I’ve recently been doing more performance analysis with PICs and taking advantage of PAPI (Performance Application Programming Interface), which provides generic counters that are both easy to identify and work across different platforms. Over the years I’ve maintained a collection of cpustat based scripts to answer questions from the above list. These scripts were written for specific architectures and became out of date when new processor types were introduced. PAPI solves this – I’m now writing a suite of cpustat based scripts based on PAPI (out of necessity – performance analysis is my job), that will work across different and future processor types. If I can, I’ll post them here.

And for the reason of this post: Roch Bourbonnais, Jim Mauro and myself were recently in the same place at the same time, and used the opportunity to have a few informal talks about performance topics recorded on video. These talks wern’t prepared beforehand, we just chatted about what we knew at the time, including advice and tips. This talk is on PICs:

download part 1 for iPod

download part 2 for iPod

I’m a fan of informal video talks, and I hope to do more – they are an efficient way to disseminate information. And for busy people like myself, it can be the difference between never documenting a topic or providing something – albeit informal – to help others out. Just based on my experience, the time it’s taken to generate different formats of content has been:

In fact, it’s taken twice the time to write this blog post about the videos than it took to plan and film them.

Documentation is another passion of mine, and we are doing some very cool things in the Fishworks product to create documentation deliverables in a smart and efficient way; which can be the topic of another blog post…

Print Friendly
Posted on May 15, 2010 at 4:26 pm by Brendan Gregg · Permalink
In: Performance · Tagged with: , , , ,

4 Responses

Subscribe to comments via RSS

  1. Written by Peter
    on May 15, 2010 at 5:08 pm

    Hi Brendan,
    FYI: The second video download link results in a 9.6k unplayable file (for me at least)
    Sydney, Australia

  2. Written by Brendan Gregg
    on May 15, 2010 at 6:47 pm

    Thanks Peter, I think it’s fixed now.

  3. Written by Ivan Ostres
    on May 17, 2010 at 3:19 am

    Hi Brendan,
    this kind of information broadcast seems a very good idea and you did a decent job. I would propose showing examples on live system which would make a few breaks on whiteboard talk. It’s the same as with long songs…you need a break section ;-).

  4. Written by himanshu khona
    on May 25, 2010 at 2:29 am

    Hi Brendan -
    I have read almost all the posts from you before making purchase of 7410. Now that we are at implementation stage, there are teething problems for network fail over for 7410. We have 2 head nodes and 1 22 disk JBOD with 2 log zillas. The fundamental issue is with ISCSI fail over during network failure. We have tried enough things till now and unable to get it to work as desired. We tried, IPMP, MPIO (with 2 different 10Gbe cards) and so on.
    I wanted to see if you or any of your team members can help us.

Subscribe to comments via RSS