<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Concurrency&#8217;s Shysters</title>
	<atom:link href="http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/feed/" rel="self" type="application/rss+xml" />
	<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/</link>
	<description>Views on software from Bryan Cantrill&#039;s deck chair</description>
	<lastBuildDate>Wed, 24 Aug 2011 00:55:27 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Bryan Cantrill</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-150</link>
		<dc:creator>Bryan Cantrill</dc:creator>
		<pubDate>Thu, 04 Dec 2008 21:53:12 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-150</guid>
		<description>David,
Thanks for the thoughtful comments -- it&#039;s always a relief when a fellow domain expert agrees!  As for HTM, I&#039;m skeptical that they do indeed gain over traditional techniques, especially for small critical sections.  After all, I&#039;m still doing read-to-own bus transactions (no way around that), and that&#039;s a much greater cost than the pipeline stalls from a compare&amp;swap. Further: are the scenarios in which HTM results in a higher performing system for the contended case or the uncontended case?  If the former (that is, if HTM only putatively shows a benefit when contention is high), HTM is falling into a classic architecture pitfall:  optimizing for the wrong case.  In the Solaris kernel -- as in any mature system with fine-grained parallelism -- the vast, vast majority of locks are uncontended.  (Indeed, that&#039;s the whole damn point of fine-grained parallelism.)  And when we do find a lock that has high contention, we take the steps necessary to defract or eliminate that contention -- we don&#039;t optimize for the contention itself.  
My HTM skepticism is also heightened by the fact that the world has already had the opportunity to experiment with a flavor of HTM:  namely, load-linked/store-conditional (implemented at least by Alpha and PowerPC).  Now, there is obviously a difference between LL/SC and full-blown HTM, but if one is making the argument that HTM is essentially most useful for &quot;very small critical sections&quot;, I think one needs to address what HTM would solve that LL/SC didn&#039;t...
</description>
		<content:encoded><![CDATA[<p>David,<br />
Thanks for the thoughtful comments &#8212; it&#8217;s always a relief when a fellow domain expert agrees!  As for HTM, I&#8217;m skeptical that they do indeed gain over traditional techniques, especially for small critical sections.  After all, I&#8217;m still doing read-to-own bus transactions (no way around that), and that&#8217;s a much greater cost than the pipeline stalls from a compare&amp;swap. Further: are the scenarios in which HTM results in a higher performing system for the contended case or the uncontended case?  If the former (that is, if HTM only putatively shows a benefit when contention is high), HTM is falling into a classic architecture pitfall:  optimizing for the wrong case.  In the Solaris kernel &#8212; as in any mature system with fine-grained parallelism &#8212; the vast, vast majority of locks are uncontended.  (Indeed, that&#8217;s the whole damn point of fine-grained parallelism.)  And when we do find a lock that has high contention, we take the steps necessary to defract or eliminate that contention &#8212; we don&#8217;t optimize for the contention itself.<br />
My HTM skepticism is also heightened by the fact that the world has already had the opportunity to experiment with a flavor of HTM:  namely, load-linked/store-conditional (implemented at least by Alpha and PowerPC).  Now, there is obviously a difference between LL/SC and full-blown HTM, but if one is making the argument that HTM is essentially most useful for &quot;very small critical sections&quot;, I think one needs to address what HTM would solve that LL/SC didn&#8217;t&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Holmes</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-149</link>
		<dc:creator>David Holmes</dc:creator>
		<pubDate>Thu, 04 Dec 2008 18:34:41 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-149</guid>
		<description>Here, here Bryan! As part of the core group involved in developing the Java Concurrency Utilities and having been teaching about concurrent programming in Java for over ten years, it was disheartening to read so much nonsense about the &quot;CMT sky is falling&quot;. Does CMT impose additional challenges for effective concurrent programming? Sure. But there are so many fallacies in the arguments being put forward: the main one being that a single application needs to keep all of those core&#039;s busy. I&#039;m all for additional parallelism, but even then the simplistic programming models being advocated in a number of languages/platforms don&#039;t even take into account that there are thresholds below which parallelizing a problem just doesn&#039;t make sense. (Just because you can, doesn&#039;t mean you should!).
As for TM, well I&#039;ve long been a software TM skeptic, for the reasons you outline: TM relies on the &#039;M&#039; and once you have real programs that need to handle other things transactionally (or rather has things that can&#039;t be handled transactionally!) then STM breaks down. I&#039;ve read, and reviewed, a lot of academic papers on STM, and as the authors try to expand their models to cope with things that are inherently non-transactional, the programming model gets more and more complex, to the point where I don&#039;t believe that the resulting model is any better than &quot;threads &amp; locks&quot; - far from it. Plus the performance is terrible too.
Hardware TM is a slightly different story. You can take advantage of HTM for very small critical sections code (that do only involve memory) and gain performance benefits over alternative techniques. And the programming model is somewhat simpler compared to lock-free techniques (but marginally so given the programming models at this low-level are fairly simple to begin with).
Regards,
David Holmes
</description>
		<content:encoded><![CDATA[<p>Here, here Bryan! As part of the core group involved in developing the Java Concurrency Utilities and having been teaching about concurrent programming in Java for over ten years, it was disheartening to read so much nonsense about the &quot;CMT sky is falling&quot;. Does CMT impose additional challenges for effective concurrent programming? Sure. But there are so many fallacies in the arguments being put forward: the main one being that a single application needs to keep all of those core&#8217;s busy. I&#8217;m all for additional parallelism, but even then the simplistic programming models being advocated in a number of languages/platforms don&#8217;t even take into account that there are thresholds below which parallelizing a problem just doesn&#8217;t make sense. (Just because you can, doesn&#8217;t mean you should!).<br />
As for TM, well I&#8217;ve long been a software TM skeptic, for the reasons you outline: TM relies on the &#8216;M&#8217; and once you have real programs that need to handle other things transactionally (or rather has things that can&#8217;t be handled transactionally!) then STM breaks down. I&#8217;ve read, and reviewed, a lot of academic papers on STM, and as the authors try to expand their models to cope with things that are inherently non-transactional, the programming model gets more and more complex, to the point where I don&#8217;t believe that the resulting model is any better than &quot;threads &amp; locks&quot; &#8211; far from it. Plus the performance is terrible too.<br />
Hardware TM is a slightly different story. You can take advantage of HTM for very small critical sections code (that do only involve memory) and gain performance benefits over alternative techniques. And the programming model is somewhat simpler compared to lock-free techniques (but marginally so given the programming models at this low-level are fairly simple to begin with).<br />
Regards,<br />
David Holmes</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bryan Cantrill</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-148</link>
		<dc:creator>Bryan Cantrill</dc:creator>
		<pubDate>Wed, 05 Nov 2008 15:37:06 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-148</guid>
		<description>Keith,
First, loved your follow-up to our work:
  &lt;a href=&quot;http://x86vmm.blogspot.com/2008/11/cantrill-and-bonwick-get-all-concurrent.html&quot; rel=&quot;nofollow&quot;&gt;http://x86vmm.blogspot.com/2008/11/cantrill-and-bonwick-get-all-concurrent.html&lt;/a&gt;
Yes, the microkernel debate is another interesting analogue, and your observations are spot-on (and I say this as one who did kernel work for a microkernel operating system).  Do you think there are enough of these for a book?  &quot;Locked in the Cellar: System Software&#039;s Crazy Aunts, 1970-present&quot;?
In terms of good faith:  the problem I have is not that the TM folks are incorrect, it&#039;s the arrogance of the sweeping assertions.  Speaking personally, I have attempted to inject a little data/reality into the thinking of some TM partisans, if only to get them to narrow the scope of their assertions a bit.  I have been roundly ignored each time.  Perhaps not malice, but it is certainly true that there is a point at which malice and incompetence become impossible to distinguish from one another...
</description>
		<content:encoded><![CDATA[<p>Keith,<br />
First, loved your follow-up to our work:<br />
  <a href="http://x86vmm.blogspot.com/2008/11/cantrill-and-bonwick-get-all-concurrent.html" rel="nofollow">http://x86vmm.blogspot.com/2008/11/cantrill-and-bonwick-get-all-concurrent.html</a><br />
Yes, the microkernel debate is another interesting analogue, and your observations are spot-on (and I say this as one who did kernel work for a microkernel operating system).  Do you think there are enough of these for a book?  &quot;Locked in the Cellar: System Software&#8217;s Crazy Aunts, 1970-present&quot;?<br />
In terms of good faith:  the problem I have is not that the TM folks are incorrect, it&#8217;s the arrogance of the sweeping assertions.  Speaking personally, I have attempted to inject a little data/reality into the thinking of some TM partisans, if only to get them to narrow the scope of their assertions a bit.  I have been roundly ignored each time.  Perhaps not malice, but it is certainly true that there is a point at which malice and incompetence become impossible to distinguish from one another&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Keith Adams</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-147</link>
		<dc:creator>Keith Adams</dc:creator>
		<pubDate>Wed, 05 Nov 2008 15:14:22 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-147</guid>
		<description>&quot;Shysters&quot; is, at least, uncharitable. The TM partisans are, in fact, wrong, and it&#039;s becoming increasingly acceptable to say so in polite company. But I suspect that they&#039;ve come to their wrongness in good faith...
</description>
		<content:encoded><![CDATA[<p>&quot;Shysters&quot; is, at least, uncharitable. The TM partisans are, in fact, wrong, and it&#8217;s becoming increasingly acceptable to say so in polite company. But I suspect that they&#8217;ve come to their wrongness in good faith&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bryan Cantrill</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-146</link>
		<dc:creator>Bryan Cantrill</dc:creator>
		<pubDate>Wed, 05 Nov 2008 14:59:36 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-146</guid>
		<description>RNC,
Yes, a good question -- and I suppose another way in which the two-level/TM analogue holds up is that Sun had/has a major stake in both. ;)  The answer is that I&#039;m not close enough to Rock to answer the implementation issue definitively -- but when the TM issue was initially discussed in the CPU architecture committees in which I participated (in 2001), I did not withhold my skepticism.  Rock is not -- or should not be -- &quot;dependent&quot; on TM.  They have added support for it, and if some body of researchers or (less likely in my opinion) practitioners find that support useful, great.  But TM in Rock should be a sideshow, not the main event...
</description>
		<content:encoded><![CDATA[<p>RNC,<br />
Yes, a good question &#8212; and I suppose another way in which the two-level/TM analogue holds up is that Sun had/has a major stake in both. <img src='http://dtrace.org/blogs/bmc/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />   The answer is that I&#8217;m not close enough to Rock to answer the implementation issue definitively &#8212; but when the TM issue was initially discussed in the CPU architecture committees in which I participated (in 2001), I did not withhold my skepticism.  Rock is not &#8212; or should not be &#8212; &quot;dependent&quot; on TM.  They have added support for it, and if some body of researchers or (less likely in my opinion) practitioners find that support useful, great.  But TM in Rock should be a sideshow, not the main event&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bryan Cantrill</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-145</link>
		<dc:creator>Bryan Cantrill</dc:creator>
		<pubDate>Wed, 05 Nov 2008 14:54:26 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-145</guid>
		<description>UX-admin,
Peter&#039;s right; check out those references for the architecture of the two-level model.  It should also be said (and I don&#039;t think I actually said it in my thesis) that the lab in which I worked had a Solaris source license.  Having the source was instrumental in me being able to do my research -- a fact which helped inform my bias towards open source after I arrived at Sun...
</description>
		<content:encoded><![CDATA[<p>UX-admin,<br />
Peter&#8217;s right; check out those references for the architecture of the two-level model.  It should also be said (and I don&#8217;t think I actually said it in my thesis) that the lab in which I worked had a Solaris source license.  Having the source was instrumental in me being able to do my research &#8212; a fact which helped inform my bias towards open source after I arrived at Sun&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Schow</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-144</link>
		<dc:creator>Peter Schow</dc:creator>
		<pubDate>Wed, 05 Nov 2008 08:07:28 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-144</guid>
		<description>@UX-admin:  Take a look at reference [9] and [8] in the mentioned thesis.
</description>
		<content:encoded><![CDATA[<p>@UX-admin:  Take a look at reference [9] and [8] in the mentioned thesis.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: RNC</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-143</link>
		<dc:creator>RNC</dc:creator>
		<pubDate>Wed, 05 Nov 2008 05:24:10 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-143</guid>
		<description>So what is your view on Harware Transactional Memory as implemented in Rock? A CPU upon which Sun Micro&#039;s future appears dependent?
</description>
		<content:encoded><![CDATA[<p>So what is your view on Harware Transactional Memory as implemented in Rock? A CPU upon which Sun Micro&#8217;s future appears dependent?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: UX-admin</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-142</link>
		<dc:creator>UX-admin</dc:creator>
		<pubDate>Wed, 05 Nov 2008 01:19:54 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-142</guid>
		<description>&quot;This article is half stern lecture on the merits of abstinence and half Kama Sutra.&quot;
I love your writing style!
BTW, a carfeul reader will have noted that you already knew about the two-level scheduling model in Solaris. Given that at that time Solaris wasn&#039;t open source, how did you know how threading was implemented in Solaris, before having actually been hired by Jeff to work at Sun?
</description>
		<content:encoded><![CDATA[<p>&quot;This article is half stern lecture on the merits of abstinence and half Kama Sutra.&quot;<br />
I love your writing style!<br />
BTW, a carfeul reader will have noted that you already knew about the two-level scheduling model in Solaris. Given that at that time Solaris wasn&#8217;t open source, how did you know how threading was implemented in Solaris, before having actually been hired by Jeff to work at Sun?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Garrett D'Amore</title>
		<link>http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-141</link>
		<dc:creator>Garrett D'Amore</dc:creator>
		<pubDate>Tue, 04 Nov 2008 09:50:01 +0000</pubDate>
		<guid isPermaLink="false">http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/#comment-141</guid>
		<description>Wow, an unbelievably excellent post on a topic that I have found myself wondering about as well.  TM has seemed to me at least, to offer additional complexity without really solving the hard problems that I wind up using locks (and other synchronization primitives in the kernel) to solve, and your post puts to words what I&#039;ve been feeling for a while.  Thank you!
</description>
		<content:encoded><![CDATA[<p>Wow, an unbelievably excellent post on a topic that I have found myself wondering about as well.  TM has seemed to me at least, to offer additional complexity without really solving the hard problems that I wind up using locks (and other synchronization primitives in the kernel) to solve, and your post puts to words what I&#8217;ve been feeling for a while.  Thank you!</p>
]]></content:encoded>
	</item>
</channel>
</rss>

