[cpp-threads] Asynchronous Function Proposal

Pablo Halpern phalpern at cilk.com
Fri Jun 5 02:26:08 BST 2009


Response below:

On 6/2/2009 8:07 PM, Boehm, Hans wrote:
> [Adding Pablo, Arch, and Doug:  This is a continuing discussion of Lawrence Crowl's proposal for a simple C++ facility to support asynchronous function call execution.  Reattaching Lawrence's proposal for newcomers.]
>   
[Bjarne seems to be the only one in this thread not subscribed to the 
cpp-threads list at present]

I'm glad to be included in this discussion.  I certainly have Cilk 
experience to contribute.  Since I've been out of the loop for a while, 
I haven't read Lawrence's latest proposal in detail yet.

> I'm not an expert on this stuff.  (Copying Pablo in case he can help and hasn't seen this.) But my understanding is that when Cilk is about to execute a spawned task, it pushes a work item on the task queue that represents the parents' continuation.  That pushed item may get stolen by another worker thread.  Thus the parent thread may appear to have migrated between threads.  
Correct.
> My vague intuition is that this is necessary if you want to both:
>
> - Efficiently execute child tasks on the parent thread in the common case, and
> - Be reasonably sure that when you steal a task, you're likely something reasonably large, from closer to the root of the tree.
>   
Actually, neither statement is completely correct.

1) The true argument for stealing continuations is space, not time. I.e., we can support ``for(i=1 to 109) cilk_spawn f();'' without running out of space.

   It may be true that stealing continuations is more efficient than stealing children, but the difference is not that large.  (I.e., TBB/fib is pretty fast, although not as fast as Cilk++/fib.)
   
2) Stealing from the near the root has nothing to do with whether we steal 
the child or the continuation.  What it does, though, is allow the 
scheduling algorithm to pretty quickly find the work that's on the 
critical path, without ever suspending a worker unless there's no more 
parallel work to be done.

> If I understand this correctly, this doesn't mesh particularly well with a bunch of other thread facilities in the current draft, e.g. thread-local variables.  (There are more details on Cilk in their PLDI 98 paper http://supertech.csail.mit.edu/papers/cilk5.pdf .)
>   
Thread-local variables certainly don't play well with Cilk threads.  We 
have several ideas floating around at Cilk Arts for how to create 
something that works better than TLS.  As we've explored parallelism in 
general, we've come to see that TLS has the same problem as global 
variables.  Tying a variable to a thread only works for very limited 
scopes.  The problem is not limited to Cilk-style work-stealing.  Any 
time a single job can be divided among threads, TLS falls apart.  This 
can happen even in a do-it-yourself threading system with a sequential 
job: the job queues a sub-task onto the task queue, when the sub-task 
completes, it queues the next sub-task, which may be run on a different 
worker thread (with different TLS).

TLS works for situations where one thread == one complete job.  For 
example, it is fine for a UI thread that sits in an event loop or when 
the program itself can logically be split into producer and consumer 
threads.  TLS does not work well for fine-grained parallelism, where you 
try to take advantage of every opportunity to do work in parallel.

For async(), we have two choices:
1. Require that async() return on the same thread OR
2. Don't require that async() return on the same thread (and thus don't 
require that TLS be the same).

Cilk obviously chose #2 for cilk_spawn.

I'll just put in a note that we would eventually like to standardize the 
Cilk++ language, possibly by merging it into the next (post-C++0x) 
version of C++.  The Cilk++ spawn/sync system defines a consistent 
calculus with certain guarantees that probably cannot be implemented by 
an async() library facility.  My concern is that an overly-ambitious 
async() library facility might slow adoption of a more powerful language 
facility. (But perhaps I have it backwards: People will be happy to go 
from the almost-there library facility to the completely-there language 
facility.)
> TBBs tasks are more complicated to use, since they don't rely on language support.  They may avoid this kind of issue; I'm not sure. (Copying Arch in case he can help.)
>
> My impression is that for both Cilk and TBB, interactions with other C++0x thread facilities haven't really been worked out.  (Not to surprising, since we're having enough problems with interactions between C++0x facilities, not to mention weird interactions within older systems we're building on.)
>   
Alas, too true.
- Pablo
> Hans
>   
>



More information about the cpp-threads mailing list