[cpp-threads] Asynchronous Function Proposal

Sun Jun 21 16:40:33 BST 2009

Herb,

I don't disagree with you in principle on either the programming model 
or the desirability of running the async task in a thread pool.  My 
concern is that we have deferred thread pools to TR2 and I am reluctant 
to do anything here that might force certain decisions in TR2 thread 
pools.  For example, a thread pool proposal might have some strategy for 
dealing with the duration of thread-local storage.  I can imagine a 
quasi-join of a thread from a pool, though I have not thought out how 
that would work or what its characteristics would be.  Without having 
done the design work on thread pools, I'm not sure how any decision we 
make on async() might interact with a future thread pool proposal.  One 
possibility is that a thread pool should be a separate argument to async().

That said, my feelings on the matter are not particularly strong.  I 
have little hope that thread local storage can be made much more useful 
except by using Cilk++-like hyperobjects.  We should start now and 
spread the word that thread-local storage is almost as bad as global 
storage, and must be avoided when possible and treated with a lot of 
care when it can't be avoided. There are a large class of problems for 
which the problems described in N2880 are a non-issue.

- Pablo

On 6/19/2009 7:51 PM, Herb Sutter wrote:
> Yes, let's first converge on the programming model.
>
> Lawrence wrote:
>   
>> async is not a replacement for OpenMP.
>>     
> [...]
>   
>> Be wary of trying to use this facility for massive parallelism,
>> it is a stop-gap mechanism for few-core systems until TR2 becomes
>> available.
>>     
>
> Agreed! More next:
>
>   
>> The programming model is:
>>    At the highest levels of the program, add async where appropriate.
>>    If enough concurrency has not been achieved, move down a layer.
>>    Repeat until you achieve the desired core utilization.
>>     
>
> Not agreed -- the above seems to be talking about 'using more cores to get the answer faster,' aka "parallelism" or Pillar 2 in < http://www.ddj.com/cpp/200001985 >. That's the space targeted by tools like OpenMP and work stealing and (common uses of) thread pools etc., and patterns like fork-join.
>
> The programming model for async is different:
>
>    Instead of a synchronous function call that returns a synchronous value,
>    make an asynchronous function call that returns an asynchronous value.
>
> This is talking about 'I want to do work (potentially) asynchronously, decoupled from this thread,' aka "concurrency" or Pillar 1 in < http://www.ddj.com/cpp/200001985 >. That's the space targeted by tools like OpenMPI and threads and (some uses of) thread pools etc., and patterns like pipelining. It's mostly not about using more cores at all, although it just so happens that when you express asynchronous work you may also be expressing asynchronous compute-bound work that can keep more cores busy at once, as with pipelining again as a specific example; but this is not about scaling work to saturate a manycore machine.
>
>   
>> A difference in the anticpated programming model may be at the root
>> of the lack of consensus.
>>     
>
> As I put it in my draft paper:
>
>   - That is, just as we can make a synchronous function call that returns a
>   - synchronous result:
>   - 
>   -     T t  =  f();
>   -
>   -   we want to be able to make an asynchronous function call that returns
>   -   an asynchronous result [...] :
>   -
>   -     future<T> t  =  async([]{  f();  });
>
> That really is all. Do we agree on that motivation?
>
> Finally, without violating the motivation above, it's consistent to want the option of running the async() task on a pool, or on a work stealing implementation if the user opts in via "either"/"may-be-called," because of:
>
>   a) efficiency
>
>   b) not interfering with load balancing done by thread pools et al.
>
> Which brings us to:
>
>   
>>> There's another key problem that I don't think I mentioned
>>> explicitly before, which is oversubscription: A big reason
>>> to be able to run the async task on a thread pool is that
>>> the pool is already in the business of staying "rightsized"
>>> for the machine. In an application that uses today's thread
>>> pools, having compute-intensive work apart from the thread pool
>>> penalizes performance because it makes it harder for the pool to
>>> accurately match ready work to available cores and oversubscribes
>>> the machine. So that's another key reason that an efficient
>>> implementation needs to be able to run the work on a pool
>>> especially when the application is using that pool anyway.
>>>       
>> I agree that async must avoid the over-subscription problem.
>> However, thread pools are not necessary to avoid the problem.
>> In particular, a count of active threads, compared against
>> std::thread::hardware_concurrency() can provide all the information
>> necessary to determine if new thread should be created.
>>     
>
> No, that's not at all what I'm saying. I'm saying that an "in a new thread" async specification doesn't play nice with thread pools, and therefore doesn't play nice with apps that *do already* (or will choose to) use thread pools for their compute-intensive work.
>
> Let me try to say it again with maybe slightly clearer phrasing:
>
> Applications that run their compute-intensive work on a thread pool really want all their compute-intensive work to run on the pool. The pool is already in the business of staying "rightsized" for the machine, and having compute-intensive work outside the thread pool interferes with the pool's ability to accurately match the number of ready threads to the available hardware. Each compute-intensive async task in a non-pool thread adds extra work that the thread pool doesn't know about and so results in oversubscribing the machine, providing more ready work than there is available hardware parallelism.
>
> If we mandate "in a new thread," then it will probably be unusable in practice for any compute-intensive task in an application that is using a thread pool to spread its compute-intensive work across the available hardware.
>
> Herb
>
>
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>