[cpp-threads] Asynchronous Execution Issues (caching_async)
Anthony Williams
anthony at justsoftwaresolutions.co.uk
Wed Apr 29 08:45:01 BST 2009
At Wed 29 Apr 2009 05:13:33 BST, "Robison, Arch"
<arch.robison at intel.com> wrote:
> The experience with TBB strongly supports Herb's point: users need
> both. For the first few years of TBB, we offered only work stealing,
> and users kept asking us for true asynchrony too. (Now TBB offers
> the latter via an approximation of std::thread).
>
> A key lesson from Cilk is that it's not just about granularity. The
> choice of schedule can profoundly affect the asymptotic space
> behavior of a program. For example, Cilk-style scheduling often
> delivers space bounds linear in the number of processors, whereas
> the equivalent code using true asynchrony can have exponential space
> blowup. (Think depth-first versus breadth-first.)
>
> Worse, "both" is probably not enough. Cilk-style scheduling is
> great for a certain class of problems, but there are other problems
> (e.g. stream processing) where FIFO-like scheduling makes more
> sense. Beman Dawes suggestion of a more generic approach is worth
> investigating.
>
> To make non-preemptive schedulers useful, users have to understand
> their limitations, particularly with respect to deadlock. For
> example, it's overly simplistic to require that one caching_async()
> call never wait on another. At a minimum, I want a function being
> executed by a cached_async() call to be able to execute another
> cached_async() call and wait on the result. Invariably users want
> to do more complex work graphs.
>
> Thus I'm skeptical that caching_async can be both widely useful and
> avoid the hard issues of a thread pool.
It is for this reason that the BSI comment was worded as it was:
"Provide a simple function along the lines of:
template< typename F, typename ... Args > requires Callable< F,
Args... > future< Callable::result_type > async( F&& f, Args && ... );
Semantics are similar to creating a thread object with a packaged_task
invoking f with forward<Args>(args...) but details are left
unspecified to allow different scheduling and thread spawning
implementations. It is unspecified whether a task submitted to async
is run on its own thread or a thread previously used for another async
task. If a call to async succeeds, it shall be safe to wait for it
from any thread. The state of thread_local variables shall be
preserved during async calls. No two incomplete async tasks shall see
the same value of this_thread::get_id(). [Note: this effectively
forces new tasks to be run on a new thread, or a fixed-size pool with
no queue. If the library is unable to spawn a new thread or there are
no free worker threads then the async call should fail.] "
This is deliberately basic. Aside from thread_locals it has the same
semantics as spawning a new thread every time, but it is allowed to
fail due to "too many threads", and may reuse old async threads once
the original task has finished.
In general, I've seen three different models for dividing work:
1) Discrete long-running tasks which communicate e.g. gui+background
thread. These are often well served by std::thread, but if they return
a result it might be good to have an async function that can run them
and return a future for the result. This really requires that each
function is running on a separate thread.
2) Many independent tasks that can run in parallel. These are ideally
suited to a work-stealing thread pool as there is no communication
(except maybe short-lived locks) and no dependent waiting.
3) A graph of inter-dependent tasks that can mostly be run in
parallel, but which have dependencies or communicate with each other.
A simple work-stealing pool can deadlock on such tasks, particularly
if they are nested on the same thread. In order to run such tasks on a
pool you need to provide dependency information to the pool. Things
like task groups can provide such information. Running each task on a
separate thread should "work", but will likely have poor performance
due to oversubscription, thread overheads and so forth.
A proper set of support for #3 is clearly outside the remit of what
we're doing at the moment. Support for #1 or #2 is doable, but they
require different thread properties --- #1 requires separate threads,
#2 works better with a limited pool of threads. If we provide support
for #1 or #2, I can see people trying to use it for #3.
The BSI suggestion handles #1 and provides working but obviously poor
support for #3. Likewise it will provide working but obviously poor
support for #2.
If we provide good support for #2 this does not work for #1 or #3, as
the tasks may not run when required. User code may deadlock, and this
deadlock may depend on the number of available threads in the pool
e.g. with 4 threads everything is fine, but with 3 or 5 you get
deadlock as the scheduling properties change.
It might be an option to support #1 and #2, which gets us back to
creating_async vs caching_async. It would be nice if we could do this
without making it hard to support #3 later in the same framework,
which leads us to Beman's generic async proposal with the properties
specified by an extra parameter. This is particularly good if we take
into account the issue that Arch noted above about different code
requiring different scheduling properties.
unique_future<retval> f1=async(std::own_thread,gui_func); //#1
unique_future<retval> f2=async(std::pool_thread,parallel_func); //#2
unique_future<retval> f3=async(my_task_group,dependent_parallel_func); //#3
unique_future<retval> f3=async(fifo_pool,dependent_parallel_func); //#3
I think the thread_local issue is actually a distraction. It is always
possible to write classes which have destructors with behaviour that
is undesirable when the class is used as a thread_local. I think it
might be useful to have join() only return when the thread function
has completed AND all the thread_locals have been destroyed, but this
is not an issue for async. If you use async you do not know the
lifetime of thread_locals. They may exist before your task starts, and
may continue to exist after it finishes.
Anthony
--
Anthony Williams
Author of C++ Concurrency in Action | http://www.manning.com/williams
just::thread C++0x thread library | http://www.stdthread.co.uk
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Just Software Solutions Ltd, Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK
More information about the cpp-threads
mailing list