[cpp-threads] Asynchronous Execution Issues

Fri Apr 24 23:56:59 BST 2009

On 4/24/09, Bjarne Stroustrup <bs at cs.tamu.edu> wrote:
> Lawrence Crowl wrote:
>
> > At the Summit meeting I took the action item of producing a proposal
> > for a simple asynchronous execution.  The idea was to facilities,
> > with an API along the lines of:
> >
> >    auto x = std::creating_async( function1 );
> >    auto y = std::caching_async( function2 );
> >    ....
> >    auto a = x.get();
> >    auto b = y.get();
> >
> > The difference between the two is that creating_async would always
> > create a new thread, while the caching_async is permitted to reuse
> > threads.  The primary operational difference is that thread-local
> > storage is reused in the latter case.
> >
> >
>
> This is great! I was about to write something along those
> lines myself (thinking that I was among those such charged -
> and seriously behind with my work as usual)

Well you asked to review my work early, so you were a little on
the hook.  :-)

> Naming is a big deal: I don't think those two names are good.

Fully agreed, but I'd rather start with a clearly bad name than a
subtly bad name.  Please send any suggestions that you might have.

> Also, I don't think the key issue is/was whether a thread might
> be reused, but whether a task might be executed on the thread
> that launched it. In that spirit, I propose the names:
>    async()   // execute somewhere
>    async_thread()  // execute on a thread different from that of the launcher

The concurrency subgroup didn't take this view.  The idea was that
the work should be able to include arbitrary synchronization, and
if the caller and the work are serialized, they cannot synchronize.

I agree that a facility to execute the work on the current thread
would be helpful, particularly in avoiding oversubscription, but
it does impose restrictions on the synchronization in the work.

> (is it async() of asynch() and why?)

It is async because without the trailing 'ronous', the ch would be
pronounced as in church.  English spelling is not ideal.

> > There are a couple of issues that make the task more difficult than
> > we thought at the time.  As a result, I have a couple of questions.
> >
> > First, consider the case of the caching_async.  Because the threads
> > may persist, we need a handle on the list of threads so that we
> > can inform a thread to die so that we destroy any thread-local
> > variables before the global variables that they reference.  So, we
> > start needing a manager object, and the whole facility is starting
> > to look too much like a thread pool.
>
> I'm not convinced that's necessary. It's up to whoever implements
> async_thread() to
>    (1) make sure that a task gets a "clean" thread (with no
>    information from previous tasks in local variables)
>    (2) a terminal error on an asynch_tread()ed tread is reflected
>    in its future and the thread is properly killed or recycled.
> I think that exposing any details of how/if/when that is done is
> against the spirit of asynch.

I am not worried about OS-level caching of threads (and many will
anyway) but specifically on the behavior of thread-local variables
as a result of calling the async function.  If the variables are
always created and destroyed, that simplifies the problem.

> > I don't feel as though I have a mandate to propose anything
> > that looks like a thread pool.  Do you agree?
>
> Agreed.
>
>
> > Second, consider the case of the creating_async.  The problem here
> > is a touch more subtle.  In particular, once the working thread has
> > set its promise, it can start destroying thread-local variables.
>
> You are making too many assumptions here. My mental model is
> that the tread is clean when a task starts executing. It couldn't
> possible know about the local variables of a previous task.

I'm worried about the thread-local variables of the new thread that
does the work.  For instance,

    thread_local complicated helper;

    auto f = async( []{ return foo( &helper ); } );
    ....
    auto x = f.get();

Running the work in another thread necessarily initializes a helper
in that thread.  Suppose the destructor of complicated uses a global
variable.  When will that use occur?  In short, we have no idea.

More specifically, we know that the return happens before the get,
but have no synchronization between the destruction of helper and
the call of the code.

> > Unfortunately, we have no idea when that has or might happen.
> > That thread could be stalled immediately after providing the value.
> > The solution is to wait for thread termination before returning the
> > value from the future.  This solution implies keeping the thread as
> > part of the future so that we can wait.  We have no such mechanism.
> > So, an asynchronous execution function needs another kind of future.
> > Is such a future going beyond my mandate?
>
> Sorry to be so brief, but I am just finishing the semester, etc.,
> and I think that the solution to these problems is to specify
> simply what must be done and carefully avoid saying how it is done.

The futures we have now do not have the necessary synchronization.
I can add a future, but if adding another future type would kill
the proposal, I'd rather not do the work to define it.  This mail
is a straw poll to see if I should do the work.

-- 
Lawrence Crowl