[cpp-threads] Proposing a layered Thread API
Ion Gaztañaga
igaztanaga at gmail.com
Sat Sep 2 00:10:48 BST 2006
Hi,
After revising some implementations from Howard, Peter and others and
just remembering some opinions from the Redmond meeting, I'm trying to
think if a layered Thread/Task API could make everyone happy.
Some of the proposals are:
-> Some want reference-counted concurrently-joined future. Futures
referring to the same asynchronous execution can be joined from several
threads at the same time and each one gets a copy.
-> Some want a generic future, one that is independent from the
executor. This allows a single future type even that function is being
executed by a thread pool or just a simple OS thread. This makes the
task system extendible with user-produced executors.
-> Some want an easy, efficient implementation, so that they don't want
to pay for all that reference-counted, multiple-join overhead.
I think that we can get all those using a layered approach. This
approach is not fully implemented, but I think is implementable. These
are the levels:
------------------------------------------------------------
------------------------------------------------------------
Level 0: Direct OS thread management
------------------------------------------------------------
------------------------------------------------------------
-> thread<T>: A handle for the operating system thread, created by a
thread factory that can store settings (pthread_attr_t) and launch
threads.
-> Creating a thread<T> means creating an OS thread and joining it
means waiting for the OS thread termination.
-> The return value is _moved_ to the caller.
Example:
typedef std::vector<char> file_data_t;
file_data_t read_big_file(const char *file)
thread<file_data_t> t = launch_thread(bind(read_big_file, "myfile"));
//The whole vector is moved to the target, and no memory
//allocation is needed to obtain it
file_data_t data = t();
This level 0 thread is a _must_ because:
-> Offers portable basic operating system thread control. This is why we
have so many C and C++ portable runtimes in many libraries (Apache,
Mozilla, ACE, Qt...). So it's clear that some C++ programmers find this
essential.
-> Many times I need an OS thread and we do care if the function is
being executed in another thread or not.
Example: I want to launch a GUI in another thread. That means that I
need all POSIX thread guarantees (IO, signals, completion ports,
asynchronous IO, etc...). I don't want the function to be blocked
because a thread pool considers that there are too many active threads.
I want to create a new OS thread and that's all. A generic future<T> is
not the answer, because I want to make sure that when I call "join",
that effectively means pthread_join().
-> Exceptions: Exceptions thrown by the thread can be propagated to the
caller using Beman's virtual functions in std::exception:
virtual unique_ptr<std::exception> clone() = 0;
virtual void throw_self() = 0;
Note that we don't need to use full-clone semantics because we are going
to destroy the launched exception anyway. We can use move semantics, so
that we create a new exception but using the move constructor. When
throwing self, instead of throwing a copy, we can throw a moved version.
virtual unique_ptr<std::exception> move_clone() = 0;
virtual void throw_moved_self() = 0;
-> Implementability: Already implemented by Howard.
------------------------------------------------------------
------------------------------------------------------------
Level 1: Asynchronous task handle: aka cheap future.
------------------------------------------------------------
------------------------------------------------------------
-> task<T>: A handle for an asynchronous task that will be executed in
an executor (this can be a thread for each function, a thread pool, or
synchronous execution...).
-> This requires an standard interface for the implementation class, so
that we can plug more executors in the framework.
-> task<T> is a unique_ptr for an asynchronous task and can be joined
only once. This means that the return value is moved.
-> Movability is even more critical than in thread<T>, we might
be executing the task in an efficient thread-pool or just synchronously,
and copy/mutex overhead might be noticeable, because we are not creating
a thread for each task. The thread pool can be created using
thread<void> abstraction, so that we can define our own portable executors.
-> We can just define an implementation interface so that the user can
easily create its own executors:
template<class T>
class task_impl_interface
{
virtual T join() = 0;
virtual void request_cancel() = 0;
//....
};
//The task is just a lightweight, movable-only holder
//of the implementation. Forwards all operations to
//the implementation.
template<class T>
class task
{
//....
std::unique_ptr<interface> result_;
public:
task(unique_ptr<task_impl_interface> impl)
: result_(impl)
{}
T operator()
{ return result_->join(); }
void request_kind_cancel()
{ result_->request_kind_cancel(); }
// ...
};
Deriving from this interface we can have different executors: one thread
per function, a thread-pool, synchronous... This virtual interface is
not the only approach. Peter's approach registering a function object is
also valid. The idea is to have a single task<R> type for any executor,
the virtual interface is just a way to get that, but Peter's approach
has also many advantages, so we should study which approach is better.
Use cases:
task<T> t1 = create_task_in_a_thread(f);
task<T> t2 = create_task_in_a_thread_pool(f);
task<T> t3 = create_task_in_my_own_executor(f);
std::vector<task<T> > pending;
pending.push_back(std::move(t1));
pending.push_back(std::move(t2));
pending.push_back(std::move(t3));
//Task is movable only and can be placed in containers.
task<T> tnew = pending[0]; //compilation error
task<T> tnew(pending[0]); //Ok, task moved from vector to tnew
//Task is a one-shot function call and moves the return value:
//T(T &&) is called
T a = tnew();
//This throws, because the task is empty
pending[0]();
-> The main reason for task<T> is that many times we don't share
ownership of a future from different threads and we don't want to pay
for a copy and a mutex lock each join. The mutex lock would be needed
IMHO in a concurrently-joinable future because the copy constructor of a
generic type does not need to be thread-safe at all: it might modify
mutable members in the source object.
-> This scheme of a unique joiner is pretty widespread: we will surely
have a "master" thread launching asynchronous operations, we will store
them in a container and wait until one of them ends. A
reference-counted, concurrently-joinable future is only needed if a
group of threads launch asynchronous operations, passing them to other
threads and operating on the the same future at the same time. With
task<T> I can pass (using std::move) a task from a thread to another
thread, but only call operator() once, and only from one thread.
-> Exceptions: The same as level 0.
-> Implementability: Easy. Peter has already implemented this approach.
------------------------------------------------------------
------------------------------------------------------------
Level 2: Future. Reference-counted value
------------------------------------------------------------
------------------------------------------------------------
-> future<T>: Basically the same as task<T> but can be freely copied
between threads, and you get a copy for each join operation. The copy
operation should be executed holding a mutex to guarantee that every
class is correctly copied when multiple threads are calling join().
-> I'm still thinking if we can reuse the same executors as task<T>
and/or convert a task<T> in future<T> so that a user can design an
executor returning task<T> that can be used also to obtain futures.
-> Imagine that future<T> can be constructed from task<T> using move
semantics:
template<class T>
class future
{
//This future overrides task and
//takes control of the operation
future(task<T> &&t);
};
I see future<T> as the shared_ptr equivalent of an asynchronous task.
Converting from task<T> to future<T> can be seen as a conversion from
unique_ptr to shared_ptr:
unique_ptr<T> task( new T);
shared_ptr<T> future (task.release());
task<T> is emptied and morphs into a reference-counted, concurrently
joinable, full-powered future<T>.
-> Implementability: I haven't tested this, but I don't see any big
problem. The future holds a shared_ptr/intrusive_ptr to the task
implementation and holds a lock when calling join(). The the real join
value is first moved to a local storage and then copied for each join
request. The same can be done with exceptions.
Thoughts? Do you see a 3 level approach correct? It's too complicated?
Too hard to implement?
Regards,
Ion
More information about the cpp-threads
mailing list