[cpp-threads] Parallel input from cin etc.

N.M. Maclaren nmm1 at cam.ac.uk
Tue Nov 22 09:52:03 GMT 2011


27.4.1 [iostream.objects.overview] paragraph 4 says:

    Concurrent access to a synchronized (27.5.3.4) standard iostream
    object's formatted and unformatted input (27.7.2.1) ... functions
    or a standard C stream by multiple threads shall not result in a
    data race (1.10). [ Note: Users must still synchronize concurrent
    use of these objects and streams by multiple threads if they wish
    to avoid interleaved characters. -end note ]

27.5.3.4 [ios.members.static] says (roughly) that 'c = fgetc(f)' is
equivalent to 'c = str.rdbuf()->sbumpc(c);' and 'ungetc(c, f);' is
equivalent to 'str.rdbuf()->sputbackc(c);' for a synchronised stream.

27.7.2.1 [istream] paragraph 2 says "....  Both groups of input
functions are described as if they obtain (or extract) input characters
by calling rdbuf()->sbumpc() or rdbuf()->sgetc(). They may use other
public members of istream."

What this has done is is to state that unsynchronised reading from cin
if that is a synchronised stream is not undefined behaviour, but the
problem is that is unimplementable without the LIBRARY adding locking
around extractors, in general.  And even that has very serious problems,
because it is unclear which forms of deadlock are the user's error and
which are implementation bugs.  But the standard more than just implies
that there is no locking at that level.

I am not good at describing such fiendishly complicated implementation
issues, as there is invariably a way round every problem, in isolation.
Quite often, there is a way round every combination of two, but not of
every one of any number :-(  But I could try, and I have implemented
this for real.  But let's just consider just a few questions:

    1) Can a single character be 'consumed' by more than one thread?

    2) Can a character be completely ignored (i.e. effectively skipped
over, improperly)?

    3) Can a thread reach EOF, but the final stream still have data
waiting to be read?

    4) If not, how is this supposed to be implemented?

Ideally, it would be nice to see an existence proof that we could
debate for correctness covering, say, just signed base-free integers
and single characters, as in my example.  I don't think that it can
be done.

Regards,
Nick Maclaren.


// This is an explanation of why 27.4.1 [iostream.objects.overview]
// paragraph 4 should exclude cin from being safe against data races.
//
// Everywhere I can find (e.g. 27.7.2.1 [istream] paragraph 2), the
// description is solely in terms of character get operations, which
// is inadequate for implementing the specification and behaviour.
//
// Supply the input "123-456-789-" and assume unblocked input.  Now
// what if the iterations were in different threads?  When reading
// a number, a thread necessarily has to read the following '-' in
// order to determine that it is not a digit, so it needs to do the
// equivalent of an ungetc().  But what about the timing?  Specifically:
//
// As someone who has implemented thread-safe input, I can assure
// people that the current specification is either unimplementable
// or self-inconsistent.

#include <iostream>
using namespace std;

int main () {
    int n;
    char c;
    for (int i = 0; i < 3; ++i) {
        cin >> n >> c;
        cout << n << " '" << c << "' ";
    }
    cout << endl;
    return 0;
}





More information about the cpp-threads mailing list