Threads suck... - Difficult to understand control flow - Difficult to test and debug - Most synchronisation errors are timing-dependent and only result in breakage under heavy load - Language and library semantics vaguely defined ...but they're a necessary evil - Multiple threads are required to make use of multiple processor cores - Multicore and hyperthreading processors are becoming standard in desktops and even laptops - Servers already tend to have many processors - Reference: Sutter 2005 - Inter-thread communication is much cheaper than inter-process - Processes can also share memory, but that brings much of the same challenges as multithreading plus the problem of differing virtual addresses (which breaks C++ polymorphism) How to structure applications into threads - Easy routes - Server: one kernel thread per client - Performs badly under heavy use - Uses lots of VM space - High scheduling and synchronisation overhead - Interactive application: one kernel thread per task - Often one task uses the majority of the processor time, so this doesn't improve performance much if at all - Better ways - Server: thread pool, one kernel thread per processor - Harder to write - application needs its own scheduler - May not interact well with other applications - All I/O should be non-blocking (can have extra threads to be released if a running thread is about to block) - Interactive application: UI thread plus one thread per processor for processor-intensive work - Can be hard to parallelise - High-performance computing: use MPI, OpenMP, or other framework for parallelism Thread synchronisation - Do it too little and your program is broken - Do it too much and your threads can spend most of their time waiting for each other - Mutexes (a.k.a. locks) and condition variables - Good building blocks - Lockless algorithms and data structures - For experts only - And even they make mistakes (e.g. double-checked locking) - Examples: - Double-checked locking with volatile in Java - Read-Copy-Update What was that about semantics? We tend to assume that the operations of our program - reading and writing variables, synchronising, and performing I/O - happen in the same order as written in the program. You probably all know, however, that compilers can reorder operations to make your program run faster. They can also eliminate or duplicate memory accesses in some cases by caching variables in registers. And some processors do not support writing to memory locations of certain sizes in a single operation, so writing to a variable may also involve reading and writing the memory storing other variables. (This is true for most processors in the case of bitfields.) Aside from this, processors may reorder reads and writes depending on which part of the memory system has the current version of that line of memory (RAM, own cache, another processor's cache, etc.), within some restrictions. All this is OK in a single-threaded program - though reordering by the compiler must be borne in mind when writing signal handlers. However it can introduce disastrous race conditions into a multithreading program (or one that shares memory with other processes). Synchronisation primitives such as mutexes deal with the processor reordering, but may not inhibit compiler reordering. In order for a language to support multithreading properly, it must have a memory model that defines to what extent reordering is possible and how the programmer can limit it. Reference: Boehm 2004 Few languages have such models yet, with the notable exception of Java. The C and C++ standards describe a single-threaded abstract machine, and the C binding for POSIX threads speaks vaguely in terms of "memory locations" rather than language semantics. However there is active work on such a memory model for C++, some of which may be applicable to C. C#'s memory model is only vaguely specified. For higher-level languages there may or may not be a problem to resolve. References: Manson et al. 2004,2005, Boehm 2006 How does this relate to Debian? If you're a maintainer for a program that uses multithreading - and more and more programs do - you are likely to see increasing numbers of bug reports relating to synchronisation errors and maybe problems with language semantics as SMP becomes the norm. Debian's support for multiple architectures with different processor memory models also means that some such bugs will only appear on some architectures. Further reading Boehm, Threads Cannot Be Implemented As a Library, HP Technical Report 2004 http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html Sutter, The Free Lunch is Over, Doctor Dobb's Journal 2005 http://www.gotw.ca/publications/concurrency-ddj.htm Manson et al., Java Memory Model and Thread Specification (JSR-133), Java Community Process 2004 http://jcp.org/aboutJava/communityprocess/final/jsr133/index.html Manson et al., The Java Memory Model, Principles of Programming Languages 2005 http://rsim.cs.uiuc.edu/Pubs/popl05.pdf Boehm, Threads and memory model for C++, personal web site 2006 http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/