- Old Stuff
- Optional extensions to basic standard
- Status of Compilers
There is now a Draft Chapter under construction.
These are proposals on which we are generally agreed so far, with brief rationale.
- Some kind of concurrency will be included in the spec, including MVars for communication.
At least the following interface will be provided:
- We need a thread-safe interface to mutable state, for use in library code that does not otherwise use concurrency.
We have two choices:
- Use MVars. A non-concurrent implementation might implement them in terms of IORef, for example.
- Use STM. Easier to use, but not entirely trivial to implement, even in a single-threaded implementation, because exceptions have to abort a transaction (sample implementation).
- Concurrent foreign calls are required. A concurrent foreign call allows other Haskell threads to make progress before the foreign call returns.
- concurrent foreign calls are required to guarantee progress of other Haskell threads when one thread makes a blocking call.
- concurrent foreign calls are required for implementing I/O multiplexing, a principal use of concurrency.
- concurrent foreign calls are required to guarantee timely responsiveness of an interactive application in the presence of long-running foreign calls.
- Concurrent/reentrant foreign calls are required. A reentrant foreign call is a foreign call that calls a foreign-exported Haskell function. A concurent/reentrant foreign call is both concurrent and reentrant.
Haskell system must be able to process call-ins from arbitrary
external OS threads.
- the main loop of a GUI may block (hence concurrent) and makes callbacks (hence reentrant), we need to support this kind of usage.
- providing concurrent/reentrant foreign calls does not impose significant extra overhead on the rest of the system. For example, a call-in can check a thread-local variable (fast) to see whether it arose from a foreign call.
- Foreign calls will be able to specify independently whether they
are concurrent, reentrant, or both. (syntax and the sense of the annotations are still to be decided, see below).
- these annotations can have a profound impact on performance in some implementations.
- Bound threads are not required, but allowed as an extension, and we will specify their meaning.
There follows a list of issues on which decisions are still to be made. A numbered item, eg. 1.2, indicates a question, an item ending in a letter, eg. 1.2.a, indicates a possible choice for question 1.2.
1. Cooperative or preemptive concurrency?
Choice 1.a. The spec requires cooperative concurrency, and preemption is allowed as an extension. Both would be specified precisely in terms of what progress and fairness guarantees the programmer can expect.
- Allows many more implementations, including Hugs (although Hugs needs to be updated to handle concurrent and concurrent/reentrant foreign calls, and non-blocking I/O).
- Preemption isn't always required; a common case is an application that relies on concurrency for I/O multiplexing, where most threads are usually blocked.
- Cooperative systems can be faster, and are simpler to implement (see state threads reference).
- it is fairly difficult to tell the difference between a cooperative (with progress guarentee) and a preemptive system in practice on non contrived programs.
- STM is trivial and much more efficient to implement. no need to rollback ever except on an explicit exception. Can be almost as fast as IORefs.
- Portability problems: if a programmer develops a concurrent application on a preemptive system, there is no guarantee that it will work as expected on a cooperative system, and the compiler/runtime can give no useful feedback.
- Need to specify which operations are "yield points" in library documentation.
- Long-running pure code must be refactored into the IO monad so that explicit yield points can be inserted.
- Lazy I/O is not non-blocking in a cooperative implementation, so lazy I/O would be restricted to single-threaded programs.
- Extra restrictions: inside unsafePerformIO we can't guarantee that yield works, or that concurrent and/or reentrant foreign calls work as expected. (ditto unsafeInterleaveIO, which is why lazy I/O doesn't play with cooperative concurrency).
Choice 1.b. Preemption is required by the spec.
- Simpler from the programmer's point of view: no yield, no worrying about latency. "write code as if the current thread is the only one.".
- Imposes significant implementation constraints. Essentially only GHC and YHC would be able to implement it. JHC has no concept of thunks, which is a barrier to implementing general preemption.
- Even in a preemptive system, deadlocks are easy to program, and arbitrary starvation can result from laziness: evaluating arbitrary expressions while holding an MVar can prevent other threads from running. The fact that we therefore require seq and possibly deepSeq is disturbing, as is the notion that the programmer must think about "what is evaluated" when programming concurrent code.
- The main benefit of preemption, latency, cannot be guarenteed anyway.
1.b.1. Include thread priorities or not?
- Some applications require it
- It affects the fairness/progress guarantees, including the possibility of priorities from the outset may be simpler.
- Need to decide how rigidly to specify them. are they advisory? do we require priority inversion protection?
- Can be hard to implement, no implementations yet.
2. Syntax for foreign call annoatations.
2.1. choices for concurrent calls:
Choice 2.1.a. we annotate concurrent calls:
Rationale for using the term "block": blocking is the main reason for wanting concurrent calls. Concurrent calls allow the progress guarantee to be retained in the presence of a blocking foreign call. A foreign call that just takes a long time is still making progress.
Rationale for not using the term "block": the fact that the call blocks is immaterial, the property we want to provide is that it doesn't impede progress of other Haskell threads. A long-running call is indistinguishable from a blocked call in terms of the progress of other threads.
We often don't know whether a library call will block or not (it isn't documented), whereas saying a call should run concurrently with other threads is a choice the programmer can reasonably make.
Choice 2.1.b. we annotate non concurrent calls:
Rationale for annotating the non-concurrent calls: this is a performance issue. It is always correct to make a concurrent call, but it might be more efficient to make a non-concurrent call if the call does not block. An implementation might implement all calls as concurrent, for simplicity.
Against: John Meacham says "The FFI is inherently unsafe. We do not need to coddle the programer who is writing raw FFI code."
2.2. choices for non-reentrant calls:
Rationale for annotating the non-reentrant calls, as opposed to the reentrant ones: we want the "safe" option to be the default (as in the FFI spec).
2.3. should we annotate foreign calls according to whether they need to access thread-local state (TLS) or not?
- a call that doesn't need access to thread-local state, called from a bound thread, can be executed much more quickly on an implementation that doesn't run the Haskell thread directly on the bound OS thread, because it doesn't need to context switch.
- libraries that require TLS, eg. OpenGL, often have many fast TLS-using functions. So implementations that need the no-TLS annotation in order to get good performance, will probably still get poor performance from libraries that need TLS anyway.
3. Semantics of IORefs
MVar operations must be strictly ordered; that is, a thread must never observe MVar operations performed by another thread out of order.
We have a choice when it comes to IORefs, however. (Note that this only affects true multiprocessor implementations of concurrent Haskell).
Choice 3.a Specify a weak memory model, in which IORef updates may be observed out of order, but specify that certain operations (eg. MVar operations) constitute sequence points around which no re-ordering may happen.
Choice 3.b Specify a strong memory model in which no re-ordering is observable.
- Some processors provide this anyway (current generations of x86, x86-64)
- The implementation will require some synchronisation in any case in order to prevent threads from observing partially-written closures. For example, if one thread builds a closure and writes its address into an IORef, there must be a write barrier (and possibly a read barrier depending on the CPU) to prevent other threads from following the pointer and not finding the closure at the end of it. This synchronisation may be enough to provide the strong memory model anyway.
- Strong memory models are easier to program with, and leave fewer possibilities for a program to behave unexpectedly on a different processor or Haskell implementation.
- The Control.Concurrency module
- Concurrent Haskell (the original paper, including a semantics)
- Extending the Haskell FFI with Concurrency (a specification of the interaction between concurrency and the FFI, with a semantics)
- Software Transactional Memory
- Cheap (but Functional) Threads William L. Harrison. (a threading library entirely in user-space based on resumption monads, including support for system calls)
- A Language-based Approach to Unifying Events and Threads Peng Li and Steve Zdancewic. (another user-space implementation of concurrency, demonstrating efficiency and multiprocessor scalability).
What follows is old material, which we will probably want to incorporate into the main text later.
- if any haskell thread is runnable then at least one thread will be running.
- foreign calls marked 'concurrent' will not interfere will the above rule.
Additionally, we have a fairness property of MVars:
- every thread is guarenteed to run in a finite amount of time if a program yields infinitly often. (weak fairness)
- A thread blocked on an MVar will eventually run, provided there are no other threads holding the MVar indefinitely. (this is implied by the previous rule)
This means that MVar blocking must be implemented in a fair way, eg. a FIFO of blocked threads.
In order to meet the progress guarentee, an implementation must yield to another thread, waiting for an appropriate event, before any action that entails blocking for an indeterminate amount of time.
initial proposal is here:
Alternate, simpler proposal: full memory barrier at every putMVar and takeMVar.
perhaps a better phrasing of the first proposal exists, in practice, from a users point of view, it would be hard to tell the difference between the two models, but we should say something concrete on the matter.
Misc library stuff
yield is guarenteed to choose an alternate thread if another one exists and is runnable.
sleep guarentees the thread will wait as long as its argument at a minimum. it may be blocked for longer.
I/O operations from System.IO, System.Directory, System.Process (and others?) do not prevent other threads from making progress when they are waiting for I/O to complete.
We could provide a lower-level non-blocking I/O interface along the lines of threadWaitRead, threadWaitWrite, perhaps in Control.Concurent.IO.
Optional extensions to basic standard
These are optional extensions a compiler may implement. In some implementations they may entail a run-time cost to non-concurrent code or a compiler might need a special option to enable them. However, A compiler is not required to provide more than one concurrency model as long as it can meet the requirements of the standard and any options it claims to support.
If a compiler documents that it supports one of the following options, then it must adhere to the rules of that option as well.
Optional Feature 1 - Preemption
The standard only requires a progress guarentee, that a thread is always running, making progress. If an implementation supports context switching during arbitrary computations and meets the stronger fairness guarentee below, then it can be said to support the 'Preemption' option.
- no starvation
new library calls provided
- mergeIO, nmergeIO
Possibly: define the meaning of using concurrent and/or reentrant foreign calls with unsafePerformIO (see below)?
Optional Feature 2 - OS threads
The implementation additionally allows the following:
- foreign exported functions, and function pointers created by foreign import "wrapper", can be invoked from multiple OS threads
- bound threads: forkOS, isCurrentThreadBound, runInBoundThread, runInUnboundThread
- concurrent/reentrant foreign calls are supported
Although not mentioned in the standard, the use of Concurrency may affect the lazy sharing of computations. Consult an implementations documentation if this might be an issue for you.
The following operations inside unsafePerformIO are undefined:
- concurrent or reentrant foreign calls
- I/O operations
Additionally, foreign imports that are both declared to return a non-IO type and declared concurrent and/or reentrant are undefined.
Check an implementation's documentation for details before depending on any particular behavior.
Status of Compilers
just coroutines (does not meet standard): hugs (standard concurrency planned?)
standard concurrency: jhc(planned)
standard + preemptive: ghc, yhc
standard + preemptive + OS threads: "ghc -threaded"