Version 3 (modified by Simon Peyton Jones, 13 years ago) (diff)

# Bang patterns

## Goal

Our goal is to make it easier to write strict programs in Haskell. Programs that use 'seq' extensively are possible but clumsy, and it's often quite awkward or inconvenient to increase the strictness of a Haskell program. This proposal changes nothing fundamental; but it makes strictness more convenient.

## The basic idea

The main idea is to add a single new production to the syntax of patterns

```	pat ::= !pat
```

Matching an expression e against a pattern !p is done by first evaluating e (to WHNF) and then matching the result against p.

Example:

```f1 !x = True
```

f1 is strict in x. In this case, it's not so bad to write

```f1 x = x `seq` True
```

but when guards are involved the `seq` version becomes horrible:

```-- Duplicate the seq on y
f2 x y | g x = y `seq` rhs1
| otherwise = y `seq` rhs2

-- Have a weird guard
f2 x y | y `seq` False = undefined
| g x = rhs1
| otherwise = rhs2

-- Use bang patterns
f2 !x !y | g x = rhs1
| otherwise = rhs2
```

Bang patterns can be nested of course:

```f2 (!x, y) = [x,y]
```

f2 is strict in x but not in y. A bang only really has an effect if it precedes a variable or wild-card pattern:

```f3 !(x,y) = [x,y]
f4 (x,y)  = [x,y]
```

f3 and f4 are identical; putting a bang before a pattern that forces evaluation anyway does nothing.

```g5 x = let y = f x in body
g6 x = case f x of { y -> body }
g7 x = case f x of { !y -> body }
```

g5 and g6 mean exactly the same thing. But g7 evalutes (f x), binds y to the result, and then evaluates body.

## Non-recursive let and where bindings

First, we deal with non-recursive bindings only. In Haskell, let and where bindings can bind patterns. Their semantics is given by translation to a case, with an implicit implicit tilde (~).

```let [x,y] = e in b
```

means

```case e of { ~[x,y] -> b }
```

In our proposal, if there is a bang right at the top of a let-bound pattern, then the implicit tilde is replaced with an explicit bang:

```let ![x,y] = e in b
```

means

```case e of { ![x,y] -> b }
```

which is the same as

```case e of { [x,y] -> b }
```

Similarly

```let !y = f x in b
```

means

```case f x of { !y -> b }
```

which evaluates the (f x), thereby giving a strict `let`.

## Examples

Here is a more realistic example, a strict version of partition:

```partitionS p [] = ([], [])
partitionS p (x:xs)
| p x       = (x:ys, zs)
| otherwise = (ys, x:zs)
where
!(ys,zs) = partitionS p xs
```

The bang in the where clause ensures that the recursive call is evaluated eagerly. In Haskell today we are forced to write

```partitionS p [] = ([], [])
partitionS p (x:xs)
= case partitionS p xs of
(ys,zs) | p x       = (x:ys, zs)
| otherwise = (ys, x:zs)
```

which is clumsier (especially if there are a bunch of where-clause bindings, all of which should be evaluated strictly), and doesn't provide the opportunity to fall through to the next equation (not needed in this example but often useful).

## Top-level bang-pattern bindings

Does this make sense?

```  module Foo where
!x = factorial 1000
```

A top-level bang-pattern binding like this would imply that the binding is evaluated when the program is started; a kind of module initialisation. This makes some kind of sense, since (unlike unrestricted side effects) it doesn't matter in which order the module initialisation is performed.

But it's not clear why it would be necessary or useful. Conservative conclusion: no top-level bang-patterns.

## Recursive let and where bindings

It is just about possible to make sense of bang-patterns in recursive let and where bindings. At first you might think they don't make sense: how can you evaluate it strictly if it doesn't exist yet? But consider

```  let !xs = if funny then 1:xs else 2:xs in ...
```

Here the binding is recursive, but makes sense to think of it as equivalent to

```  let xs = if funny then 1:xs else 2:xs
in xs `seq` ...
```

But (a) it must be unusual, and (b) I have found it very hard to come up with a plausible translation (that deals with all the tricky points).

Conservative conclusion: bang-pattern bindings must be non-recursive.

### Tricky point: syntax

What does this mean?

```   f ! x = True
```

Is this a definition of `(!)` or a banged argument? (Assuming that space is insignificant.)

Proposal: resolve this ambiguity in favour of the bang-pattern. If you want to define `(!)`, use the prefix form

```   (!) f x = True
```

### Tricky point: nested bangs

Consider this:

```  let !(x, Just !y) = <rhs> in <body>
```

This should be equivalent to

```  case <rhs> of { (x, Just !y) -> <body> }
```

Notice that this meant that the entire pattern is matched (as always with Haskell). The `Just` may fail; `x` is not evaluated; but `y` is evaluated.

## Tricky point: polymorphism

```  let f :: forall a. Num a => a->a
Just f = <rhs>
in (f (1::Int), f (2::Integer))
```

But if we were to allow a bang pattern, `!Just f = <rhs>`, with the translation to a case expression given earlier, we would end up with

```   case <rhs> of { Just f -> (f (1::Int), f (2::Integer) }
```

But if this is Haskell source, then `f` won't be polymorphic.

## Changes to the Report

The changes to the Report would be these

• Section 3.17, add pat ::= !pat to the syntax of patterns. We would need to take care to make clear whether
```f !x = 3
```
was a definition of the function "!", or of "f". (There is a somewhat similar complication with n+k patterns; see the end of 4.3.3.2 in the Report. However we probably do not want to require parens thus
```f (!x) = 3
```
which are required in n+k patterns.
• Section 3.17.2: add new bullet 10, saying "Matching the pattern "!pat" against a value "v" behaves as follows:
• if v is bottom, the match diverges
• otherwise, "pat" is matched against "v".
• Fig 3.1, 3.2, add a new case (t):
```case v of { !pat -> e; _ -> e' }
= v `seq` case v of { pat -> e; _ -> e' }
```
• Section 3.12 (let expressions). In the translation box, wherever there is a bang right at the top of a pattern on the LHS of the translation rule, omit the implicit tilde that occurs at the top of the pattern on the RHS of the rule.

The last point is the only delicate one. If we adopt this proposal we'd need to be careful about the semantics. For example, are bangs ok in recursive bindings? (Yes.) And for non-recursive bindings the order of the bindings shouldn't matter.

#76
Bang patterns