[[PageOutline]]
= Bang patterns =
== Tickets ==
[[TicketQuery(description~=BangPatterns)]]
== Goal ==
Our goal is to make it easier to write strict programs in
Haskell. Programs that use 'seq' extensively are possible but clumsy,
and it's often quite awkward or inconvenient to increase the strictness
of a Haskell program. This proposal changes nothing fundamental;
but it makes strictness more convenient.
== The basic idea ==
The main idea is to add a single new production to the syntax of
patterns
{{{
pat ::= !pat
}}}
Matching an expression e against a pattern !p is done by first
evaluating e (to WHNF) and then matching the result against p.
Example:
{{{
f1 !x = True
}}}
f1 is strict in x. In this case, it's not so bad to write
{{{
f1 x = x `seq` True
}}}
but when guards are involved the {{{seq}}} version becomes horrible:
{{{
-- Duplicate the seq on y
f2 x y | g x = y `seq` rhs1
| otherwise = y `seq` rhs2
-- Have a weird guard
f2 x y | y `seq` False = undefined
| g x = rhs1
| otherwise = rhs2
-- Use bang patterns
f2 !x !y | g x = rhs1
| otherwise = rhs2
}}}
Bang patterns can be nested of course:
{{{
f2 (!x, y) = [x,y]
}}}
f2 is strict in x but not in y. A bang only really has an effect
if it precedes a variable or wild-card pattern:
{{{
f3 !(x,y) = [x,y]
f4 (x,y) = [x,y]
}}}
f3 and f4 are identical; putting a bang before a pattern that
forces evaluation anyway does nothing.
{{{
g5 x = let y = f x in body
g6 x = case f x of { y -> body }
g7 x = case f x of { !y -> body }
}}}
g5 and g6 mean exactly the same thing. But g7 evalutes (f x), binds y to the
result, and then evaluates body.
== Let and where bindings ==
In Haskell, let and where bindings can bind patterns. We propose to modify
this by allowing an optional bang at the top level of the pattern. Thus for example:
{{{
let ![x,y] = e in b
}}}
The "`!`" should not be regarded as part of the pattern; after all,
in a function argument `![x,y]` means the same as `[x,y]`. Rather, the "`!`"
is part of the syntax of `let` bindings.
The semantics is simple; the above program is equivalent to:
{{{
let p@[x,y] = e in p `seq` b
}}}
That is, for each bang pattern, invent a variable `p`, bind it to the
banged pattern (removing the bang) with an as-pattern, and `seq` on it
in the body of the `let`. (Thanks to Ben Rudiak-Gould for suggesting
this idea.)
A useful special case is when the pattern is a variable:
Similarly
{{{
let !y = f x in b
}}}
means
{{{
let y = f x in y `seq` b
}}}
which evaluates the `(f x)`, thereby giving a strict `let`.
A useful law is this. A ''non-recursive'' bang-pattern binding
is equivalent to a `case` expression:
{{{
let !p = in
is equivalent to (when the let is non-recursive)
case of !p ->
}}}
Here is a more realistic example, a strict version of partition:
{{{
partitionS p [] = ([], [])
partitionS p (x:xs)
| p x = (x:ys, zs)
| otherwise = (ys, x:zs)
where
!(ys,zs) = partitionS p xs
}}}
The bang in the where clause ensures that the recursive
call is evaluated eagerly. In Haskell today we are forced to
write
{{{
partitionS p [] = ([], [])
partitionS p (x:xs)
= case partitionS p xs of
(ys,zs) | p x = (x:ys, zs)
| otherwise = (ys, x:zs)
}}}
which is clumsier (especially if there are a bunch of where-clause
bindings, all of which should be evaluated strictly), and doesn't
provide the opportunity to fall through to the next equation (not
needed in this example but often useful).
----
= Changes to the Report =
The changes to the Report would be these. (Incomplete.)
* Section 3.17, add pat ::= !pat to the syntax of patterns.
We would need to take care to make clear whether
{{{
f !x = 3
}}}
was a definition of the function "!", or of "f". (There is
a somewhat similar complication with n+k patterns; see the
end of 4.3.3.2 in the Report. However we probably do not
want to require parens thus
{{{
f (!x) = 3
}}}
which are required in n+k patterns.
* Section 3.17.2: add new bullet 10, saying "Matching
the pattern "!pat" against a value "v" behaves as follows:
* if v is bottom, the match diverges
* otherwise, "pat" is matched against "v".
* Fig 3.1, 3.2, add a new case (t):
{{{
case v of { !pat -> e; _ -> e' }
= v `seq` case v of { pat -> e; _ -> e' }
}}}
* Section 3.12 (let expressions). In the translation box, first apply
the following transformation: for each pattern pi that is of
form `!qi = ei`, transform it to `xi@qi = ei`, and and replace `e0`
by {{{(xi `seq` e0)}}}. Then, when none of the left-hand-side patterns
have a bang at the top, apply the rules in the existing box.
----
= Discussion =
== Recursive let and where bindings ==
At first you might think that a recursive bang pattern
don't make sense: how can you evaluate it strictly if it doesn't exist
yet? But consider
{{{
let !xs = if funny then 1:xs else 2:xs in ...
}}}
Here the binding is recursive, but the translation still makes sense:
{{{
let xs = if funny then 1:xs else 2:xs
in xs `seq` ...
}}}
== Top-level bang-pattern bindings ==
Does this make sense?
{{{
module Foo where
!x = factorial 1000
}}}
A top-level bang-pattern binding like this would imply that
the binding is evaluated when the program is started; a kind of
module initialisation. This makes some kind of sense, since
(unlike unrestricted side effects) it doesn't matter in which order
the module initialisation is performed.
But it's not clear why it would be necessary or useful. '''Conservative
conclusion''': no top-level bang-patterns.
== Tricky point: syntax ==
What does this mean?
{{{
f ! x = True
}}}
Is this a definition of `(!)` or a banged argument? (Assuming that
space is insignificant.)
Proposal: resolve this ambiguity in favour of the bang-pattern. If
you want to define `(!)`, use the prefix form
{{{
(!) f x = True
}}}
Another point that came up in implementation is this. In GHC, at least,
''patterns'' are initially parsed as ''expressions'', because Haskell's syntax
doesn't let you tell the two apart until quite late in the day.
In expressions, the form {{{(! x)}}} is a right section, and parses fine.
But the form {{{(!x, !y)}}} is simply illegal. Solution:
in the syntax of expressions, allow sections without the wrapping parens
in explicit lists and tuples. Actually this would make sense generally:
what could {{{(+ 3, + 4)}}} mean apart from a tuple of two sections?
== Tricky point: pattern-matching semantics ==
A bang is part of a ''pattern''; matching a bang forces evaluation.
So the exact placement of bangs in equations matters. For example,
there is a difference between these two functions:
{{{
f1 x True = True
f1 !x False = False
f2 !x True = True
f2 x False = False
}}}
Since pattern matching goes top-to-bottom, left-to-right,
{{{(f1 bottom True)}}} is {{{True}}}, whereas {{{(f2 bottom True)}}}
is {{{bottom}}}.
== Tricky point: binding transformations ==
In Haskell 98, these two bindings are equivalent:
{{{
{ p1=e1; p2=e2 } and { (~p1,~p2) = (e1,e2) }
}}}
But with bang patterns this transformation only holds if `p1`, `p2` are
not bang-patterns. Remember, the bang is part of the binding, not the pattern,
== Tricky point: exceptions ==
Consider this program
{{{
test = "no exception"
where !_ = error "top down"
!_ = error "bottom up"
eval_order = evaluate test `catch` \e ->
case e of
ErrorCall txt -> return txt
_ -> throw e
}}}
Depending on whether the strict bindings induced by the bang patters in `test` are evaluated top-to-bottom or bottom-to-top, you'll get different results from `eval_order`. But this is exactly the point of of the [http://research.microsoft.com/%7Esimonpj/Papers/imprecise-exn.htm Imprecise Exceptions paper]: the semantics of `test` is a set of exception values, rather than a unique one. So this non-determinism (which can only be observed in the IO monad) is fine.
See this [http://article.gmane.org/gmane.comp.lang.haskell.general/15751 email thread]
== Tricky point: nested bangs (part 1) ==
Consider this:
{{{
let (x, Just !y) = in
}}}
Is `y` evaluted before `` is begun? No, it isn't. That would be
quite wrong. Pattern matching in a `let` is lazy; if any of the variables bound
by the pattern is evaluated, then the whole pattern is matched. In this example,
if `x` or `y` is evaluated, the whole pattern is matched, which in turn forces
evaluation of `y`. The binding is equivalent to
{{{
let t =
x = case t of { (x, Just !y) -> x }
y = case t of { (x, Just !y) -> y }
in
}}}
== Tricky point: nested bangs (part 2) ==
Consider this:
{{{
let !(x, Just !y) = in
}}}
This should be equivalent to
{{{
case of { (x, Just !y) -> }
}}}
Notice that this meant that the '''entire''' pattern is matched
(as always with Haskell). The `Just` may fail; `x` is
not evaluated; but `y` '''is''' evaluated.
This means that you can't give the obvious alternative translation that uses
just let-bindings and {{{seq}}. For example, we could attempt to translate
the example to:
{{{
let
t =
x = case t of (x, Just !y) -> x
y = case t of (x, Just !y) -> y
in
t `seq`
}}}
This won't work, because using `seq` on `t` won't force `y`.
However, the semantics says that the original code is equivalent to
{{{
let p@(x, Just !y) = in p `seq`
}}}
and we can desugar that in obvious way to
{{{
let
t =
p = case t of p@(x, Just !y) -> p
x = case t of p@(x, Just !y) -> x
y = case t of p@(x, Just !y) -> y
in
p `seq`
}}}
which is fine.
You could also build an intermediate tuple, thus:
{{{
let
t = case of p@(x, Just !y) -> (p,x,y)
p = sel13 t
x = sel23 t
y = sel33 t
in
t `seq`
}}}
Indeed GHC does just this for complicated pattern bindings.
== Tricky point: polymorphism ==
Haskell allows this:
{{{
let f :: forall a. Num a => a->a
Just f =
in (f (1::Int), f (2::Integer))
}}}
But if we were to allow a bang pattern, `!Just f = `,
with the translation to
a case expression given earlier, we would end up with
{{{
case of { Just f -> (f (1::Int), f (2::Integer) }
}}}
But if this is Haskell source, then `f` won't be polymorphic.
One could say that the translation isn't required to preserve the static
semantics, but GHC, at least, translates into System F, and being able
to do so is a good sanity check. If we were to do that, then we would
need
{{{
:: Maybe (forall a. Num a => a -> a)
}}}
so that the case expression works out in System F:
{{{
case of {
Just (f :: forall a. Num a -> a -> a)
-> (f Int dNumInt (1::Int), f Integer dNumInteger (2::Integer) }
}}}
The trouble is that `` probably has a type more like
{{{
:: forall a. Num a => Maybe (a -> a)
}}}
...and now the dictionary lambda may get in the way
of forcing the pattern.
This is a swamp. '''Conservative conclusion''': no generalisation (at all)
for bang-pattern bindings.
== Existentials ==
A strict pattern match could perhaps support existentials (which GHC currently
rejects in pattern bindings):
{{{
data T where
T :: a -> (a->Int) -> T
f x = let !(T y f) = x in ...
}}}
----
= A more radical proposal =
A more radical proposal (from John Hughes) is to make pattern matching
in `let` strict. Thus
{{{
let (x,y) = e in b
}}}
would be equivalent to
{{{
case e of (x,y) -> b
}}}
(in the non-recursive case) and to
{{{
let t = let { x = fst t; y = snd t } in e
in case t of (x,y) -> b
}}}
(in the recursive case).
To recover Haskell 98 semantics for a pattern binding, use a tilde:
{{{
let ~(x,y) = e in b
}}}
More precisely, the rules for desugaring binding are these:
1. Sort into strongly-connected components.
2. Apply the following rules:
{{{
let p1=e1 ; ... ; pn = en in e ==> let (p1,...,pn) = (e1,...,en) in e
let p = e1 in e0 ==> case e1 of p -> e0
where no variable in p occurs free in e1
let p = e1 in e0 ==> let p = fix (\ ~p -> e1) in e0
otherwise
}}}
== Why the strongly-connected component? ==
Consider this
{{{
let (y:ys) = xs
(z:zs) = ys
in ...
}}}
There is no real recursion here, but considered all together the
second binding does refer to the first. So the desugaring above would use
fix, and if you work it out you'll find that all the variables get bound to bottom.
Note that the ''static'' semantics already requires a strongly-conneted component
analysis.