3 | | Thi |
| 3 | This page describes a rather lightweight proposal for adding views to |
| 4 | Haskell Prime. I'm thinking of prototyping the idea in GHC, so I'm looking |
| 5 | for feedback. |
| 6 | |
| 7 | == The problem == |
| 8 | |
| 9 | We are keen on abstraction, but pattern matching is so convenient that |
| 10 | we break abstractions all the time. It's our dirty little secret. |
| 11 | Looked at this way, object-oriented folk are much more obsessive |
| 12 | about abstraction than we are: everything (including field access |
| 13 | these days) is a method. |
| 14 | |
| 15 | Views have, in one form or another, repeatedly been proposed as a |
| 16 | solution for this problem. (See the end for a comparison with related work.) |
| 17 | |
| 18 | == The proposal informally == |
| 19 | |
| 20 | The proposal introduces a new form of pattern, called a '''view pattern''' |
| 21 | Here are some function definitions using view patterns. |
| 22 | To read these definitions, imagine that `$sing` is |
| 23 | a sort of constructor that matches singleton lists. |
| 24 | {{{ |
| 25 | f :: [Int] -> Int |
| 26 | f (sing -> n) = n+1 -- Equiv to: f [x] = ... |
| 27 | f other = 0 |
| 28 | |
| 29 | g :: [Bool] -> Int |
| 30 | g (sing -> True) = 0 -- Equiv to: g [True] = ... |
| 31 | g (sing -> False) = 1 -- Equiv to: g [False] = ... |
| 32 | g other = 2 |
| 33 | |
| 34 | h :: [[Int]] -> Int |
| 35 | h ($sing -> x : $sing -> y : _) = x+y |
| 36 | -- Equiv to: h ([x]:[y]:_) = ... |
| 37 | h other = 0 |
| 38 | }}} |
| 39 | So what is `sing`? It is just an ordinary Haskell function that |
| 40 | returns a `Maybe` type: |
| 41 | {{{ |
| 42 | sing :: [a] -> Maybe a |
| 43 | sing [x] = Just x |
| 44 | sing other = Nothing |
| 45 | }}} |
| 46 | So `sing` simply identifies singleton lists, and returns the payload (that is, |
| 47 | the singleton element; otherwise it returns `Nothing`. |
| 48 | It is very important that '''there is nothing special about `sing`'''. It is |
| 49 | not declared to be a view; it can be called as a normal Haskell function; the author |
| 50 | of `sing` might not have intended it to be used in pattern matching. |
| 51 | |
| 52 | --------------------------- |
| 53 | == The proposal more formally == |
| 54 | |
| 55 | The only special stuff is in the pattern. |
| 56 | The sole change is this: add a single new sort of pattern, of the |
| 57 | form |
| 58 | (''expr'' `->` ''pat'') |
| 59 | |
| 60 | where ''expr'' is an arbitrary Haskell expression. I'll call a pattern |
| 61 | of this form a '''view pattern'''. |
| 62 | |
| 63 | From a '''scoping''' point of view, the variables bound by the pattern (''expr'' `->` ''pat'') |
| 64 | are simply the variables bound by ``pat``. |
| 65 | Any variables in ``expr`` are bound occurrences. |
| 66 | |
| 67 | The rule for '''pattern-matching''' is this: |
| 68 | To match a value ''v'' against a pattern ''($expr -> p)'', |
| 69 | * Evaluate ''(expr v)'' |
| 70 | * If the result is ''(`Just` w)'', match ''w'' against ''p'' |
| 71 | * If the result is `Nothing`, the match fails. |
| 72 | |
| 73 | The '''typing rule''' is similarly simple. |
| 74 | The expression ''expr'' must have type |
| 75 | ''t1 `-> Maybe` t2''. Then the pattern ''pat'' must have type ''t2'', and the |
| 76 | whole pattern (''expr'' `->` ''pat'') has type ''t1''. |
| 77 | |
| 78 | === The value input feature === |
| 79 | |
| 80 | Note that the ''expr'' is an arbitrary Haskell expression. For example, suppose you wrote |
| 81 | a regular expression matching function: |
| 82 | {{{ |
| 83 | regexp :: String -> String -> Maybe (String, String) |
| 84 | -- (regexp r s) parses a string matching regular expression r |
| 85 | -- the front of s, returning the matched string and remainder of s |
| 86 | }}} |
| 87 | then you could use it in patterns thus: |
| 88 | {{{ |
| 89 | f :: String -> String |
| 90 | f (regexp "[a-z]*" -> (name, rest)) = ... |
| 91 | }}} |
| 92 | Of course, the argument does not need to be a constant as it is here. |
| 93 | |
| 94 | '''This ability to pass arguments to the view function, to guide its matching |
| 95 | behaviour, is a key feature of this proposal,''' shared by some, but by no means |
| 96 | all view proposals. I'll call it the '''value input feature'''. |
| 97 | |
| 98 | Indeed, in a sense, patterns become first class. For example, one could pass a pattern |
| 99 | as an argument to a function thus: |
| 100 | {{{ |
| 101 | g :: (Int -> Maybe Int) -> Int -> ... |
| 102 | g p (p -> x) = ... |
| 103 | }}} |
| 104 | Here the first argument `p` can be thought of as a pattern passed to `g`, which |
| 105 | is used to match the second argument of `g`. |
| 106 | |
| 107 | === A possible extension === |
| 108 | |
| 109 | It would be quite useful to allow more than one sub-pattern in a view |
| 110 | pattern. To do this we'd need a `Maybe` data type that returns more than |
| 111 | one result, thus: |
| 112 | {{{ |
| 113 | data Maybe2 a b = Nothing2 | Just2 a b |
| 114 | data Maybe3 a b c = Nothing3 | Just3 a b c |
| 115 | -- ..etc..., up to 8 perhaps (sigh) |
| 116 | }}} |
| 117 | With this in hand we can extend the views story to have multiple sub-patterns. |
| 118 | Example: |
| 119 | {{{ |
| 120 | snoc :: [a] -> Maybe2 [a] a |
| 121 | snoc [] = Nothing2 |
| 122 | snoc (x:xs) = case snoc xs of |
| 123 | Nothing2 -> Just2 [] x |
| 124 | Just2 ys y -> Just2 (x:ys) y |
| 125 | |
| 126 | last :: [Int] -> Int |
| 127 | last (snoc -> xs x) = x |
| 128 | last other = error "empty list" |
| 129 | }}} |
| 130 | It is tiresome that we need types `Maybe2`, `Maybe3` etc, but we already have |
| 131 | that in Haskell; consider `zip3`, `zip4` and so on. |
| 132 | We could always get away without it, by sticking to unary view patterns and |
| 133 | using tuples, thus: |
| 134 | {{{ |
| 135 | snoc :: [a] -> Maybe2 ([a], a) |
| 136 | snoc [] = Nothing |
| 137 | snoc (x:xs) = case snoc xs of |
| 138 | Nothing -> Just ([], x) |
| 139 | Just (ys,y) -> Just (x:ys, y) |
| 140 | |
| 141 | last :: [Int] -> Int |
| 142 | last (snoc -> (xs, x)) = x |
| 143 | last other = error "empty list" |
| 144 | }}} |
| 145 | But the tuple looks a bit clumsy. |
| 146 | |
| 147 | Under this proposal, the number of sub-patterns in the view pattern determines |
| 148 | which return type the view function should have. E.g. in the pattern '(e -> p1 p2 p3)', |
| 149 | 'e' should return a `Maybe3`. |
| 150 | |
| 151 | If n=0, then we want `Maybe0`, which is called `Bool`. Thus |
| 152 | {{{ |
| 153 | even :: Int -> Bool |
| 154 | even n = n `div` 2 == 0 |
| 155 | |
| 156 | f (even ->) = ... -- Matches even numbers |
| 157 | f other = ... |
| 158 | }}} |
| 159 | Here `even` is used as a nullary view pattern, with no sub-patterns |
| 160 | following the `->`. |
| 161 | |
| 162 | == Efficiency == |
| 163 | |
| 164 | View patterns can do arbitrary computation, perhaps expensive. So |
| 165 | it's good to have a syntactically-distinct notation that reminds the |
| 166 | programmer that some computation beyond ordinary pattern matching may |
| 167 | be going on. |
| 168 | |
| 169 | It's reasonable to expect the compiler to avoid repeated computation when |
| 170 | pattern line up in a column: |
| 171 | {{{ |
| 172 | f (snoc -> x xs) True = ... |
| 173 | f (snoc -> x xs) False = ... |
| 174 | }}} |
| 175 | In pattern-guard form, common sub-expression should achieve the same |
| 176 | effect, but it's quite a bit less obvious. We should be able to give |
| 177 | clear rules for when the avoidance of repeat computation is |
| 178 | guaranteed. |
| 179 | |
| 180 | |
| 181 | --------------------------- |
| 182 | == More examples == |
| 183 | |
| 184 | === Erlang-style parsing === |
| 185 | |
| 186 | The expression to the left of the `->` can mention variables bound in earlier patterns. |
| 187 | For example, Sagonas et al describe an extension to Erlang that supports pattern-matching on bit-strings ([http://user.it.uu.se/~kostis/Papers/index.html#Conference "Application, implementation and performance evaluation of bit-stream programming in Erlang", PADL'07]). Suppose we had a parsing function thus: |
| 188 | {{{ |
| 189 | bits :: Int -> ByteString -> Maybe2 Word ByteString |
| 190 | -- (bits n bs) parses n bits from the front of bs, returning |
| 191 | -- the n-bit Word, and the remainder of bs |
| 192 | }}} |
| 193 | Then we could write patterns like this: |
| 194 | {{{ |
| 195 | parsePacket :: ByteString -> ... |
| 196 | parsePacket (bits 3 -> n (bits n -> val bs)) = ... |
| 197 | }}} |
| 198 | This parses 3 bits to get the value of `n`, and then parses `n` bits to get |
| 199 | the value of `val`. |
| 200 | |
| 201 | === Sets as lists === |
| 202 | |
| 203 | Here is a module implementing sets as lists: |
| 204 | {{{ |
| 205 | module Set( Set, empty, insert, delete, has) where |
| 206 | |
| 207 | newtype Set a = S [a] |
| 208 | |
| 209 | has :: Eq a => a -> Set a -> Maybe (Set a) |
| 210 | has x (S xs) | x `elem` xs = Just (xs \\ x) |
| 211 | | otherwise = Nothing |
| 212 | |
| 213 | delete :: a -> Set a -> Set a |
| 214 | delete x (has x -> s) = s |
| 215 | delete x s = s |
| 216 | |
| 217 | insert :: a -> Set a -> Set a |
| 218 | insert x s@(has x -> _) = s |
| 219 | insert x s = s |
| 220 | }}} |
| 221 | Notice that in the left-hand side `delete x (has x -> s)`, the first `x` is a binding occurrence, |
| 222 | but the second is merely an argument to `has` and is a bound occurrence. |
| 223 | |
| 224 | === N+k patterns === |
| 225 | |
| 226 | You can test for values. For example here's a view function that tests for values |
| 227 | greater than or equal to n: |
| 228 | {{{ |
| 229 | np :: Num a => a -> a -> Maybe a |
| 230 | np k n | k <= n = Just (n-k) |
| 231 | | otherwise = Nothing |
| 232 | |
| 233 | f :: Num a => a -> Int |
| 234 | f (np 10 -> n) = 0 -- Matches values >= 10 |
| 235 | f (np 4 -> n) = 1 -- Matches values >= 4 |
| 236 | f other = 2 |
| 237 | }}} |
| 238 | You will recognise these as (n+k) patterns, albeit with slightly different syntax. |
| 239 | (Incidentally, this example shows that the view function can be overloaded, and |
| 240 | that its use in a view pattern gives rise to a type-class constraint (in this case, |
| 241 | that in turn makes `f` overloaded). |
| 242 | |
| 243 | === Naming constants in one place === |
| 244 | |
| 245 | We are always taught to write magic numbers, or constants, in one place only. |
| 246 | In C you can write |
| 247 | {{{ |
| 248 | #define ERRVAL 4 |
| 249 | }}} |
| 250 | and then use `ERRVAL` in `switch` labels. You can't do that in Haskell, which is tiresome. |
| 251 | But with view pattern you can: |
| 252 | {{{ |
| 253 | errVal :: Int -> Bool |
| 254 | errVal = (== 4) |
| 255 | |
| 256 | f (errVal ->) = ... |
| 257 | }}} |
| 258 | |
| 259 | == Remarks == |
| 260 | |
| 261 | '''Note 0'''. A key feature of this proposal is its modesty; it is essentially simply some syntactic sugar for patterns: |
| 262 | * There is no new form of declaration (e.g. 'view' or 'pattern synonym'). |
| 263 | * The functions used in view patterns are ordinary Haskell functions, and can be called from ordinary Haskell code. They are not special view functions. |
| 264 | * Since the view functions are ordinary Haskell functions, no changes are needed to import or export mechanisms. |
| 265 | * Both static and dynamic semantics are extremely simple. |
| 266 | However, sometimes modest syntactic sugar can have profound consequences. |
| 267 | In this case, it's possible that people would start routinely hiding |
| 268 | the data representation and exporting view functions instead, which might |
| 269 | be an excellent thing. |
| 270 | |
| 271 | '''Note 1'''. All this could be done with pattern guards. For example `parsePacket` could be written |
| 272 | {{{ |
| 273 | parsePacket bs | Just (n, bs') <- bits 3 bs |
| 274 | | Just (val, bs'') <- bits n bs' |
| 275 | = ... |
| 276 | }}} |
| 277 | But it's a bit more clumsy. I'm hoping that support for view patterns might encourage people to export |
| 278 | view functions (ones with `Maybe` return types, and encouage their use in patten matching). That is, |
| 279 | just lower the barrier for abstraction a bit. |
| 280 | |
| 281 | '''Note 2'''. It is hard to check for completeness of pattern matching; and likewise for |
| 282 | overlap. But guards already make both of these hard; and GADTs make completness hard too. |
| 283 | So matters are not much worse than before. |
| 284 | |
| 285 | == Concrete syntax == |
| 286 | |
| 287 | Here are some other possible syntax choices I've considered: |
| 288 | {{{ |
| 289 | f ($snoc x xs) = ... -- Use prefix "$" |
| 290 | g ($(bits 3) x bs) = ... -- Extra parens for the value input feature |
| 291 | |
| 292 | f (%snoc x xs) = ... -- Or use prefix "%" instead |
| 293 | |
| 294 | f (.snoc x xs) = ... -- Or use prefix "." instead |
| 295 | |
| 296 | f (snoc | x xs) = .. -- Use "|" instead of "->" |
| 297 | g (bits 3 | b bs) = ... |
| 298 | }}} |
| 299 | |
| 300 | I also thought about infix view patterns, where the view function |
| 301 | appears between its (pattern) arguments, but I could not think of a |
| 302 | nice syntax for it, so it is not provided by this proposal. |
| 303 | |
| 304 | ------------------------- |
| 305 | == Related work == |
| 306 | |
| 307 | === Wadler's original paper (POPL 1987)] === |
| 308 | |
| 309 | Wadler's paper was the first concrete proposal. It proposed a top-level view |
| 310 | declaration, together with functions ''in both directions'' between the view type |
| 311 | and the original type, which are required to be mutually inverse. |
| 312 | This allows view constructors to be used in expressions |
| 313 | as well as patterns, which seems cool. Unfortunately this dual role proved |
| 314 | problematic for equational reasoning, and every subsequent proposal restricted |
| 315 | view constructors to appear in patterns only. |
| 316 | |
| 317 | === [http://haskell.org/development/views.html Burton et al views (1996)] === |
| 318 | |
| 319 | This proposal is substantially more complicated than the one above; in particular it |
| 320 | rquires new form of top-level declaration for a view type. For example: |
| 321 | {{{ |
| 322 | view Backwards a of [a] = [a] `Snoc` a | Nil |
| 323 | where |
| 324 | backwards [] = Nil |
| 325 | backwards (x:[]) = [] `Snoc` x |
| 326 | backwards (x1:(xs `Snoc` xn)) = (x1:xs) `Snoc` xn |
| 327 | }}} |
| 328 | Furthermore, it is in some ways less expressive than the proposal here; |
| 329 | the (n+k) pattern, Erlang `bits` pattern, and `regexp` examples are not |
| 330 | definable, because all rely on the value input feature. |
| 331 | |
| 332 | I think this proposal is substantially the same as "Pattern matching and |
| 333 | abstract data types", Burton and Cameron, JFP 3(2), Apr 1993. |
| 334 | |
| 335 | === [http://citeseer.ist.psu.edu/okasaki98view.html Okasaki: views in Standard ML] === |
| 336 | |
| 337 | Okasaki's design is very similar to Burton et al's, apart from differences due |
| 338 | to the different host language. Again, the value input feature is not supported. |
| 339 | |
| 340 | === [http://citeseer.ist.psu.edu/erwig96active.html Erwig: active patterns] === |
| 341 | |
| 342 | Erwig's proposal for active patterns renders the Set example like this: |
| 343 | {{{ |
| 344 | data Set a = Empty | Add a (Set a) |
| 345 | |
| 346 | pat Add' x _ = |
| 347 | Add y s => if x==y then Add y s |
| 348 | else let Add' x t = s |
| 349 | in Add x (Add y t) |
| 350 | |
| 351 | delete x (Add' x s) = s |
| 352 | delete x x = s |
| 353 | }}} |
| 354 | This requires a new top-leven declaration form `pat`; and I don't think it's nearly |
| 355 | as easy to understand as the current proposal. Notably, in the first equation for |
| 356 | `delete` it's ahrd to see that the second `x` is a bound occurrence; this somehow |
| 357 | follows from the `pat` declaration. |
| 358 | |
| 359 | Still the proposal does support the value input feature. |
| 360 | |
| 361 | === [http://citeseer.ist.psu.edu/erwig00pattern.html Erwig/Peyton Jones: transformational patterns] === |
| 362 | |
| 363 | This paper describes pattern guards, but it also introduces '''transformational patterns'''. (Although |
| 364 | it is joint-authored, the transformational-pattern idea is Martin's.) Transformational patterns |
| 365 | are very close to what we propose here. In particular, view functions are ordinary Haskell functions, |
| 366 | so that the only changes are to patterns themselves. |
| 367 | |
| 368 | There are two main differences (apart from syntax). |
| 369 | First, transformational patterns didn't have the value input feature, althought it'd be easy |
| 370 | to add (indeed that's what we've done). Second, : in the current |
| 371 | proposal the view function is expected to return a `Maybe` type, with `Nothing` to indicate match |
| 372 | failure. In the transformational pattern paper, this implicit matching is not performed. So, |
| 373 | using the new syntax, you'd have to write |
| 374 | {{{ |
| 375 | f (snoc -> Just2 xs x) = ... |
| 376 | }}} |
| 377 | The benefit of not having the implicit matching is that you can write functions that are, perhaps, |
| 378 | more view-like. Example: |
| 379 | {{{ |
| 380 | data Product = ....some big data type... |
| 381 | |
| 382 | data Size = Small | Medium | Big -- View type |
| 383 | prodSize :: Product -> Size |
| 384 | prodSize = .... |
| 385 | |
| 386 | f :: Product -> ... |
| 387 | f (prodSize -> Small) = ... |
| 388 | f (prodSize -> Medium) = ... |
| 389 | f (prodSize -> Big) = ... |
| 390 | }}} |
| 391 | With the current proposal, you'd instead write something like this: |
| 392 | {{{ |
| 393 | smallProd, medProd, bigProd :: Product -> Bool |
| 394 | smallProd p = ... |
| 395 | medProd p = ... |
| 396 | bigProd p = ... |
| 397 | |
| 398 | f :: Product -> ... |
| 399 | f (smallProd ->) = ... |
| 400 | f (medProd ->) = ... |
| 401 | f (bigProd ->) = ... |
| 402 | }}} |
| 403 | This is not obviously worse, except that the first version is more |
| 404 | obviously exhaustive. Incidentally, both should generate the same |
| 405 | code. |
| 406 | |
| 407 | While I think the implicit Maybe-stripping is a real win, it's an open |
| 408 | design choice. Perhaps a different arrow could suppress the |
| 409 | stripping? |
| 410 | |
| 411 | === Pattern synonyms === |
| 412 | |
| 413 | [http://hackage.haskell.org/trac/haskell-prime/wiki/PatternSynonyms Pattern synonyms] |
| 414 | are a requested Haskell Prime feature. John Reppy had the same idea years ago for Standard ML; see |
| 415 | [http://people.cs.uchicago.edu/~jhr/papers/1992/tr-sml-const.pdf Abstract value constructors], |
| 416 | Reppy & Aiken, TR 92-1290, Cornell, June 1992. |
| 417 | |
| 418 | The one way in which pattern synonyms are better than view patterns is that |
| 419 | they define by-construction bi-directional maps. Example |
| 420 | {{{ |
| 421 | data Term = Var String | Term String [Term] |
| 422 | |
| 423 | -- 'const' introduces a pattern synonym |
| 424 | const Plus a b = Term "+" [a,b] |
| 425 | |
| 426 | f :: Term -> Term |
| 427 | f (Plus a b) = Plus (f a) (f b) |
| 428 | f ... = ... |
| 429 | }}} |
| 430 | With pattern views, we'd have to write two functions for the "plus" view: |
| 431 | {{{ |
| 432 | plus :: Term -> Term -> Term |
| 433 | plus a b = Term "+" [a,b] |
| 434 | |
| 435 | isPlus :: Term -> Maybe2 Term Term |
| 436 | isPlus (Term "+" [a,b]) = Just2 a b |
| 437 | isPlus other = Nothing |
| 438 | |
| 439 | f :: Term -> Term |
| 440 | f (isPlus -> a b) = plus (f a) (f b) |
| 441 | }}} |
| 442 | But perhaps that is not so bad. Pattern synonyms also require a new form of top level declaration; |
| 443 | and are much more limited than view patterns (by design they cannot do computation). |
| 444 | |
| 445 | === First class abstractions === |
| 446 | |
| 447 | Several proposals suggest first class ''abstractions'' rather that first-class ''patterns''. |
| 448 | Here are the ones I know of |
| 449 | |
| 450 | * [http://hackage.haskell.org/trac/haskell-prime/ticket/114 Claus Reinke's lambda-match proposal] |
| 451 | * [http://citeseer.ist.psu.edu/tullsen00first.html Tullsen: first class patterns] |
| 452 | * [http://ttic.uchicago.edu/~blume/pub-cat.html Matthias Blume: Extensible programming with first-class cases] (ICFP06) |
| 453 | |
| 454 | All these proposals are more or less orthogonal to this one. For example, Reinke |
| 455 | proposes a compositional syntax for lambda abstractions |
| 456 | `(\p -> e)` where pattern matching failure on `p` can be caught and composed |
| 457 | with a second abstraction. Thus |
| 458 | {{{ |
| 459 | (| Just x -> x+1 ) +++ (| Nothing -> 0 ) |
| 460 | }}} |
| 461 | combines two abstractions, with failure from the first falling through to the seoond. |
| 462 | |
| 463 | Blume and Tullsen have a similar flavour. None of these proposals say |
| 464 | anything about the patterns themselves, which in turn is all this |
| 465 | proposal deals with. Hence orthgonal. |
| 466 | |