Version 11 (modified by nhn@…, 11 years ago) (diff)

Tweaks to the existing record system

This page is used to discuss _minor_ tweaks to the existing record system if it is decided that it will be left in basically unchanged. radical or brand new record systems should be discussed elsewhere.


Reallow punning, Foo {x,y,z} would be interpreted as Foo {x = x, y = y, z = z} in both declaration and pattern matching contexts.


Update syntax should not bottom out when fields are undefined, e.g.

data Foo = Foo { x :: String, y :: Int } | Bar { x :: String }

foo = Bar { x = "hello }

baz = foo { y = 3 }

should not result in an error, but rather pass foo through unchanged. update should update the record if it exists, but pass the type through otherwise. This would make the update syntax actually useful.

Label-based pattern-matching

The function:

f val { x = "foo" } = 4

should match if passed a Foo or a Bar with x being equal to "foo" and val would be bound to its argument (like an @ pattern)

g _ { y = 3 } = 4

would match only the Bar constructor since it is the only one with a y field.

This would mitigate the problems caused by accessors being partial functions since you can use a simple case statement to get the effect of an accesor that returns its result in a Maybe.

Note from Simon. I hate that the above defn of 'f' has just one argument (val {x="foo")), whereas it looks as if it has two. (This is a problem with existing Haskell.) It looks like 'f' has an argument 'val' and another arguement that is a free-standing record, something we really want in the end anyhow. Not sure how to fix this. val@{x="foo")?

First-class syntax

First class update and setting syntax (more advanced, needs better syntax). A syntax for updating and setting fields should be allowed. Some possibilites are

foo { x = }

would be equivalent to (\v -> foo { x = v })

foo { x \ }

would be equivalent to

(\f -> foo { x = case foo of _ {x} -> foo { x = f x }; _ -> foo })

Polymorphic record update

Given a record like:

data Foo a = Foo { bar :: a }

it would be nice to be able to update it like:

f = Foo { bar = 'a' }
g = f { bar = False }

Note the change in the type of the stored field. At the moment, such a record update must be written using the data constructor, not the update syntax.

'Open' statement

Having the ability to 'open' a record bringing all its values into scope would be useful for techniques such as first class modules when combined with PolymorphicComponents. a proposal is

data Record = Record { foo :: Int, bar :: String }

f :: Record -> Int
f x = ... where
   open x
   ans = ...

will desugar to

f x = ... where
   Record { foo = foo } = x 
   Record { bar = bar } = x 
   ans = ...

open x would be allowed at the top level, in a let binding, or in a where binding.


It is often useful to limit the ability of users to fill in or access parts of a data type arbitrarily to maintain invariants, instituting the following rule would let you enforce that to some degree:

  • Positional notation for pattern matching and constructor creation for a constructor may not be used unless all of its field labels are in scope.

This would insure that by not exporting a field label, it cannot be gotten around by using positional notation. This fix would also require the polymorphic setting ability mentioned above and would partially mitigate the need for ReadonlyConstructors

Polymorphic Record Update take II

(The following was discussed briefly on the Haskell' list.) Consider the following data type:

data T a
  = C1 { f1 :: a }
  | C2 { f1 :: a, f2 :: Int }
  | C3 { f2 :: Int }
deriving Show

Suppose we want to update the field f1 only in such a way that its type changes. We cannot use the record update syntax, as not all constructors have a field f1. So we write a utility function. However, we would prefer to do as little as possible when it comes to values constructed by constructors NOT having a field f2. One might naively try this:

foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x         = x

But of course, this does not type check as the type of x is different on the LHS and RHS. We can get around that by reconstructing the value on the RHS:

foo :: T a -> T Int
foo x@(C1 {})       = x {f1 = 1}
foo x@(C2 {})       = x {f1 = 2}
foo x@(C3 {f2 = n}) = C3 {f2 = n}

However, this is bad, because we have to change the code if further constructors are added, even when they do not have a field f1, and we also have to change the code if further fields are added to constructors not having the field f1. This is tedious, error prone, and really defeats one of the main reasons for using records in the first place. For example:

data T a
  = C1 { f1 :: a }
  | C2 { f1 :: a, f2 :: Int }
  | C3 { f2 :: Int, f3 :: Char }
  | C4 { f2 :: Int }
  deriving Show

foo :: T a -> T Int
foo x@(C1 {})               = x {f1 = 1}
foo x@(C2 {})               = x {f1 = 2}
foo x@(C3 {f2 = n, f3 = c}) = C3 {f2 = n, f3 = c}
foo x@(C4 {f2 = n})         = C4 {f2 = n}

One might think it would be possible to do better if we're furtunate enough to have a field that is common to *all* constructors not having a field f1, as is the case for f2 in this case:

foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x         = x {f2 = f2 x}

But this does not type check, and it would not apply anyway if there is no such common field.

What we really need is a function that reconstructs a value of type T a at type T b for all values constructed by a constructor that does not have a field f1:

coerce_no_f1 :: T a -> T b
coerce_no_f1 x@(C3 {f2 = n, f3 = c}) = C3 {f2 = n, f3 = c}
coerce_no_f1 x@(C4 {f2 = n})         = C4 {f2 = n}

foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x         = coerce_no_f1 x

But we'd rather not have to write such functions by hand, just as we'd rather not write update functions by hand. Maybe the record update syntax could be extended so that the function that gets generated behind the scenes only includes constructors that does NOT mention a particular field. For example, the field name(s) that must not occur could be prefixed by ~ which suggests negation in some settings. It does not have this connotation in Haskell, but at least ~ is already a special symbol. We could then write:

foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x         = x {~f1}

Now the code for foo only has to be changed if new constructors having a field f1 are added.

Of course, it should be possible to combine this with the normal record update syntax. E.g.

foo :: T a -> T Int
foo x@(C1 {}) = x {f1 = 1}
foo x@(C2 {}) = x {f1 = 2}
foo x         = x {~f1, f2 = f2 x + 1}


Due to a lack of experience with alternative record systems, the consensus seems to be that we should stick with the current system, perhaps with a few of the minor tweaks mentioned above. (Which ones is a question still open for discussion.)

However, the main reason for there being little use of alternative candidates, would seem to be that they are not compatible with current Haskell. Thus, it would be useful to have some mechanism for experimental records to be tried out in real Haskell implementations before the next language committee (Haskell-double-prime) starts its work. Then there might be a possibility of one of them being accepted.

A concrete suggestion is that we separate out everything from the Report to do with named-field records into something like a self-contained addendum. Whilst still an official part of the language standard, it might also be marked as a possibility for future removal. This would make it clear what parts of the language could be changed (or re-used without conflict) in an alternative records system. The re-use part is especially important, since taking some of the same syntax to mean something different is pretty-much essential for useability.