# Polymorphic Components

## Brief Explanation

Arguments of data constructors may have polymorphic types (marked with `forall`

)
and contexts constraining universally quantified type variables, e.g.

newtype Swizzle = MkSwizzle (forall a. Ord a => [a] -> [a])

The constructor then has a rank-2 type:

MkSwizzle :: (forall a. Ord a => [a] -> [a]) -> Swizzle

If RankNTypes are not supported, these data constructors are subject to similar restrictions to functions with rank-2 types:

- polymorphic arguments can only be matched by a variable or wildcard (
`_`

) pattern - when the costructor is used, it must be applied to the polymorphic arguments

This feature also makes it possible to create explicit dictionaries, e.g.

data MyMonad m = MkMonad { unit :: forall a. a -> m a, bind :: forall a b. m a -> (a -> m b) -> m b }

The field selectors here have ordinary polymorphic types:

unit :: MyMonad m -> a -> m a bind :: MyMonad m -> m a -> (a -> m b) -> m b

## References

- From Hindley-Milner Types to First-Class Structures by Mark P. Jones, Haskell Workshop, 1995.
- distinguish from ExistentialQuantification (currently also marked with
`forall`

, but before the data constructor).

## Open Issues

- allow empty foralls?
data T a = Mk (forall . Show a => a)

- hugs vs. ghc treatment as keyword (see below)
- design choice: only wildcard & variables are allowed when pattern matching on polymorphic components (ghc allows as-patterns, hugs doesn't)

## Tickets

- #57
- add polymorphic components

## Pros

- type inference is a simple extension of Hindley-Milner.
- offered by GHC and Hugs for years
- large increment in expressiveness: types become impredicative, albeit with an intervening data constructor, enabling Church encodings and similar System F tricks.
Functions with rank-2 types may be trivially encoded.
Functions with rank-n types may also be encoded, at the cost of packing and unpacking
`newtype`

s. - useful for polymorphic continuation types, like the ReadP type used in a proposed replacement for the Read class.

## Cons

- more complex denotational semantics

# Report TODO List

List of items that need to change in the draft report.

## Syntax

Introduce a new *special identifier*, **forall**. This identifier has a special interpretation in types and type schemes (i.e., it is *not* a type variable).
However, **forall** can still be used as an ordinary variable in expressions.

Syntax for writing type schemes:

poly -> 'forall' tvar_1 ... tyvar_n '.' opt_ctxt type (n > 0) opt_ctxt -> context '=>' scheme -> '(' poly ')' | type ascheme -> '(' poly ')' | atype bscheme -> '(' poly ')' | btype

NOTE: Schemes should not contain quantified variables that are not mentioned in the scheme body because this probably indicates a programmer error.

Syntax for **data** and **newtype** declarations

-- Section 4.2.1 constr -> con acon_filed_1 ... acon_field_k (arity con = k, k>=0) | bcon_field conop bcon_field (infix conop) | con { fielddecl1 , ... , fielddecln } (n>=0) fielddecl-> vars :: con_field con_field -> ! ascheme | scheme -- or scheme instead of ascheme? acon_filed -> ! ascheme | ascheme bcon_filed -> ! ascheme | bscheme -- or bscheme instead of ascheme? -- Section 4.2.3 newconstr -> con ascheme | con { var :: scheme }

NOTE: The grammar in the Haskell 98 report contains a minor bug that seems to allow erroneous data declarations like the following:

data T = C !

For this reason I introduced the *con_field* productions.

## Working with datatypes

### Labeled fields

(Section 3.15?)

Type of the selector functions:

data T a = C { x :: forall b. ctxt => type } x :: ctxt => T a -> type

If different constructors of the same datatype contain the same label, then the corresponding schemes should be equal up to renaming of the quantified variables. This ensures that we can give a reasonable type for the selector functions.

### Constructors

(Section 4.2.1)

Constructors that have polymorphic components cannot appear in the program without values for their polymorphic fields. For example, consider the following declaration:

data T a = C Int (forall b. b -> a) a

The constructor function 'C' cannot be used with less then two arguments because the second argument is the last polymorphic component of 'C'. Here are some examples that illustrate what we can and cannot do with 'C':

ex1 :: a -> T a ex1 a = C 2 (const a) -- ok. ex2 = C 2 -- not ok, needs another argument.

This restriction ensures that all expressions in the program have ordinary Hindley-Milner types.

The values for the polymorphic components should have type schemes that are at least as polymorphic as what is declared in the datatype. This is similar to the check that is performed when an expression has an explicit type signature.

### Patterns

We do not allow nested patterns on fields that have polymorphic types. In other words, when we use a constructor with a polymorphic field as a pattern, we allow only variable and wild-card patterns in the positions corresponding to the polymorphic fields.

Discussion: We could lift this restriction but the resulting behavior may be more confusing than useful. Consider the following example:

data S = C (forall a. [a -> a]) test1 (C x) | null x = ('b', False) | otherwise = (f 'a', f True) where f = head x test2 (C []) = ('b', False) test2 (C (f : _)) = (f 'a', f True)

We may expect that these two definitions are equivalent but, in fact, the first one is accepted while the second one is rejected if we perform the usual translation of patterns. To see what goes wrong, consider how we desugar patterns:

test2 (C x) = case x of [] -> ('b',False) f:_ -> (f 'a', f True)

The use of `x`

in the **case** statement instantiates
it to a particular type, which makes `f`

monomorphic
and so we cannot instantiate it to different types.
An alternative might be to perform a generalization after
we pattern match on a polymorphic field but it is not clear
if this extra complexity is useful.

## TODO

- lots of english text in algebreic datatype declartions
- anything in "kind inference"?
- other?