Opened 13 years ago

Last modified 3 years ago

#94 new task

Syntax for specifing language extensions

Reported by: patrykz@… Owned by: none
Priority: normal Milestone:
Version: Keywords:
Cc: Meta Owner:
State: discussion Section: N/A or multiple
Related Tickets:

Description

One of the original Haskell goals was to define a language for experimentation with functional language features. This will not change with Haskell' - we already know that at least one feature that is in high demand (FDs) is not ready for inclusion in Haskell'.

However, Haskell still does not provide any internal mechanism to facilitate language experimentation. In particular, the language does not provide a mechanism for keeping track of the languaged extensions in used by a particular piece of code.

To make things worse, many (most?) extension require lexical and/or syntactic changes to the language. Some extensions in the past proved to be incompatible, and not all compilers implement all extensions.

So, looking at a particular piece of code, how do we know which compiler it can be compiled with, and which command-line flags do we pass to which compiler? The answer is, simply, we don't.

Currently, we do this by introducing ugly GHC-specific pragmas, comments in the documentation, and a plethora of similar half-measures.

I would like to propose a language feature to address this concern once and for all. I know that a similar proposal has been circulated on the Haskell mailing list in the past, but to the best of my knowledge, none of the current tickets addresses this problem.

The Proposal

Add explicit syntax for documenting language extensions required to compile a module.

The Problem

  • Current language does not provide a uniform extension mechanism.
  • Pragmas (as currently used in GHC) are NOT suitable for this. Specifically, by their design they are COMMENTS, and therefore should have no impact on the language semantics. Unrecognized pragmas should be ignorable without warnings, while, currently, if you omit the GHC options pragma for a module that requires some syntactic extension, you will receive thousands of lexical and syntactic errors completely unrelated to your extension.
  • I strongly believe that compiler pragmas should ONLY be used for extensions (such as INLINE, various SPECIALIZE, etc.) that have no impact on the program semantics.

The Solution

Add an extension clause to the language.

To avoid introduction of a new keyword, I propose to use the following syntax:

  module -> [extensions;] "module" modid [exports] "where" body
              |  body

  extensions -> extension_1 ; ... ; extension_n

  extension -> "import" "extension" varid [extparam_1 ... extparam_n]

  extparam -> varid | conid | literal

A module using some GHC woo-hah extension would look like:

  import extension GHC "woo-hah!"
  module Foo ...

Or, if an extension is common enough:

  import extension fundeps
  module Foo ...

This is a very conservative syntax. It does not support grouping, aliasing and renaming of extensions (as previously circulated on the haskell mailing list) which I personally believe would be very a bad idea. I want to be able to look at a piece of Haskell code and tell immediately which extensions it uses, without being forced to browse through the included modules, etc. Further, the parsers and lexers must be able to tell where the list of extensions finishes and the module code (which uses these extensions) starts. The "module" keyword is a natural candidate for that.

Extensions would NOT be exported by a module that uses that extension, but would have to be specified separately by each module that uses the features provided by that extension. For example, one often hides uses of unboxed types, functional dependencies, etc, behind a curtain of abstract data types, and such data type implemented using non-standard features can be happily used within standard-conforming Haskell programs. On the other hand, if an extension is visible in an interface exported by a module, it has to be named explicitly (with import extension clauses) by any module importing that interface.

Extensions could be parametized, since I can readily imagine extensions that would require such a thing. I would also recommend in the standard that every compiler groups its own extensions under a common name (for example, GHC, HUGS, JHC, NHC, etc.) until they are in sufficiently common use to be standardized independently (such as fundeps), at which stage there should probably be a corresponding addendum to the standard for that extension.

Specifying extensions before the module keyword ensures that the lexer and parser can find them before parsing of the actual module. I recommend that bare modules without the module keyword cannot specify any extensions, and therefore must be written in pure Haskell'.

The standard itself should not define any extensions and state that the standard grammar and semantics describes the base language in absence of any import extension clauses.

Each extension, including FFI, should be described in a separate addendum. Once an extension has been fully defined and tested in production code, it may be blessed by including it in the list of addenda to the standard, at which case its name will change from compiler-specific (eg. GHC woohoo) to standard (eg. woohoo.)

Pros

  • Addresses a pending language design goal.
  • Useful for automatic documentation tools such as Haddock, which could even generate a hyperlink from an extension name to the relevant addendum when available.
  • Simple to implement.
  • Neat syntax.
  • Backwards-compatible (introduces no keyword polution.)
  • Makes all information required to compile a program available directly in the source code, rather than in ad-hoc places such as command-line, Cabal package descriptions, documentation, comments, pragmas and what-not.
  • Returns all comments (including pragmas) to their original code as semantically-neutral annotations on the source program.

Cons

  • Some implementation hassles. The compiler must use either a predicated parser and lexer (I believe most do already) or else parse the module until it finds the "module" keyword collecting the extension clauses, and then parse the actual module using an appropriate parser and lexer chosen according to the specified extensions.

Change History (5)

comment:1 Changed 13 years ago by john@…

topic: Syntax

comment:2 Changed 13 years ago by ijones@…

topic: SyntaxAnnotations

comment:3 Changed 13 years ago by john@…

Priority: majorminor

comment:4 Changed 3 years ago by Herbert Valerio Riedel

Milestone:

moving non-milestoned many year old legacy tickets out of the way

comment:5 Changed 3 years ago by Herbert Valerio Riedel

Priority: minornormal

Set default priority (as this confuses Trac otherwise)

Note: See TracTickets for help on using tickets.