Changes between Version 2 and Version 3 of BinaryIO

Dec 17, 2005 3:14:12 AM (13 years ago)

Expand on binary IO


  • BinaryIO

    v2 v3  
    11= Binary I/O =
    3 Haskell 98 treats I/O as character-based, and lacks a mechanism for binary I/O. It is currently impossible to read or write binary data in a portable manner.
     3Haskell 98 treats I/O as character-based, and lacks a well-defined mechanism for binary I/O. However, a number of competing external libraries exist providing various forms of binary I/O, providing forms of compressed I/O, and serialised, persistent data.
    55 * Character-based I/O is needed, at least because systems (e.g. Unix and Windows) have different line-termination conventions that should be hidden from programs. The problem becomes more acute when different environments use different character sets and encodings (see [wiki:Unicode]).
    66 * Binary I/O is needed both to handle binary data and as a base upon which general treatment s of character-encoding conversions (see [wiki:Unicode]) may be layered.
    8 The proposal is to add a form of I/O over `Word8` (i.e. octets, 8-bit binary values). See the "Binary input and output" section of [ System.IO] for a rough design.
     8One proposal is to add a form of I/O over `Word8` (i.e. octets, 8-bit binary values). See the "Binary input and output" section of [ System.IO] for a rough design.
     10Another would be to look at one of the binary I/O libraries based on [ The Bits Between The Lambdas], descendents of which have proliferated in the last couple of years. The advantage of this style over the simpler System.IO library is support for serialising more complex data types, using type classes to recursively define binary I/O routines for each type component of the data you with to serialise. Instances of I/O may be written by hand, or derived mechanically with [ DrIFT].
     12Issues to consider:
     13 * What language extensions are required?
     14 * Support for cyclic structures
     15 * Is it possible to derive I/O instances for types, or must they be written by hand?
     17Existing libraries for Binary I/O:
     18 * The simplest is probably [ System.IO], which provides hGetBuf-style I/O. Really only suitable for arrays.
     19 * [ Packed strings], layered over System.IO is sometimes used, for simple data types, which can be easily converted to and from flat arrays, using list functions.
     20 * The de-facto standard, and also the fastest, for non-trivial data types, the Binary class, a version of which is [ described here]. Distributed with nhc, and used by GHC to deal with .hi files. Tool support from DrIFT to derive new instances. Flavours include:
     21    * [ NHC's binary], the original
     22    * [ GHC's Binary], used internally by GHC.
     23    * [ NewBinary], the standard
     24    * [ Lambdabot/Hmp3's Binary], a faster, Handle-only version of Binary.
     25 * [ SerTH] is a Binary-alike, which uses Template Haskell to derive serialiser instances for each data type. It's an alternative to using DrIFT (or handwriting) your own Binary instances. Obviously requires TH. Supports serialising cyclic structures
     26 * [ ByteStream], a new high-performance serialisation library, using gzip compression.
     28Further information:
     29 * [ A recent mailing list thread].
     30 * [ A page on the Haskell wiki]
     32The two simplest options are to go with only the System.IO extension, or the Binary class.
     35 * The Binary class (particularly as implemented in NewBinary) is simple, elegant and widely used.
     36 * Binary IO is an oft requested feature, lack of which is sometimes considered a flaw in Haskell98, so we should do something about it.
     39 * Ideally(?) Binary should be derivable without an external tool
     40 * Binary only supports I/O from Handles and memory buffers. Some people require other kinds of streams
     41 * There is an overlap with Storable that isn't exploited or explained in any existing library.
     42 * Some new developments are underway to combine SerTH's cyclic structure support with the speed of NewBinary
     43 * What about a NewIO library, how will this overlap/interact?