| 63 | === Section 3 (Expressions) === |
| 64 | |
| 65 | '''remove''' |
| 66 | |
| 67 | In the syntax that follows, there are some families of nonterminals |
| 68 | indexed by precedence levels (written as a superscript). Similarly, |
| 69 | the nonterminals op, varop, and conop may have a double index: a |
| 70 | letter l, r, or n for left-, right- or non-associativity and a |
| 71 | precedence level. A precedence-level variable i ranges from 0 to 9; an |
| 72 | associativity variable a varies over {l, r, n}. For example |
| 73 | |
| 74 | {{{ |
| 75 | aexp -> ( expi+1 qop(a,i) ) |
| 76 | }}} |
| 77 | |
| 78 | actually stands for 30 productions, with 10 substitutions for i and 3 |
| 79 | for a. |
| 80 | |
| 81 | '''replace''' |
| 82 | |
| 83 | {{{ |
| 84 | exp -> exp0 :: [context =>] type (expression type signature) |
| 85 | | exp0 |
| 86 | |
| 87 | expi -> expi+1 [qop(n,i) expi+1] |
| 88 | | lexpi |
| 89 | | rexpi |
| 90 | lexpi -> (lexpi | expi+1) qop(l,i) expi+1 |
| 91 | lexp6 -> - exp7 |
| 92 | rexpi -> expi+1 qop(r,i) (rexpi | expi+1) |
| 93 | }}} |
| 94 | |
| 95 | '''with''' |
| 96 | |
| 97 | {{{ |
| 98 | exp -> infixexp :: [context =>] type (expression type signature) |
| 99 | | infixexp |
| 100 | |
| 101 | infixexp -> exp10 qop infixexp |
| 102 | | - infixexp |
| 103 | | exp10 |
| 104 | }}} |
| 105 | |
| 106 | '''replace''' |
| 107 | |
| 108 | {{{ |
| 109 | | ( expi+1 qop(a,i) ) (left section) |
| 110 | | ( lexpi qop(l,i) ) (left section) |
| 111 | | ( qop(a,i)<-> expi+1 ) (right section) |
| 112 | | ( qop(r,i)<-> rexpi ) (right section) |
| 113 | }}} |
| 114 | |
| 115 | '''with''' |
| 116 | |
| 117 | {{{ |
| 118 | | ( infixexp qop ) (left section) |
| 119 | | ( qop<-> infixexp ) (right section) |
| 120 | }}} |
| 121 | |
| 122 | '''replace''' |
| 123 | |
| 124 | Expressions involving infix operators are disambiguated by the |
| 125 | operator's fixity (see Section 4.4.2). Consecutive unparenthesized |
| 126 | operators with the same precedence must both be either left or right |
| 127 | associative to avoid a syntax error. Given an unparenthesized |
| 128 | expression "x qop(a,i) y qop(b,j) z", parentheses must be added |
| 129 | around either "x qop(a,i) y" or "y qop(b,j) z" when i=j unless a=b=l |
| 130 | or a=b=r. |
| 131 | |
| 132 | '''with |
| 133 | |
| 134 | Expressions involving infix operators are disambiguated by the |
| 135 | operator's fixity (see Section 4.4.2). Consecutive unparenthesized |
| 136 | operators with the same precedence must both be either left or right |
| 137 | associative to avoid a syntax error. Given an unparenthesized |
| 138 | expression "x qop(a,i) y qop(b,j) z" (where "qop(a,i)" means an |
| 139 | operator with associativity "a" and precedence "i"), parentheses |
| 140 | must be added around either "x qop(a,i) y" or "y qop(b,j) z" when |
| 141 | i=j unless a=b=l or a=b=r. |
| 142 | |
| 143 | An example algorithm for resolving expressions involving infix |
| 144 | operators is given in Section 9.6. |
| 145 | |
| 146 | |
| 147 | '''remove''' |
| 148 | |
| 149 | A note about parsing. Expressions that involve the interaction of |
| 150 | fixities with the let/lambda meta-rule may be hard to parse. For |
| 151 | example, the expression |
| 152 | |
| 153 | {{{ |
| 154 | let x = True in x == x == True |
| 155 | }}} |
| 156 | |
| 157 | cannot possibly mean |
| 158 | |
| 159 | {{{ |
| 160 | let x = True in (x == x == True) |
| 161 | }}} |
| 162 | because (==) is a non-associative operator; so the expression must |
| 163 | parse thus: |
| 164 | |
| 165 | (let x = True in (x == x)) == True |
| 166 | |
| 167 | However, implementations may well use a post-parsing pass to deal |
| 168 | with fixities, so they may well incorrectly deliver the former |
| 169 | parse. Programmers are advised to avoid constructs whose parsing |
| 170 | involves an interaction of (lack of) associativity with the |
| 171 | let/lambda meta-rule. |
| 172 | |
| 173 | '''replace''' |
| 174 | |
| 175 | For the sake of clarity, the rest of this section shows the syntax of |
| 176 | expressions without their precedences. |
| 177 | |
| 178 | '''with''' |
| 179 | |
| 180 | For the sake of clarity, the rest of this section will assume that |
| 181 | expressions involving infix operators have been resolved according |
| 182 | to the fixities of the operators. |
| 183 | |
| 184 | === Section 3.5 (Sections) === |
| 185 | |
| 186 | replace the grammar, as above. The text is still correct. |
| 187 | |
| 188 | === Section 9 (Syntax reference) === |
| 189 | |
| 190 | Make the corresponding changes as for Section 3. |
| 191 | |
| 192 | === Section 9.6, Resolving expressions with infix operators === |
| 193 | |
| 194 | '''New sub-section''' |
| 195 | |
| 196 | The following is an example implementation of fixity resolution for |
| 197 | Haskell expressions. The function @resolve@ takes a list consisting |
| 198 | of alternating expressions and operators (an instance of the |
| 199 | @infixexp@ non-terminal in the context-free grammar), and returns |
| 200 | either @Just e@ where @e@ is the resolved expression, or @Nothing@ if |
| 201 | the input does not represent a valid expression. |
| 202 | |
| 203 | {{{ |
| 204 | data Op = Op String Prec Fixity deriving Eq |
| 205 | data Fixity = Leftfix | Rightfix | Nonfix deriving Eq |
| 206 | type Prec = Int |
| 207 | |
| 208 | data Exp = Var Var | OpApp Exp Op Exp | Neg Exp deriving Eq |
| 209 | type Var = String |
| 210 | |
| 211 | data Tok = TExp Exp | TOp Op | TNeg |
| 212 | |
| 213 | resolve :: [Tok] -> Maybe Exp |
| 214 | resolve tokens = fmap fst $ parseNeg (Op "" (-1) Nonfix) tokens |
| 215 | where |
| 216 | parseNeg :: Op -> [Tok] -> Maybe (Exp,[Tok]) |
| 217 | parseNeg op1 (TExp e1 : rest) = parse op1 e1 rest |
| 218 | parseNeg op1 (TNeg : rest) |
| 219 | = do guard (prec1 < 6) |
| 220 | (r, rest') <- parseNeg (Op "-" 6 Leftfix) rest |
| 221 | parse op1 (Neg r) rest' |
| 222 | where |
| 223 | Op _ prec1 fix1 = op1 |
| 224 | |
| 225 | parse :: Op -> Exp -> [Tok] -> Maybe (Exp, [Tok]) |
| 226 | parse _ e1 [] = Just (e1, []) |
| 227 | parse op1 e1 (TOp op2 : rest) |
| 228 | -- case (1): check for illegal expressions |
| 229 | | prec1 == prec2 && (fix1 /= fix2 || fix1 == Nonfix) |
| 230 | = Nothing |
| 231 | |
| 232 | -- case (2): op1 and op2 should associate to the left |
| 233 | | prec1 > prec2 || (prec1 == prec2 && fix1 == Leftfix) |
| 234 | = Just (e1, TOp op2 : rest) |
| 235 | |
| 236 | -- case (3): op1 and op2 should associate to the right |
| 237 | | otherwise |
| 238 | = do (r,rest') <- parseNeg op2 rest |
| 239 | parse op1 (OpApp e1 op2 r) rest' |
| 240 | where |
| 241 | Op _ prec1 fix1 = op1 |
| 242 | Op _ prec2 fix2 = op2 |
| 243 | }}} |
| 244 | |
| 245 | The algorithm works as follows. At each stage we have a call |
| 246 | |
| 247 | {{{ |
| 248 | parse op1 E1 (op2 : tokens) |
| 249 | }}} |
| 250 | |
| 251 | which means that we are looking at an expression like |
| 252 | |
| 253 | {{{ |
| 254 | E0 `op1` E1 `op2` ... (*1) |
| 255 | }}} |
| 256 | |
| 257 | (the caller holds E0). The job of @parse@ is to build the expression |
| 258 | to the right of @op1@, returning the expression and any remaining |
| 259 | input. |
| 260 | |
| 261 | There are three cases to consider: |
| 262 | |
| 263 | (1) if @op1@ and @op2@ have the same precedence, but they do not |
| 264 | have the same associativity, or they are declared to be nonfix, then |
| 265 | the expression is illegal. |
| 266 | |
| 267 | (2) If @op1@ has a higher precedence than @op2@, or @op1@ and @op2@ |
| 268 | should left-associate, then we know that the expression to the right |
| 269 | of @op1@ is @E1@, so we return this to the caller. |
| 270 | |
| 271 | (3) Otherwise, we know we want to build an expression of the form |
| 272 | @E1 `op2` R@. To find R, we call @parseNeg op2 tokens@ to compute |
| 273 | the expression to the right of @op2@, namely @R@ (more about |
| 274 | @parseNeg@ below, but essentially if @tokens@ is of the form @(E2 : |
| 275 | rest)@, then this is equivalent to @parse op2 E2 rest@). Now, we |
| 276 | have |
| 277 | |
| 278 | E0 `op1` (E1 `op2` R) `op3` ... |
| 279 | |
| 280 | where @op3@ is the next operator in the input. This is an instance of |
| 281 | (*1) above, so to continue we call parse, with the new E1 == (E1 |
| 282 | `op2` R) |
| 283 | |
| 284 | To initialise the algorithm, we set @op1@ to be an imaginary operator |
| 285 | with precedence lower than anyything else. Hence @parse@ will consume |
| 286 | the whole input, and return the resulting expression. |
| 287 | |
| 288 | The handling of the prefix negation operator, @-@, complicates matters |
| 289 | only slightly. Recall that prefix negation has the same fixity as |
| 290 | infix negation: left-associative with precedence 6. The operator to |
| 291 | the left of @-@, if there is one, must have precedence lower than 6 |
| 292 | for the expression to be legal. The negation operator itself may |
| 293 | left-associate with operators of the same fixity (e.g. @+@). So for |
| 294 | example @-a + b@ is legal and resolves as @(-a) + b), but @a + -b@ is |
| 295 | illegal. |
| 296 | |
| 297 | The function @parseNeg@ handles prefix negation. If we encounter a |
| 298 | negation operator, and it is legal in this position (the operator to |
| 299 | the left has precedence lower than 6), then we proceed in a similar |
| 300 | way to case (3) above: compute the argument to '-' by recursively |
| 301 | calling @parseNeg@, and then continue by calling @parse@. |
| 302 | |
| 303 | Note that this algorithm is insensitive to the range of precedences. |
| 304 | There is no reason in principle that Haskell should be limited to |
| 305 | integral precedences in the range 1 to 10; a larger range, or |
| 306 | fractional values, would present no additional implementation |
| 307 | difficulties. |
| 308 | |
| 309 | |
| 310 | |