To restart this thread, do your solutions handle the following (legal) variation of the original example? if True: x = 3+4 y = (2 + 4 + 5) z = 5 else: x = 5 if False: x = 8 z = 2 Notice that the assignment of y wraps onto the next line at an *earlier* column. This is legal b/c it is surrounded by parens. However, it seems that the preprocessing approaches will fail for this example. Do you have a workaround? --Yitzhak On Jun 12, 2009, at 11:43 AM, Andreas Rossberg wrote: > On Jun 12, 2009, at 10.20 h, Andrej Bauer wrote: > >> I think I understand the general idea of inserting "virtual" tokens, >> but the details confuse me still. So starting with >> >>> if True: >>> x = 3 >>> y = (2 + >>> 4 + 5) >>> else: >>> x = 5 >>> if False: >>> x = 8 >>> z = 2 >> >> Martin suggests the following: >> >>> { >>> if True: >>> ; >>> { >>> x = 3 >>> ; >>> y = (2 + >>> ; >>> { >>> 4 + 5) >>> } >>> } >>> ; >>> else: >>> ; >>> { >>> x = 5 >>> ; >>> if False: >>> ; >>> { >>> x = 8 >>> ; >>> z = 2 >>> } >>> } >>> } >> >> I have two questions. Notice that the { ... } and ( ... ) need not be >> correctly nested (in the top half), so how are we going to deal with >> this? The second question is, why are there the separators after and >> just before "else:". I would expect separators inside { .... }, but >> not around "else". > > It depends on how exactly you define your layout rules. The usual > approach is to tie start of layout-sensitive blocks to particular > keywords -- this is essentially what Python and Haskell do. In that > case, the binding to y is not affected. Haskell's rules for optional > layout would rewrite your original program as > >>> if True: >>> {x = 3 >>> ;y = (2 + >>> 4 + 5) >>> }else: >>> {x = 5 >>> ;if False: >>> {x = 8 >>> ;z = 2 >>> }} > > The basic rules are fairly simple: > > 1. Insert "{" (assume width 0) before the first token following a > layout keyword (usually ":" in Python). This opens a block. > > 2. As long as inside a block, insert ";" before each token that is > on the _same_ column as the current (i.e. innermost) "{". > > 3. A block ends as soon as you see a line whose first token is > _left_ of the current "{". Insert "}" before that token. > > Blocks can be nested, so you need to maintain a stack of starting > columns in the parser. Note that rule 3 may end several blocks at > once. EOF is treated as a token at column 0. > > The way I implemented this is by wrapping the ocamllex-generated > lexer with a function that compares each token's column with the top > of the layout stack and inserts auxiliary tokens as necessary. > > Haskell has another rule for inserting "}" if there would be a parse > error without it (this is to allow inline blocks). This rule is > pretty fudgy, and almost impossible to implement properly with a > conventional parser generator. IMO, the only sane way to reformulate > this rule is again to tie it to specific keywords, e.g. insert "}" > before "else" if missing. This can be implemented in the parser by > making closing braces optional in the right places. > > - Andreas > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs -------------------------------------------------- Yitzhak Mandelbaum AT&T Labs - Research http://www.research.att.com/~yitzhak