caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Mike Lin <nilekim@gmail.com>
To: Yitzhak Mandelbaum <yitzhak@research.att.com>
Cc: Andreas Rossberg <rossberg@mpi-sws.org>,
	caml-list@inria.fr, Andrej Bauer <Andrej.Bauer@andrej.com>
Subject: Re: [Caml-list] ocamllex and python-style indentation
Date: Tue, 30 Jun 2009 16:19:50 -0400	[thread overview]
Message-ID: <2a1a1a0c0906301319q77c20932gd6b879f5af212028@mail.gmail.com> (raw)
In-Reply-To: <E47AC31E-BF02-4440-A0BD-EB4B2D90182A@research.att.com>

More generally, you've got parentheses, comments, and string literals,
and you need to know to ignore whitespace within any of those -- and
to ignore e.g. an open parenthesis that occurs within a comment, or a
close comment that occurs within a string literal. So inevitably
you've got to lex and parse at some level to make this work for a
practical language.
There are still some byzantine cases that ocaml+twt doesn't handle
properly, and I think it probably gets pretty close to the minimally
complex yet practically usable approach to this.

Mike

On Tue, Jun 30, 2009 at 2:58 PM, Yitzhak
Mandelbaum<yitzhak@research.att.com> wrote:
> To restart this thread, do your solutions handle the following (legal)
> variation of the original example?
> if True:
>    x = 3+4
>    y = (2 +
> 4 + 5)
>    z = 5
> else:
>    x = 5
>    if False:
>        x = 8
>        z = 2
>
> Notice that the assignment of y wraps onto the next line at an *earlier*
> column. This is legal b/c it is surrounded by parens. However, it seems that
> the preprocessing approaches will fail for this example. Do you have a
> workaround?
> --Yitzhak
>
> On Jun 12, 2009, at 11:43 AM, Andreas Rossberg wrote:
>
> On Jun 12, 2009, at 10.20 h, Andrej Bauer wrote:
>
> I think I understand the general idea of inserting "virtual" tokens,
> but the details confuse me still. So starting with
>
> if True:
>
>    x = 3
>
>    y = (2 +
>
>      4 + 5)
>
> else:
>
>    x = 5
>
>    if False:
>
>        x = 8
>
>        z = 2
>
> Martin suggests the following:
>
> {
>
> if True:
>
> ;
>
>   {
>
>   x = 3
>
>   ;
>
>   y = (2 +
>
>   ;
>
>     {
>
>     4 + 5)
>
>     }
>
>   }
>
> ;
>
> else:
>
> ;
>
>   {
>
>   x = 5
>
>   ;
>
>   if False:
>
>   ;
>
>       {
>
>       x = 8
>
>       ;
>
>       z = 2
>
>       }
>
>   }
>
> }
>
> I have two questions. Notice that the { ... } and ( ... ) need not be
> correctly nested (in the top half), so how are we going to deal with
> this? The second question is, why are there the separators after and
> just before "else:". I would expect separators inside { .... }, but
> not around "else".
>
> It depends on how exactly you define your layout rules. The usual approach
> is to tie start of layout-sensitive blocks to particular keywords -- this is
> essentially what Python and Haskell do. In that case, the binding to y is
> not affected. Haskell's rules for optional layout would rewrite your
> original program as
>
> if True:
>
>    {x = 3
>
>    ;y = (2 +
>
>      4 + 5)
>
> }else:
>
>    {x = 5
>
>    ;if False:
>
>        {x = 8
>
>        ;z = 2
>
> }}
>
> The basic rules are fairly simple:
> 1. Insert "{" (assume width 0) before the first token following a layout
> keyword (usually ":" in Python). This opens a block.
> 2. As long as inside a block, insert ";" before each token that is on the
> _same_ column as the current (i.e. innermost) "{".
> 3. A block ends as soon as you see a line whose first token is _left_ of the
> current "{". Insert "}" before that token.
> Blocks can be nested, so you need to maintain a stack of starting columns in
> the parser. Note that rule 3 may end several blocks at once. EOF is treated
> as a token at column 0.
> The way I implemented this is by wrapping the ocamllex-generated lexer with
> a function that compares each token's column with the top of the layout
> stack and inserts auxiliary tokens as necessary.
> Haskell has another rule for inserting "}" if there would be a parse error
> without it (this is to allow inline blocks). This rule is pretty fudgy, and
> almost impossible to implement properly with a conventional parser
> generator. IMO, the only sane way to reformulate this rule is again to tie
> it to specific keywords, e.g. insert "}" before "else" if missing. This can
> be implemented in the parser by making closing braces optional in the right
> places.
> - Andreas
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
> --------------------------------------------------
> Yitzhak Mandelbaum
> AT&T Labs - Research
> http://www.research.att.com/~yitzhak
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>


  reply	other threads:[~2009-06-30 20:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-11 12:57 Andrej Bauer
2009-06-11 13:12 ` [Caml-list] " yoann padioleau
2009-06-11 13:21 ` Andreas Rossberg
2009-06-11 13:44 ` Martin Jambon
2009-06-12  8:20   ` Andrej Bauer
2009-06-12 12:56     ` Martin Jambon
2009-06-12 13:34     ` Martin Jambon
2009-06-12 15:43     ` Andreas Rossberg
2009-06-30 18:58       ` Yitzhak Mandelbaum
2009-06-30 20:19         ` Mike Lin [this message]
2009-06-30 22:06         ` Andreas Rossberg
2009-07-01  2:13           ` Mike Lin
2009-07-01  7:31             ` Andreas Rossberg
2009-07-01 14:02               ` Mike Lin
2009-07-01 14:17                 ` Andreas Rossberg
2009-07-01 14:21                   ` Andreas Rossberg
2009-07-01 14:37                     ` Mike Lin
2009-07-01 15:03                   ` Sylvain Le Gall
2009-07-01 15:16                     ` [Caml-list] " Andreas Rossberg
2009-07-01 16:26                       ` Sylvain Le Gall
2009-07-01 15:19                     ` [Caml-list] " Martin Jambon
2009-07-01 15:43                       ` Andreas Rossberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a1a1a0c0906301319q77c20932gd6b879f5af212028@mail.gmail.com \
    --to=nilekim@gmail.com \
    --cc=Andrej.Bauer@andrej.com \
    --cc=caml-list@inria.fr \
    --cc=rossberg@mpi-sws.org \
    --cc=yitzhak@research.att.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).