caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Andreas Rossberg <rossberg@mpi-sws.org>
To: Andrej Bauer <Andrej.Bauer@andrej.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] ocamllex and python-style indentation
Date: Fri, 12 Jun 2009 17:43:50 +0200	[thread overview]
Message-ID: <FBA1153F-776B-47FF-B267-22504D045671@mpi-sws.org> (raw)
In-Reply-To: <7d8707de0906120120x10cc8fe0p54adbd189003f3da@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2630 bytes --]

On Jun 12, 2009, at 10.20 h, Andrej Bauer wrote:

> I think I understand the general idea of inserting "virtual" tokens,
> but the details confuse me still. So starting with
>
>> if True:
>>    x = 3
>>    y = (2 +
>>      4 + 5)
>> else:
>>    x = 5
>>    if False:
>>        x = 8
>>        z = 2
>
> Martin suggests the following:
>
>> {
>> if True:
>> ;
>>   {
>>   x = 3
>>   ;
>>   y = (2 +
>>   ;
>>     {
>>     4 + 5)
>>     }
>>   }
>> ;
>> else:
>> ;
>>   {
>>   x = 5
>>   ;
>>   if False:
>>   ;
>>       {
>>       x = 8
>>       ;
>>       z = 2
>>       }
>>   }
>> }
>
> I have two questions. Notice that the { ... } and ( ... ) need not be
> correctly nested (in the top half), so how are we going to deal with
> this? The second question is, why are there the separators after and
> just before "else:". I would expect separators inside { .... }, but
> not around "else".

It depends on how exactly you define your layout rules. The usual  
approach is to tie start of layout-sensitive blocks to particular  
keywords -- this is essentially what Python and Haskell do. In that  
case, the binding to y is not affected. Haskell's rules for optional  
layout would rewrite your original program as

>> if True:
>>    {x = 3
>>    ;y = (2 +
>>      4 + 5)
>> }else:
>>    {x = 5
>>    ;if False:
>>        {x = 8
>>        ;z = 2
>> }}

The basic rules are fairly simple:

1. Insert "{" (assume width 0) before the first token following a  
layout keyword (usually ":" in Python). This opens a block.

2. As long as inside a block, insert ";" before each token that is on  
the _same_ column as the current (i.e. innermost) "{".

3. A block ends as soon as you see a line whose first token is _left_  
of the current "{". Insert "}" before that token.

Blocks can be nested, so you need to maintain a stack of starting  
columns in the parser. Note that rule 3 may end several blocks at  
once. EOF is treated as a token at column 0.

The way I implemented this is by wrapping the ocamllex-generated lexer  
with a function that compares each token's column with the top of the  
layout stack and inserts auxiliary tokens as necessary.

Haskell has another rule for inserting "}" if there would be a parse  
error without it (this is to allow inline blocks). This rule is pretty  
fudgy, and almost impossible to implement properly with a conventional  
parser generator. IMO, the only sane way to reformulate this rule is  
again to tie it to specific keywords, e.g. insert "}" before "else" if  
missing. This can be implemented in the parser by making closing  
braces optional in the right places.

- Andreas


[-- Attachment #2: Type: text/html, Size: 6455 bytes --]

  parent reply	other threads:[~2009-06-12 15:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-11 12:57 Andrej Bauer
2009-06-11 13:12 ` [Caml-list] " yoann padioleau
2009-06-11 13:21 ` Andreas Rossberg
2009-06-11 13:44 ` Martin Jambon
2009-06-12  8:20   ` Andrej Bauer
2009-06-12 12:56     ` Martin Jambon
2009-06-12 13:34     ` Martin Jambon
2009-06-12 15:43     ` Andreas Rossberg [this message]
2009-06-30 18:58       ` Yitzhak Mandelbaum
2009-06-30 20:19         ` Mike Lin
2009-06-30 22:06         ` Andreas Rossberg
2009-07-01  2:13           ` Mike Lin
2009-07-01  7:31             ` Andreas Rossberg
2009-07-01 14:02               ` Mike Lin
2009-07-01 14:17                 ` Andreas Rossberg
2009-07-01 14:21                   ` Andreas Rossberg
2009-07-01 14:37                     ` Mike Lin
2009-07-01 15:03                   ` Sylvain Le Gall
2009-07-01 15:16                     ` [Caml-list] " Andreas Rossberg
2009-07-01 16:26                       ` Sylvain Le Gall
2009-07-01 15:19                     ` [Caml-list] " Martin Jambon
2009-07-01 15:43                       ` Andreas Rossberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FBA1153F-776B-47FF-B267-22504D045671@mpi-sws.org \
    --to=rossberg@mpi-sws.org \
    --cc=Andrej.Bauer@andrej.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).