[Caml-list] CFG's and OCaml

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] CFG's and OCaml
@ 2004-08-13 14:04 David McClain
  2004-08-13 15:05 ` Damien Doligez
  2004-08-13 15:49 ` Brian Hurt
  0 siblings, 2 replies; 27+ messages in thread
From: David McClain @ 2004-08-13 14:04 UTC (permalink / raw)
  To: caml-list

Okay... here's a case where when I do "exactly" what the gurus at Inria 
do, I get a reduce/reduce conflict, and yet when I build OCaml it does 
not report any such conflicts. [I say "exactly" because obviously I'm 
not...]

simple_expr:
	constant
	...

simple_pattern:
	signed_constant
	...

constant:
	INT
| 	FLOAT

signed_constant:
	constant
|	MINUS INT
|	MINUS FLOAT
  ;; /*  ---------------------------------------------------------- */

The reduce/reduce conflict comes on deciding whether to assign an INT 
seen to signed_constant which will reduce to simple_pattern, or instead 
to become constant which reduces to simple_expr. Both Inria and I do 
completely different semantic reductions in these two cases, and so a 
reduce/reduce conflict could be fatal here...

[ I found many instances of my own reduce/reduce conflicts arising from 
error trapping clauses with things like missing RPAREN's etc. Both 
pattern and expressions had these error traps, and so it was simple to 
remove the reduce/reduce conflict by eliding a trap from one or the 
other of these two places. That's pretty harmless since the compiler 
aborts there anyway and it makes no difference which reduction is 
chosen.

But the last few reduce/reduce conflicts are like the one shown above 
and they do matter.]

So, as so often happens with the master's touch, everything works fine 
for them, but it doesn't for me. Why should this be, in this example 
case?

David McClain
Senior Corporate Scientist
Avisere, Inc.

david.mcclain@avisere.com
+1.520.390.7738 (USA)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 14:04 [Caml-list] CFG's and OCaml David McClain
@ 2004-08-13 15:05 ` Damien Doligez
  2004-08-13 15:26   ` David McClain
  2004-08-13 15:28   ` David McClain
  2004-08-13 15:49 ` Brian Hurt
  1 sibling, 2 replies; 27+ messages in thread
From: Damien Doligez @ 2004-08-13 15:05 UTC (permalink / raw)
  To: caml Caml

On Aug 13, 2004, at 16:04, David McClain wrote:

> simple_expr:
> 	constant
> 	...
>
> simple_pattern:
> 	signed_constant
> 	...
>
> constant:
> 	INT
> | 	FLOAT
>
> signed_constant:
> 	constant
> |	MINUS INT
> |	MINUS FLOAT
>  ;; /*  ---------------------------------------------------------- */
>
> The reduce/reduce conflict comes on deciding whether to assign an INT 
> seen to signed_constant which will reduce to simple_pattern, or 
> instead to become constant which reduces to simple_expr. Both Inria 
> and I do completely different semantic reductions in these two cases, 
> and so a reduce/reduce conflict could be fatal here...

But...  There can only be a reduce-reduce conflict if you have two 
derivations
of the same prefix, one of which expects a constant as the next 
nonterminal,
and the other expects a signed_constant.  In other words, we need more 
context
to tell what's going on.  You should give option -v to ocamlyacc and 
look
at the resulting *.output file.  It's a bit hard to understand at first,
but with some experience you can get a little information out of it.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 15:05 ` Damien Doligez
@ 2004-08-13 15:26   ` David McClain
  2004-08-13 16:12     ` Damien Doligez
  2004-08-13 15:28   ` David McClain
  1 sibling, 1 reply; 27+ messages in thread
From: David McClain @ 2004-08-13 15:26 UTC (permalink / raw)
  To: Damien Doligez, caml Caml

?? I thought I had provided the context you want to see... That's
simple_expr and simple_pattern. Perhaps I should have also shown you what
else the simple_expr expects to see... indeed it is also looking for a unary
MINUS.

expr:
    simple_expr
    ...

simple_expr:
    ...
    | constant
    ...
    | MINUS expr  %prec prec_unary_minus
    ...

And yes, I have been religiously studying the parser.output files at every
step of the way.


----- Original Message ----- 
From: "Damien Doligez" <damien.doligez@inria.fr>
To: "caml Caml" <caml-list@inria.fr>
Sent: Friday, August 13, 2004 08:05
Subject: Re: [Caml-list] CFG's and OCaml


> On Aug 13, 2004, at 16:04, David McClain wrote:
>
> > simple_expr:
> > constant
> > ...
> >
> > simple_pattern:
> > signed_constant
> > ...
> >
> > constant:
> > INT
> > | FLOAT
> >
> > signed_constant:
> > constant
> > | MINUS INT
> > | MINUS FLOAT
> >  ;; /*  ---------------------------------------------------------- */
> >
> > The reduce/reduce conflict comes on deciding whether to assign an INT
> > seen to signed_constant which will reduce to simple_pattern, or
> > instead to become constant which reduces to simple_expr. Both Inria
> > and I do completely different semantic reductions in these two cases,
> > and so a reduce/reduce conflict could be fatal here...
>
> But...  There can only be a reduce-reduce conflict if you have two
> derivations
> of the same prefix, one of which expects a constant as the next
> nonterminal,
> and the other expects a signed_constant.  In other words, we need more
> context
> to tell what's going on.  You should give option -v to ocamlyacc and
> look
> at the resulting *.output file.  It's a bit hard to understand at first,
> but with some experience you can get a little information out of it.
>
> -- Damien
>
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ:
http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 15:05 ` Damien Doligez
  2004-08-13 15:26   ` David McClain
@ 2004-08-13 15:28   ` David McClain
  1 sibling, 0 replies; 27+ messages in thread
From: David McClain @ 2004-08-13 15:28 UTC (permalink / raw)
  To: Damien Doligez, caml Caml

Sorry, mistyping...

expr:
   | simple_expr
    ...
   | MINUS expr %prec unary_minus
    ...

simple_expr:
    ...
  |  constant
    ...

simple_pattern:
    ...
   | signed_constant
    ...

constant:
    INT
  | FLOAT

signed_constant:
    constant
  | MINUS INT
  | MINUS FLOAT

----- Original Message ----- 
From: "Damien Doligez" <damien.doligez@inria.fr>
To: "caml Caml" <caml-list@inria.fr>
Sent: Friday, August 13, 2004 08:05
Subject: Re: [Caml-list] CFG's and OCaml


> On Aug 13, 2004, at 16:04, David McClain wrote:
>
> > simple_expr:
> > constant
> > ...
> >
> > simple_pattern:
> > signed_constant
> > ...
> >
> > constant:
> > INT
> > | FLOAT
> >
> > signed_constant:
> > constant
> > | MINUS INT
> > | MINUS FLOAT
> >  ;; /*  ---------------------------------------------------------- */
> >
> > The reduce/reduce conflict comes on deciding whether to assign an INT
> > seen to signed_constant which will reduce to simple_pattern, or
> > instead to become constant which reduces to simple_expr. Both Inria
> > and I do completely different semantic reductions in these two cases,
> > and so a reduce/reduce conflict could be fatal here...
>
> But...  There can only be a reduce-reduce conflict if you have two
> derivations
> of the same prefix, one of which expects a constant as the next
> nonterminal,
> and the other expects a signed_constant.  In other words, we need more
> context
> to tell what's going on.  You should give option -v to ocamlyacc and
> look
> at the resulting *.output file.  It's a bit hard to understand at first,
> but with some experience you can get a little information out of it.
>
> -- Damien
>
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ:
http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 14:04 [Caml-list] CFG's and OCaml David McClain
  2004-08-13 15:05 ` Damien Doligez
@ 2004-08-13 15:49 ` Brian Hurt
  2004-08-13 16:04   ` David McClain
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Hurt @ 2004-08-13 15:49 UTC (permalink / raw)
  To: David McClain; +Cc: caml-list

On Fri, 13 Aug 2004, David McClain wrote:

> Okay... here's a case where when I do "exactly" what the gurus at Inria 
> do, I get a reduce/reduce conflict, and yet when I build OCaml it does 
> not report any such conflicts. [I say "exactly" because obviously I'm 
> not...]
> 
> simple_expr:
> 	constant
> 	...
> 
> simple_pattern:
> 	signed_constant
> 	...
> 
> constant:
> 	INT
> | 	FLOAT
> 
> signed_constant:
> 	constant
> |	MINUS INT
> |	MINUS FLOAT
>   ;; /*  ---------------------------------------------------------- */
> 
> The reduce/reduce conflict comes on deciding whether to assign an INT 
> seen to signed_constant which will reduce to simple_pattern, or instead 
> to become constant which reduces to simple_expr. Both Inria and I do 
> completely different semantic reductions in these two cases, and so a 
> reduce/reduce conflict could be fatal here...

When the compiler sees an int, which path should it take?

My advice would be to remove the constant from signed_constant's patterns.


> So, as so often happens with the master's touch, everything works fine 
> for them, but it doesn't for me. Why should this be, in this example 
> case?

They're doing something evil- take a look at line 355 of 
parsing/parser.mly in 3.08.0.

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 15:49 ` Brian Hurt
@ 2004-08-13 16:04   ` David McClain
  2004-08-13 16:29     ` Brian Hurt
  2004-08-13 16:58     ` Paul Snively
  0 siblings, 2 replies; 27+ messages in thread
From: David McClain @ 2004-08-13 16:04 UTC (permalink / raw)
  To: Brian Hurt, caml-list

Uhuhh... Yes, I did that same "evil" thing as well, even before 
discussing all these reduce/reduce conflicts/

What I find is that these screwball little tricks might help, and the 
might not. YACC is just too darned sensitive to minor and non-obvious 
perturbations in the input grammar specification. Realizing its legacy, 
indeed it does arise from the old IBM-360, or more properly PDP-10, 
days. That was the style of programming back then.. I remember it well.

I see that the root of the problem really lies in the reducion from 
LL(1) to LALR(1) where driver table entries are shared as much as 
possible to diminish the size of these tables. However, when that 
occurs, we end up getting mixes like we have between the distinct 
expression and pattern subtrees in the grammar.

Indeed these do appear syntactically similar, yet by way of 
specification, we are hoping to apply completely different semantic 
actions to these reductions. This is a dilemma.

I had though the other night about creating a common subtree of 
semantic actions and then let the higher levels unravel that subtree. 
That would probably work well here, but it is a lot more work.

David McClain
Senior Corporate Scientist
Avisere, Inc.

david.mcclain@avisere.com
+1.520.390.7738 (USA)

On Aug 13, 2004, at 08:49, Brian Hurt wrote:

> On Fri, 13 Aug 2004, David McClain wrote:
>
>> Okay... here's a case where when I do "exactly" what the gurus at 
>> Inria
>> do, I get a reduce/reduce conflict, and yet when I build OCaml it does
>> not report any such conflicts. [I say "exactly" because obviously I'm
>> not...]
>>
>> simple_expr:
>> 	constant
>> 	...
>>
>> simple_pattern:
>> 	signed_constant
>> 	...
>>
>> constant:
>> 	INT
>> | 	FLOAT
>>
>> signed_constant:
>> 	constant
>> |	MINUS INT
>> |	MINUS FLOAT
>>   ;; /*  ---------------------------------------------------------- */
>>
>> The reduce/reduce conflict comes on deciding whether to assign an INT
>> seen to signed_constant which will reduce to simple_pattern, or 
>> instead
>> to become constant which reduces to simple_expr. Both Inria and I do
>> completely different semantic reductions in these two cases, and so a
>> reduce/reduce conflict could be fatal here...
>
> When the compiler sees an int, which path should it take?
>
> My advice would be to remove the constant from signed_constant's 
> patterns.
>
>
>> So, as so often happens with the master's touch, everything works fine
>> for them, but it doesn't for me. Why should this be, in this example
>> case?
>
> They're doing something evil- take a look at line 355 of
> parsing/parser.mly in 3.08.0.
>
> -- 
> "Usenet is like a herd of performing elephants with diarrhea -- 
> massive,
> difficult to redirect, awe-inspiring, entertaining, and a source of
> mind-boggling amounts of excrement when you least expect it."
>                                 - Gene Spafford
> Brian
>

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 15:26   ` David McClain
@ 2004-08-13 16:12     ` Damien Doligez
  0 siblings, 0 replies; 27+ messages in thread
From: Damien Doligez @ 2004-08-13 16:12 UTC (permalink / raw)
  To: caml Caml

On Aug 13, 2004, at 17:26, David McClain wrote:

> ?? I thought I had provided the context you want to see... That's
> simple_expr and simple_pattern. Perhaps I should have also shown you 
> what
> else the simple_expr expects to see... indeed it is also looking for a 
> unary
> MINUS.

The context we need is the productions that call expr and 
simple_pattern,
and recursively up to the entry point.  In other words, pretty much the
whole grammar.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 16:04   ` David McClain
@ 2004-08-13 16:29     ` Brian Hurt
  2004-08-13 16:42       ` Xavier Leroy
  2004-08-13 16:58     ` Paul Snively
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Hurt @ 2004-08-13 16:29 UTC (permalink / raw)
  To: David McClain; +Cc: Ocaml Mailing List

On Fri, 13 Aug 2004, David McClain wrote:

> Uhuhh... Yes, I did that same "evil" thing as well, even before 
> discussing all these reduce/reduce conflicts/

Did you get all the symbols you needed to?  It looks like you need the 
entire first set for the "right" reduction rule.  If you missed a 
symbol, this could still be a problem.  This is why this trick is "evil".

> 
> What I find is that these screwball little tricks might help, and the 
> might not. YACC is just too darned sensitive to minor and non-obvious 
> perturbations in the input grammar specification. Realizing its legacy, 
> indeed it does arise from the old IBM-360, or more properly PDP-10, 
> days. That was the style of programming back then.. I remember it well.

The problem is that Yacc grammar is very dense.  Changing the yacc 
description in even minor ways can have very big effects on the grammar 
you are actually describing.

Take my example earlier of left vr.s right recursion.  Consider the 
expression 4 * 5 / 3.  With left recursion, this is parsed as
(4 * 5) / 3, or 20 / 3, or 6.  With right recursion, this is parsed as 4 * 
(5 / 3), or 4 * 1, or 4.  Opps.

One of the things I like about LALR(1) parsers is that my experience has 
been that if they get confused about something, sooner or later the user 
of the language will get confused about the exact same thing.  Even 
something as "harmless" as a shift/reduce conflict.  The classic example 
of this is the dangling else problem in C.  My number one complaint with 
Ocaml is the number of shift/reduce (and hidden reduce/reduce) conflicts 
in it's grammar.  These bite me on a regular basis.

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 16:29     ` Brian Hurt
@ 2004-08-13 16:42       ` Xavier Leroy
  2004-08-13 17:18         ` Ken Rose
  2004-08-13 18:55         ` Brian Hurt
  0 siblings, 2 replies; 27+ messages in thread
From: Xavier Leroy @ 2004-08-13 16:42 UTC (permalink / raw)
  To: Brian Hurt; +Cc: David McClain, Ocaml Mailing List

> My number one complaint with 
> Ocaml is the number of shift/reduce (and hidden reduce/reduce) conflicts 
> in it's grammar.  These bite me on a regular basis.

>From this message and earlier messages of yours, I think you are under
the wrong impression that precedences and associativities can be used
(and would be used in OCaml's grammar) to resolve (or "hide" as you
say) reduce/reduce conflicts.  

This is incorrect: Yacc uses precedences and associativities to
resolve (i.e. choose to shift or choose to reduce) shift/reduce
conflicts only.  If there were reduce/reduce conflicts in OCaml's
grammar, Yacc would say so and no among of precedence tweaking would
hide them.

This said, it is true the OCaml grammar uses precedences a lot to deal
with shift/reduce situations.  It is equally true that some of these
situations correspond to syntactic corners of the language that can
confuse the user.

Concerning David McClain's problems, I can only repeat the advice
given at the beginning of the ocamlyacc chapter in the OCaml manual:

`` Readers unfamiliar with lex and yacc are referred to ``Compilers:
principles, techniques, and tools'' by Aho, Sethi and Ullman
(Addison-Wesley, 1986), or ``Lex & Yacc'', by Levine, Mason and Brown
(O'Reilly, 1992). ''

(The latter is more practice-oriented.)

- Xavier Leroy

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 16:04   ` David McClain
  2004-08-13 16:29     ` Brian Hurt
@ 2004-08-13 16:58     ` Paul Snively
  1 sibling, 0 replies; 27+ messages in thread
From: Paul Snively @ 2004-08-13 16:58 UTC (permalink / raw)
  To: David McClain; +Cc: Brian Hurt, caml-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi David!

On Aug 13, 2004, at 9:04 AM, David McClain wrote:

> Uhuhh... Yes, I did that same "evil" thing as well, even before 
> discussing all these reduce/reduce conflicts/
>
> What I find is that these screwball little tricks might help, and the 
> might not. YACC is just too darned sensitive to minor and non-obvious 
> perturbations in the input grammar specification. Realizing its 
> legacy, indeed it does arise from the old IBM-360, or more properly 
> PDP-10, days. That was the style of programming back then.. I remember 
> it well.
>
The issue isn't strictly one of style. You're using an entirely 
context-free parser generator to attempt to parse a grammar that _is 
not_ context free. The joker in this deck is that no non-trivial 
grammar is context-free, so all real-world "context free" grammars use 
some kind of trick to get past it. Hence the effort to turn (1) into 
(k) or, better still, (inf), cf. ANTLR, Spirit, Bison 1.50 or later, or 
Elkhound.

> I had though the other night about creating a common subtree of 
> semantic actions and then let the higher levels unravel that subtree. 
> That would probably work well here, but it is a lot more work.
>
Quite. My advice remains to be to use Elkhound.

> David McClain
> Senior Corporate Scientist
> Avisere, Inc.
>
> david.mcclain@avisere.com
> +1.520.390.7738 (USA)
>
Best regards,
Paul

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iEYEARECAAYFAkEc8yMACgkQbot1wzHBQBWHggCeNLjlA7AdwuvQBRZ+T5LDqvkQ
WLoAniMo7jA2ISdBz24MeDUZd8aJ7bNt
=JehV
-----END PGP SIGNATURE-----

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 16:42       ` Xavier Leroy
@ 2004-08-13 17:18         ` Ken Rose
  2004-08-13 18:55         ` Brian Hurt
  1 sibling, 0 replies; 27+ messages in thread
From: Ken Rose @ 2004-08-13 17:18 UTC (permalink / raw)
  To: Ocaml Mailing List; +Cc: David McClain

Xavier Leroy wrote:

> Concerning David McClain's problems, I can only repeat the advice
> given at the beginning of the ocamlyacc chapter in the OCaml manual:
> 
> `` Readers unfamiliar with lex and yacc are referred to ``Compilers:
> principles, techniques, and tools'' by Aho, Sethi and Ullman
> (Addison-Wesley, 1986), or ``Lex & Yacc'', by Levine, Mason and Brown
> (O'Reilly, 1992). ''

Which brings up the thought of comp.compilers, which John Levine 
moderates, and which is crowded with people who can probably help better 
than this group can.

  - ken

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 16:42       ` Xavier Leroy
  2004-08-13 17:18         ` Ken Rose
@ 2004-08-13 18:55         ` Brian Hurt
  2004-08-14  0:25           ` Jon Harrop
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Hurt @ 2004-08-13 18:55 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: David McClain, Ocaml Mailing List

On Fri, 13 Aug 2004, Xavier Leroy wrote:

> From this message and earlier messages of yours, I think you are under
> the wrong impression that precedences and associativities can be used
> (and would be used in OCaml's grammar) to resolve (or "hide" as you
> say) reduce/reduce conflicts.  
> 
> This is incorrect: Yacc uses precedences and associativities to
> resolve (i.e. choose to shift or choose to reduce) shift/reduce
> conflicts only.  If there were reduce/reduce conflicts in OCaml's
> grammar, Yacc would say so and no among of precedence tweaking would
> hide them.

OK.  I don't generally use explicitly stated precedences anyways.  Of 
course, as a programmer, I work under the rule that multiplication and 
division happen before addition and subtraction, and that I should put 
parens around everything else.

> 
> This said, it is true the OCaml grammar uses precedences a lot to deal
> with shift/reduce situations.  It is equally true that some of these
> situations correspond to syntactic corners of the language that can
> confuse the user.

I see why it's done the way it is.  It's an aesthetic decision.

> Concerning David McClain's problems, I can only repeat the advice
> given at the beginning of the ocamlyacc chapter in the OCaml manual:
> 
> `` Readers unfamiliar with lex and yacc are referred to ``Compilers:
> principles, techniques, and tools'' by Aho, Sethi and Ullman
> (Addison-Wesley, 1986), or ``Lex & Yacc'', by Levine, Mason and Brown
> (O'Reilly, 1992). ''
> 
> (The latter is more practice-oriented.)
> 

Having read all three, and several others as well, the O'Reilly book is a
good reference if you already basically know Lex and Yacc.  The Dragon
Book (Aho et. al.) is a better book for building a compiler.  But Holub's
"Compiler Design in C" does a Minix, and actually builds a lex/yacc like
tool.  But it does this at the cost of actually using the tools to make a
compiler, including such topics as code generation and optimization.  A
better title would be "Lex/Yacc Design in C".  As such, I've found it a
better book for explaining how the tools actually work, including why they
occassionally behave strangely.

IMHO, read Holub first.  Then read Aho.  The get the O'Reilly book to 
leave by your computer.

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-13 18:55         ` Brian Hurt
@ 2004-08-14  0:25           ` Jon Harrop
  2004-08-14  0:57             ` Erik de Castro Lopo
                               ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Jon Harrop @ 2004-08-14  0:25 UTC (permalink / raw)
  To: Ocaml Mailing List

I have some (probably trivial) questions about parsers:

1. Are most programming languages designed to be implementable using lex and 
yacc?

2. If so, are their designs restricted by this?

3. If so, is the fact that most languages disallow "a<b<c" due to this?

4. Could that be added to OCaml? ;-)

5. Is it productive to think in terms of coercing lex and yacc into doing as 
much of the work as possible and then using postprocessing to do the rest 
(e.g. this is the way I'd implement a<b<c)?

Cheers,
Jon.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14  0:25           ` Jon Harrop
@ 2004-08-14  0:57             ` Erik de Castro Lopo
  2004-08-14  8:52               ` Alan Schmitt
  2004-08-14  3:33             ` Brian Hurt
  2004-08-14  7:55             ` skaller
  2 siblings, 1 reply; 27+ messages in thread
From: Erik de Castro Lopo @ 2004-08-14  0:57 UTC (permalink / raw)
  To: Ocaml Mailing List

On Sat, 14 Aug 2004 01:25:59 +0100
Jon Harrop <jon@jdh30.plus.com> wrote:

> 
> I have some (probably trivial) questions about parsers:
> 
> 1. Are most programming languages designed to be implementable using 
> lex and yacc?

Maybe not lex and yacc, but most are designed to be parsed by 
LALR(1) parsers.

Currently the most difficult to parse language seems to be C++.

> 2. If so, are their designs restricted by this?

Unlikely.

> 3. If so, is the fact that most languages disallow "a<b<c" due to this?

This should be parsable by an LALR(1) parser.

Anyway, this is really just syntactic sugar for "a < b && b < c".

> 5. Is it productive to think in terms of coercing lex and yacc into doing as 
> much of the work as possible and then using postprocessing to do the rest 
> (e.g. this is the way I'd implement a<b<c)?

If a, b and c are ints, then the assember version of

       if a<b<c then ....

would be exactly the same as:

	if a < b && b < c then ....

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"Anyone who considers arithmetical methods of producing random
digits is, of course, in a state of sin." - John Von Neumann (1951)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14  0:25           ` Jon Harrop
  2004-08-14  0:57             ` Erik de Castro Lopo
@ 2004-08-14  3:33             ` Brian Hurt
  2004-08-14  7:55             ` skaller
  2 siblings, 0 replies; 27+ messages in thread
From: Brian Hurt @ 2004-08-14  3:33 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sat, 14 Aug 2004, Jon Harrop wrote:

My answers (YMMV):

> 
> I have some (probably trivial) questions about parsers:
> 
> 1. Are most programming languages designed to be implementable using lex and 
> yacc?

They used to be...

> 
> 2. If so, are their designs restricted by this?

Less so than you might think, IMHO.  Mainly they push languages away from 
"bad" constructs.

> 
> 3. If so, is the fact that most languages disallow "a<b<c" due to this?

No.  "a<b<c" is parsed the same way as "a+b+c".

> 
> 4. Could that be added to OCaml? ;-)

Not my call.  Technically, yes it could.  Practically is a different 
question.

> 5. Is it productive to think in terms of coercing lex and yacc into doing as 
> much of the work as possible and then using postprocessing to do the rest 
> (e.g. this is the way I'd implement a<b<c)?

No.  I dislike coercing tools.  

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14  0:25           ` Jon Harrop
  2004-08-14  0:57             ` Erik de Castro Lopo
  2004-08-14  3:33             ` Brian Hurt
@ 2004-08-14  7:55             ` skaller
  2004-08-14 20:19               ` Jon Harrop
  2 siblings, 1 reply; 27+ messages in thread
From: skaller @ 2004-08-14  7:55 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sat, 2004-08-14 at 10:25, Jon Harrop wrote:
> I have some (probably trivial) questions about parsers:
> 
> 1. Are most programming languages designed to be implementable using lex and 
> yacc?

No :) Felix is, and many academic languages are,
but most industrial kind of languages aren't.
Heck, can you really use the word 'designed' for them?

> 2. If so, are their designs restricted by this?

There are some constraints I wish didn't exist
in the Felix grammar. Its not too bad -- I tend to
accept if a simple parsing engine can parse it,
so can a human, so conforming to the constraints
is a good idea anyhow.

> 3. If so, is the fact that most languages disallow "a<b<c" due to this?

Felix is LALR(1) and supports the following Pythonesque syntax:

  a < b &< c &< d

which is NOT quite what you proposed: we don't want
to get confused with

 (a < b) < c

which is the interpretation you'd get if you make <
a left associative binary operator.

It is possible to make < a chain operator instead,
that is, associative multi-ary: I do that with
operator + and * since it is necessary for the
correct interpretation of the type:

	a * b * c

which is NOT the same as

	(a * b) * c

> 4. Could that be added to OCaml? ;-)

Not without breaking existing code -- but Camlp4 could
add the

	a < b &< c &< d 

syntax easily I expect.

> 5. Is it productive to think in terms of coercing lex and yacc into doing as 
> much of the work as possible 

I personally think you should do the opposite -- let lex/yacc
do the least possible work since they're fairly rigid.
You may need to fiddle with your grammar to get the language
you want -- and it is better if that has the minimum
impact on your semantic logic. IMHO.

> and then using postprocessing to do the rest 
> (e.g. this is the way I'd implement a<b<c)?

For chain operator like '+' and '*' in Felix,
yacc just returns a list.

If the context is a type, the result builds a tuple kind
with n arguments.

If the context is executable code, it is remapped into
a left associative binary operator using a fold_left over
the list.

I actually *could* do this inside the parser, but
I don't, i use a separate 'desugaring' phase
to rewrite the syntax tree. (Actually i recode it
rather than rewriting it).

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14  0:57             ` Erik de Castro Lopo
@ 2004-08-14  8:52               ` Alan Schmitt
  0 siblings, 0 replies; 27+ messages in thread
From: Alan Schmitt @ 2004-08-14  8:52 UTC (permalink / raw)
  To: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 580 bytes --]

* Erik de Castro Lopo (ocaml-erikd@mega-nerd.com) wrote:
> On Sat, 14 Aug 2004 01:25:59 +0100
> Jon Harrop <jon@jdh30.plus.com> wrote:
> > 
> > 1. Are most programming languages designed to be implementable using 
> > lex and yacc?
> 
> Maybe not lex and yacc, but most are designed to be parsed by 
> LALR(1) parsers.
> 
> Currently the most difficult to parse language seems to be C++.

I vote for TeX as the most difficult language to lex and parse.

Alan Schmitt

-- 
The hacker: someone who figured things out and made something cool happen.
.O.
..O
OOO

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14  7:55             ` skaller
@ 2004-08-14 20:19               ` Jon Harrop
  2004-08-14 20:55                 ` Brian Hurt
  2004-08-14 22:13                 ` skaller
  0 siblings, 2 replies; 27+ messages in thread
From: Jon Harrop @ 2004-08-14 20:19 UTC (permalink / raw)
  To: Ocaml Mailing List

On Saturday 14 August 2004 04:33, Brian Hurt wrote:
> > 3. If so, is the fact that most languages disallow "a<b<c" due to this?
> 
> No.  "a<b<c" is parsed the same way as "a+b+c".

Sorry, I should have been more specific. With left- or right- or 
non-associative, commuting, 'a->'a->'a operators (like + and *) you can get 
away with parsing that way, e.g. "a+b+c" as:

either  (a+b)+c  or  a+(b+c)

But you can't do this with comparison 'a->'a->bool operators because it forces 
you to deviate from conventional mathematical meaning, e.g. you get a type 
error in OCaml on the "3" in "1<2<3" because it parses as "(1<2)<3" which 
evaluates to "true<3" which just doesn't make any sense.

IMHO, being able to do "1<2<false", although valid in OCaml and many other 
languages, is not terribly useful. Indeed, it can lead to overt 
confusingnesses:

# false=false=false;;
- : bool = false
# false=false=false=false;;
- : bool = true

I thought that the ML family were designed to mimic mathematical notation 
where possible but, AFAIK, most implementations don't do "a<b<c" this way.

I had always assumed that this was a limitation of LALR(1) but, according to 
skaller and Eric, I was wrong.

On Saturday 14 August 2004 08:55, skaller wrote:
> ...
> It is possible to make < a chain operator instead,
> ...

I see. You don't just make (x + y) an expression in the grammar but a whole 
new rule "sum" which contains (x + y) or (x + sum) and has the precendence of 
"+"?

So I want to take all comparison operators "'a -> 'a -> bool" and make a rule 
"inequality" for a (x op1 y) or (x op1 comparison) chain "operator" which, 
say, builds a list of operand and operators? Then you could do "x0 <= x < 
x1". Woohoo!

Would this have to be a conflict in the grammar with "a<b<c" parsed as 
"(a<b)<c"?

> > 4. Could that be added to OCaml? ;-)
>
> Not without breaking existing code...

Right, because somebody somewhere is bound to have done the equivalent of 
"2<5<false" in their OCaml code. But does "2<5<false" have defined behaviour?

> > 5. Is it productive to think in terms of coercing lex and yacc into doing
> > as much of the work as possible
>
> I personally think you should do the opposite -- let lex/yacc
> do the least possible work since they're fairly rigid.
> You may need to fiddle with your grammar to get the language
> you want -- and it is better if that has the minimum
> impact on your semantic logic. IMHO.

But making more use of lex and yacc is good because they detect conflicts or 
ambiguities? Not great in the sense that reduce-reduce conflicts can be a 
nightmare to get rid of though (especially if you don't own the grammar), but 
that's the trade-off.

Cheers,
Jon.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14 20:19               ` Jon Harrop
@ 2004-08-14 20:55                 ` Brian Hurt
  2004-08-14 20:57                   ` Marcin 'Qrczak' Kowalczyk
  2004-08-15  1:26                   ` Jon Harrop
  2004-08-14 22:13                 ` skaller
  1 sibling, 2 replies; 27+ messages in thread
From: Brian Hurt @ 2004-08-14 20:55 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sat, 14 Aug 2004, Jon Harrop wrote:

> On Saturday 14 August 2004 04:33, Brian Hurt wrote:
> > > 3. If so, is the fact that most languages disallow "a<b<c" due to this?
> > 
> > No.  "a<b<c" is parsed the same way as "a+b+c".
> 
> Sorry, I should have been more specific. With left- or right- or 
> non-associative, commuting, 'a->'a->'a operators (like + and *) you can get 
> away with parsing that way, e.g. "a+b+c" as:
> 
> either  (a+b)+c  or  a+(b+c)
> 
> But you can't do this with comparison 'a->'a->bool operators because it forces 
> you to deviate from conventional mathematical meaning, e.g. you get a type 
> error in OCaml on the "3" in "1<2<3" because it parses as "(1<2)<3" which 
> evaluates to "true<3" which just doesn't make any sense.
> 

The syntax of a language doesn't enforce a given meaning on the language 
being parsed.  "Colorless green ideas sleep furiously" is a syntactically 
correct English sentence, even if it is utterly meaningless.

The AST  of a<b<c has to be one of two ways:

       <                <
      / \              / \
    a    <      or    <   c
        / \          / \
       b   c        a   b

i.e. a < (b < c) or (a < b) < c.  What the meaning of these two 
expressions are is entirely up to the compiler- more spefically, up to the 
parts which are not lex or yacc based.

Although this does bring up one interesting question- is a<b<c 
syntactically different than (a<b)<c?  Generally, languages want to 
consider "extra" parenthesis to be harmless.

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14 20:55                 ` Brian Hurt
@ 2004-08-14 20:57                   ` Marcin 'Qrczak' Kowalczyk
  2004-08-14 22:15                     ` skaller
  2004-08-15  1:26                   ` Jon Harrop
  1 sibling, 1 reply; 27+ messages in thread
From: Marcin 'Qrczak' Kowalczyk @ 2004-08-14 20:57 UTC (permalink / raw)
  To: caml-list

W liście z sob, 14-08-2004, godz. 15:55 -0500, Brian Hurt napisał:

> The AST  of a<b<c has to be one of two ways:
> 
>        <                <
>       / \              / \
>     a    <      or    <   c
>         / \          / \
>        b   c        a   b

No, it can also be an error, which is IMHO a better choice.
Unless it's made equivalent to a < b && b < c, like in Python.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14 20:19               ` Jon Harrop
  2004-08-14 20:55                 ` Brian Hurt
@ 2004-08-14 22:13                 ` skaller
  1 sibling, 0 replies; 27+ messages in thread
From: skaller @ 2004-08-14 22:13 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sun, 2004-08-15 at 06:19, Jon Harrop wrote:
> On Saturday 14 August 2004 04:33, Brian Hurt wrote:

> 
> I see. You don't just make (x + y) an expression in the grammar but a whole 

> new rule "sum" which contains (x + y) or (x + sum) and has the precendence of 
> "+"?

The actual grammar production is like:

sum: difference sum | difference

which looks right associative. But it isn't, because the real
thing is:

sum: difference sum { $1 :: $2 } | difference { [$1] }

which you can see constructs a list -- I can interpret that
as either left or right associative, or as an n-ary operator.
I have cheated the parser by NOT building the parse tree
you'd expect, but a list :)

> So I want to take all comparison operators "'a -> 'a -> bool" and make a rule 
> "inequality" for a (x op1 y) or (x op1 comparison) chain "operator" which, 
> say, builds a list of operand and operators? Then you could do "x0 <= x < 
> x1". Woohoo!
> 
> Would this have to be a conflict in the grammar with "a<b<c" parsed as 
> "(a<b)<c"?

Of course, you have to decide whether

	a < b < c

means

	(a < b) < c

or

	a < n && b < c

You can actually parse it and generate a list either way,
and make the decision later. However you can't easily
defer the decision to type analysis .. because the typing
usually would depend on already knowing the precedence.

you *could* do a trial typing, and if it failed, try
the other precedence -- it would work, but heck it would
surely confuse the programmer as well as the compiler :)

That's why i use

	a < b &< c &< d &< e

in Felix. This notation doesn't conflict with
the usual precedence for <.

> > > 4. Could that be added to OCaml? ;-)
> >
> > Not without breaking existing code...
> 
> Right, because somebody somewhere is bound to have done the equivalent of 
> "2<5<false" in their OCaml code. But does "2<5<false" have defined behaviour?

Recall Ocaml has a polymorphic comparison operator .. it has to
work on basic types like bool too.

Exercise: it actually has a real meaning in terms of boolean
operators (assume false < true). Hint: do the truth table :)

> But making more use of lex and yacc is good because they detect conflicts or 
> ambiguities? 

Except that often they're unwanted, and an artefact of the
grammar you have chosen, rather than the language you want
to parse. As you said: there is a tradeoff.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14 20:57                   ` Marcin 'Qrczak' Kowalczyk
@ 2004-08-14 22:15                     ` skaller
  0 siblings, 0 replies; 27+ messages in thread
From: skaller @ 2004-08-14 22:15 UTC (permalink / raw)
  To: Marcin 'Qrczak' Kowalczyk; +Cc: caml-list

On Sun, 2004-08-15 at 06:57, Marcin 'Qrczak' Kowalczyk wrote:
> W liście z sob, 14-08-2004, godz. 15:55 -0500, Brian Hurt napisał:
> 
> > The AST  of a<b<c has to be one of two ways:
> > 
> >        <                <
> >       / \              / \
> >     a    <      or    <   c
> >         / \          / \
> >        b   c        a   b
> 
> No, it can also be an error, which is IMHO a better choice.
> Unless it's made equivalent to a < b && b < c, like in Python.

Not in C++ where < can be overloaded. In that case,
 
	(a < b) < c

could be quite meaningful :)
 
-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-14 20:55                 ` Brian Hurt
  2004-08-14 20:57                   ` Marcin 'Qrczak' Kowalczyk
@ 2004-08-15  1:26                   ` Jon Harrop
  2004-08-15  8:24                     ` skaller
  2004-08-15 15:39                     ` Brian Hurt
  1 sibling, 2 replies; 27+ messages in thread
From: Jon Harrop @ 2004-08-15  1:26 UTC (permalink / raw)
  To: Ocaml Mailing List

On Saturday 14 August 2004 21:55, Brian Hurt wrote:
> The syntax of a language doesn't enforce a given meaning on the language
> being parsed.  "Colorless green ideas sleep furiously" is a syntactically
> correct English sentence, even if it is utterly meaningless.

I think I see what you mean. ;-)

> The AST  of a<b<c has to be one of two ways:
>
>        <                <
>       / \              / \
>     a    <      or    <   c
>         / \          / \
>        b   c        a   b
>
> i.e. a < (b < c) or (a < b) < c.

I believe you are unnecessarily constraining the AST to be a binary tree. What 
is wrong with an n-ary tree:

type ast = ... | Less of ast list | ...

Less [a; b; c]

I think Skaller referred to the implementation of a parser for this as a 
"chain operator", which I understand to be a way non-associative operators 
may be parsed to create a node in the AST with a list of operands, rather 
than the usual pair of operands.

So the AST becomes an n-ary tree, in order to represent n-ary operators. I'd 
have assumed that pattern matches were implemented in this way but a previous 
post by Pierre suggested otherwise, because patterns have "no notion of 
evaluation semantics" so a pattern match is not really a (2n+1)-ary operator.

> What the meaning of these two 
> expressions are is entirely up to the compiler- more spefically, up to the
> parts which are not lex or yacc based.

I'm not sure again. :-)

If you coerce the parsing into using a single, binary AST then I believe you 
lose the ability to distinguish between Less[a; b; c] and Less[Less[a; b]; c] 
etc. You have to create different binary AST node types, e.g. 
InequalityLess(InequalityLess(a, b), c) to mean Less[a; b; c].

> Although this does bring up one interesting question- is a<b<c
> syntactically different than (a<b)<c?

In conventional mathematical notation, I would say yes. Although the latter is 
only meaningful if you've defined a comparison over booleans (e.g. OCaml's 
true>false), which you probably wouldn't...

> Generally, languages want to consider "extra" parenthesis to be harmless.

Yes, so, as you say, this boils down to "should a<b<c and (a<b)<c be 
semantically different?".

Cheers,
Jon.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-15  1:26                   ` Jon Harrop
@ 2004-08-15  8:24                     ` skaller
  2004-08-15 15:39                     ` Brian Hurt
  1 sibling, 0 replies; 27+ messages in thread
From: skaller @ 2004-08-15  8:24 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sun, 2004-08-15 at 11:26, Jon Harrop wrote:

> I believe you are unnecessarily constraining the AST to be a binary tree. What 
> is wrong with an n-ary tree:
> 
> type ast = ... | Less of ast list | ...
> 
> Less [a; b; c]

Indeed: for operator * in the type sublanguage of Ocaml
that is mandatory.

> I think Skaller referred to the implementation of a parser for this as a 
> "chain operator", which I understand to be a way non-associative operators 
> may be parsed to create a node in the AST with a list of operands, rather 
> than the usual pair of operands.

Yes, but the phrase 'non-associative' has a different meaning
than what I called 'chain operator'. Non-associative means a binary
operator that *requires brackets* for example in TeX

	x ^ y ^ z

is an *error* because ^ is neither left nor right associative
but non-associative. On the other hand in 'mathematics' + is
actually associative -- it isn't left or right associative,
its associative: 

	a + b + c

means the same as BOTH

	(a + b) + c

and

	a + (b + c)

This means the compiler could choose three distint parses:

	(a) left assoc
	(b) right assoc
	(c) chain operator

and the user would not be allowed to depend on the
implementation chosen. Few languages do that,
but it would be useful because it allows superior
optimisation -- for example constant folding both

	1 + 2 + three
	one + 2 + 3

is impossible with left or right assoc because you can't
tell from the parse tree if the client used brackets to
override the usual precedence -- you can ignore that
if you happen to know the semantics of course: 
for ints, you might. But it would be risky for
floats.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-15  1:26                   ` Jon Harrop
  2004-08-15  8:24                     ` skaller
@ 2004-08-15 15:39                     ` Brian Hurt
  2004-08-15 16:54                       ` Jon Harrop
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Hurt @ 2004-08-15 15:39 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml Mailing List

On Sun, 15 Aug 2004, Jon Harrop wrote:

> I believe you are unnecessarily constraining the AST to be a binary
> tree. 

No, I was trying to make a graphical point.

> I think Skaller referred to the implementation of a parser for this as a 
> "chain operator", which I understand to be a way non-associative operators 
> may be parsed to create a node in the AST with a list of operands, rather 
> than the usual pair of operands.

Yep.  But notice where the trick was- it wasn't in the yacc grammar, it 
was in the Ocaml code.  He parsed the sequence as:
     <
    / \
   a   <
      / \
     b   c

Forget I said AST.  How are the yacc rules applied?

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Caml-list] CFG's and OCaml
  2004-08-15 15:39                     ` Brian Hurt
@ 2004-08-15 16:54                       ` Jon Harrop
  0 siblings, 0 replies; 27+ messages in thread
From: Jon Harrop @ 2004-08-15 16:54 UTC (permalink / raw)
  To: Ocaml Mailing List

On Sunday 15 August 2004 16:39, Brian Hurt wrote:
> Forget I said AST.  How are the yacc rules applied?
> ...

Right, yes, I see what you mean. So we're talking about using LR(1), which is 
necessarily a binary tree (albeit of function calls and not AST), to build up 
non-binary-tree data structures.

This kind of goes back to what I said originally about doing as much work as 
possible within LALR(1) and using post-processing to do the rest. Although, 
in this case, I think yacc will enforce correctness, e.g. even "a<b>c<d>e" is 
correct etc., so the post-processing doesn't need to do any checking here.

As Skaller pointed out off list, the ability to write "a<=b<c" has advantages 
in terms of short-circuit evaluation as well...

Cheers,
Jon.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Caml-list] CFG's and OCaml
@ 2004-08-12 19:15 David McClain
  0 siblings, 0 replies; 27+ messages in thread
From: David McClain @ 2004-08-12 19:15 UTC (permalink / raw)
  To: caml-list

Heh! I just dug out Andrew Appel's book from the closet and began 
reading it anew.

Lo and behold, he shows how grammars that invite reduce/reduce 
conflicts are best handled by opening up the universe of sytactically 
recognizable sentences -- well beyond anything meaningful for the 
language at hand -- and then using a later semantic analysis stage to 
detect and report the error.

So this explains why the parser.ml for OCaml has all those apparently 
illegal accepting clauses in it. The parser recognizes all kinds of 
wild things that would be patently invalid OCaml code.

I began life in this field more than 30 years ago using a simple 
language called Forth to control a large 100 inch telescope in 
Wyoming... I learned many "bad habits" in programming -- most notably 
to write code that works when fed correct input for its intended 
purpose -- and to hell with what happens in any other case. I watched 
and learned from the practicioners of this language what I came to call 
"Kamakazi Programming Style".

Now after spending the last 30 years trying hard to overcome these 
initial misguidings, I find once again, that this is the appropriate 
solution -- at least in the parsing stage.

Very interesting...

David McClain
Senior Corporate Scientist
Avisere, Inc.

david.mcclain@avisere.com
+1.520.390.7738 (USA)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2004-08-15 16:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-13 14:04 [Caml-list] CFG's and OCaml David McClain
2004-08-13 15:05 ` Damien Doligez
2004-08-13 15:26   ` David McClain
2004-08-13 16:12     ` Damien Doligez
2004-08-13 15:28   ` David McClain
2004-08-13 15:49 ` Brian Hurt
2004-08-13 16:04   ` David McClain
2004-08-13 16:29     ` Brian Hurt
2004-08-13 16:42       ` Xavier Leroy
2004-08-13 17:18         ` Ken Rose
2004-08-13 18:55         ` Brian Hurt
2004-08-14  0:25           ` Jon Harrop
2004-08-14  0:57             ` Erik de Castro Lopo
2004-08-14  8:52               ` Alan Schmitt
2004-08-14  3:33             ` Brian Hurt
2004-08-14  7:55             ` skaller
2004-08-14 20:19               ` Jon Harrop
2004-08-14 20:55                 ` Brian Hurt
2004-08-14 20:57                   ` Marcin 'Qrczak' Kowalczyk
2004-08-14 22:15                     ` skaller
2004-08-15  1:26                   ` Jon Harrop
2004-08-15  8:24                     ` skaller
2004-08-15 15:39                     ` Brian Hurt
2004-08-15 16:54                       ` Jon Harrop
2004-08-14 22:13                 ` skaller
2004-08-13 16:58     ` Paul Snively
  -- strict thread matches above, loose matches on Subject: below --
2004-08-12 19:15 David McClain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).