caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* yacc style
@ 2005-01-27 20:17 Chris King
  2005-01-27 21:39 ` [Caml-list] " Erik de Castro Lopo
       [not found] ` <200501272252.43720.jon@jdh30.plus.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Chris King @ 2005-01-27 20:17 UTC (permalink / raw)
  To: O'Caml Mailing List

I'm defining a grammar which, when parsed, returns an imperative
structure (i.e. one which can't easily be created in a functional
style).  ocamlyacc is making this difficult for me, since there's no
clean way to pass to the main parser function a structure on which to
operate.  Right now I'm getting around this by defining global
references in the header of the parser and setting them before each
call to the parser main function.  Is it considered better style to
instead have the parser return a parse tree, and then use that to
generate the imperative structure, or is there a more direct way to do
what I want to do?

Thanks!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
  2005-01-27 20:17 yacc style Chris King
@ 2005-01-27 21:39 ` Erik de Castro Lopo
  2005-01-28  1:14   ` skaller
       [not found] ` <200501272252.43720.jon@jdh30.plus.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Erik de Castro Lopo @ 2005-01-27 21:39 UTC (permalink / raw)
  To: caml-list

On Thu, 27 Jan 2005 15:17:13 -0500
Chris King <colanderman@gmail.com> wrote:

> Is it considered better style to
> instead have the parser return a parse tree, and then use that to
> generate the imperative structure, 

Yes, normally the parser generates a parse tree which is then
passed to the semantic analyser for semantic checking.

Erik-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"... a discussion of C++'s strengths and flaws always sounds like an
argument about whether one should face north or east when one is
sacrificing one's goat to the rain god." -- Thant Tessman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
       [not found] ` <200501272252.43720.jon@jdh30.plus.com>
@ 2005-01-28  0:41   ` Chris King
  0 siblings, 0 replies; 7+ messages in thread
From: Chris King @ 2005-01-28  0:41 UTC (permalink / raw)
  To: Jon Harrop; +Cc: O'Caml Mailing List

On Fri, 28 Jan 2005 08:39:56 +1100, Erik de Castro Lopo
<ocaml-erikd@mega-nerd.com> wrote:
> Yes, normally the parser generates a parse tree which is then
> passed to the semantic analyser for semantic checking.

Okay, sounds good -- thanks!

On Thu, 27 Jan 2005 22:52:43 +0000, Jon Harrop <jon@jdh30.plus.com> wrote:
> May I ask what makes it difficult to use a functional style in your case?
> Perhaps it isn't as difficult as you think...

It's for performance reasons, mostly: the structure consists of a hash
table (to store user symbol definitions) and an array (for the actual
data).  It's easy enough to get the data in the form of a parse tree
and massage it into these structures afterwards; I was just wondering
if there was a more direct way.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
  2005-01-27 21:39 ` [Caml-list] " Erik de Castro Lopo
@ 2005-01-28  1:14   ` skaller
  2005-01-28  2:28     ` Erik de Castro Lopo
  0 siblings, 1 reply; 7+ messages in thread
From: skaller @ 2005-01-28  1:14 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list

On Fri, 2005-01-28 at 08:39, Erik de Castro Lopo wrote:
> On Thu, 27 Jan 2005 15:17:13 -0500
> Chris King <colanderman@gmail.com> wrote:
> 
> > Is it considered better style to
> > instead have the parser return a parse tree, and then use that to
> > generate the imperative structure, 
> 
> Yes, normally the parser generates a parse tree which is then
> passed to the semantic analyser for semantic checking.

Unfortunately this is useless in the common case
of needing to parse C. It would surely be nice
to be able to pass an argument to ocamlyacc,
as can now be done for ocamllex.

Strangely in this case the ideal place to add
on  the typedef table would be the lexbuf,
which *is* passed to the parser .. but the client
can't get at it, and it isn't extensible.

The reason this isn't done right seems to be that
yacc/lex are used to bootstrap Ocaml and so fixing
them would destablise the bootstrap.

FYI: in Felix, the parser and lexer are built-in
to the language, and lexical scoping is supported:
whilst you cannot pass in an extra argument to the parser,
you *can* write action code that depends on its
environment, which need not be the top level.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
  2005-01-28  1:14   ` skaller
@ 2005-01-28  2:28     ` Erik de Castro Lopo
  2005-01-28  4:30       ` skaller
  2005-01-28  9:04       ` Jean-Christophe Filliatre
  0 siblings, 2 replies; 7+ messages in thread
From: Erik de Castro Lopo @ 2005-01-28  2:28 UTC (permalink / raw)
  To: caml-list

On 28 Jan 2005 12:14:39 +1100
skaller <skaller@users.sourceforge.net> wrote:

> On Fri, 2005-01-28 at 08:39, Erik de Castro Lopo wrote:
> >
> > Yes, normally the parser generates a parse tree which is then
> > passed to the semantic analyser for semantic checking.
> 
> Unfortunately this is useless in the common case
> of needing to parse C.

I'm happy to take your word for it John, but I'd like to know
why.

> It would surely be nice
> to be able to pass an argument to ocamlyacc,
> as can now be done for ocamllex.

Could you give a example?

> Strangely in this case the ideal place to add
> on  the typedef table would be the lexbuf,

Ok, so this enables you to know at lex time if an identifier is 
a user type avoiding ugly parser hacks to work around the fact
that identifier X is actually a user defined type.

Am I on the right track here?

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"life is too long to know C++ well" -- Erik Naggum


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
  2005-01-28  2:28     ` Erik de Castro Lopo
@ 2005-01-28  4:30       ` skaller
  2005-01-28  9:04       ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 7+ messages in thread
From: skaller @ 2005-01-28  4:30 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list

On Fri, 2005-01-28 at 13:28, Erik de Castro Lopo wrote:
> On 28 Jan 2005 12:14:39 +1100
> skaller <skaller@users.sourceforge.net> wrote:
> 
> > On Fri, 2005-01-28 at 08:39, Erik de Castro Lopo wrote:
> > >
> > > Yes, normally the parser generates a parse tree which is then
> > > passed to the semantic analyser for semantic checking.
> > 
> > Unfortunately this is useless in the common case
> > of needing to parse C.
> 
> I'm happy to take your word for it John, but I'd like to know
> why.

Sure: there are several contexts, but one is that:

	(X)(Y)(Z)

could mean either

	((X)(Y))(Z)

or

	(X)((Y)(Z))

depending on whether Y is a type, in which case (Y)(Z) is a cast,
or an expression, in which case (Y)(Z) is a function application.
The precedences of casts and function applications in C are different.

Function calls bind more tightly than casts, so (int)(f)(x) means
cast the result of f(x) to an int, whereas (g)(a)(b) 
means apply g(a) to b .. assuming g,a,b are not typenames .. :)

Smly 

	(f)(x,y,z)

comma is a separator if f is a function, if f is a typename
its a cast, and (x,y,z) is an expression value 'z'... :)

I guess there is more, and C++ is worse, but this is enough
to be as confused as a one token lookahead context free
parser would be .. :)

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] yacc style
  2005-01-28  2:28     ` Erik de Castro Lopo
  2005-01-28  4:30       ` skaller
@ 2005-01-28  9:04       ` Jean-Christophe Filliatre
  1 sibling, 0 replies; 7+ messages in thread
From: Jean-Christophe Filliatre @ 2005-01-28  9:04 UTC (permalink / raw)
  To: Erik de Castro Lopo; +Cc: caml-list


Erik de Castro Lopo writes:
 > skaller <skaller@users.sourceforge.net> wrote:
 > > On Fri, 2005-01-28 at 08:39, Erik de Castro Lopo wrote:
 > > >
 > > > Yes, normally the parser generates a parse tree which is then
 > > > passed to the semantic analyser for semantic checking.
 > > 
 > > Unfortunately this is useless in the common case
 > > of needing to parse C.
 > 
 > Could you give a example?

When parsing C, the lexer  must produce different tokens for variables
identifiers  and  types identifiers,  otherwise  you may  misinterpret
things  like "a  * b"  (is it  the  declaration of  a pointer  b or  a
multiplication?) or casts. The following piece of code is illustrating
the difficulty:

======================================================================
int a, b;
typedef int t, u;
void f1() { a * b; }
void f2() { t * u; }
void f3() { t * b; }
void f4() { int t; t * b; }
void f5(t u, unsigned t) {
  switch ( t ) {
  case 0: if ( u )
    default: return;
  }
}
======================================================================

The  solution  is  to  have  the  parser  modifying  the  lexer  while
parsing. This is quite ugly in practice. The CIL framework includes a
full C parser written in ocaml, so you can get there one possible way
of handling this issue; see http://manju.cs.berkeley.edu/cil/

Hope this helps,
-- 
Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr)


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-01-28  9:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-27 20:17 yacc style Chris King
2005-01-27 21:39 ` [Caml-list] " Erik de Castro Lopo
2005-01-28  1:14   ` skaller
2005-01-28  2:28     ` Erik de Castro Lopo
2005-01-28  4:30       ` skaller
2005-01-28  9:04       ` Jean-Christophe Filliatre
     [not found] ` <200501272252.43720.jon@jdh30.plus.com>
2005-01-28  0:41   ` Chris King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).