caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* menhir
@ 2007-04-28 10:32 skaller
  2007-04-28 16:50 ` [Caml-list] menhir Francois Pottier
  0 siblings, 1 reply; 20+ messages in thread
From: skaller @ 2007-04-28 10:32 UTC (permalink / raw)
  To: caml-list

Just a note I just built the Felix parser with Menhir.
First, it detected some duplicate definitions Ocamlyacc didn't! Good!

Second, I got a "rather a lot" of states have end-of-stream conflicts.
What's that about?

Third the generated ml file was 4.5 Meg. 
Ocamlopt on amd64 hung for so long I almost posted a bug report
for Ocamlopt, but finally it finished. This was a 95% CPU, 25% memory
job, so no paging. I'd guess it took 100x times longer than
compilation of the ocamlyacc file (which is just a bunch of numbers :)
I didn't measure it .. no biggie for me now I know, but my
box is a LOT faster than some of the boxes my product gets built on.

After that, Felix built ok, and the parser worked for
'pure' code. However it failed when Felix preprocessor
syntax extensions were used (which is 90% of all programs).

Now, most of the system data transport for this is properly
built so it can't cause any problems. The one thing which 
is hacked is the pushback detection.

Basically: when Ocamlyacc reduces a production, it sometimes
ends on the last token, and sometimes it overshoots by 1.

My grammar uses a system like:

exprx:
  expr expr_terminator { $1,$2 }

statementsx:
  | statement_aster statements_terminator { $1, $2 }

This is saying: a 'special expression' exprx is an expression
PLUS one of the tokens which will solidly terminate an 
expression AND NOT ITSELF BE OVERSHOT by the parser.

In other words when exprx is parsed the reduction must
leave the next token unread.

The semantics used are: when the exprx is processed the
action arranges to push the terminator token back
into the token stream.

Perhaps because Menhir is LR(1) not LALR(1), this technique
is failing. Or Menhir may simply be looking ahead further
than required in the token stream.

Whichever way, I am depending on this particular implementation
detail of Ocamlyacc, and Mehir is using a different implementation.

Any suggestions how to 'fix' this?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-05-03  8:44 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-28 10:32 menhir skaller
2007-04-28 16:50 ` [Caml-list] menhir Francois Pottier
2007-04-28 19:47   ` Markus Mottl
2007-04-28 21:15     ` Jon Harrop
2007-04-29  4:43   ` skaller
2007-04-29  7:27     ` Christophe Raffalli
2007-05-01 15:57     ` Francois Pottier
2007-05-01 17:11       ` skaller
2007-05-01 17:34         ` Francois Pottier
2007-05-01 23:42           ` skaller
2007-05-02  5:38             ` Francois Pottier
2007-05-02  5:50               ` Francois Pottier
2007-05-02  8:41               ` skaller
2007-05-02 12:30                 ` Francois Pottier
2007-05-02 16:29                   ` skaller
2007-05-02 18:35                     ` Francois Pottier
2007-05-03  1:30                       ` skaller
2007-05-03  8:43                         ` Joel Reymont
2007-05-01 17:15       ` skaller
2007-05-01 17:31         ` Francois Pottier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).