caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Error messages with dypgen
@ 2007-05-18  7:20 Joel Reymont
  2007-05-18  8:42 ` [Caml-list] " skaller
  0 siblings, 1 reply; 4+ messages in thread
From: Joel Reymont @ 2007-05-18  7:20 UTC (permalink / raw)
  To: OCaml List

I understand that dypgen throws an exception when a syntax error is  
found.

How do I get it to produce line/column numbers when that happens?

It would be rather groovy if the the exception carried  
symbol_start_pos, symbol_end_pos, rhs_start_pos, rhs_end_pos from the  
dyp record since this information is available at the time that the  
exception is raised. A placeholder for a textual message would also  
be very helpful.

A simple error function built into dypgen could then take a message  
and raise the syntax error exception with all the required info.


	Thanks, Joel

--
http://wagerlabs.com/






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Error messages with dypgen
  2007-05-18  7:20 Error messages with dypgen Joel Reymont
@ 2007-05-18  8:42 ` skaller
  2007-05-18 11:36   ` Joel Reymont
  0 siblings, 1 reply; 4+ messages in thread
From: skaller @ 2007-05-18  8:42 UTC (permalink / raw)
  To: Joel Reymont; +Cc: OCaml List

On Fri, 2007-05-18 at 08:20 +0100, Joel Reymont wrote:
> I understand that dypgen throws an exception when a syntax error is  
> found.
> 
> How do I get it to produce line/column numbers when that happens?
> 
> It would be rather groovy if the the exception carried  
> symbol_start_pos, symbol_end_pos, rhs_start_pos, rhs_end_pos from the  
> dyp record since this information is available at the time that the  
> exception is raised. 

No it isn't. Dypgen uses lexbufs for compatibility with the 
broken Ocamlyacc interface. Dypgen lets you use ulex or
other lexer as well. The type of the error thrown by the
automaton should not be polluted by positional information
that has no reasonable standard specification.

If you want this information, you can look it up yourself
in the lexbuf. The parser has no business at all examining
the lexbuf, the lexbuf belongs to the lexer.

> A simple error function built into dypgen could then take a message  
> and raise the syntax error exception with all the required info.

An error function is a good idea, except the Ocamlyacc style
interface is broken so there's no way to pass it so it
would have to be global.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Error messages with dypgen
  2007-05-18  8:42 ` [Caml-list] " skaller
@ 2007-05-18 11:36   ` Joel Reymont
  2007-05-18 13:48     ` skaller
  0 siblings, 1 reply; 4+ messages in thread
From: Joel Reymont @ 2007-05-18 11:36 UTC (permalink / raw)
  To: skaller; +Cc: OCaml List, Emmanuel Onzon

John,

There should at least be a textual message embedded in the exception.

This is what I have right now:

input_declarations:
   | INPUT COLON input_decs { `InputDecls (List.rev $3) }
   | INPUT COLON input_decs error {
       parser_error "Missing semicolon" $startpos($4) $endpos($4)
     }
   | INPUT COLON error {
       parser_error "Error after INPUT:" $startpos($3) $endpos($3)
     }
   | INPUT error {
       parser_error "Missing ':' after INPUT" $startpos($2) $endpos($2)
     }

I clearly know why the error is happening here. Positions  
notwithstanding, wow do I rewrite this for dypgen?

	Thanks, Joel

On May 18, 2007, at 9:42 AM, skaller wrote:

> No it isn't. Dypgen uses lexbufs for compatibility with the
> broken Ocamlyacc interface. Dypgen lets you use ulex or
> other lexer as well. The type of the error thrown by the
> automaton should not be polluted by positional information
> that has no reasonable standard specification.

--
http://wagerlabs.com/






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Error messages with dypgen
  2007-05-18 11:36   ` Joel Reymont
@ 2007-05-18 13:48     ` skaller
  0 siblings, 0 replies; 4+ messages in thread
From: skaller @ 2007-05-18 13:48 UTC (permalink / raw)
  To: Joel Reymont; +Cc: OCaml List, Emmanuel Onzon

On Fri, 2007-05-18 at 12:36 +0100, Joel Reymont wrote:
> John,
> 
> There should at least be a textual message embedded in the exception.
> 
> This is what I have right now:
> 
> input_declarations:
>    | INPUT COLON input_decs { `InputDecls (List.rev $3) }
>    | INPUT COLON input_decs error {
>        parser_error "Missing semicolon" $startpos($4) $endpos($4)
>      }
>    | INPUT COLON error {
>        parser_error "Error after INPUT:" $startpos($3) $endpos($3)
>      }
>    | INPUT error {
>        parser_error "Missing ':' after INPUT" $startpos($2) $endpos($2)
>      }
> 
> I clearly know why the error is happening here. Positions  
> notwithstanding, wow do I rewrite this for dypgen?

There are two issues here. 

First, the above code will
"work" in dypgen already, assuming you supply a 
parser_error function. This code would not work for me,
because my lexbuf is a dummy: the lexer has already run
and made a list of tokens, and the lexer function is bound
to the list. It ignores, totally, the lexbuf. In my system
the positional information is stored in every token.

So my extant parser is an example that 'proves' that
dypgen *must not* standardise the format of source
reference information and certainly must not raise
a syntax error exception encoding that information.

One solution to this may be to use an abstract type
and a functor to make the 'source' information 
parametric.

Second: this style of error handling CANNOT work with a GLR
parser, because GLR parsers can simultaneously try multiple
alternatives. The only time you can be sure you have an error
is at a 'cut point', that is, a point where all threads 
join, and none of them proceed.

A conclusion: dypgen may need to be modified so that there
is a way to 'return' an error. At present you can 
raise Giveup to indicate a parse thread failed,
however your technique above is to *successfully* 
parse an error.

A more advanced conclusion: Ocamlyacc parser interface
is seriously broken, and should be supported only
for compatibility.

There are two proper parser interfaces, IMHO:
one for input iterators (mutable streams) and
one for forward iterators (functional streams).

The mutable interface looks like:

	lexer: state -> info
	get_loc: info -> srcloc
	get_token: info -> token

The functional interface looks like:

	lexer: state -> state * info

instead. With this interface, backtracking to
an old 'state' value is possible.

Input and forward iterators are interconvertible.

A forward iterator can be made into an input
iterator by simply using a reference to the state,
that is, use a state variable to record the current
state.

An input iterator can be converted to a forward
iterator by 'buffering' tokens in a list. Doing
this efficiently is slightly tricky, that is,
only buffering enough tokens to satisfy a possible
backtrack (usually done with cut points).

In both these interfaces the srcloc type is supplied
by the user. Ideally, the token type would be too.
in that case another function is needed:

	get_token_code: token -> int

which is what the parser uses: that's the tag
of a variant constructor or whatever.

These interfaces should be standardised for ALL
parsers so we have 'plugin' ability. Of course,
the semantics may depend on the kind of parser
and grammar.

Ocamlyacc itself could be easily modified to fit
this design by the lexer simply returning its lexbuf
with the token.

in summary: the key problem with what you want
to do is that it is makes no sense semantically.
You want to return information that the parser
cannot in principle obtain. The fact it appears
visible is actually a design bug in Ocamlyacc
which has been duplicated by Dypgen in compatibility
mode.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-05-18 13:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-18  7:20 Error messages with dypgen Joel Reymont
2007-05-18  8:42 ` [Caml-list] " skaller
2007-05-18 11:36   ` Joel Reymont
2007-05-18 13:48     ` skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).