From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA16367 for caml-redistribution; Fri, 31 Jan 1997 08:40:56 +0100 (MET)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA25227 for <caml-list@pauillac.inria.fr>; Thu, 30 Jan 1997 10:55:21 +0100 (MET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.7.6/8.7.3) with ESMTP id KAA07721; Thu, 30 Jan 1997 10:55:20 +0100 (MET)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA25223; Thu, 30 Jan 1997 10:55:20 +0100 (MET)
From: Xavier Leroy <Xavier.Leroy@inria.fr>
Message-Id: <199701300955.KAA25223@pauillac.inria.fr>
Subject: Re: ergonomie du compilateur
In-Reply-To: <199701211254.NAA13357@ithif18.inf.tu-dresden.de> from Hendrik Tews at "Jan 21, 97 01:54:46 pm"
To: tews@tcs.inf.tu-dresden.de (Hendrik Tews)
Date: Thu, 30 Jan 1997 10:55:19 +0100 (MET)
Cc: caml-list@inria.fr
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: weis

> - There is no error recovery in the compiler. Before I start the
> first compile I have at least 5 syntax errors in the file. In
> order to find them I have to go through at least 5 edit-compile
> cycles, which I find very annoying.

My experience with ML compilers that attempt error recovery is that
they usually get it wrong and report many spurious syntax errors after
the first actual error, which is even more annoying than stopping at
the first error. Perhaps that's because of the ML syntax, which
has fewer obvious resynchronization points than, say, C syntax.

Also, as an implementor, I must say that I find parsing extremely
boring and prefer to concentrate on semantic issues. But all the usual
Yacc error recovery machinery is there, and any work on the OCaml
parser is welcome.

> - In case of a syntax error, the compiler never reports what it
> was expecting.

My understanding is that this can be done on a case by case basis,
using error productions. But I'm not sure it can be done in a
systematic way by reporting all tokens that the LALR tables are
expecting at that point. My recollection is that various Yacc
optimizations make this information (the set of expected tokens) hard
to recover from the transition tables.

> - I don't understand the behavior of "ocamlc -i" if an type error
> occurs. Especially in this case, where any information about the
> inferred types would be useful, it just prints nothing.

The "-i" option means ``typecheck the whole compilation unit, then
print out the inferred signature''. If the typing fails, nothing is
printed. Remember that a compilation unit, in OCaml, is parsed as a
whole, then typechecked as a whole, etc. There is no such thing as a
"phrase" or "toplevel definition" in a compilation unit.

In Caml Light there is actually a notion of "phrase" in a compilation
unit, and the "-i" option reports types on a phrase by phrase
basis. This is better for understanding type errors, but can alos lead
to inconsistent typings being printed. Consider:

        let r = ref []
        let f x = ... r := [1] ...

camlc -i reports

        value r : '_a list ref
        value f : ... -> ...

which is not right (the type of r is actually int list ref), while
ocamlc -i reports the correct types

        val r : int list ref
        val f : ... -> ...

Regards,

- Xavier Leroy