From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p05ECknt016741 for ; Wed, 5 Jan 2011 15:12:46 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsoAAHYLJE3RVdg2kGdsb2JhbACkHwgVAQEBAQkJExEEIKdIjBCEYoZYAQEDBYVHBIsKiUM X-IronPort-AV: E=Sophos;i="4.60,278,1291590000"; d="scan'208";a="72141672" Received: from mail-qw0-f54.google.com ([209.85.216.54]) by mail3-smtp-sop.national.inria.fr with ESMTP; 05 Jan 2011 15:12:40 +0100 Received: by qwj9 with SMTP id 9so15836013qwj.27 for ; Wed, 05 Jan 2011 06:12:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:reply-to:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=nCASYMwo+Jx6kAxc/f8lJbOxjENFN64uJ0xROw82HWo=; b=MWZtTUErG2+z3kKiRv8V9fyP4IWkSY0VkJtnRvbq0hT1D3Khs4LzjJPwMWrccsBW/C 69XOty6dheKd/M6zU0WUbKROyLewxz2GEcmsdnG2UauKGNqtJNSxKvIsJjnvX6+VcdWF t2ZOL0143y1C16k/foRDiUfM4OAkg+qpJkUuc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=Ujf89CMfcJaOpYCVYSMqMHY5IfWjjhRKNhGIDHdHQusntHgMRvx6gk3TX8jG1xsjA4 +VlZatkGkmRSpCmudpxV4aLkbWWh/oIyCKWB2wuRNwDyZYcvx2Z8YfeVsdch8EQzRbwR Gylx5bUrPFtuiSQMvgcwKZtBfXn7yIXIy8B7Q= Received: by 10.224.11.18 with SMTP id r18mr21651480qar.226.1294236759933; Wed, 05 Jan 2011 06:12:39 -0800 (PST) MIME-Version: 1.0 Sender: cppcrazy@gmail.com Reply-To: boris@yakobowski.org Received: by 10.220.188.139 with HTTP; Wed, 5 Jan 2011 06:12:19 -0800 (PST) In-Reply-To: <20110104203152.GA3828@yquem.inria.fr> References: <4D23353C.8020803@gmail.com> <1259991756.440008.1294155536392.JavaMail.root@zmbs2.inria.fr> <20110104174545.GA1535@yquem.inria.fr> <1263353434.442766.1294169448342.JavaMail.root@zmbs2.inria.fr> <20110104203152.GA3828@yquem.inria.fr> From: Boris Yakobowski Date: Wed, 5 Jan 2011 15:12:19 +0100 X-Google-Sender-Auth: 11biyAzstOIPEmUQYI4RguOinMo Message-ID: To: The Caml Mailing List Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [Caml-list] Array.make exception and parser On Tue, Jan 4, 2011 at 9:31 PM, Francois Pottier wrote: > It is true that ocamlyacc (and Menhir) offer essentially no support for > explaining parse errors. (The "error" pseudo-token, inherited from yacc, is > supposed to help, but in my opinion its use pollutes the grammar and makes it > uncontrollable.) Nevertheless, as underlined by Yitzhak, I don't think there > is a deep reason why LR parsers must be bad at explaining errors. In > principle, upon detecting an error, an LR parser could easily dump the stack, > which corresponds to the sentence (composed of terminal and non-terminal > symbols) that has been recognized so far. I think the stack would be useless for the user: too long, and impossible to understand without the grammar. It would be barely better for the writer of the grammar, as he would need to recognize the parsing state to produce an intelligible error report. I think the error token is a good idea, that just went too far. Its ability to shift and reduce allows writing parsers that recovers from syntax errors, but we hardly do that nowadays. Instead, using the error token causes bogus shift/reduce conflicts... What I propose is the following: still use the error token, but do not allow reduction. Instead, only allow productions that return exceptions when they contain the error token. This way, the parse errors are caught inside the grammar, as they should, but do not pollute the parsing itself. > It could also display the set of > look-ahead tokens that would *not* have caused an error in the current > state. (Come to think of it, this is a feature that I would like to add to > Menhir, if only time was not so much of the essence!) This would be incredibly useful (provided the mly writer uses sane names for its tokens, or ideally with some further cooperation from the lexer). Finally, a remark on the parse errors returned by the Ocaml compiler. As many of us, I find them very frustrating. However, the fault does not lie only in the parsing technology. The Ocaml grammar is much too ambiguous for its own good (no difference between toplevel lets and inner ones, no delimiters for ifs and matchs, etc...). As a result, the compiler often reports the error much too far. Camlp4 explains what syntactic entity it expected when it finds a parse error, but this only works if the error is detected at the right place. Language writers would be well inspired to keep that in mind when they design their language. (BTW: a link to a changelog on the homepage of Menhir would be great. And on http://yquem.inria.fr/cgi-bin/mailman/listinfo/menhir-list, the link to the archives is broken.) Cheers, -- Boris