On Thu, May 31, 2012 at 8:59 AM, Alain Frisch <alain@frisch.fr> wrote:
On 05/30/2012 04:14 PM, Hongbo Zhang wrote:
1. Why camlp4 is buggy?
The main buggy part is its parsing technology.

I don't consider that the main problem with Camlp4 is that it is buggy, but rather that (i) it is overly complex for the benefits it delivers,
and (ii) it is actually not such a great idea to change the concrete syntax.

Hi,Alain, 
   Thanks for your message.
   Some opinions below: (Feel free to correct if I am wrong)
   Why do you think it's overly complex? The other part is not complex IMO except the internal mechanism of parsing. There are other benefits of revised syntax. One point is that write an error  
recover parser is straitforward for revised syntax. And it's more friendly to IDE.

Some example of useless complexity:

 - A custom notion of AST. Why not simply use the OCaml one?  (Extended with nodes for a new nodes, like quotations.)

The simple answer is you can not. (or it's at least not an easy way once you want to support quotation and *antiquotation* *everywhere*. Just take a look at how ugly the Template Haskell program is) 
 - The use of concrete syntax for manipulating the AST.  The developer needs to understand not only the new AST, but also how it is reflected exactly by the concrete syntax quotations (and this is non trivial), and where anti-quotations are allowed, etc. What's wrong with normal pattern matching and expression building with the standard AST?  It might be a little bit more verbose, but it's so much simpler to understand.

The answer is also you can not. There are some ambiguities that you can not support quotations
and antiquotations. Currenty Camlp4 support quotations and antiquotations for revised syntax in 
all branches except Ast.TyDcl.

I agree it would be useful to write a simple quoation expander for Parsetree.structure_item, (with limited antiquotation support) in Camlp4 and add another hook for Parsetree.
 - A different syntax (the revised one).  I understand the benefits of this new syntax, but it seems kind of crazy to have a "low-level" tool implemented in (and encouraging) a syntax different from the core system.

 - A complicated bootstrapping cycle (partly a consequence of the fact that Camlp4 is itself written in a custom syntax).  That's mostly for OCaml maintainers, but in the past, it has slowed down development in a non-negligible way.


2. About the proposal.
There are mainly 2 pieces. About the hook
Parsetree.structure->Parsetree. structure, given that camlp4 already

imports Parsetree, it's really trivial to
add another hook after camlp4astdump2ocamlast.

I've absolutely no doubt that Camlp4 can be extended to be at least as powerful as this "Parsetree rewriting" proposal.  What's important is that this proposal is so simple that it can be implemented in a few dozens line of code in the core compiler. We should not create a dependency on a complex tool for problems which can be solved with something so simple.


It's still nontrivial to write a robust Parsetree.structure ->
Parsetree.structure, it would be nice if we could provide a quotation
syntax for Parsetree.types.

I believe the opposite: it's simpler to write a robust AST->AST rewriting function if you work directly on the "real" AST definition, rather than a slighlty different one and with a custom syntax.  Just for an example, consider a left-hand side like:

 | <:pat< $p1$ | $p2$ | $p3$ >> -> ...

Will it capture both (p1|p2)|p3 and p1|(p2|p3)?  Or only one of them?
Another example: controlling precisely locations introduced in the AST fragments created with quotations is quite tricky.



Alain



--
-- Bob