From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id E9B89BBC1 for ; Mon, 18 Feb 2008 15:34:15 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8CACgmuUfVujhf/2dsb2JhbACrVYF6 X-IronPort-AV: E=Sophos;i="4.25,371,1199660400"; d="asc'?scan'208";a="22749025" Received: from witko.kerneis.info ([213.186.56.95]) by mail4-smtp-sop.national.inria.fr with ESMTP; 18 Feb 2008 15:34:15 +0100 Received: from [2001:660:3301:8063:20c:f1ff:fe40:edcf] (helo=tatanka.kerneis.info) by witko.kerneis.info with esmtpsa (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1JR74Q-0005OA-UQ; Mon, 18 Feb 2008 15:34:14 +0100 Received: from gabriel by tatanka.kerneis.info with local (Exim 4.69) (envelope-from ) id 1JR74O-0001Rw-FL; Mon, 18 Feb 2008 15:34:12 +0100 Date: Mon, 18 Feb 2008 15:34:12 +0100 From: Gabriel Kerneis To: caml-list@yquem.inria.fr Cc: Nicolas.Pouillard@inria.fr Subject: [Camlp4] Antiquotation and unescaped strings Message-ID: <20080218143412.GA4411@kerneis.info> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0F1p//8PRICkK4MW" Content-Disposition: inline User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-SA-Exim-Connect-IP: 2001:660:3301:8063:20c:f1ff:fe40:edcf X-SA-Exim-Mail-From: gabriel@kerneis.info X-SA-Exim-Scanned: No (on witko.kerneis.info); SAEximRunCond expanded to false X-Spam: no; 0.00; camlp:01 camlp:01 parser:01 parser:01 ocaml:01 expr:01 expr:01 bug:01 bug:01 wiki:01 mll:01 parsed:01 strings:01 strings:01 expression:02 X-Attachments: type="application/pgp-signature" name="signature.asc" --0F1p//8PRICkK4MW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I've got a problem with antiquotations and strings in camlp4. I've got a parser (xmlp4, used in Ocsigen) written with camlp4 which has to deal with strings.=20 This parser provides a quotation mechanism to let the user input XML in a natural way within an Ocaml program. Basically, the problem is the following expression (in the parser's code): | PCData(s) -> [ <:expr< XHTML.EncodedPCData $str:s$ >> ] where s is a (potentially unescaped) string. Here is a typical output: [XML.EncodedPCDATA " "bar \" ] corresponding to the following input file: <:xml< "bar\ >> (assuming that this quotation is parsed as PCData of course) As you can see, it's weird and it took me some time to find out what was happening (real-life examples being much more intricate). Of course, I would like to get: [XML.EncodedPCDATA "\"bar \\" ] instead, and of course, I could obtain it through: | PCData(s) -> [<:expr< XHTML.EncodedPCData $str:String.escaped s$ >>] But I wonder whether this should be considered as a camlp4 bug (or possible enhancement). When I write << EncodedPCData $str:s$ >>, I=20 expect to get something consistent, not some kind of "let's evaluate s and copy/paste it blindly there". Shall I submit a bug report? Or is there something I got wrong in my code (which is very likely)? Anyway, if this is an intended behavior, the explanation should be put=20 in BIG BOLD RED letters on the camlp4 wiki, on the quotation page. This would help a lot (plus I would like to understand what is exactly going=20 on there). NB: I simplified the examples for the shake of clarity. You can find the relevant code here: http://ocsigen.org/trac/browser/xmlp4/newocaml/xhtmlparser.ml You may also want to have a look at: http://ocsigen.org/trac/browser/xmlp4/newocaml/xmllexer.mll and http://ocsigen.org/trac/browser/xmlp4/newocaml/xhtmlsyntax.ml I'd welcome any remark regarding this code. Kind regards, --=20 Gabriel Kerneis --0F1p//8PRICkK4MW Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHuZdk6a2JmXQu5bYRAnY/AJ9way8SOEnL2RCDZ/44UsMWFeCMwwCbBI7T Cfa68crzHczjTBV5uuRe5tw= =THQU -----END PGP SIGNATURE----- --0F1p//8PRICkK4MW--