From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.7 required=5.0 tests=AWL,RCVD_IN_BL_SPAMCOP_NET autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id CFD21BC0B for ; Wed, 29 Nov 2006 00:07:42 +0100 (CET) Received: from 42.mail-out.ovh.net (42.mail-out.ovh.net [213.251.189.42]) by concorde.inria.fr (8.13.6/8.13.6) with SMTP id kASN7gjP026593 for ; Wed, 29 Nov 2006 00:07:42 +0100 Received: (qmail 12 invoked by uid 503); 28 Nov 2006 23:07:43 -0000 Received: from b6.ovh.net (HELO mail240.ha.ovh.net) (213.186.33.56) by 42.mail-out.ovh.net with SMTP; 28 Nov 2006 23:07:43 -0000 Received: from b0.ovh.net (HELO queue-out) (213.186.33.50) by b0.ovh.net with SMTP; 28 Nov 2006 23:07:42 -0000 Received: from vil93-4-82-227-140-227.fbx.proxad.net (HELO ?192.168.1.2?) (lists%philippewang.info@82.227.140.227) by ns0.ovh.net with SMTP; 28 Nov 2006 23:07:39 -0000 Message-ID: <456CC136.4080905@philippewang.info> Date: Wed, 29 Nov 2006 00:07:34 +0100 From: Philippe Wang User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Till Varoquaux Cc: caml-list@inria.fr Subject: Re: [Caml-list] About the O'Reilly book on the web References: <45688DAE.7010309@ccr.jussieu.fr> <456AAABE.5020405@irisa.fr> <456CA3B7.1020508@philippewang.info> <9d3ec8300611281433q5509ccby65937fd4384f5a25@mail.gmail.com> In-Reply-To: <9d3ec8300611281433q5509ccby65937fd4384f5a25@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Ovh-Remote: 82.227.140.227 (vil93-4-82-227-140-227.fbx.proxad.net) X-Ovh-Local: 213.186.33.20 (ns0.ovh.net) X-Miltered: at concorde with ID 456CC13E.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; o'reilly:01 regexp:01 syntax:01 clashes:01 -warn-error:01 regexp:01 ocaml:01 backslash:01 backslash:01 compiler:01 runtime:01 syntax:01 coq:01 ocaml:01 syntaxe:01 Hello, > Although I do agree the problem seems a little more complicated: > we are used to a more or less standard regexp syntax where special > chars can be escaped by \, this obviously clashes with escaping > characters in string if we pass strings to the function defining the > regular exceptions.... > I would recommend treating all warnings as errors: > -warn-error A > to avoid such conflicts. I am not used to using regexp with Caml... (although I write some scripts with OCaml instead of using Bash or Perl, or even Php, sometimes...) But writing systematically the double backslash for a single regexp backslash and a quadruple backslash for a backslashed backslash... Well I wouldn't do it! > As far as I'm concerned I find the problem to be more complicated: > regular expressions are not syntaxily checked nor are they typed > checked when specified through strings. Some languages intergrate them > as first class values thus allowing these verifications. Last semester, with some friends we wrote a Caml compiler (that keeps the type informations at runtime), and one idea (which we did not implement because of some lack of time) was first class regexp, a bit like in Perl, while mixing it with the match-with-like syntax... (If only we could have as much time as we want or need...) It could be something "usable", like : matchr (* the matchr keyword is an example *) "some string" with | "PLOP 42 - \([0-9]\)" -> Int (int_of_string $1) | "PLOP 43 - \(.*\)" -> $1 Then you can close it like in Coq with an "end" keyword, or just keep the OCaml syntaxe... (implicit closing, whatever) Well, of course, bad thing could be that $ is a character for infix operators (but whatever it's not so important) Then the question is still "What do we want OCaml to be?" Do we want to make regexps easy to use with OCaml ? And are we ready to make OCaml bigger ("just") for that ? > Another > solution would be to build them using an Ocaml recursive sum type. > Although this would solve the syntax problem it would make regexp very > tedious to write. A library offering both options can be found at: > http://www.lri.fr/~marche/regexp/ It looks really ... "not funny" I would say! > Ideally one would want to precompile regular expression from strings > to actual constructed types using a preprocessor (e.g. camlp4). It > seems Francois Potier was one of the first to try such an approach: > [http://caml.inria.fr/pub/ml-archives/caml-list/2001/07/30b327c7c4b0fa5ace86dbf258e2c5d1.en.html] > > I'm pretty sure this has been done in other libraries (regexp-pp for > instance). Actual type-checking might prove a little harder to get > working. Actually I don't really see the types problems... If everything in return with $n has type string, then there is no matter... We can also easily detect that $4 does not exist for a regexp such as "\(PLOP[4-2]\) 42.*" :-p (or decide that $4 would be an empty string, but it would seem a bit dirty) Anyways, it would probably be more "a good thing" than "a bad thing"... Philippe > P.S.:Je confirme: j'ai bien recu ton mail ;-)... PS : Je n'ai toujours pas compris comment ça marche... Parce que le mail qui est passé est celui qui a été envoyé avec la mauvaise adresse o_O Peut-être que je comprendrai un jour prochain... (donc je retente avec la "mauvaise adresse" puisque ça semble mieux fonctionner)