From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 33A81BB9C for ; Wed, 23 Nov 2005 21:56:31 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jANKuUN3016532 for ; Wed, 23 Nov 2005 21:56:30 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id VAA10350 for ; Wed, 23 Nov 2005 21:56:29 +0100 (MET) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jANKuS8G016527; Wed, 23 Nov 2005 21:56:29 +0100 Received: from frontend1.internal (mysql-sessions.internal [10.202.2.149]) by frontend1.messagingengine.com (Postfix) with ESMTP id 52D21D173BB; Wed, 23 Nov 2005 15:56:24 -0500 (EST) Received: from frontend2.messagingengine.com ([10.202.2.151]) by frontend1.internal (MEProxy); Wed, 23 Nov 2005 15:56:24 -0500 X-Sasl-enc: 5fK/Iebixr76qvIgZ9YQmJ7YIP+hh1DGl970O+sBaeNB 1132779383 Received: from munge (burnham.ljcrf.edu [192.231.106.2]) by frontend2.messagingengine.com (Postfix) with ESMTP id 33B04571434; Wed, 23 Nov 2005 15:56:17 -0500 (EST) Date: Wed, 23 Nov 2005 12:56:04 -0800 (PST) From: Martin Jambon X-X-Sender: martin@munge To: Luc Maranget Cc: Christophe Raffalli , sejourne_kevin , caml-list Subject: Re: [Caml-list] Request for complete pattern matching In-Reply-To: <20051123183134.GB6446@yquem.inria.fr> Message-ID: References: <43839F1A.2080909@univ-savoie.fr> <43842069.3070700@yahoo.fr> <43848103.9060504@univ-savoie.fr> <20051123183134.GB6446@yquem.inria.fr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at nez-perce with ID 4384D77E.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 4384D77C.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 maranget:01 replacing:01 en':01 camlp:01 compilation:01 doable:01 matchings:01 camlp:01 extensively:01 computations:01 endline:01 endline:01 regexps:01 ocaml:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=disabled version=3.0.3 On Wed, 23 Nov 2005, Luc Maranget wrote: >> >> No this is not a problem of test. I want to be able to write >> >> match e with >> (0 <= 0, g) -> g >> | (f, 0 <= 0) -> f >> | (f, g) -> fun x -> f x + g x >> > Well, I understand better and (as you may have guessed yourself!) I do > not incline to adopt the idea. > > However provided you have some 'break' construct to transfert control to > next matching, you can perhaps compile your construct by syntactic > transformation. > > My idea is to transform your patterns into > normal ones, replacing p <= e1 e2 ... en by fresh variables (say v) > and then to match 'v e1 ... en' against p. > > Here you have : > > match e with > | (v1, g) -> (match v1 0 with 0 -> g |_ -> break) > | (f, v2) -> (match v2 0 with 0 -> f |_ -> break) > | f, g -> fun x -> f x + g x > > At a little additional cost in complexity, > you can replace 'break' (which does not exists) by exceptions as follows > > let x = e in > try (match x with > | (v1, g) -> (match v1 0 with 0 -> g |_ -> raise Exit) > | _ -> raise Exit) > with Exit -> > try (match x with > | (f, v2) -> (match v2 0 with 0 -> f |_ -> raise Exit) > | _ -> raise Exit) > with Exit -> > (match x with f, g -> fun x -> f x + g x) > > > I am not familiar enough with Camlp4, but I have the feeling > that some purely syntactic compilation of your construct is doable. > Since, only from the presence of <=, > can you introduce the extra matchings and try .. with Exit) > I am not saying it is easy, just that it looks feasible. That's basically what I did for Micmatch (http://martin.jambon.free.fr/micmatch.html). It was not so simple to implement with Camlp4, but it works. Exceptions are used extensively as you describe. So it is actually possible to insert arbitrary computations into ML patterns! Of course the complexity is not O(size of the pattern) anymore. For instance, you can write: (* toto.ml *) try while true do match read_line () with / upper / | / _* "." eos / -> print_endline "looks like a sentence" | "." | / ("bye"~ space*)+ / -> print_endline "Bye!"; exit 0 | _ -> print_endline "???" done with End_of_file -> () Notes: - the stuff between slashes are regexps - "." and the last _ are regular OCaml patterns - regexps are replaced by an identifier which is matched after the arrow using library functions, then it is decided whether to jump to the next case or to execute the user-given expression. Here is a session using the silly program above: $ micmatch toto.ml Hello looks like a sentence ho ho ho. looks like a sentence !@#$%^& ??? goodbye ??? bye bye Bye! $ > Typing and warnings are yet another issue! Warnings regarding incomplete match cases can be suppressed by adding 'when true' guards and superfluous catch-all cases. Tail-call optimizations are preserved by this transformation: (* f may raise an exception Exn, but not g which is tail-recursive. So we don't write this: *) let rec g arg = ... try f x; g y with Exn -> h z (* but that: *) let rec g arg = ... (try f x; fun () -> g y with Exn -> fun () -> h z) () You can run "camlp4o pr_o.cmo -I `ocamlfind query micmatch_pcre` pa_micmatch_pcre.cma toto.ml" to see the final ocaml code. So it is definitely possible to create a generic camlp4 library which allows the insertion of custom tests/bindings into ocaml patterns. In addition to string/regexp matching, the library could be used for the following extensions: - 'when' tests appearing inside of the pattern - evaluation and matching of lazy data - regular expressions over anything (lists, arrays, trees, ...) - views (as far as I understand the concept) - matching objects using method calls as record fields - other? I have no plan of implementing this myself, but it might be a good project for a student... Martin -- Martin Jambon, PhD http://martin.jambon.free.fr Store and share your bioinformatics tips at http://wikiomics.org