From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q0LIED8V013531 for ; Sat, 21 Jan 2012 19:14:13 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsoBAEP/Gk/RVdQ2kGdsb2JhbABCngYBkAkIIgEBAQEJCQ0HFAQhgWkFHQITcAEHXRIBBQEWHwgahSaCK5gpgl8KnDiJCoMcBIg7jF2OET2EHQ X-IronPort-AV: E=Sophos;i="4.71,548,1320620400"; d="scan'208";a="128363277" Received: from mail-vw0-f54.google.com ([209.85.212.54]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Jan 2012 19:14:07 +0100 Received: by vbbey12 with SMTP id ey12so1816797vbb.27 for ; Sat, 21 Jan 2012 10:14:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.52.66.166 with SMTP id g6mr1289817vdt.34.1327169646470; Sat, 21 Jan 2012 10:14:06 -0800 (PST) Received: by 10.52.186.70 with HTTP; Sat, 21 Jan 2012 10:14:06 -0800 (PST) Date: Sat, 21 Jan 2012 13:14:06 -0500 Message-ID: From: Pierre Chopin To: caml-list@inria.fr X-Gm-Message-State: ALoCoQlzczbQcQlN4B1hOStzhRH3YaU44T/BP+d4mFvUteCw0p8lrx0M4w8V6NP1uIvZXwyvA+GR Content-Type: multipart/alternative; boundary=20cf307f309255a55704b70dc364 Subject: [Caml-list] pattern matching on integer intervals --20cf307f309255a55704b70dc364 Content-Type: text/plain; charset=ISO-8859-1 Hi, I am trying to do pattern matching on unicode characters, represented by integers. I would like to do something like that let f c = match c with 0xff .. 0xfff -> foo I know we can pattern match over char intervals but It doesn't be to be the case for char intervals. Some I have two questions: Is there a better way of doing what I am doing and why is it possible to pattern match over char intervals and not int intervals? -- Pierre Chopin, Chief Technology Officer and co-founder punchup LLC pierre@punchup.com --20cf307f309255a55704b70dc364 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,

I am trying to do pattern matching on unicode characters, repres= ented by integers. I would like to do something like that

=A0let f = c =3D
=A0match c with
=A0=A0 0xff .. 0xfff -> foo

I know = we can pattern match over char intervals but It doesn't be to be the ca= se for char intervals. Some I have two questions:
=A0Is there a better way of doing what I am doing and why is it possible to= pattern match over char intervals and not int intervals?

--
Pie= rre Chopin,
Chief Technology Officer and co-founder
punchup L= LC
pierre@punchup= .com

--20cf307f309255a55704b70dc364-- From mboxrd@z Thu Jan 1 00:00:00 1970 X-Sympa-To: caml-list@inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q0LIU9fL013929 for ; Sat, 21 Jan 2012 19:30:09 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgABAP0DG0/RVda2kGdsb2JhbABChQmpBwgiAQEBAQkJDQcUBCGBcgEBAQQSAg8EGQE4AQMMAQUFCw0CAiYCAiISAQUBHAYTIqJYCosigzeEKIkwAgULgSSJYYEWBIg7jF2OET2BUIJN X-IronPort-AV: E=Sophos;i="4.71,548,1320620400"; d="scan'208";a="128363968" Received: from mail-tul01m020-f182.google.com ([209.85.214.182]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Jan 2012 19:30:04 +0100 Received: by obcwo16 with SMTP id wo16so3319186obc.27 for ; Sat, 21 Jan 2012 10:30:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=jSKkzMiXrGYwtdU1R6l2vZK1GBDUb9g07EQ4rAr1E5c=; b=KpoSDtP3ODSwAaiZvbsECoJTHaWkRKn85dqYGn57GXqJ06OdoxooEpfifi6jXC4JUW w/yFYCmEDUX8Q+PYfw8n2WE3XH2JfYIMS3I/fRQSUcvgmDOCGg50Bc8PIwCRkYCkdJw9 555qnksr9U6/A+bFwqZnZW/43PENpC6koS0/k= MIME-Version: 1.0 Received: by 10.182.38.70 with SMTP id e6mr2311167obk.13.1327170603037; Sat, 21 Jan 2012 10:30:03 -0800 (PST) Sender: till.varoquaux@gmail.com Received: by 10.182.28.198 with HTTP; Sat, 21 Jan 2012 10:30:02 -0800 (PST) In-Reply-To: References: Date: Sat, 21 Jan 2012 13:30:02 -0500 X-Google-Sender-Auth: jR79WzbpY4E1ThNEQkEPohi3lwA Message-ID: From: Till Varoquaux To: Pierre Chopin Cc: caml-list@inria.fr Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id q0LIU9fL013929 X-Validation-by: till@pps.jussieu.fr Subject: Re: [Caml-list] pattern matching on integer intervals I remember wondering the same thing and the answer can be found in the ocaml parser: char intervals are expanded at parse time to the the full list of characters. This would blow up if you were to use ints, int64 etc.... HTH, Till On Sat, Jan 21, 2012 at 1:14 PM, Pierre Chopin wrote: > Hi, > > I am trying to do pattern matching on unicode characters, represented by > integers. I would like to do something like that > >  let f c = >  match c with >    0xff .. 0xfff -> foo > > I know we can pattern match over char intervals but It doesn't be to be the > case for char intervals. Some I have two questions: >  Is there a better way of doing what I am doing and why is it possible to > pattern match over char intervals and not int intervals? > > -- > Pierre Chopin, > Chief Technology Officer and co-founder > punchup LLC > pierre@punchup.com > From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q0LIYWlQ014172 for ; Sat, 21 Jan 2012 19:34:32 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgYBAP0DG09KfVI0imdsb2JhbABCrhAIIgEBAQoJDQcSBiGBcgEBAQQSAhMZARsdAQMMBgULDS4iAREBBQEcBhMiolgKi2qCb4QoP4hxAgULjBsElRiOET2EAA X-IronPort-AV: E=Sophos;i="4.71,548,1320620400"; d="scan'208";a="140879885" Received: from mail-ww0-f52.google.com ([74.125.82.52]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Jan 2012 19:34:27 +0100 Received: by wgbdq12 with SMTP id dq12so1956692wgb.9 for ; Sat, 21 Jan 2012 10:34:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=dY8hk7Qtjn4YBgzDQ6OQrGaj/emjTi0bLM82xqKfBms=; b=CvI4VnHfbyeo8B17AYzJ0sPRThpuwi4MlzrGlRByis2hfFWacSLg3E0vwto3oHt2PM fsFWTkVg0B/jFTI4vmU/LandT+fCRsi6NJRUB2DLWzP4sZgjMIXTuV4VAl1DxyjIsA9E 1J+FENirYco2nnXpGkeeTZsBiooILmgRekbRM= Received: by 10.180.93.168 with SMTP id cv8mr4914554wib.2.1327170867202; Sat, 21 Jan 2012 10:34:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.227.198.147 with HTTP; Sat, 21 Jan 2012 10:33:46 -0800 (PST) In-Reply-To: References: From: Gabriel Scherer Date: Sat, 21 Jan 2012 19:33:46 +0100 Message-ID: To: Pierre Chopin Cc: caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id q0LIYWlQ014172 Subject: Re: [Caml-list] pattern matching on integer intervals I believe your best bet is the 'if .. then .. else' chain. You may also use 'when' clauses in pattern matching, but those don't scale too well and are best avoided if their are the only content of your content: pattern matching are good for structural deconstruction and environment binding, but can be confusing and useless when you are using none of those aspects. On Sat, Jan 21, 2012 at 7:14 PM, Pierre Chopin wrote: > Hi, > > I am trying to do pattern matching on unicode characters, represented by > integers. I would like to do something like that > >  let f c = >  match c with >    0xff .. 0xfff -> foo > > I know we can pattern match over char intervals but It doesn't be to be the > case for char intervals. Some I have two questions: >  Is there a better way of doing what I am doing and why is it possible to > pattern match over char intervals and not int intervals? > > -- > Pierre Chopin, > Chief Technology Officer and co-founder > punchup LLC > pierre@punchup.com > From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q0LJkCtQ015135 for ; Sat, 21 Jan 2012 20:46:12 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgYBAJEVG0/RVaC2kGdsb2JhbABCgw2rBwgiAQEBAQkJDQcUBCGBcgEBAQQSAhMRCAEbHAIDDAYFCw0JFg8JAwIBAgEREQEFARwTCAEBHqJ5Cotqgm+EHz+IcQIFC4h/gxwEiDuMXYVWgTeHBD2EHQ X-IronPort-AV: E=Sophos;i="4.71,548,1320620400"; d="scan'208";a="128366682" Received: from mail-gy0-f182.google.com ([209.85.160.182]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Jan 2012 20:46:07 +0100 Received: by ghy10 with SMTP id 10so944313ghy.27 for ; Sat, 21 Jan 2012 11:46:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=5hJRSWCehjYAev4z67fs7DerJkBlkGQv0Z1V5FkRKBQ=; b=CSDZFeZf6SNICeRql/C7VMaX+0F867hcCGTu7yzy+/XMx0uaGoJMtXQ2dhEpOfkvt8 6Udwk5pkvKd5E73OewLep/1N6hmrhsBlD0PluQyIQp1WHI54UR7Rzt12N9udGrDGFvur n6bDuJo7EXqS03SYGAs4sqzTF9O1MqAfG4Dzc= Received: by 10.236.173.40 with SMTP id u28mr4169848yhl.15.1327175165880; Sat, 21 Jan 2012 11:46:05 -0800 (PST) Received: from [192.168.1.65] (99-121-78-10.lightspeed.lnngmi.sbcglobal.net. [99.121.78.10]) by mx.google.com with ESMTPS id a24sm19698175ana.13.2012.01.21.11.46.04 (version=SSLv3 cipher=OTHER); Sat, 21 Jan 2012 11:46:05 -0800 (PST) Message-ID: <4F1B15FB.1030900@gmail.com> Date: Sat, 21 Jan 2012 14:46:03 -0500 From: Edgar Friendly User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20111124 Thunderbird/8.0 MIME-Version: 1.0 To: caml-list@inria.fr References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Caml-list] pattern matching on integer intervals On 01/21/2012 01:14 PM, Pierre Chopin wrote: > Hi, > > I am trying to do pattern matching on unicode characters, represented by > integers. I would like to do something like that > > let f c = > match c with > 0xff .. 0xfff -> foo > > I know we can pattern match over char intervals but It doesn't be to be > the case for char intervals. Some I have two questions: > Is there a better way of doing what I am doing and why is it possible > to pattern match over char intervals and not int intervals? The author of Camomile wrote IMap and ISet to efficiently store maps/sets over a large domain where large ranges are the same. You may be able to use these DIET trees from camomile or from batteries to accomplish something along the lines of what you want. E.