caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Luc Maranget <luc.maranget@inria.fr>
To: alex@baretta.com (Alessandro Baretta)
Cc: garrigue@kurims.kyoto-u.ac.jp (Jacques Garrigue),
	caml-list@inria.fr (Ocaml)
Subject: Re: [Caml-list] Again on pattern matching and strings
Date: Thu, 24 Oct 2002 15:24:26 +0200 (MET DST)	[thread overview]
Message-ID: <200210241324.PAA0000015374@beaune.inria.fr> (raw)
In-Reply-To: <3DB7E9B5.8020406@baretta.com> from "Alessandro Baretta" at oct 24, 2002 02:38:13

> 
> 
> 
> Jacques Garrigue wrote:
> 
> > That message was about polymorphic variants, which are encoded as
> > integers, and for which pattern-matching is a decision tree.
> > 
> > However, if you look in bytecomp/matching.ml, you will see that string
> > patterns are just checked sequentially (the ordering is not used).
> > Moreover, the match compiler seems to be clever enough to compile
> > properly the above style:...
> 
> Very strange. I thought the Ocaml compiler sould 
> precalculate the branch of pattern matching to be taken, and 
> then jump, thereby avoiding sequential checking. I'm sorry 
> for my mistake.

If you are interested in pm code, I would suggest that you have a look
at the produced code after pattern-matching compilation (option -dlambda),
before looking at the compiler sources.

The issue is not really PM bu rather switches: how to compile
a serche in a ordered list of constants ?

To sum it up for strings : strings are atoms to the PM compiler which
never look into them, it only compares one string against another, for
equality only.  The match compiler does not make avantage of the known
pattern string in any sense.  The match compiler does not make
avantage of the existence of a lexical ordering on strings. In fact
many << optimizations >> are posible here, none is performed.


For other ``constants'' from others datatypes (that is at the compiler
level for machine integers) the switch compiler performs many optimizations.
Basically the compiler mixes tests againts constants
(= i < i, etc and a special x in [i1..i2] test) and
jump tables.

Strings hence remain ``the ghost in the machine'' as regards switch
efficiency.

All this can be explained by history and by the search of a compromise
between compiler complexity on the one hand, and code efficiency on
the other.


If you want efficient search in a set of strings, PM is not the
solution, a library solution is provided by Hastbl or Map.
More efficient solutions can be obtained by coding, or provided by
third party libraries.

As to your original problem, I cannot resist proposing  a quick and
dirty solution, using cpp, still having meaningful line numbers.

yourfile.ml:
#define S1 "...."
#define S2 "xxxx"

...

let f x = match x with
| S1 -> ...
| S2 -> ...


Makefile:
CPP=cpp -E -P
#Some alternatives...
#CPP=/lib/cpp -E -P  (Old fashioned Unix)
#CPP=gcc -E -P -x c  (If you have gcc)
 ....

yourfile.cmo: yourfile.ml
        ocamlc -pp '${CPP}' -c yourfile.m



Of course, this is quite dirty and the previous messages were giving
clues to much cleaner solutions. In particular, learning how to use
camlp4 is a worthy investment.



> Alex Baretta

--Luc Maranget
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2002-10-24 13:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-23 23:47 Alessandro Baretta
2002-10-23 23:46 ` Alexander V.Voinov
2002-10-23 23:57 ` Stefano Zacchiroli
2002-10-24  7:10   ` Alessandro Baretta
2002-10-24  7:38     ` Stefano Zacchiroli
2002-10-24  8:01     ` Jacques Garrigue
2002-10-24 12:38       ` Alessandro Baretta
2002-10-24 13:24         ` Luc Maranget [this message]
2002-10-24 15:13           ` Alessandro Baretta
2002-10-24 16:26           ` Sven Luther
2002-10-25  8:40             ` Luc Maranget
2002-10-24  4:11 ` Christopher Quinn
     [not found] ` <15799.14325.887770.501722@karryall.dnsalias.org>
2002-10-24  7:43   ` Alessandro Baretta
2002-10-24  8:51     ` Daniel de Rauglaudre
2002-10-24  9:50       ` Stefano Zacchiroli
2002-10-24 10:30         ` Noel Welsh
2002-10-24 12:59         ` Daniel de Rauglaudre
2002-10-24 13:16           ` Basile STARYNKEVITCH
2002-10-25 10:29             ` Daniel de Rauglaudre
2002-10-24 12:34       ` Alessandro Baretta
2002-10-24 12:51         ` Daniel de Rauglaudre
     [not found] <IIEMJEMIMDMLIIPHPOBLOELNCAAA.fsmith@mathworks.com>
2002-10-24  7:16 ` Alessandro Baretta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200210241324.PAA0000015374@beaune.inria.fr \
    --to=luc.maranget@inria.fr \
    --cc=alex@baretta.com \
    --cc=caml-list@inria.fr \
    --cc=garrigue@kurims.kyoto-u.ac.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).