caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Yutaka OIWA <oiwa@yl.is.s.u-tokyo.ac.jp>
To: skaller@users.sourceforge.net
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] [ann] Regexp library supporting binding for * and	+'s
Date: Mon, 20 Sep 2004 15:54:48 +0900	[thread overview]
Message-ID: <vfibrg1bfbb.fsf@tuba.is.s.u-tokyo.ac.jp> (raw)
In-Reply-To: <1095640712.2580.292.camel@pelican.wigram> (skaller's message of "20 Sep 2004 10:38:33 +1000")

>> On 20 Sep 2004 10:38:33 +1000, skaller <skaller@users.sourceforge.net> said:

skaller> On Mon, 2004-09-20 at 06:41, Yutaka OIWA wrote:
>> I plan to construct a neat syntax sugar over this library 
>> and build a next-generation version of Regexp/OCaml library.
>> Any comments are welcome.

skaller> Can you explain why/how Pcre is being used?

The reason is simply current implemenentation convenience.
It is stable, has enough features (e.g. unlimited number of captures,
non-capturing groups, much of helper functions and runtime features,
and is well-performing. My intension is not to implement automata engine
by myself, at least in near future.

However, as you can see in README in Regexp-OCaml (main version), my
future plan includes supporting backends other than PCRE/OCaml.
Having its own regexp parser and limiting regexp syntax to strict
regular language are the provision for possible future.
At the time of OCaml 3.07 released, I really considered to support
the standard Str module, but unfortunately current Str lacks some of
the features required by current Regexp/OCaml implementation.
Anyway, backend is backend. And also, frontend is frontend. Period.
It can be highly independent once it designed so, and my interests
are mainly in the frontend part. I highly appreciete supports from 
people working on the backend part.

Multilingualization is one in current high-priority to-do list.
At least one of the users requested me to support EUC-JP patterns,
and you might be the second person :-)
I am considering how to support M17N feature: it may depends to
underlying backends (e.g. Camomile?), or it may be supported solely in the
frontend layer, by encoding multibyte handling into regexps.
This trick is used in the Japanese port of Perl interpreter on MS-DOS,
and (at least) one of Japanese handling module for Perl5.
# As you can imagine, just using M17N feature of underlying library is
# not sufficient: internal regexp parser must also modified to accept
# multibyte-encoded regular expression. This is one of the reason that 
# curent Regexp/OCaml does not support UTF8 option of PCRE/OCaml.

For supporting list-binding of Kleene-stars, I am very interested in
richer backends which supports such features.  Alain Frisch's recent 
posting has interested me.  There is also a talk with related title in
ICFP04, although I had not yet read the paper.
However, I feel at the same time that backend is not a current show-stopper:
it is truly better to have such backends, but it can be emulated without that,
As I had shown in the combinators.  I can wait for a while for
theretical/practical progresses. Current problem is mainly the frontend:
there are many language-design problems once we introduce nested bindings.
I already had a discussion with some people in ICFP04, and I hope more.

-- 
Yutaka Oiwa              Yonezawa Lab., Dept. of Computer Science,
      Graduate School of Information Sci. & Tech., Univ. of Tokyo.
                    <oiwa@yl.is.s.u-tokyo.ac.jp>, <yutaka@oiwa.jp>
PGP fingerprint = C9 8D 5C B8 86 ED D8 07  EA 59 34 D8 F4 65 53 61

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-09-20  6:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-19 20:41 Yutaka OIWA
2004-09-20  0:38 ` skaller
2004-09-20  6:54   ` Yutaka OIWA [this message]
2004-09-20 11:12     ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vfibrg1bfbb.fsf@tuba.is.s.u-tokyo.ac.jp \
    --to=oiwa@yl.is.s.u-tokyo.ac.jp \
    --cc=caml-list@inria.fr \
    --cc=skaller@users.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).