caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] RE: a regular expression library
@ 2001-09-25 17:12 Jerome Vouillon
  2001-09-25 18:40 ` Miles Egan
  2001-09-25 19:03 ` Markus Mottl
  0 siblings, 2 replies; 6+ messages in thread
From: Jerome Vouillon @ 2001-09-25 17:12 UTC (permalink / raw)
  To: caml-list


Hello,

I've started to write a regular expression library.  It supports
several styles of regular expressions:
- Perl-style regular expressions;
- Posix extended regular expressions;
- Emacs-style regular expressions;
- Shell-style file globbing
It is also possible to build regular expressions by combining simpler
regular expressions.

The library is still under developpement, but already quite
usable.  The most notable missing features are back-references
and look-ahead/look-behind assertions.

I would greatly appreciate your comments about the library (and, in
particular, about its API).  Contributions and bug reports are also
welcome.

The library can be downloaded from http://sourceforge.net/projects/libre/

The library seems to be pretty fast when compiled to native code.
Here are some timing results (Pentium III 500Mhz):
* Scanning a 1Mb string containing only 'a's, except for the last
  character which is a 'b', searching for the pattern "aa?b"
  (repeated 100 times).
    - RE: 2.6s
    - PCRE: 68s
* Regular expression example from http://www.bagley.org/~doug/shootout/
    - RE: 0.43s
    - PCRE: 3.68s
(The library is much slower when compiled to bytecode though, as it
 is entirely written in O'Caml.  I plan to rewrite the critical
 sections of the code in C.)

-- Jerome
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] RE: a regular expression library
  2001-09-25 17:12 [Caml-list] RE: a regular expression library Jerome Vouillon
@ 2001-09-25 18:40 ` Miles Egan
  2001-09-25 19:03 ` Markus Mottl
  1 sibling, 0 replies; 6+ messages in thread
From: Miles Egan @ 2001-09-25 18:40 UTC (permalink / raw)
  To: Jerome Vouillon; +Cc: caml-list

On Tue, Sep 25, 2001 at 01:12:29PM -0400, Jerome Vouillon wrote:
> 
> Hello,
> 
> I've started to write a regular expression library.  It supports
> several styles of regular expressions:
> - Perl-style regular expressions;
> - Posix extended regular expressions;
> - Emacs-style regular expressions;
> - Shell-style file globbing
> It is also possible to build regular expressions by combining simpler
> regular expressions.

This is great news!  A native ocaml library would be the best solution and
also another example of the power of the language.

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] RE: a regular expression library
  2001-09-25 17:12 [Caml-list] RE: a regular expression library Jerome Vouillon
  2001-09-25 18:40 ` Miles Egan
@ 2001-09-25 19:03 ` Markus Mottl
  2001-09-25 21:22   ` [Caml-list] calling native from bytecode (was RE: a regular expression library) Chris Hecker
  1 sibling, 1 reply; 6+ messages in thread
From: Markus Mottl @ 2001-09-25 19:03 UTC (permalink / raw)
  To: Jerome Vouillon; +Cc: caml-list

On Tue, 25 Sep 2001, Jerome Vouillon wrote:
> I've started to write a regular expression library.

Cool! :-)

Finally, OCaml could get a reasonable regexp-library!

> (The library is much slower when compiled to bytecode though, as it
>  is entirely written in O'Caml.  I plan to rewrite the critical
>  sections of the code in C.)

It would be really nice if there were ways to call OCaml-native code
from OCaml-byte code. This question has popped up in the past, but it's
not an easy thing to do due to issues with the runtime:

  http://caml.inria.fr/archives/200108/msg00026.html

Any news in this respect? A toplevel that could run a high-performance,
OCaml-native code string matching engine would give a terrific scripting
environment!

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] calling native from bytecode (was RE: a regular expression library)
  2001-09-25 19:03 ` Markus Mottl
@ 2001-09-25 21:22   ` Chris Hecker
  2001-09-25 22:40     ` [Caml-list] calling native from bytecode Dave Mason
  2001-11-09 15:09     ` [Caml-list] avoiding native call from bytecode issue via dynamic linking Jeff Henrikson
  0 siblings, 2 replies; 6+ messages in thread
From: Chris Hecker @ 2001-09-25 21:22 UTC (permalink / raw)
  To: Markus Mottl, Jerome Vouillon; +Cc: caml-list


>> (The library is much slower when compiled to bytecode though, as it
>>  is entirely written in O'Caml.  I plan to rewrite the critical
>>  sections of the code in C.)
>It would be really nice if there were ways to call OCaml-native code
>from OCaml-byte code. This question has popped up in the past, but it's
>not an easy thing to do due to issues with the runtime:
>  http://caml.inria.fr/archives/200108/msg00026.html
>Any news in this respect? A toplevel that could run a high-performance,
>OCaml-native code string matching engine would give a terrific scripting
>environment!

I was going to reply to that same quote and reiterate my desire to link native and bytecode!  :) That was the thread I started you've linked to, here's the beginning: http://caml.inria.fr/archives/200108/msg00008.html.

It's such a shame that we're pushing ocaml code into C because of this limitation.  This seems incredibly ironic and sad to me.

I started looking into it seriously, but I got discouraged after Xavier's discussion of the gc differences and after I looked at the runtime code (lots of #ifdefs and even separate files in some cases).

It seems like it would be a _ton_ of work if you tried to do it in a backwards compatible way.  If this was a higher priority, it seems like you could do it if you punted bytecode backwards compatibility and made some big changes.  I'd love to do this, but it looks like a major undertaking, not the quick hack I originally though it might be.

Chris


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] calling native from bytecode
  2001-09-25 21:22   ` [Caml-list] calling native from bytecode (was RE: a regular expression library) Chris Hecker
@ 2001-09-25 22:40     ` Dave Mason
  2001-11-09 15:09     ` [Caml-list] avoiding native call from bytecode issue via dynamic linking Jeff Henrikson
  1 sibling, 0 replies; 6+ messages in thread
From: Dave Mason @ 2001-09-25 22:40 UTC (permalink / raw)
  To: caml-list; +Cc: Chris Hecker, Markus Mottl, Jerome Vouillon

>>>>> On Tue, 25 Sep 2001 14:22:09 -0700, Chris Hecker <checker@d6.com> said:

> I was going to reply to that same quote and reiterate my desire to
> link native and bytecode!  :)

I hate me-too mailings on lists, but... :-)

I honestly think this is one of the most important issues with ocaml
at the moment.  The native code compiler is good enough that we
frequently see examples of it beating C code (sometimes partly because
of having to jump the barrier between ocaml and C, but still...).  It
is a huge shame that we can't use the resulting code in the
interpreter.

I wonder how much of the runtime library would be speeded up (and made
more robust) if it were coded in ocaml instead of C.

Please, please, please!

Unfortunately, I don't have the time right now, or a grad student with
the aptitude, or I'd do it.  It would be *very* cool to have ocaml all
the way down to the metal!

../Dave
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Caml-list] avoiding native call from bytecode issue via dynamic linking
  2001-09-25 21:22   ` [Caml-list] calling native from bytecode (was RE: a regular expression library) Chris Hecker
  2001-09-25 22:40     ` [Caml-list] calling native from bytecode Dave Mason
@ 2001-11-09 15:09     ` Jeff Henrikson
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Henrikson @ 2001-11-09 15:09 UTC (permalink / raw)
  To: caml-list; +Cc: Chris Hecker, markus, Jerome Vouillon

Has anybody considered sidestepping the native/bytecode compatablity issue in favor of an all native toplevel?  That is, one which
takes an expression, compiles it to native code, and loads it back in to code space?  For example, write it out to a shared lib,
either one for each expression (or more likely for efficiency) one for some reasonably sized history of expressions.  Then dlopen
and dlsym the symbols into place.

Once upon a time when I briefly tried to work with SML/NJ, I saw little temp files being written to disk all the time when I was
evaluating expressions from an sml session in emacs.  I presumed their purpose was getting code to cross the data/code barrier.  I
don't think they were standard shared libs under windows or linux because they were often only a few hundreded bytes- too small to
be a real shared lib.


Jeff Henrikson


Markus Mottl [markus@mail4.ai.univie.ac.at] on 9/25/01
>It would be really nice if there were ways to call OCaml-native code
>from OCaml-byte code. This question has popped up in the past, but it's
>not an easy thing to do due to issues with the runtime:
>  http://caml.inria.fr/archives/200108/msg00026.html
>Any news in this respect? A toplevel that could run a high-performance,
>OCaml-native code string matching engine would give a terrific scripting
>environment!

Chris Hecker [checker@d6.com] on 9/25/01:
> I was going to reply to that same quote and reiterate my desire to link native and bytecode!  :) That was the thread I
> started you've linked to, here's the beginning: http://caml.inria.fr/archives/200108/msg00008.html.
>
> It's such a shame that we're pushing ocaml code into C because of this limitation.  This seems incredibly ironic and sad to me.







-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-11-09 14:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-25 17:12 [Caml-list] RE: a regular expression library Jerome Vouillon
2001-09-25 18:40 ` Miles Egan
2001-09-25 19:03 ` Markus Mottl
2001-09-25 21:22   ` [Caml-list] calling native from bytecode (was RE: a regular expression library) Chris Hecker
2001-09-25 22:40     ` [Caml-list] calling native from bytecode Dave Mason
2001-11-09 15:09     ` [Caml-list] avoiding native call from bytecode issue via dynamic linking Jeff Henrikson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).