caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <gerd@gerd-stolpmann.de>
To: Markus Mottl <mottl@miss.wu-wien.ac.at>
Cc: Miles Egan <miles@caddr.com>, OCAML <caml-list@inria.fr>
Subject: Re: features of PCRE-OCaml
Date: Sat, 9 Dec 2000 14:12:13 +0100	[thread overview]
Message-ID: <0012091437390B.00625@ice> (raw)
In-Reply-To: <20001209040315.D26367@miss.wu-wien.ac.at>

On Sat, 09 Dec 2000, Markus Mottl wrote:
>Gerd Stolpmann schrieb am Friday, den 08. December 2000:
>> There are two functions making it easy: enter_blocking_section and
>> leave_blocking_section. For example, the stub for the read syscall of the Unix
>> library:
>
>Ok, I have found an article by Xavier on these functions:
>
>  http://caml.inria.fr/archives/199905/msg00035.html
>
>So if I am not mistaken, a function that calls the GC or allocates memory
>on the OCaml-heap cannot be considered reentrant even if its semantics
>is otherwise referentially transparent. This means that just "tagging"
>a function as "pure" is no guarantee that it won't mess up the runtime
>when e.g. calling the GC concurrently - right?

For example, the situation must not occur where one thread is initializing
memory and is interrupted by another thread allocating memory and calling the
GC. One precondition of the GC is that memory is always initialized.

"Reentrancy" is an abstract view on the function interface; it is not true for
lower coding levels because (heap) memory is nothing but a large global variable
implicitly shared by every piece of code.

>In other terms I can put those functions around the largest section of
>C-code that doesn't interfere with the OCaml-runtime system - then I
>should be safe.

I think so.

>The only question now is: would it really pay for pattern matching in the
>PCRE? I have taken a look at the implementation of these functions and on
>their use, but have only found cases where some function really blocks for
>either an indefinite (e.g. read) or at least potentially very long amount
>of time (e.g. gethostbyaddr, which might need to contact a nameserver).
>
>Without threads we won't benefit, anyway, and if we use threads, there
>is a small overhead associated with calling these functions. Pattern
>matching maybe does not eat up so much time in the average case that this
>is justified. Any experiences or suggestions when using these functions
>is advisable?

I would say it depends on the problem size. For example, when searching in a
long text it is definitely worth-while to release the masterlock.

The more interesting case is the average text processing program with many
invocations of the PCRE engine with average problem sizes. The question is
whether the sum of all invocations is big enough such that an effect is
measurable. Ideally, I can imagine a two processor system in which one CPU only
executes Caml code, and the other only regexps. From the Caml CPU's point of
view, the PCRE calls are (ideally) cost-free (because they are delegated to the
other CPU). However, there is a synchronization overhead, and nothing is won if
the Caml CPU must spend more time with synchronization than it would spend with
executing the regexp itself.

I think it is worth an experiment.

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 100             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------



  reply	other threads:[~2000-12-11 17:29 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-12-06  0:51 Markus Mottl
2000-12-07 16:01 ` John Max Skaller
2000-12-07 16:32   ` Markus Mottl
2000-12-07 17:08     ` John Max Skaller
2000-12-08  0:03       ` Markus Mottl
2000-12-08 17:52         ` John Max Skaller
2000-12-08  9:19       ` Alain Frisch
2000-12-08 18:11         ` John Max Skaller
2000-12-08 19:48           ` Alain Frisch
2000-12-09 17:07             ` John Max Skaller
2000-12-14 17:35   ` unicode support Nickolay Semyonov
2000-12-07 20:17 ` features of PCRE-OCaml Miles Egan
2000-12-08 12:30   ` Gerd Stolpmann
2000-12-08 15:05     ` Markus Mottl
2000-12-08 15:40       ` Gerd Stolpmann
2000-12-09  3:03         ` Markus Mottl
2000-12-09 13:12           ` Gerd Stolpmann [this message]
2000-12-10  0:32             ` Markus Mottl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0012091437390B.00625@ice \
    --to=gerd@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=miles@caddr.com \
    --cc=mottl@miss.wu-wien.ac.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).