caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: skaller <skaller@users.sourceforge.net>
To: Oliver Bandel <oliver@first.in-berlin.de>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] mboxlib reloaded ;-)
Date: Sat, 28 Apr 2007 23:49:51 +1000	[thread overview]
Message-ID: <1177768191.11923.24.camel@rosella.wigram> (raw)
In-Reply-To: <20070428114450.GI363@first.in-berlin.de>

On Sat, 2007-04-28 at 13:44 +0200, Oliver Bandel wrote:
> On Sat, Apr 28, 2007 at 12:54:53PM +0200, Gabriel Kerneis wrote:
> > Le Sat, 28 Apr 2007 12:47:47 +0200, Oliver Bandel
> > <oliver@first.in-berlin.de> a écrit :
> > > > You should check the size (number of states) of the generated
> > > > lexer.
> > > 
> > > How?
> > 
> > It's printed out by ocamllex when you run it on you .mll file.
> > Regards,
> 
> Ah, ok. :)
> 
> 
> 18 states, 261 transitions, table size 1152 bytes.
> 
> Does not loooks very huge ;-)

Lol, no it is tiny. You are probably right, too many calls,
and too much copying data around. AFAIK Ocaml channels also
add an extra buffer layer (is that right?) so there's even
more copying.

Still, although Ocaml may generate more code than C,
if your code is reasonably tight it should be cached
and be fast: function calls are actually quite cheap.

Here's an idea: you said:

"For the about 100MB mbox there are 2.5 * 10^6 calls to
to Buffer.add_string for the header and 1.6 * 10^6 calls
to Buffer.add_string for the body, 2.6*10^6 calls to the
function lexing.engine, ..."

How about NOT storing the body text. Instead, just store
the integer file offset of the first byte and the length?
Not sure what you application is doing;
perhaps that would work for you?

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


  reply	other threads:[~2007-04-28 13:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-27 13:54 Oliver Bandel
2007-04-27 16:29 ` [Caml-list] " Richard Jones
2007-04-27 23:12   ` Oliver Bandel
2007-04-28  0:54     ` skaller
2007-04-28 10:47       ` Oliver Bandel
2007-04-28 10:54         ` Gabriel Kerneis
2007-04-28 11:44           ` Oliver Bandel
2007-04-28 13:49             ` skaller [this message]
2007-04-28 14:18               ` Oliver Bandel
2007-04-29 10:45                 ` Richard Jones
2007-04-29 15:41                   ` Oliver Bandel
2007-04-29 18:51                     ` Robert Roessler
2007-05-01 11:00                       ` camomile-problem (Re: [Caml-list] mboxlib reloaded ;-)) Oliver Bandel
2007-05-01 10:56                   ` [Caml-list] mboxlib reloaded ;-) Oliver Bandel
2007-04-28  7:56     ` Richard Jones
2007-04-28 10:58       ` Oliver Bandel
     [not found]         ` <20070429103911.GA30510@furbychan.cocan.org>
2007-04-29 15:43           ` Oliver Bandel
2007-09-24 18:22     ` ocamllex speed [was Re: [Caml-list] mboxlib reloaded ;-)] Bruno De Fraine
2007-09-24 19:54       ` Alain Frisch
2007-09-25  8:53         ` Bruno De Fraine
2007-09-24 22:06       ` skaller
2007-09-27  5:26       ` Chris King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1177768191.11923.24.camel@rosella.wigram \
    --to=skaller@users.sourceforge.net \
    --cc=caml-list@yquem.inria.fr \
    --cc=oliver@first.in-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).