caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Michal Moskal <malekith@pld-linux.org>
To: Pierre Weis <pierre.weis@inria.fr>
Cc: caml-list@pauillac.inria.fr
Subject: scanf (Re: [Caml-list] what is the functional way to solve this problem?)
Date: Wed, 8 Oct 2003 23:22:23 +0200	[thread overview]
Message-ID: <20031008212223.GA23554@roke.freak> (raw)
In-Reply-To: <200310082034.WAA29610@pauillac.inria.fr>

On Wed, Oct 08, 2003 at 10:34:40PM +0200, Pierre Weis wrote:
> [...]
> > Hm, with your version it's only 3 times slower:
> [...]
> > However, your code is not correct:
> > >         Scanf.bscanf Scanf.Scanning.stdib " %d %d %s"
> > ----------------------------------------------------^^
> > 
> > It should be [^\n], as filenames can contain spaces. In this case:
> 
> I did not know the complete specification :)

Neither did I ;-) I just generated filesystem data from my /usr/share
and there was few files that contained spaces in names. It simply more
reasonable to assume filenames can contain spaces then to assume they
can't. OTOH filenames can as well contain \n... But we are talking about
scanf efficiency, so original problem becomes somewhat irrelevant :-)

> However, your proposed patch is rather inefficient:
> 
> > It should be [^\n], as filenames can contain spaces. In this case:
> ---------------^^^^^
> 
> You should prefer %s@\n, if efficiency is a concern (which since to be the
> case :)

I admit I didn't go that throughly through scanf manual. And there
seems to be no indication about efficiency of both solutions. But as I
understand [] thing would involve creating some flag array for all 256
characters, for each end every scanf call, which has to be slow. Maybe
some caching (for cases where [] really needs to be used)?

> > Well, it's 15 times slower.
> 
> Could you please let me have access to the data to be able to test the
> code without bothering you ?

Sure, but along with stats.ml I posted mkd.ml (from make-data). You can
run it on your /usr or /usr/share redirecting input to file. If you need
*my* test case, please drop me an email.

> Thank you very much indeed for your testbed case, since you exhibit
> situations where there is room for improvement in the implementation
> of scanf (or at least there is the necessity to explain what are the
> efficient pattern constructs for Scanf).

I generally not use Scanf. This is because I use OCaml mostly for some
compilers, preprocessors etc, that need more sophisticated lexing. It is
however nice when using OCaml to do some simple and/or scripting tasks.
I used Scanf mostly for programming contest, where input data is
formated into sequence of numbers or some simple strings. And found it
to be fast enough (you know, this is programming contest, they time you,
so you have to be fast). But I used only %d and maybe %s, mainly
through:

  let get_int () = Scanf.scanf " %d" (fun x -> x)

Which was more natural to me (didn't still convert myself to
all-functional-style-I'm-using-only-Haskell :-)

-- 
: Michal Moskal :: http://www.kernel.pl/~malekith : GCS {C,UL}++++$ a? !tv
: When in doubt, use brute force. -- Ken Thompson : {E-,w}-- {b++,e}>+++ h

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


      reply	other threads:[~2003-10-08 21:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-02 12:31 [Caml-list] what is the functional way to solve this problem? Ram Bhamidipaty
2003-10-02 14:53 ` Michal Moskal
2003-10-08 14:48   ` Pierre Weis
2003-10-08 16:09     ` Michal Moskal
2003-10-08 20:34       ` Pierre Weis
2003-10-08 21:22         ` Michal Moskal [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031008212223.GA23554@roke.freak \
    --to=malekith@pld-linux.org \
    --cc=caml-list@pauillac.inria.fr \
    --cc=pierre.weis@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).