caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Performance-cost of ^
@ 2003-03-26  9:52 Oliver Bandel
  2003-03-26 10:03 ` Basile STARYNKEVITCH
  2003-03-26 12:28 ` Damien Doligez
  0 siblings, 2 replies; 5+ messages in thread
From: Oliver Bandel @ 2003-03-26  9:52 UTC (permalink / raw)
  To: caml-list

Hello,

I'm reading in a file linewise.
For some operations I need it as
one long string.

How to acchieve this performant?
Is it ok to use ^ for a list of lines
(with  List.fold_right? or List.fold_left?)

Or is to better to read in the file
with the help of the Buffer-module
and split the read file into lines later,
if operations on lines are necessary?

TIA,
   Oliver


-- 
Word Processors: Stupid and Inefficient

           http://www.ecn.wfu.edu/~cottrell/wp.html

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Caml-list] Performance-cost of ^
  2003-03-26  9:52 [Caml-list] Performance-cost of ^ Oliver Bandel
@ 2003-03-26 10:03 ` Basile STARYNKEVITCH
  2003-03-26 12:28 ` Damien Doligez
  1 sibling, 0 replies; 5+ messages in thread
From: Basile STARYNKEVITCH @ 2003-03-26 10:03 UTC (permalink / raw)
  To: Oliver Bandel; +Cc: caml-list

>>>>> "Oliver" == Oliver Bandel <oliver@first.in-berlin.de> writes:

    Oliver> Hello, I'm reading in a file linewise.  For some
    Oliver> operations I need it as one long string. [...]

Assuming a Unix (or Linux) plateform, I'll suggest to use
BigArray.char and BigArray.Array1.map_file which maps the file as an
array of chars. 


-- 

Basile STARYNKEVITCH         http://starynkevitch.net/Basile/ 
email: basile<at>starynkevitch<dot>net 
aliases: basile<at>tunes<dot>org = bstarynk<at>nerim<dot>net
8, rue de la Faïencerie, 92340 Bourg La Reine, France

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Performance-cost of ^
  2003-03-26  9:52 [Caml-list] Performance-cost of ^ Oliver Bandel
  2003-03-26 10:03 ` Basile STARYNKEVITCH
@ 2003-03-26 12:28 ` Damien Doligez
  2003-03-28 16:21   ` Christian Lindig
  1 sibling, 1 reply; 5+ messages in thread
From: Damien Doligez @ 2003-03-26 12:28 UTC (permalink / raw)
  To: caml-list

On Wednesday, March 26, 2003, at 10:52 AM, Oliver Bandel wrote:

> I'm reading in a file linewise.
> For some operations I need it as
> one long string.
>
> How to acchieve this performant?
> Is it ok to use ^ for a list of lines
> (with  List.fold_right? or List.fold_left?)

You should use String.concat.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Performance-cost of ^
  2003-03-26 12:28 ` Damien Doligez
@ 2003-03-28 16:21   ` Christian Lindig
  2003-03-30 10:20     ` Xavier Leroy
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Lindig @ 2003-03-28 16:21 UTC (permalink / raw)
  To: oliver; +Cc: Caml Mailing List

On Wed, Mar 26, 2003 at 01:28:20PM +0100, Damien Doligez wrote:
> On Wednesday, March 26, 2003, at 10:52 AM, Oliver Bandel wrote:
> >I'm reading in a file linewise.  For some operations I need it as one
> >long string.
> >
> >How to acchieve this performant?  Is it ok to use ^ for a list of
> >lines (with  List.fold_right? or List.fold_left?)
> 
> You should use String.concat.

What about using the Buffer module? It sounds like it was especially
designed to build up long strings.

-- Christian

-- 
Christian Lindig         http://www.eecs.harvard.edu/~lindig/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Performance-cost of ^
  2003-03-28 16:21   ` Christian Lindig
@ 2003-03-30 10:20     ` Xavier Leroy
  0 siblings, 0 replies; 5+ messages in thread
From: Xavier Leroy @ 2003-03-30 10:20 UTC (permalink / raw)
  To: Christian Lindig, oliver, Caml Mailing List

> > Oliver Bandel wrote:
> > >I'm reading in a file linewise.  For some operations I need it as one
> > >long string.
> > >
> > >How to acchieve this performant?  Is it ok to use ^ for a list of
> > >lines (with  List.fold_right? or List.fold_left?)

It's inefficient for large files (quadratic time).

Christian Linding wrote:
> Damien Doligez wrote:
> > You should use String.concat.
> 
> What about using the Buffer module? It sounds like it was especially
> designed to build up long strings.

Yes, Buffer will work fine here, with about the same efficiency as
String.concat.

If the file is a regular file and isn't expected to change during
reading, the simplest and most efficient solution is:

   let ic = open_in_bin filename in
   let len = in_channel_length ic in
   let s = String.create len in
   really_input ic s 0 len;
   close_in ic;
   s

In other circumstances (i.e. reading from a pipe or socket; desire to
read the file as a text file and not a binary file), consider the
following solution:

  let b = Buffer.create 1024 in
  let s = Buffer.create 1024 in
  let rec read_channel ic =
    let n = input ic s 0 1024 in
    if n > 0 then begin Buffer.add_substring b s 0 n; read_channel ic end 
  in
    read_channel ic; Buffer.contents b

Many ways to skin a cat, I'm afraid.

- Xavier Leroy

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-03-30 10:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-26  9:52 [Caml-list] Performance-cost of ^ Oliver Bandel
2003-03-26 10:03 ` Basile STARYNKEVITCH
2003-03-26 12:28 ` Damien Doligez
2003-03-28 16:21   ` Christian Lindig
2003-03-30 10:20     ` Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).