caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Kleiser <jon.kleiser@ceres.no>
To: "caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] Create Array of floats from string, surprise
Date: Thu, 27 Apr 2017 14:00:21 +0000	[thread overview]
Message-ID: <001F890B-024A-4A94-89BC-59A010038850@mail.uio.no> (raw)
In-Reply-To: <CAPFanBEwLh5L9pOmo=Ub3AKV+A9Vo0WNiGrmPRYGQmpeJb6S6g@mail.gmail.com>

Hi Gabriel,

I have now figured out how to use the Scanf.bscanf as you suggest. The disappointment and big surprise, however, is that my program using ‘Scanf.bscanf’ is significantly slower than the earlier one based on ‘String.split_on_char’ and ‘List.iteri’, about 43.5 secs vs. 17.6 secs.
Thanks anyway. I feel I have learned quite some OCaml by doing this.

/Jon


> On 26. Apr, 2017, at 17:26, Gabriel Scherer <gabriel.scherer@gmail.com> wrote:
> 
> > Can this Stream reading make use of the scanf to read floats (and other words)?
> 
> Not really (although you can make do with Scanf.Scanning.from_function : (unit -> char) -> Scanning.in_channel; https://caml.inria.fr/pub/docs/manual-ocaml/libref/Scanf.Scanning.html ).
> 
> If counting the line number is important to you, it makes sense to keep using input_line, instead of scanning " %f" directly on the channel (as this may skip arbitrarily many newlines) but then you can still use it to scan each line as a string:
> 
>   let line = input_line channel in
>   let scanbuf = Scanf.Scanning.from_string line in
>   incr line_number;
>   Scanf.bscanf "%s@ " ignore;
>   let vec = Array.init !dims (fun _ -> Scanf.bscanf scanbuf " %f" (fun x -> x)) in
>   ...
> 
> (the format "%s@c" means "scan a string until the character (c) excluded, so "%s@ " consumes the first word.)
> 
> On Wed, Apr 26, 2017 at 10:05 AM, Jon Kleiser <jon.kleiser@ceres.no> wrote:
> Thanks a lot, Gabriel, for your idea about using the “word by word” method. This far I have used the Stream way of file reading:
> 
> let line_stream_of_channel channel =
>   Stream.from
>     (fun _ -> try Some (input_line channel) with End_of_file -> None)
> 
> Can this Stream reading make use of the scanf to read floats (and other words)? If not, I may leave the Stream way.
> 
> I would also like to have access to the current number of lines received, to be able to report that so-and-so was found at line number x. This far I have not found out how count the lines while reading from a Stream.
> 
> /Jon
> 
> 
> > On 26. Apr, 2017, at 15:41, Gabriel Scherer <gabriel.scherer@gmail.com> wrote:
> >
> > It looks like you read a line from an input channel and now want to split it on its spaces. It is also possible to read the input channel word by word in the first place, and for this the semantics of spaces in a scanf format is very useful: a single space ignores all whitespace. So
> >
> > let read_float () =
> >   Scanf.scanf " %f" (fun x -> x)
> >
> > will ignore any whitespace and then expect a floating-point number, read it and return it. (This reads from standard input, to read from arbitrary channels see Scanf.bscanf and the Scanf.Scanning module).
> >
> > On Wed, Apr 26, 2017 at 6:48 AM, Jon Kleiser <jon.kleiser@ceres.no> wrote:
> > Hi,
> >
> > I am quite new to OCaml, and I am looking for the most efficient way to make an Array of floats from string. My solution this far looks like this, where dims is a global variable specifying the size of the Arrays (typically 300):
> >
> > let make_vector vec_strings =
> >   let vec = Array.make !dims 0.0 in
> >   List.iteri (fun i str -> vec.(i) <- float_of_string str) vec_strings
> >
> > let process_line line =
> >   let parts = Str.split (Str.regexp " ") line in
> >   make_vector (List.tl parts)   (* skipping first element which is not a float *)
> >
> > /Jon
> 
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


  reply	other threads:[~2017-04-27 14:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-26 10:48 [Caml-list] Create Array of floats from string Jon Kleiser
2017-04-26 11:02 ` rixed
2017-04-26 13:36   ` Francois BERENGER
     [not found] ` <CAPFanBGh0q2AaF7ROWJJF81o=8+79sn-q4-CxqCKGQ__Oa5SEw@mail.gmail.com>
2017-04-26 14:05   ` Jon Kleiser
2017-04-26 15:26     ` Gabriel Scherer
2017-04-27 14:00       ` Jon Kleiser [this message]
2017-04-26 15:27 ` Alain Frisch
2017-04-27  8:36   ` Jon Kleiser
2017-04-27  9:15   ` Jon Kleiser
2017-04-28 12:19     ` Jon Kleiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=001F890B-024A-4A94-89BC-59A010038850@mail.uio.no \
    --to=jon.kleiser@ceres.no \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).