caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
       [not found] <Pine.LNX.4.44.0209300427310.1961-100000@home.oyster.ru>
@ 2002-09-30  1:19 ` Alessandro Baretta
  0 siblings, 0 replies; 10+ messages in thread
From: Alessandro Baretta @ 2002-09-30  1:19 UTC (permalink / raw)
  To: malc, Ocaml



malc wrote:
> On Sun, 29 Sep 2002, Alessandro Baretta wrote:
> 
> 
>>malc wrote:
>>
>>>I might be way off here, but shouldnt it be:
>>>
>>>let input_file_as_string file =
>>>  let chanin = open_in_bin file in
>>>  let len = in_channel_length chanin in
>>>  let s = String.create len in
>>>  really_input chanin s 0 (pred len);
>>>  close_in chanin;
>>>  s
>>
>># in_channel_length stdin;;
>>Exception: Sys_error "Illegal seek".
> 
> 
> Yes, but the function should be called: input_channel[stream]_as_string, 
> no?
> 

Scratch... Scratch... I'm not sure what you mean.

My point is just that in_channel_length is only meaningful 
for channels connected to regular files. Otherwise, it 
raises Sys_error. Maxence Guesdon's version handles this 
case correctly, since it assumes that a channel is at the 
EOF if-and-only-if reading from it raises End_of_file.

If you mean that the name of function seems to imply that 
its parameter is a regular file, I disagree with you. In 
UNIX anything you can do IO onto is a _file_ descriptor. In 
C buffered IO is done through FILE* variables. One simply 
has to write code which correctly handles the different 
kinds of files that exist in the UNIX world. Some you can 
seek on; on others you cannot, and all you can do is 
continue to read until there is nothing left to read -- that 
is fread(fdesc, buf, max) returns 0 when max > 0.

Alex

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27 15:07   ` Maxence Guesdon
  2002-09-27 17:06     ` Kontra, Gergely
@ 2002-09-27 22:27     ` malc
  1 sibling, 0 replies; 10+ messages in thread
From: malc @ 2002-09-27 22:27 UTC (permalink / raw)
  To: Maxence Guesdon; +Cc: Florian Hars, kgergely, caml-list

On Fri, 27 Sep 2002, Maxence Guesdon wrote:

> If you want to read a whole file as a string, you can use the following function:
> 
> let input_file_as_string file =
>   let chanin = open_in_bin file in
>   let len = 1024 in
>   let s = String.create len in
>   let buf = Buffer.create len in
>   let rec iter () =
>     try
>       let n = input chanin s 0 len in
>       if n = 0 then
>         ()
>       else
>         (
>          Buffer.add_substring buf s 0 n;
>          iter ()
>         )
>     with
>       End_of_file -> ()
>   in
>   iter ();
>   close_in chanin;
>   Buffer.contents buf
> 

I might be way off here, but shouldnt it be:

let input_file_as_string file =
  let chanin = open_in_bin file in
  let len = in_channel_length chanin in
  let s = String.create len in
  really_input chanin s 0 (pred len);
  close_in chanin;
  s

Even if your variant ran faster(and it doesnt), it suffers from
Buffers resizing policy, meaning it will bail out sooner than
String's maximal capacity is reached.

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27 17:06     ` Kontra, Gergely
@ 2002-09-27 18:22       ` Tim Freeman
  0 siblings, 0 replies; 10+ messages in thread
From: Tim Freeman @ 2002-09-27 18:22 UTC (permalink / raw)
  To: kgergely; +Cc: maxence.guesdon, florian, caml-list

>Well, but what about memory usage?
>I just miss an iterator, or sg. like that, which reads a line, processes
>it, reads next line...

Is there a reason not to use streams?

I'm not clear on the current limitations of the streams.ml that comes
with OCAML (logged as bug 1289), but the improved version at
http://www.fungible.com fully implements all of the operations that
can get through type checking and can read 10MB/s on my 600 Mhz
Pentium III.

It ought to be easy to write a function that takes a stream of
characters and returns a stream of lines.

-- 
Tim Freeman       
tim@fungible.com
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D  7180 76DF FE00 34B1 5C78 
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27 15:07   ` Maxence Guesdon
@ 2002-09-27 17:06     ` Kontra, Gergely
  2002-09-27 18:22       ` Tim Freeman
  2002-09-27 22:27     ` malc
  1 sibling, 1 reply; 10+ messages in thread
From: Kontra, Gergely @ 2002-09-27 17:06 UTC (permalink / raw)
  To: Maxence Guesdon; +Cc: Florian Hars, caml-list

>>   > One point was ocaml is really obscure is the file reading: it needed to
>>   > wrap the line-reading in an infinite loop, and this loop is ended via an
>>   > exception. I think End_of_file is not an exception, it is just an event.
>
>What next ? int_of_string "foo" returning None ?
No, that wouldn't be a good idea.
But writing int_of_string "foo" is a bad thing, while the end-of-file
thing is not a design problem. Is there a function to test eof? (without
consuming a line?)

>If you want to read a whole file as a string, you can use the
>following function:
[snip]
Well, but what about memory usage?
I just miss an iterator, or sg. like that, which reads a line, processes
it, reads next line...
Just want to write cleaner, which eg. in ruby is just (so intuitive)

IO.foreach("filename.ext") {|line|
	# process line
}

and don't like infinite loops.

Gergo

+-[Kontra, Gergely @ Budapest University of Technology and Economics]-+
|         Email: kgergely@mcl.hu,  kgergely@turul.eet.bme.hu          |
|  URL:   turul.eet.bme.hu/~kgergely    Mobile: (+36 20) 356 9656     |
+-------"Olyan langesz vagyok, hogy poroltoval kellene jarnom!"-------+
.
Magyar php mirror es magyar php dokumentacio: http://hu.php.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27 14:50 ` Florian Hars
@ 2002-09-27 15:07   ` Maxence Guesdon
  2002-09-27 17:06     ` Kontra, Gergely
  2002-09-27 22:27     ` malc
  0 siblings, 2 replies; 10+ messages in thread
From: Maxence Guesdon @ 2002-09-27 15:07 UTC (permalink / raw)
  To: Florian Hars; +Cc: kgergely, caml-list


> Kontra, Gergely wrote:
>   > One point was ocaml is really obscure is the file reading: it needed to
>   > wrap the line-reading in an infinite loop, and this loop is ended via an
>   > exception. I think End_of_file is not an exception, it is just an event.

What next ? int_of_string "foo" returning None ?

It is an exception case: your function tries to read a line but there is nothing to read so it will not return the expected result, so an exception is raised. It is the classic and right way to do.

If you want to read a whole file as a string, you can use the following function:

let input_file_as_string file =
  let chanin = open_in_bin file in
  let len = 1024 in
  let s = String.create len in
  let buf = Buffer.create len in
  let rec iter () =
    try
      let n = input chanin s 0 len in
      if n = 0 then
        ()
      else
        (
         Buffer.add_substring buf s 0 n;
         iter ()
        )
    with
      End_of_file -> ()
  in
  iter ();
  close_in chanin;
  Buffer.contents buf


-- 
Maxence Guesdon
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-26 15:50 Kontra, Gergely
  2002-09-27  8:01 ` Maxence Guesdon
@ 2002-09-27 14:50 ` Florian Hars
  2002-09-27 15:07   ` Maxence Guesdon
  1 sibling, 1 reply; 10+ messages in thread
From: Florian Hars @ 2002-09-27 14:50 UTC (permalink / raw)
  To: Kontra, Gergely; +Cc: caml-list

Kontra, Gergely wrote:
  > One point was ocaml is really obscure is the file reading: it needed to
  > wrap the line-reading in an infinite loop, and this loop is ended via an
  > exception. I think End_of_file is not an exception, it is just an event.

For processing files a line at a time, I once decided to write a module
Textfile with the two methods

val iter : (string -> unit) -> in_channel -> unit
val fold : ('a -> string -> 'a) -> 'a -> in_channel -> 'a

and forget about whatever ugly things happen in there.

Yours, Florian.



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27  8:14   ` Xavier Leroy
@ 2002-09-27 10:29     ` Kontra, Gergely
  0 siblings, 0 replies; 10+ messages in thread
From: Kontra, Gergely @ 2002-09-27 10:29 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Maxence Guesdon, caml-list

>> > a function, which checks, whether a string can match a regexp, and
>> > return true or false. Search_forward, again, I think should be return an
>> > option, and not trowing an exception.
>> See Str.string_match : regexp -> string -> int -> bool
[...]

>        Str.string_match (Str.regexp ".*\\.html$") filename 0
>and
>        try
>          ignore(Str.search_forward (Str.regexp "\\.html$") filename 0);
>          true
>        with Not_found ->
>          false

Thanks a lot, it was a useful post!
But isn't this functionally so often needed, that it worth to wrap it
into a function?
In many cases it is enough to test, whether a string contains another
string, is it a function for that?

Gergo
+-[Kontra, Gergely @ Budapest University of Technology and Economics]-+
|         Email: kgergely@mcl.hu,  kgergely@turul.eet.bme.hu          |
|  URL:   turul.eet.bme.hu/~kgergely    Mobile: (+36 20) 356 9656     |
+-------"Olyan langesz vagyok, hogy poroltoval kellene jarnom!"-------+
.
Magyar php mirror es magyar php dokumentacio: http://hu.php.net

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-27  8:01 ` Maxence Guesdon
@ 2002-09-27  8:14   ` Xavier Leroy
  2002-09-27 10:29     ` Kontra, Gergely
  0 siblings, 1 reply; 10+ messages in thread
From: Xavier Leroy @ 2002-09-27  8:14 UTC (permalink / raw)
  To: Maxence Guesdon; +Cc: Kontra, Gergely, caml-list

> > Another annoying problem (maybe it is in the docs...), but I cannot find
> > a function, which checks, whether a string can match a regexp, and
> > return true or false. Search_forward, again, I think should be return an
> > option, and not trowing an exception.
> 
> See Str.string_match : regexp -> string -> int -> bool

In an attempt to prevent one more round of e-mails on this topic, let
me just add that string_match performs "anchored matching" (matching
the RE at the given location in the string) while search_forward
performs "unanchored matching" (matching at any location).  However,
the latter can be turned into the former by prefixing the regexp with ".*".
Hence, the following are equivalent:

        Str.string_match (Str.regexp ".*\\.html$") filename 0
and
        try
          ignore(Str.search_forward (Str.regexp "\\.html$") filename 0);
          true
        with Not_found ->
          false

This said, a much cleaner solution is

        Filename.check_suffix filename ".html"

Not only it is shorter, but under Windows it will perform
case-insensitive matching, like it should.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
  2002-09-26 15:50 Kontra, Gergely
@ 2002-09-27  8:01 ` Maxence Guesdon
  2002-09-27  8:14   ` Xavier Leroy
  2002-09-27 14:50 ` Florian Hars
  1 sibling, 1 reply; 10+ messages in thread
From: Maxence Guesdon @ 2002-09-27  8:01 UTC (permalink / raw)
  To: Kontra, Gergely; +Cc: caml-list


> Another annoying problem (maybe it is in the docs...), but I cannot find
> a function, which checks, whether a string can match a regexp, and
> return true or false. Search_forward, again, I think should be return an
> option, and not trowing an exception.

See Str.string_match : regexp -> string -> int -> bool
 
For the rest, i won't go into another syntax flam war :-)

-- 
Maxence Guesdon
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs
@ 2002-09-26 15:50 Kontra, Gergely
  2002-09-27  8:01 ` Maxence Guesdon
  2002-09-27 14:50 ` Florian Hars
  0 siblings, 2 replies; 10+ messages in thread
From: Kontra, Gergely @ 2002-09-26 15:50 UTC (permalink / raw)
  To: caml-list

Hi!

First sorry for my previous post, I don't read the whole thread before
posting.

Yesterday (half the night) I spent my time writing an extremely simple
program, which recursively change the text .xml to .html (actually the
links, but this is a very simple program, so I didn't parse the html) in
all html files in the given directory (and in the subdirectories).
Before I wrote the same in ruby. To better clear ocaml, I put the
corresponding statements to the same row in a table.

One point was ocaml is really obscure is the file reading: it needed to
wrap the line-reading in an infinite loop, and this loop is ended via an
exception. I think End_of_file is not an exception, it is just an event.

I'm sure this whole thing can be solved with a nice lazy list with the
apply method, and should ocaml follow this way.

Suggestions? Implementations? Will it be in the next release?

Another annoying problem (maybe it is in the docs...), but I cannot find
a function, which checks, whether a string can match a regexp, and
return true or false. Search_forward, again, I think should be return an
option, and not trowing an exception.

BTW what about the many labelled-not labelled stuff. As matters
stand: it look a bit caothic, I think it is the beginning of adopting
the libraries to the new label functionality.

But the thing most of the time I was spending with was the actual ocaml
syntax. I cannot feel the precedences, where to put ;-s and ;;-s.

If anyone can (constructively) criticise my programming style, please
send me your suggestions!

The program mentioned above:
open Unix
open Str

let rec replace what =
  if (stat what).st_kind==S_DIR then (
    print_string (what^" is a directory, so entering...\n");
    let dir=opendir what in
    try
      while true do
        let item=readdir dir in
        print_endline("Considering replacing in "^item);
        match item with
        | "."->()
        | ".." -> ()
        | _ -> replace (what ^ "\\" ^ item)
      done
    with
      End_of_file->()
    )
  else
    let tempname=what^".bak" in
    if (Str.last_chars what 5) <> ".html" then
      print_endline(what^" is not a html file, so skipping")
    else
       (print_endline("Renaming "^what^" to "^tempname);
      Sys.rename what tempname;
      let w=open_out what in
      let r=open_in tempname in
      try
        while true do
          output_string w ((global_replace (regexp
"\\.xml") ".html" (input_line r))^"\n")
        done
      with End_of_file -> close_in r;close_out w;
      Sys.remove tempname);;

 replace "tutorial" (* well, it is hard-coded, but enough for me*)

Gergő

+-[Kontra, Gergely @ Budapest University of Technology and Economics]-+
|         Email: kgergely@mcl.hu,  kgergely@turul.eet.bme.hu          |
|  URL:   turul.eet.bme.hu/~kgergely    Mobile: (+36 20) 356 9656     |
+-------"Olyan langesz vagyok, hogy poroltoval kellene jarnom!"-------+
.
Magyar php mirror es magyar php dokumentacio: http://hu.php.net

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-09-30  1:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.44.0209300427310.1961-100000@home.oyster.ru>
2002-09-30  1:19 ` [Caml-list] (Sorry for my last post) lazyness, exceptions?, ocaml syntax rule-of-thumbs Alessandro Baretta
2002-09-26 15:50 Kontra, Gergely
2002-09-27  8:01 ` Maxence Guesdon
2002-09-27  8:14   ` Xavier Leroy
2002-09-27 10:29     ` Kontra, Gergely
2002-09-27 14:50 ` Florian Hars
2002-09-27 15:07   ` Maxence Guesdon
2002-09-27 17:06     ` Kontra, Gergely
2002-09-27 18:22       ` Tim Freeman
2002-09-27 22:27     ` malc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).