From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA31834; Tue, 27 Apr 2004 18:18:27 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA32099 for <caml-list@pauillac.inria.fr>; Tue, 27 Apr 2004 18:18:25 +0200 (MET DST)
Received: from postfix3-1.free.fr (postfix3-1.free.fr [213.228.0.44])
	by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i3RGIOYM015281
	for <caml-list@inria.fr>; Tue, 27 Apr 2004 18:18:25 +0200
Received: from warp (lns-th2-5f-81-56-197-80.adsl.proxad.net [81.56.197.80])
	by postfix3-1.free.fr (Postfix) with SMTP
	id E8D06C418C; Tue, 27 Apr 2004 18:18:23 +0200 (CEST)
Message-ID: <016501c42c73$24e64b30$ef01a8c0@warp>
From: "Nicolas Cannasse" <warplayer@free.fr>
To: "Yamagata Yoriyuki" <yoriyuki@mbg.ocn.ne.jp>
Cc: <caml-list@inria.fr>
References: <015701c42b9a$00065730$ef01a8c0@warp><20040427.002643.78700697.yoriyuki@mbg.ocn.ne.jp><016401c42bc4$b6438840$19b0e152@warp> <20040428.004358.45522587.yoriyuki@mbg.ocn.ne.jp>
Subject: Re: [Caml-list] Re: Common IO structure
Date: Tue, 27 Apr 2004 18:17:32 +0200
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1409
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
X-Miltered: at concorde by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")!
X-Loop: caml-list@inria.fr
X-Spam: no; 0.00; cannasse:01 warplayer:01 caml-list:01 extlib:01 ocamlnet:01 wrappers:01 extlib:01 wrappers:01 camomile:01 ocamlnet:01 gerd:01 runtime:01 slower:01 consistenly:01 behave:01 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

> > So ? That's exactly what we're talking about it there : making a choice.
And
> > that include naming of course. I don't say that the name we choosed for
> > ExtLib IO are better, it's just that "reading" and "writing" on an IO
seems
> > natural to me.
>
> get/put are used in Camomie already, and input/output are used in
> ocamlnet.  OO wrappers of Extlib IO are only recent addition, so
> changing Extlib is more natural.  In this case, what Extlib should do is
> just changing OO wrappers.  I do not think you need to change IO
> module itself.

As someone told, read/write concepts are used in most of other languages
(including Java, C, and many others). I agree ExtLib's IO are recent and
should not dictate how other libraries methods should be named. But since
we're standardizing things, Camomile , Ocamlnet and Extlib will have all to
rewrite some code in order to be intercompatible. Since we need to do that,
let's make a choice not based on what's already written, but on what's is
best for the end user. A slashdot poll would help here :)
But don't worry, if you and Gerd agree on some naming, ExtLib IO will
follow. I just want that the read/write naming be taken care as much as
other naming possibilities.

> > > Another problem is that it is not minimal
> > > enough.  For character converters, it is impossible to predict how
> > > many characters will be available, for example.  And requiring "pos",
> > > "nread", "nwrite" seems arbitrary for me.  They are somtimes useful
> > > and improvement, but not necessary.
> >
> > That's true, I agree with you but on the last point : they are necessary
in
> > order to get good performances. Concerning "available", it returns None
if
> > no data available. "pos" might throw an exception as well when
unavailable
> > (looks like pos and available should have same behavior here).
>
> My philosophy is to make the type informative as far as possible.  If
> some method does not work, I would rather remove the method, and
> notify this fact to a user in the compile time (not in the runtime).  A
> user, or the library developer can provide wrappers if necessary.
>
> Philoshophy aside, I do not see how pos and available improve
> performance.  pos certainly decreases performance.  Anyways disks are
> much slower than CPU, so arguing small performance benefit is
> nonsense.  Since there are many possible "improvements" (seek, unget,
> length, destination addresses), it would be better to stick the
> algebraically minimal specification.
>
> Note that I do not oppose extension as such.  I oppose making them as
> the standard.


I agree on dropping pos and available from the standard. ExtLib will deal
consistenly without them.


> > And nread/nwrite can simply call n times read/write. That means that
> > any library can put default implementation for additional "not
> > minimal" constructs : they will behave poorly (writing a string char
> > by char) but will interface well with other IO that are supporting
> > them correctly. If implementing efficently nread/nwrite require
> > additionnal effort, then let's implement a default behavior and
> > implement it better later. Having theses functions make room for
> > future improvements, which is not done with minimal IO.
>
> Choosing a type of buffers is not trivial (except char, in this case I
> propose using stirng, of courst).  For exmaple, what is for a Unicode
> channel?  UTF8/UTF16/UTF32 strings, array, DynArray.t, all have their
> own advantage.  And if someone uses UTF8, another uses UTF16 and so
> on, then there is not much point of having standard.
>
> Of course we would make "nread/nwrite" use a default buffer type
> (maybe list, as you do Extlib), but I doubt that such "filler" methods
> do any good.

They'll maybe not - in the Unicode case, but they'll definilty help for
other IO.
Concerning the channels, I'm against having 4 classes instead of 2. If the
user need to write both chars and strings, he will need to carry two objects
instead of one.
My proposal is based on the following :

class input = object
     method read : char
     method nread  : int -> string
     method close_in : unit
end

class output = object
     method write : char
     method nwrite : string
     method close_out : unit
end

and since it's so easy, let's add some polymorphism to get more general and
more powerful IO objects :

class ['a,'b] input = object
    method read : 'a
    method nread :  int -> 'b
    method close_in : unit
end

class ['a,'b,'c] output = object
    method write : 'a
    method nwrite : 'b
    method close_out : 'c
end

Having bi-polymorphism over an IO is really powerful : you can handle
buffered read/write like this or polymorphic writes (two ways of reading,
two ways of writing).
An example from the ExtLib :

val input_bits : (char,'a) input -> (bool,int) input
val output_bits : (char,'a,'b) output -> (bool,(int * int),'b) output

enable you to read bit-by-bit over a channel :

let ch = input_bits .... in
let b = IO.read ch in (* read a bit as a boolean *)
let n = IO.nread ch 5 in (* read 5 bits as an integer *)
...

let ch = output_bits ... in
IO.write ch true; (* write a bit *)
IO.write ch false;
IO.nwrite ch (5,31); (* write 31 using 5 bits *)

It's not only an extension : putting that in the core interface enable a
wide kind of IO.
I'm interested in your thinking about that.

Best Regards,
Nicolas Cannasse

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners