From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id GAA21472; Tue, 30 Sep 2003 06:35:26 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id GAA28570 for <caml-list@pauillac.inria.fr>; Tue, 30 Sep 2003 06:35:24 +0200 (MET DST)
Received: from herd.plethora.net (herd.plethora.net [205.166.146.1])
	by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h8U4ZIH02062;
	Tue, 30 Sep 2003 06:35:18 +0200 (MET DST)
Received: from bhurt.plethora.net (bhurt.plethora.net [205.166.146.49])
	by herd.plethora.net (8.11.6/8.10.1) with ESMTP id h8U4ZEC15251;
	Mon, 29 Sep 2003 23:35:14 -0500 (CDT)
Date: Mon, 29 Sep 2003 23:40:21 -0500 (CDT)
From: Brian Hurt <bhurt@spnz.org>
X-X-Sender: bhurt@localhost.localdomain
To: Pierre Weis <pierre.weis@inria.fr>
cc: jeanmarc.eber@lexifi.com, <martin_jambon@emailuser.net>,
        <caml-list@inria.fr>
Subject: Re: [Caml-list] Printing text with holes
In-Reply-To: <200309292048.WAA16827@pauillac.inria.fr>
Message-ID: <Pine.LNX.4.44.0309292258310.3771-100000@localhost.localdomain>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Loop: caml-list@inria.fr
X-Spam: no; 0.00; caml-list:01 pierre:01 weis:01 printf:01 printf:01 well-typed:01 mucking:01 textual:01 seperated:01 java's:01 model:01 constants:01 escapes:01 floats:01 ignoring:01 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

On Mon, 29 Sep 2003, Pierre Weis wrote:

> > On Mon, 29 Sep 2003, Jean-Marc EBER wrote:
> > 
> > > With "old" printf approach, you write something like:
> > > let _ =
> > >    printf "Hello %s %s! It is %i o'clock.\n" first_name last_name time
> > 
> > It also allows you to do:
> > 
> > let _ = printf "Hello %s %s! It is %i o'clock.\n" name time
> 
> This is not well-typed and the error is properly spotted by the compiler:

Because the compiler understands printf strings as a special case.  I do 
not know of a way I could write printf in standard Ocaml.  You need 
specialized compiler support.  Which makes it very difficult to extend, at 
best.  You basically end up mucking about in p4.

At heart, printf is doing two seperate jobs.  First, it's formating binary 
data (integers, floating points, etc) into there textual/formatted 
representations.  And second, they are outputing said formatted strings.  
These are two completely different jobs which should be done by different 
blocks of code.  Java started recognizing this when they seperated the 
formatting objects from the i/o objects.  And for all of it's limitations, 
and Java I/O has it's limitations, I think it hit upon a better solution 
than printf.  Of course, Java's I/O model isn't radical or new, as any 
Unix shell programmer will tell you.

> Absolutely. This exactly similar to what the compiler does when
> reading string constants in order to properly look for escapes, or
> when it reads strings of digits to decide if it means an integer or a
> floating point number.

Printf is at a completely different level of complexity than simply 
escaping strings.  Escaping strings (and differentiating integers from 
floats) can be done entirely in the lexer.

I've actually written printf implementations.  Even ignoring "amusing" 
complications like locales and floating point numbers, they are not simple 
beasts.  

> You are absolutely right: good old C functions have not automatically
> good old interfaces (for a counterexample just consider ioctl, malloc
> and free, compared to their void equivalent in Caml, or longjump and
> setjump compared with the nice try raise constructs). Conversely, we
> must admit that good old C functions do not necessarily have a bad
> interface, isn't it ?  (May I mention as an example the basic
> arithmetic operators ?)

There is very little in the standard C libraries I would consider "good" 
outside of C.  Primarily because (IMHO) there is very little in the 
standard C libraries.  Nor is C particularly noted for it's ease of 
manipulation of strings and I/O.  About all I ever hear anyone say about C 
and text manipulation is that it's better than FORTRAN, which qualifies as 
damning with faint praise.  Now, the Unix command line, *there* is an 
environment good for text processing.  Good enough that it's been stolen 
in one way or another at least twice (Java and Perl- Perl doing a more 
thorough job of stealing, while Java stole the basic concepts).

Here's how it would work- you would have three categories of objects.  
The first category would be sources and sinks.  A source is an object with
a read function, a sink is an object with a write function.  These objects 
would represent files and strings and network sockets etc.  The second 
category is filters.  A filter looks to the outside just like a source or 
a sink, but instead of being a final destination it just does something to 
the data and then passes it along.  A classic example of a filter is an 
encryption object- looks just like a sink, but when you write data to it, 
it encrypts the data and then writes it in turn to another sink (which may 
be another filter).  The third category of objects is formating 
objects or functions.  These take binary data and turn them into string 
representations.  A classic example here is string_of_int.

Note that since I'm working within the language, all the parts are easily 
extensible and replaceable.

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners