From mboxrd@z Thu Jan  1 00:00:00 1970
To: 9fans@cse.psu.edu
From: "Douglas A. Gwyn" <DAGwyn@null.net>
Message-ID: <3B978E5A.262D53E9@null.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
References: <20010905191944.AC31319A04@mail.cse.psu.edu>
Subject: Re: [9fans] weird print(2) problems...
Date: Thu,  6 Sep 2001 16:18:35 +0000
Topicbox-Message-UUID: ea769910-eac9-11e9-9e20-41e7f4b1d025

Russ Cox wrote:
> > print() shouldn't call write(), as there's nothing to write.
> then {echo ''} shouldn't print a newline, as there's nothing to echo.

No, the semantics for "echo" are traditionally defined as simply:
printing its arguments separated by spaces and followed by a newline.
So even "echo" with no additional arguments must print a newline.
(Unless, of course, told to suppress the newline, as with the -n
option.)  Echo with a 0-length string argument should print the
0-length string argument, separated from adjacent arguments in
the output by spaces, then NL.  "echo a '' '' b" should print
"a<space><space><space>b<newline>".

The problem with "higher-level" library functions such as print()
is that, generally, there is no clear specification relating their
invocation and when they call low-level functions such as write().

> there's no obvious answer except that we shouldn't try to "fix" it.

write() with buffer length 0 should of course write an empty data
packet.  Whether a subsequent read will see the packet boundaries
depends on several things, and whether a 0-length packet is
interpreted as "end of file" depends on several *other* things.

We ran into all this on early versions of UNIX, where a ^D on a
terminal input actually just delimited (sent along from raw to
canonical queue) the current input, same as a new-line input,
except of course without the embedded newline character.  The
UNIX convention was that read() returning 0 characters, such as
it would for a terminal *iff* a delimiter immediately followed
a previous delimiter (newline or ^D-activated), was interpreted
as end of file.  This was consistent with reading characters
beyond the last one present in a static (disk) file.  On pipes,
it required preservation of written record lengths so a 0-length
record could be detected by read().

Ultimately, many flavors of I/O on UNIX-based systems grew more
complicated, and record lengths were harder to preserve.  EOF
was therefore sometimes indicated by out-of-band information
instead of by a 0-length data packet.  However, the 0-length
idea has considerable appeal, and whatever problems there are
seem to be due to assuming semantics for an "end of file"
indication that are more heavyweight ("sticky") than the
condition sometimes indicates.  I would say that a "solution"
would be for read() from a source that (1) has no more data,
(2) has already reported 0-length for end of data, and (3) can
*never* have any more data to return a distinct "attempted read
past end of data" error, instead of the less informative 0-long
buffer.  That would not affect applications that interpret a
0-length return from read() as EOF, but would allow apps to
do the "natural" thing with empty data packets (i.e. process no
data) instead of having to assume they mean permanent EOF.

>                     The Plan 9 and the Echo

The parable is silly.  The traditional semantics of "echo" have
a simple, regular specification, which is not true of an "echo"
that outputs nothing when supplied with no non-option arguments.