The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] fstat(2) on pipes?
@ 2016-08-15 17:47 Norman Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Norman Wilson @ 2016-08-15 17:47 UTC (permalink / raw)


I remember once, long ago--probably in the early 1980s--writing
a program that expected fstat on a pipe to return the amount of
data buffered in the pipe.  It worked on the system on which
I wrote the code.  Then I tried it on another, related but
different UNIX, and it didn't work.  So if POSIX/SUS don't
prescribe a standard, I don't think one should pretend there
is one, and (as I learned back then) it's unwise to depend
on the result, except I think it's fair not to expect fstat
to fail on any valid file descriptor.

I'm pretty sure that in 7/e and earlier, fstat on a pipe
reported a regular file with zero links.  There was a reason
for this: the kernel in fact allocated an i-node from a
designated pipe device (pipedev) file system, usually the
root.  So the excuse that `there's no i-node' was just wrong.

In last-generation Research systems, when pipes were streams
(and en passant became full duplex, which caused no trouble
at all but simplified life elsewhere--I think I was the one
who realized that meant we didn't need pseudo-ttys any more),
the system allocated a pair of in-core i-nodes when a pipe
was created.  As long as such an i-node cannot be accidentally
confused with one belonging to any disk file system, this
causes no trouble at all, and since it is possible to have
more than one disk file system this is trivially possible
just by reserving a device number.  (In fact by then our
in-core i-nodes were marked with a file system type as well,
and pipes just became their own file system.)  stat always
returned size 0 for (Research) stream pipes, partly because
nobody cared enough, partly because the implementation of
streams didn't keep an exact count of all the buffered data
all along the stream, just a rough one sufficient for flow
control.  Besides, with a full-duplex pipe, which direction's
data should be counted?

Returning to the original question, I'd suggest that:
-- fstat(fd) where fd is a pipe should succeed
-- the file should be reported to have zero links,
since that is the case for a pipe (unless a named pipe,
but if you support those you probably have something
else to stat anyway)
-- the file type should be IFIFO if that type exists
in xv6 (which it wouldn't were it a real emulation of
6/e, but I gather that's not the goal), IFREG otherwise
-- permissions probably don't matter much, but for
sanity's sake should be some reasonable constant.

Norman Wilson
Toronto ON


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15  0:11 Warren Toomey
  2016-08-15  0:41 ` Dave Horsfall
@ 2016-08-15 17:53 ` Clem Cole
  1 sibling, 0 replies; 10+ messages in thread
From: Clem Cole @ 2016-08-15 17:53 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

​Yet Another Example of UNIX A != UNIX B​

IIRC from the /usr/group and later POSIX discussions, the only thing that
is for sure on the stat structure with a PIPE is that it's marked as such.

That said, I just grabbed my copy of the SVID (Vol 1 pages 126-127)

st_size "For ordinary files, this field is the address of the end of file.
  For pipes and FIFO's, this field is the count of the data currently in
the file.   For block-special & char special, this field is undefined."

As for st_ino and st_dev -- the SVID says the ino "uniquely identifies the
file in a given file system," and dev uniquely identifies the file system
that contains the file."

It further states: "The pair of fields st_ino and st_dev uniquely
identifies ordinary files."     And then later says "No other significance
is associated with this value."


So..... clearly returning an error is wrong.   I don't think the Linux
scheme hurts anything....


On Sun, Aug 14, 2016 at 8:11 PM, Warren Toomey <wkt at tuhs.org> wrote:

> All, sorry this is slightly off-topic. I'm trying to
> find out what fstat(2) returns when the file descriptor
> is a pipe. The POSIX/Open Group documentation doesn't
> really specify what should be returned. Does anybody have
> any pointers?
>
> Thanks, Warren
>
> P.S. Why? xv6 has fstat() but returns an error if the
> file descriptor isn't associated with an i-node. I'm
> trying to work out if/how to fix it.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160815/e8a2a75e/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15 15:14 ` Random832
@ 2016-08-15 16:56   ` Michael Kjörling
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Kjörling @ 2016-08-15 16:56 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1385 bytes --]

On 15 Aug 2016 11:14 -0400, from random832 at fastmail.com (Random832):
> On Mon, Aug 15, 2016, at 10:04, Noel Chiappa wrote:
>> But back to the original topic, it sounds like there's a huge amount
>> of variance in the semantics of doing fstat() on a pipe. V6 doesn't
>> special-case it in any way, but it sounds as if other systems do.
> 
> I expect that the single important thing, the only thing that most
> applications will rely on, is it returning successfully and indicating
> that the file type is fifo.

On Linux/glibc, based on the fstat(2) man page, it looks like the size
field is undefined for a FIFO:

> The st_size field gives the size of the file (if it is a regular
> file or a symbolic link) in bytes. The size of a symbolic link is
> the length of the pathname it contains, without a terminating null
> byte.

The mode field is used to hold the type of file:

> The following POSIX macros are defined to check the file type using
> the st_mode field:
> 
> ...
>     S_ISFIFO(m) FIFO (named pipe)?
> ...
> 
> The following flags are defined for the st_mode field:
> ...
>     S_IFIFO    0010000   FIFO
> ...

The above from Debian Wheezy.

-- 
Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se
                 “People who think they know everything really annoy
                 those of us who know we don’t.” (Bjarne Stroustrup)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15 14:04 Noel Chiappa
@ 2016-08-15 15:14 ` Random832
  2016-08-15 16:56   ` Michael Kjörling
  0 siblings, 1 reply; 10+ messages in thread
From: Random832 @ 2016-08-15 15:14 UTC (permalink / raw)


On Mon, Aug 15, 2016, at 10:04, Noel Chiappa wrote:
> But back to the original topic, it sounds like there's a huge amount
> of variance in the semantics of doing fstat() on a pipe. V6 doesn't
> special-case it in any way, but it sounds as if other systems do.

I expect that the single important thing, the only thing that most
applications will rely on, is it returning successfully and indicating
that the file type is fifo. If your version of xv6 supports file
permissions and if pipes are one-way it may be worthwhile to indicate
which end of the pipe it is.

In the standard: the use of the size field is explicitly unspecified for
pipes - for any file type other than regular files, symbolic links, and
shared/typed memory objects. Other than that, it's clear from the
standard that it's intended to succeed and report a sensible file type
for non-filesystem objects like pipes, shared memory objects, and
sockets. However, there's no discussion of what, if anything, belongs in
the dev/inode*, permissions, nlink, and timestamps.

On Linux: st_dev is a device number specific to pipes and st_ino is a
unique inode number. I haven't tested the timestamps thoroughly (my test
only covered instantaneously opening and statting a pipe), but they are
valid timestamps rather than being 0 or -1 or some garbage value.
st_uid/gid are [probably, haven't tested complicated cases] the user
that created it, st_nlink is 1, and the permissions are set to
[user-only] the read or write mode the pipe is opened in.

*Though, the standard's light on the meaning of device identifiers in
the first place, and what it does say could easily be read as demanding
a unique device/inode pair regardless of the nonexistence of a physical
device, which naturally leads to the solution that I observed on Linux
and that someone else mentioned on OSX.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
@ 2016-08-15 14:04 Noel Chiappa
  2016-08-15 15:14 ` Random832
  0 siblings, 1 reply; 10+ messages in thread
From: Noel Chiappa @ 2016-08-15 14:04 UTC (permalink / raw)


    > From: Warren Toomey

    > xv6 is a Unix-like OS written for teaching purposes. 

I'm fairly well-aware of Xv6; I too am planning to use it in a project.

But back to the original topic, it sounds like there's a huge amount of
variance in the semantics of doing fstat() on a pipe. V6 doesn't special-case
it in any way, but it sounds as if other systems do.

What V6 does (to complete the list) is grow the temporary file being used to
buffer the pipe contents up to a certain maximum size, whereupon it halts the
writer, and waits for the reader to catch up - at which point it truncates
the file, and adjusts the read and write pointers back to 0. So fstat() on
V6, which doesn't special-case pipes in any way for fstat(), apparently
returns 'waiting_to_be_read' plus 'already_read'.


    >>> xv6 has fstat() but returns an error if the file descriptor isn't
    >>> associated with an i-node.

    >> ?? All pipe file descriptors should have an inode?

To answer my own question, after a quick look at the Xv6 sources (on my
desktop ;-); it turns out that Xv6 handles pipes completely differently;
instead of borrowing an inode, they have special 'pipe' structures.  Hence the
error return in fstat() on Xv6. (That difference also limits the amount of
buffered data in a pipe to 512 bytes. So don't expect high throughput from a
pipe on Xv6! :-)

So I guess you get to pick which semantics you want fstat() on a pipe to have
there: V6's, V7's (see below), or something else! :-)


    > 7th Ed seems to return the amount of free space in the pipe, if I read
    > the code correctly:

I'm not sure of that (see below), but I think it would make more sense to
return the amount of un-read data (which is what I think it does do), as the
closest semantics to fstat() on a file.

It might also make sense to return the amount of free space (to a writer), and
the amount of data available to read (to a reader), since those are the
numbers users will care about. (Although then fstat() on the write side of a
pipe will have semantics which are inconsistent with fstat() on files. And if
the user code knows the maximum amount of buffering in a pipe, it could work
out the available write space from that, and the amount currently un-read.)

    > fstat()
    > {
    >    ...
    >    /* Call stat1() with the current offset in the pipe */
    >   stat1(fp->f_inode, uap->sb, fp->f_flag&FPIPE? fp->f_un.f_offset: 0);
    > }
    > stat1()
    > {
    >   ...
    >    ds.st_size = ip->i_size - pipeadj;

I'm too lazy to go read the code (even though I already have it :-), but V7
seems to usually be very similar to V6. So, what I suspect this code does is
pass the expression:

  ((fp->f_flag & FPIPE) ? fp->f_un.f_offset : 0)

as 'pipeadj' (to account for the amount that's already been read), and then
returns (ip->i_size - pipeadj), i.e. the amount remaining un-read, as the
size.

	Noel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15  0:27 Noel Chiappa
@ 2016-08-15  0:59 ` Warren Toomey
  0 siblings, 0 replies; 10+ messages in thread
From: Warren Toomey @ 2016-08-15  0:59 UTC (permalink / raw)


Warren wrote:
>     > xv6 has fstat() but returns an error if the file descriptor isn't
>     > associated with an i-node.
 
On Sun, Aug 14, 2016 at 08:27:11PM -0400, Noel Chiappa wrote:
> ?? All pipe file descriptors should have an inode?

xv6 is a Unix-like OS written for teaching purposes. I'm
making changes to give it a decent runtime environment. URLs:

https://pdos.csail.mit.edu/6.828/2014/xv6.html
https://github.com/DoctorWkt/xv6-freebsd

and it comes with it's own Lions-style commentary:
https://pdos.csail.mit.edu/6.828/2014/xv6/book-rev8.pdf

Cheers, Warren


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15  0:41 ` Dave Horsfall
@ 2016-08-15  0:54   ` Warren Toomey
  0 siblings, 0 replies; 10+ messages in thread
From: Warren Toomey @ 2016-08-15  0:54 UTC (permalink / raw)


On Mon, Aug 15, 2016 at 10:41:02AM +1000, Dave Horsfall wrote:
> Probably not much use to you, but back in Ed6 I did modify it to return
> the amount of data in the pipe.

7th Ed seems to return the amount of free space in the pipe, if I read
the code correctly:

fstat()
{
   ...
   /* Call stat1() with the current offset in the pipe */
   stat1(fp->f_inode, uap->sb, fp->f_flag&FPIPE? fp->f_un.f_offset: 0);
}

stat1()
{
   ...
   ds.st_size = ip->i_size - pipeadj;
}

Cheers, Warren


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
  2016-08-15  0:11 Warren Toomey
@ 2016-08-15  0:41 ` Dave Horsfall
  2016-08-15  0:54   ` Warren Toomey
  2016-08-15 17:53 ` Clem Cole
  1 sibling, 1 reply; 10+ messages in thread
From: Dave Horsfall @ 2016-08-15  0:41 UTC (permalink / raw)


On Mon, 15 Aug 2016, Warren Toomey wrote:

> All, sorry this is slightly off-topic. I'm trying to find out what 
> fstat(2) returns when the file descriptor is a pipe. The POSIX/Open 
> Group documentation doesn't really specify what should be returned. Does 
> anybody have any pointers?

I always thought it was undefined, but my Mac says:

BUGS
     Applying fstat to a socket (and thus to a pipe) returns a zero'd buffer,
     except for the blocksize field, and a unique device and inode number.

And my FreeBSD box is the same; I haven't checked my Penguins.

> P.S. Why? xv6 has fstat() but returns an error if the file descriptor 
> isn't associated with an i-node. I'm trying to work out if/how to fix 
> it.

Probably not much use to you, but back in Ed6 I did modify it to return
the amount of data in the pipe.

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
@ 2016-08-15  0:27 Noel Chiappa
  2016-08-15  0:59 ` Warren Toomey
  0 siblings, 1 reply; 10+ messages in thread
From: Noel Chiappa @ 2016-08-15  0:27 UTC (permalink / raw)


    > From: Warren Toomey

    > I'm trying to find out what fstat(2) returns when the file descriptor
    > is a pipe.

In V6, it returns information about the file (inode) used as a temporary
storage area for data which has been written into the pipe, but not yet read;
i.e. it's an un-named file with a length which varies between 0 and 4KB.

    > xv6 has fstat() but returns an error if the file descriptor isn't
    > associated with an i-node.

?? All pipe file descriptors should have an inode?

	Noel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] fstat(2) on pipes?
@ 2016-08-15  0:11 Warren Toomey
  2016-08-15  0:41 ` Dave Horsfall
  2016-08-15 17:53 ` Clem Cole
  0 siblings, 2 replies; 10+ messages in thread
From: Warren Toomey @ 2016-08-15  0:11 UTC (permalink / raw)


All, sorry this is slightly off-topic. I'm trying to
find out what fstat(2) returns when the file descriptor
is a pipe. The POSIX/Open Group documentation doesn't
really specify what should be returned. Does anybody have
any pointers?

Thanks, Warren

P.S. Why? xv6 has fstat() but returns an error if the
file descriptor isn't associated with an i-node. I'm
trying to work out if/how to fix it.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-08-15 17:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-15 17:47 [TUHS] fstat(2) on pipes? Norman Wilson
  -- strict thread matches above, loose matches on Subject: below --
2016-08-15 14:04 Noel Chiappa
2016-08-15 15:14 ` Random832
2016-08-15 16:56   ` Michael Kjörling
2016-08-15  0:27 Noel Chiappa
2016-08-15  0:59 ` Warren Toomey
2016-08-15  0:11 Warren Toomey
2016-08-15  0:41 ` Dave Horsfall
2016-08-15  0:54   ` Warren Toomey
2016-08-15 17:53 ` Clem Cole

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).