The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Etymology of the open file table?
@ 2016-03-22  2:07 Dan Cross
  2016-03-22  2:23 ` Marc Rochkind
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Cross @ 2016-03-22  2:07 UTC (permalink / raw)


This came up today at work; what's the origin of the open file table? The
suggestion was made that, instead, a ref-counted data structure could be
allocated at open() time to serve the same purpose, and that a table of
open files was superfluous. My guess was that this made it (relatively)
easy to look up what files referred to a particular device?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160321/c12704bf/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  2:07 [TUHS] Etymology of the open file table? Dan Cross
@ 2016-03-22  2:23 ` Marc Rochkind
  2016-03-22  2:28   ` Larry McVoy
  2016-03-22  3:02   ` Dan Cross
  0 siblings, 2 replies; 10+ messages in thread
From: Marc Rochkind @ 2016-03-22  2:23 UTC (permalink / raw)


A ref-counted data structure organized how, for what language? Integers are
really easy to work with.

(Perhaps I misunderstood your post.)

On Mon, Mar 21, 2016 at 8:07 PM, Dan Cross <crossd at gmail.com> wrote:

> This came up today at work; what's the origin of the open file table? The
> suggestion was made that, instead, a ref-counted data structure could be
> allocated at open() time to serve the same purpose, and that a table of
> open files was superfluous. My guess was that this made it (relatively)
> easy to look up what files referred to a particular device?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160321/71c55663/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  2:23 ` Marc Rochkind
@ 2016-03-22  2:28   ` Larry McVoy
  2016-03-22  2:56     ` Warren Toomey
  2016-03-22  3:02     ` Dan Cross
  2016-03-22  3:02   ` Dan Cross
  1 sibling, 2 replies; 10+ messages in thread
From: Larry McVoy @ 2016-03-22  2:28 UTC (permalink / raw)


So if you think about it you need two levels, the fd that is per open
that knows the offset, and the fd to the object in question.  The file
table is the latter.

On Mon, Mar 21, 2016 at 08:23:43PM -0600, Marc Rochkind wrote:
> A ref-counted data structure organized how, for what language? Integers are
> really easy to work with.
> 
> (Perhaps I misunderstood your post.)
> 
> On Mon, Mar 21, 2016 at 8:07 PM, Dan Cross <crossd at gmail.com> wrote:
> 
> > This came up today at work; what's the origin of the open file table? The
> > suggestion was made that, instead, a ref-counted data structure could be
> > allocated at open() time to serve the same purpose, and that a table of
> > open files was superfluous. My guess was that this made it (relatively)
> > easy to look up what files referred to a particular device?
> >

-- 
---
Larry McVoy            	     lm at mcvoy.com             http://www.mcvoy.com/lm 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  2:28   ` Larry McVoy
@ 2016-03-22  2:56     ` Warren Toomey
  2016-03-22  3:02     ` Dan Cross
  1 sibling, 0 replies; 10+ messages in thread
From: Warren Toomey @ 2016-03-22  2:56 UTC (permalink / raw)


On Mon, Mar 21, 2016 at 07:28:08PM -0700, Larry McVoy wrote:
> So if you think about it you need two levels, the fd that is per open
> that knows the offset, and the fd to the object in question.  The file
> table is the latter.

Dennis notes this issue in his Evolution paper:
https://www.bell-labs.com/usr/dmr/www/hist.html

just about the IO Redirection heading.

Cheers, Warren


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  2:23 ` Marc Rochkind
  2016-03-22  2:28   ` Larry McVoy
@ 2016-03-22  3:02   ` Dan Cross
  2016-03-22  4:00     ` John Cowan
  1 sibling, 1 reply; 10+ messages in thread
From: Dan Cross @ 2016-03-22  3:02 UTC (permalink / raw)


On Mon, Mar 21, 2016 at 10:23 PM, Marc Rochkind <rochkind at basepath.com>
wrote:

> A ref-counted data structure organized how, for what language? Integers
> are really easy to work with.
>
> (Perhaps I misunderstood your post.)
>

Sorry, let me try and clarify.

As I understand things: At the process level there exists an array of
pointers to file structures indexed by file descriptor; a file descriptor
is thus in some senses a per-process proxy for a richer data structure.
Those file structures are collected into a single, global table. The
question is why this latter table? One could rather imagine an
implementation where open() allocates (e.g., via malloc()) a new 'struct
file' that contains as a structure field an 'int refcnt' that is
incremented when a descriptor is dup()'d or as a side-effect of a fork(),
and is decremented as a result of a close(); when 'refcnt' drops to zero,
the structure could be freed with e.g. 'mfree'. What is the benefit of
'struct file file[];'?

To give a concrete example, consider 7th Edition Unix. sys/h/file.h
contains the definition of 'struct file', which already includes 'char
f_count' which is documented as a 'reference count.' This is incremented as
the result of fork() (really, in newproc() in sys/sys/slp.c) and dup()
(sys/sys/sys3.c), or when a 'struct file' is allocated (sys/sys/fio.c).
It's decremented when a file is closed; the ref count is also used to
handle releasing inodes and so forth in closef() (sys/sys/fio.c); there's
some minor use in the pipe code. But falloc() always iterates over the
global 'file' (declared as 'extern struct file file[];' in sys/h/file.h,
defined in the generated output of the 'mkconf' command in sys/conf; e.g.
sys/conf/c.c).

The question is, why the global table named 'file'? Sure, it naturally
bounds the total number of open files; is that the primary reason? Was it
just expedient? Were there any other uses that made a global array
particularly attractive as a design approach? I suppose the same question
could be asked about the proc table, buffer structs, etc.

        - Dan C.

On Mon, Mar 21, 2016 at 8:07 PM, Dan Cross <crossd at gmail.com> wrote:
>
>> This came up today at work; what's the origin of the open file table? The
>> suggestion was made that, instead, a ref-counted data structure could be
>> allocated at open() time to serve the same purpose, and that a table of
>> open files was superfluous. My guess was that this made it (relatively)
>> easy to look up what files referred to a particular device?
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160321/c9cf4a6a/attachment-0001.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  2:28   ` Larry McVoy
  2016-03-22  2:56     ` Warren Toomey
@ 2016-03-22  3:02     ` Dan Cross
  1 sibling, 0 replies; 10+ messages in thread
From: Dan Cross @ 2016-03-22  3:02 UTC (permalink / raw)


On Mon, Mar 21, 2016 at 10:28 PM, Larry McVoy <lm at mcvoy.com> wrote:

> So if you think about it you need two levels, the fd that is per open
> that knows the offset, and the fd to the object in question.  The file
> table is the latter.
>

Sure, but why does the second thing necessarily have to be a table? Does it?

        - Dan C.

On Mon, Mar 21, 2016 at 08:23:43PM -0600, Marc Rochkind wrote:
> > A ref-counted data structure organized how, for what language? Integers
> are
> > really easy to work with.
> >
> > (Perhaps I misunderstood your post.)
> >
> > On Mon, Mar 21, 2016 at 8:07 PM, Dan Cross <crossd at gmail.com> wrote:
> >
> > > This came up today at work; what's the origin of the open file table?
> The
> > > suggestion was made that, instead, a ref-counted data structure could
> be
> > > allocated at open() time to serve the same purpose, and that a table of
> > > open files was superfluous. My guess was that this made it (relatively)
> > > easy to look up what files referred to a particular device?
> > >
>
> --
> ---
> Larry McVoy                  lm at mcvoy.com
> http://www.mcvoy.com/lm
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160321/bc1276f3/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  3:02   ` Dan Cross
@ 2016-03-22  4:00     ` John Cowan
  2016-03-22  4:11       ` Warner Losh
  0 siblings, 1 reply; 10+ messages in thread
From: John Cowan @ 2016-03-22  4:00 UTC (permalink / raw)


Dan Cross scripsit:

> Those file structures are collected into a single, global table. The
> question is why this latter table? One could rather imagine an
> implementation where open() allocates (e.g., via malloc()) a new 'struct
> file' that contains as a structure field an 'int refcnt' that is
> incremented when a descriptor is dup()'d or as a side-effect of a fork(),
> and is decremented as a result of a close(); when 'refcnt' drops to zero,
> the structure could be freed with e.g. 'mfree'. What is the benefit of
> 'struct file file[];'?

Sure you could, but it would be more complex, slower, and less robust.
"When in doubt, use brute force."  --ken

-- 
John Cowan          http://www.ccil.org/~cowan        cowan at ccil.org
He made the Legislature meet at one-horse tank-towns out in the alfalfa
belt, so that hardly nobody could get there and most of the leaders
would stay home and let him go to work and do things as he pleased.
    --H.L. Mencken's translation of the Declaration of Independence


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  4:00     ` John Cowan
@ 2016-03-22  4:11       ` Warner Losh
  2016-03-23 19:48         ` Dan Cross
  0 siblings, 1 reply; 10+ messages in thread
From: Warner Losh @ 2016-03-22  4:11 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]


> On Mar 21, 2016, at 10:00 PM, John Cowan <cowan at mercury.ccil.org> wrote:
> 
> Dan Cross scripsit:
> 
>> Those file structures are collected into a single, global table. The
>> question is why this latter table? One could rather imagine an
>> implementation where open() allocates (e.g., via malloc()) a new 'struct
>> file' that contains as a structure field an 'int refcnt' that is
>> incremented when a descriptor is dup()'d or as a side-effect of a fork(),
>> and is decremented as a result of a close(); when 'refcnt' drops to zero,
>> the structure could be freed with e.g. 'mfree'. What is the benefit of
>> 'struct file file[];'?
> 
> Sure you could, but it would be more complex, slower, and less robust.
> "When in doubt, use brute force."  --ken

And hard-coded limited, like the filesystem table, were all over the
place in early OSes, mostly to cope with memory sharing on tiny
RAM systems where it was better to just statically allocate things
at compile time. This made the code simpler (and smaller) which
made it both faster and allowed one to pack more functionality into
the system. It was rare that you’d have so much memory you could
take advantage of dynamic allocation. If you used all your file descriptors
that were statically compiled into the kernel, chances are you wouldn’t
have the address space to hold enough RAM to source and sink
data to the files in question, nor deal with the connections between
the file stable and the file system.

Dynamic allocation, and moving away from static limits, only came
about later, as memory sizes grew. It was this dynamic that made
Ken’s advice such a win in the hardware of the day.

Warner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160321/8e365dfd/attachment.sig>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-22  4:11       ` Warner Losh
@ 2016-03-23 19:48         ` Dan Cross
  2016-03-23 20:17           ` John Cowan
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Cross @ 2016-03-23 19:48 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2044 bytes --]

Thanks, all. I kind of figured it was something like that....

On Tue, Mar 22, 2016 at 12:11 AM, Warner Losh <imp at bsdimp.com> wrote:

>
> > On Mar 21, 2016, at 10:00 PM, John Cowan <cowan at mercury.ccil.org> wrote:
> >
> > Dan Cross scripsit:
> >
> >> Those file structures are collected into a single, global table. The
> >> question is why this latter table? One could rather imagine an
> >> implementation where open() allocates (e.g., via malloc()) a new 'struct
> >> file' that contains as a structure field an 'int refcnt' that is
> >> incremented when a descriptor is dup()'d or as a side-effect of a
> fork(),
> >> and is decremented as a result of a close(); when 'refcnt' drops to
> zero,
> >> the structure could be freed with e.g. 'mfree'. What is the benefit of
> >> 'struct file file[];'?
> >
> > Sure you could, but it would be more complex, slower, and less robust.
> > "When in doubt, use brute force."  --ken
>
> And hard-coded limited, like the filesystem table, were all over the
> place in early OSes, mostly to cope with memory sharing on tiny
> RAM systems where it was better to just statically allocate things
> at compile time. This made the code simpler (and smaller) which
> made it both faster and allowed one to pack more functionality into
> the system. It was rare that you’d have so much memory you could
> take advantage of dynamic allocation. If you used all your file descriptors
> that were statically compiled into the kernel, chances are you wouldn’t
> have the address space to hold enough RAM to source and sink
> data to the files in question, nor deal with the connections between
> the file stable and the file system.
>
> Dynamic allocation, and moving away from static limits, only came
> about later, as memory sizes grew. It was this dynamic that made
> Ken’s advice such a win in the hardware of the day.
>
> Warner
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20160323/bcca29bb/attachment.html>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Etymology of the open file table?
  2016-03-23 19:48         ` Dan Cross
@ 2016-03-23 20:17           ` John Cowan
  0 siblings, 0 replies; 10+ messages in thread
From: John Cowan @ 2016-03-23 20:17 UTC (permalink / raw)


Dan Cross scripsit:

> Thanks, all. I kind of figured it was something like that....

For general information on CTSS, the grandparent of Unix, see
<http://www.multicians.org/thvv/7094.html>.
Unfortunately, all it says about RUNCOM is:

    Louis Pouzin also invented RUNCOM for CTSS. This facility,
    the direct ancestor of the Unix shell script, allowed users
    to create a file-system file of commands to be executed, with
    parameter substitution. Louis also produced a design for the
    Multics shell, ancestor of the Unix shell.

That's a great site for everything Multics-related, and has a lot of
ancestral stuff that we've mostly heard about from the Bell Labs side.

-- 
John Cowan          http://www.ccil.org/~cowan        cowan at ccil.org
I am he that buries his friends alive and drowns them and draws them
alive again from the water. I came from the end of a bag, but no bag
went over me.  I am the friend of bears and the guest of eagles. I am
Ringwinner and Luckwearer; and I am Barrel-rider.  --Bilbo to Smaug


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-03-23 20:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-22  2:07 [TUHS] Etymology of the open file table? Dan Cross
2016-03-22  2:23 ` Marc Rochkind
2016-03-22  2:28   ` Larry McVoy
2016-03-22  2:56     ` Warren Toomey
2016-03-22  3:02     ` Dan Cross
2016-03-22  3:02   ` Dan Cross
2016-03-22  4:00     ` John Cowan
2016-03-22  4:11       ` Warner Losh
2016-03-23 19:48         ` Dan Cross
2016-03-23 20:17           ` John Cowan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).