9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] ls question
@ 2004-03-18 21:40 David Tolpin
  2004-03-18 21:58 ` Russ Cox
                   ` (2 more replies)
  0 siblings, 3 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-18 21:40 UTC (permalink / raw)
  To: 9fans


cpu% bind -b tmp tmp
cpu% ls tmp
sam.err
sam.err
cpu% ls tmp/sam.err
tmp/sam.err
cpu%

I understand why it is so, but is it how it should be?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 21:40 [9fans] ls question David Tolpin
@ 2004-03-18 21:58 ` Russ Cox
  2004-03-18 22:05   ` Russ Cox
  2004-03-18 21:59 ` [9fans] ls question David Presotto
  2004-03-18 22:05 ` matt
  2 siblings, 1 reply; 96+ messages in thread
From: Russ Cox @ 2004-03-18 21:58 UTC (permalink / raw)
  To: 9fans

David Tolpin wrote:

>cpu% bind -b tmp tmp
>cpu% ls tmp
>sam.err
>sam.err
>cpu% ls tmp/sam.err
>tmp/sam.err
>cpu%
>
>I understand why it is so, but is it how it should be?
>  
>

Yes.  Try this.

mkdir a b c d
 >a/f
 >b/f
bind a c
bind -a b c
bind a d
bind -b b d
ls -ln c
ls -ln d
ls -l c/f
ls -l d/f

Russ



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 21:40 [9fans] ls question David Tolpin
  2004-03-18 21:58 ` Russ Cox
@ 2004-03-18 21:59 ` David Presotto
  2004-03-18 22:05 ` matt
  2 siblings, 0 replies; 96+ messages in thread
From: David Presotto @ 2004-03-18 21:59 UTC (permalink / raw)
  To: 9fans

yes


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 22:05 ` matt
@ 2004-03-18 22:04   ` David Tolpin
  2004-03-18 22:08   ` boyd, rounin
  1 sibling, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-18 22:04 UTC (permalink / raw)
  To: 9fans

> what did you expect it to be ?

I would expect it to be

cpu% bind -b tmp tmp
cpu% lc -l tmp
sam.err
sam.err
cpu% lc -l tmp/sam.*
sam.err
sam.err
cpu% lc -l tmp/sam.err
sam.err
sam.err


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 21:40 [9fans] ls question David Tolpin
  2004-03-18 21:58 ` Russ Cox
  2004-03-18 21:59 ` [9fans] ls question David Presotto
@ 2004-03-18 22:05 ` matt
  2004-03-18 22:04   ` David Tolpin
  2004-03-18 22:08   ` boyd, rounin
  2 siblings, 2 replies; 96+ messages in thread
From: matt @ 2004-03-18 22:05 UTC (permalink / raw)
  To: 9fans


what did you expect it to be ?

if you understand it, what for the question?

ls -q will give you the qids too see if the files are identical

directories are just files so effectively ls is doing

for (directory in bindlist)
	cat $directory | format_directory_output


m


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 21:58 ` Russ Cox
@ 2004-03-18 22:05   ` Russ Cox
  2004-03-18 23:00     ` [9fans] ls, rc question David Tolpin
  0 siblings, 1 reply; 96+ messages in thread
From: Russ Cox @ 2004-03-18 22:05 UTC (permalink / raw)
  To: 9fans


> ls -ln c
> ls -ln d


this was supposed to be ls -lnq.  sigh.




^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls question
  2004-03-18 22:05 ` matt
  2004-03-18 22:04   ` David Tolpin
@ 2004-03-18 22:08   ` boyd, rounin
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-18 22:08 UTC (permalink / raw)
  To: 9fans

la question



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-18 22:05   ` Russ Cox
@ 2004-03-18 23:00     ` David Tolpin
  2004-03-18 23:31       ` [9fans] dirread David Tolpin
  2004-03-19  3:41       ` [9fans] ls, rc question rsc
  0 siblings, 2 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-18 23:00 UTC (permalink / raw)
  To: 9fans


I am not asking why there are two file names in the directory listing. I understand it.

I think that behaviours of rc and ls are confusing. I would either expect rc's globbing
to only match one path name, or ls to display all files.

man rc says 'A pattern is replaced by a list of arguments, one
for each path name matched'. Not for each directory entry. For
each path name. There is one path name /dev/user for two files.

cpu% ls -q /dev|grep user
(0000000000000015 0 00) /dev/user
(000000000000000c 0 00) /dev/user

Globbing expands this to two elements:

cpu% echo /dev/user*
/dev/user /dev/user

I think this is not what the manual says. 

Then,

cpu% ls -q /dev/user*
(0000000000000015 0 00) /dev/user
(0000000000000015 0 00) /dev/user

I understand why it is so. But I think that consistency is sacrificed
for simplicity. Either rc should do what the manual says and return
one match for /dev/user* (so that /dev/user* and /dev/user is the same
thing), or ls -q /dev/user should recursively list all directory
entries matching the path name.

David Tolpin


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] dirread
  2004-03-18 23:00     ` [9fans] ls, rc question David Tolpin
@ 2004-03-18 23:31       ` David Tolpin
  2004-03-18 23:49         ` ron minnich
                           ` (2 more replies)
  2004-03-19  3:41       ` [9fans] ls, rc question rsc
  1 sibling, 3 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-18 23:31 UTC (permalink / raw)
  To: 9fans


man 2 dirread

:...
:The data returned by a read(2) on a directory is a set of complete directory
:entries in a machine-independent format, exactly equivalent to the result of
:a stat(2) on each file or subdirectory in the directory.

But it is not so. stat(2) on each file in the directory, when
there are multiple files with identical names, will return
data for the first entry only.

What am I missing?

David


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] dirread
  2004-03-18 23:31       ` [9fans] dirread David Tolpin
@ 2004-03-18 23:49         ` ron minnich
  2004-03-19  0:14         ` boyd, rounin
  2004-03-19  3:38         ` rsc
  2 siblings, 0 replies; 96+ messages in thread
From: ron minnich @ 2004-03-18 23:49 UTC (permalink / raw)
  To: 9fans

On Fri, 19 Mar 2004, David Tolpin wrote:

> 
> man 2 dirread
> 
> :...
> :The data returned by a read(2) on a directory is a set of complete directory
> :entries in a machine-independent format, exactly equivalent to the result of
> :a stat(2) on each file or subdirectory in the directory.
> 
> But it is not so. stat(2) on each file in the directory, when
> there are multiple files with identical names, will return
> data for the first entry only.

Let's see how much more I can get wrong. 

Pretend you have a union of /tmp and /tmp2 all bound onto /tmp. /tmp and 
/tmp2 have a file a in them. 

You read the dir and you get the union. So that sort of makes sense. 
You'll see all the /tmp/a twice.

You stat the entry, and the pathname resolutions rules apply, so even if 
there are multiple /tmp/a files, the stat of /tmp/a always resolves to 
"the one on top", thus you only get that stat info.

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] dirread
  2004-03-18 23:31       ` [9fans] dirread David Tolpin
  2004-03-18 23:49         ` ron minnich
@ 2004-03-19  0:14         ` boyd, rounin
  2004-03-19  3:38         ` rsc
  2 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  0:14 UTC (permalink / raw)
  To: 9fans

> What am I missing?

lots ...

http://plan9.bell-labs.com/sys/doc/lexnames.html



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] dirread
  2004-03-18 23:31       ` [9fans] dirread David Tolpin
  2004-03-18 23:49         ` ron minnich
  2004-03-19  0:14         ` boyd, rounin
@ 2004-03-19  3:38         ` rsc
  2 siblings, 0 replies; 96+ messages in thread
From: rsc @ 2004-03-19  3:38 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 216 bytes --]

the data for the file you cannot get at is
exactly equivalent to the result of stat(2)
if only you could get at it, except that your
name space is keeping you from doing that.
but it's still there.  ;-)

russ

[-- Attachment #2: Type: message/rfc822, Size: 2753 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] dirread
Date: Fri, 19 Mar 2004 03:31:37 +0400 (AMT)
Message-ID: <200403182331.i2INVbsA096235@adat.davidashen.net>


man 2 dirread

:...
:The data returned by a read(2) on a directory is a set of complete directory
:entries in a machine-independent format, exactly equivalent to the result of
:a stat(2) on each file or subdirectory in the directory.

But it is not so. stat(2) on each file in the directory, when
there are multiple files with identical names, will return
data for the first entry only.

What am I missing?

David

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-18 23:00     ` [9fans] ls, rc question David Tolpin
  2004-03-18 23:31       ` [9fans] dirread David Tolpin
@ 2004-03-19  3:41       ` rsc
  2004-03-19  5:32         ` David Tolpin
  1 sibling, 1 reply; 96+ messages in thread
From: rsc @ 2004-03-19  3:41 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 355 bytes --]

if you want, you can think of this as a bug in read(2):
if a file is covered up, it should be completely hidden.
it turns out that actually implementing this is quite hard,
that it doesn't really break much (you've found about
all of it!) to leave it as is, and that it results in things
like /bin/ls giving interesting information about the union.

[-- Attachment #2: Type: message/rfc822, Size: 3410 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question
Date: Fri, 19 Mar 2004 03:00:51 +0400 (AMT)
Message-ID: <200403182300.i2IN0p5T095713@adat.davidashen.net>


I am not asking why there are two file names in the directory listing. I understand it.

I think that behaviours of rc and ls are confusing. I would either expect rc's globbing
to only match one path name, or ls to display all files.

man rc says 'A pattern is replaced by a list of arguments, one
for each path name matched'. Not for each directory entry. For
each path name. There is one path name /dev/user for two files.

cpu% ls -q /dev|grep user
(0000000000000015 0 00) /dev/user
(000000000000000c 0 00) /dev/user

Globbing expands this to two elements:

cpu% echo /dev/user*
/dev/user /dev/user

I think this is not what the manual says. 

Then,

cpu% ls -q /dev/user*
(0000000000000015 0 00) /dev/user
(0000000000000015 0 00) /dev/user

I understand why it is so. But I think that consistency is sacrificed
for simplicity. Either rc should do what the manual says and return
one match for /dev/user* (so that /dev/user* and /dev/user is the same
thing), or ls -q /dev/user should recursively list all directory
entries matching the path name.

David Tolpin

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  3:41       ` [9fans] ls, rc question rsc
@ 2004-03-19  5:32         ` David Tolpin
  2004-03-19  5:45           ` boyd, rounin
                             ` (2 more replies)
  0 siblings, 3 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  5:32 UTC (permalink / raw)
  To: 9fans

> if you want, you can think of this as a bug in read(2):

In fact, I think both read(2) and stat(2) are OK.

The bug (or just inexact wording in the manual for read(2)).

But then either globbing should work differently (and looking
at the code in rc, i think it is easy to make it work as with union);
or the manual for rc should be changed as well.

Besides, then why bind(,,MBEFORE|MAFTER) is said to create a 'union' 
then? It is very confusing too. It does not. It creates a list, 
and name-oriented calls ([a-z]*stat, open) are restricted to only retrieve 
the first element in the list. Other calls can still traverse the list.


> that it doesn't really break much (you've found about
> all of it!) to leave it as is, and that it results in things
> like /bin/ls giving interesting information about the union.

Do you think that rc globbing returning as many identical words
as there are directory entries with the same name is in any
way useful too?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  5:32         ` David Tolpin
@ 2004-03-19  5:45           ` boyd, rounin
  2004-03-19  5:50           ` ron minnich
  2004-03-19  7:01           ` [9fans] ls, rc question Micah Stetson
  2 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  5:45 UTC (permalink / raw)
  To: 9fans

> From: "David Tolpin" <dvd@davidashen.net>

got a problem with simple graphs?



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  5:32         ` David Tolpin
  2004-03-19  5:45           ` boyd, rounin
@ 2004-03-19  5:50           ` ron minnich
  2004-03-19  6:45             ` boyd, rounin
  2004-03-19  9:07             ` Charles Forsyth
  2004-03-19  7:01           ` [9fans] ls, rc question Micah Stetson
  2 siblings, 2 replies; 96+ messages in thread
From: ron minnich @ 2004-03-19  5:50 UTC (permalink / raw)
  To: 9fans


Well, I guess it's confusing the way some union stuff works, but once you
start trying to do things with unions to make them less weird, it can get
ugly fast, see this from a bsd list:

"  Perhaps one situation would be, modify a lower-layer file so that the
modified file now exists in the upper layer, with the original still in
the lower layer.  Then try to delete the file. The upper-layer version
will be deleted, but we don't want the lower-layer version to now start
showing up, because we 'delete'd it.  We also don't want to delete the
lower-layer version, because it's a union filesystem, and that's one of
the features of a union mount -- the lower layer is not allowed to change
(much).  So, cover up the lower-layer version with 'whiteout' in the upper
layer so that it becomes not visible to the union-mount, but still exists
via non-union-mount access. "

Also see rm -W on bsd. 

Or how about this: "Running find(1) over a union tree has the side-effect 
of creating a tree of shadow directories in the upper layer."

BSD does do this: "In this example, /b is layered over an empty directory
/mnt, then /a is layered over the new view of /mnt.  Duplicate names are
suppressed so that only one occurrence of `x' (and also `.' and `..')  
will appear." (see
http://www.usenix.org/publications/library/proceedings/neworl/full_papers/mckusick.a)

Seems once you start to try to get smart there's no end to it.

I've had it put to me that union mounts would so hopelessly confuse users
that they can't be used. The first time I showed them to a quite smart CS
person she blanched a bit -- "how can anybody use that given you don't
really know what those two files are?" And here I was showing it off --
"see how you can build kernels outside the kernel tree". The only reaction
was: "yeah but having two pccpu files show up in ls is hopelessly
confusing. " I had to disagree :-)

The thing is, reads from a union directory do in fact give you the union. 
But what would you do for stat etc? I'm not sure you could do much 
different that has been done without really making the kernel pretty 
dirty (see above).

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  5:50           ` ron minnich
@ 2004-03-19  6:45             ` boyd, rounin
  2004-03-19  9:07             ` Charles Forsyth
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  6:45 UTC (permalink / raw)
  To: 9fans

yeah, you could get yourself into a real bind ...



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  5:32         ` David Tolpin
  2004-03-19  5:45           ` boyd, rounin
  2004-03-19  5:50           ` ron minnich
@ 2004-03-19  7:01           ` Micah Stetson
  2004-03-19  7:57             ` [9fans] ls, rc question -- proposed change to rc/glob.c David Tolpin
  2 siblings, 1 reply; 96+ messages in thread
From: Micah Stetson @ 2004-03-19  7:01 UTC (permalink / raw)
  To: 9fans

> Do you think that rc globbing returning as many identical words
> as there are directory entries with the same name is in any
> way useful too?

Of course!  Let's suppose you have a multipage document
that you want to print multiple copies of, but you want
all the copies of a single page to print together.  Well,
just split the document into one file per page and then
run 'bind -a . .' followed by 'lp *'.  How cool is that?
This example should make it clear that lp -c is a violation
of the tools approach.

Micah

(For the more serious among us, this is all 'tongue in cheek'.)



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  7:01           ` [9fans] ls, rc question Micah Stetson
@ 2004-03-19  7:57             ` David Tolpin
  2004-03-19  8:13               ` Rob Pike
  0 siblings, 1 reply; 96+ messages in thread
From: David Tolpin @ 2004-03-19  7:57 UTC (permalink / raw)
  To: 9fans

> > Do you think that rc globbing returning as many identical words
> > as there are directory entries with the same name is in any
> > way useful too?
>
> Of course!  Let's suppose you have a multipage document
> that you want to print multiple copies of, but you want
> all the copies of a single page to print together.  Well,

Since I don't have a printer, here is a proposed change to rc/glob.c

cpu% diff /sys/src/cmd/rc/glob.c glob.c
31c31
<       word *a;
---
>       word *a, *b;
37c37,47
<       for(a = left,n = 0;a!=right;a = a->next,n++) a->word = list[n];
---
>       for(a = left,n = 0;a!=right;n++){
>               if(a->next!=right && globcmp(list+n,list+n+1)==0){
>                       b = a->next;
>                       a->next = b->next;
>                       efree((char *)list[n]);
>                       efree((char *)b);
>               }else{
>                       a->word = list[n];
>                       a = a->next;
>               }
>       }

Since rc sorts results of globbing expansion, it makes perfect
sense to only keep only one occurence of each path name. It bring
rc in synchronization with the manual; besides, repeated names
cannot be used for access to the files they are generated for so
they should be omitted.

I am not supplying it as  a patch because I actually want to know
opinions whether it makes sense.

David Tolpin



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  7:57             ` [9fans] ls, rc question -- proposed change to rc/glob.c David Tolpin
@ 2004-03-19  8:13               ` Rob Pike
  2004-03-19  8:18                 ` David Tolpin
                                   ` (3 more replies)
  0 siblings, 4 replies; 96+ messages in thread
From: Rob Pike @ 2004-03-19  8:13 UTC (permalink / raw)
  To: 9fans

> Since I don't have a printer, here is a proposed change to rc/glob.c

i don't like changing the shell to mask kernel behavior. it leads
to surprises.  consider unix shells that `fix' cd in a way that it
differs from the chdir system call.

i say leave it.  the current behavior doesn't break much and as
rsc says it can even be informative.

-rob



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:13               ` Rob Pike
@ 2004-03-19  8:18                 ` David Tolpin
  2004-03-19  8:24                   ` David Tolpin
  2004-03-19  8:27                   ` Rob Pike
  2004-03-19  8:31                 ` Richard Miller
                                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  8:18 UTC (permalink / raw)
  To: 9fans

> > Since I don't have a printer, here is a proposed change to rc/glob.c
>
> i don't like changing the shell to mask kernel behavior. it leads
> to surprises.  consider unix shells that `fix' cd in a way that it
> differs from the chdir system call.

1. There is no kernel behavior in glob.  It is just a pass over
a list, with results sorted after the pass.

2. The changes I propose make it conforming to the manual.

> i say leave it.  the current behavior doesn't break much and as
> rsc says it can even be informative.

It cannot. ls -q /dev/user* will not give information about 
two different user files. It will display the first /dev/user twice. 

There is a difference between

ls /dev |grep user

ls /dev/user*

ls /dev/user

While bringing two elements is informative in the first case (and I
do not propose to change it), in the second case it is not. It cannot
be used.

David


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:18                 ` David Tolpin
@ 2004-03-19  8:24                   ` David Tolpin
  2004-03-19  8:27                   ` Rob Pike
  1 sibling, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  8:24 UTC (permalink / raw)
  To: 9fans

If 'echo /dev/user*' returns two-element list (if there are two files there),
then 'echo /dev/user' should return a two-element list too, if one cares
about kernel semantics in shell behavior.

If, however, it is just globbing, both should return one element.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:18                 ` David Tolpin
  2004-03-19  8:24                   ` David Tolpin
@ 2004-03-19  8:27                   ` Rob Pike
  2004-03-19  8:52                     ` David Tolpin
  2004-03-19  9:16                     ` Richard Miller
  1 sibling, 2 replies; 96+ messages in thread
From: Rob Pike @ 2004-03-19  8:27 UTC (permalink / raw)
  To: 9fans



> 1. There is no kernel behavior in glob.  It is just a pass over
> a list, with results sorted after the pass.

that's sophistry.  the data is generated by a read system call.
you're proposing making the shell differ from every other
program that reads directories.  that seems like a bad idea.

this is not a bug, it's a feature. i'd rather fix the manual than
add special code to applications.

-rob



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:13               ` Rob Pike
  2004-03-19  8:18                 ` David Tolpin
@ 2004-03-19  8:31                 ` Richard Miller
  2004-03-19  8:47                   ` Geoff Collyer
  2004-03-19  9:07                   ` Rob Pike
  2004-03-19  8:35                 ` boyd, rounin
  2004-03-19 14:19                 ` Russ Cox
  3 siblings, 2 replies; 96+ messages in thread
From: Richard Miller @ 2004-03-19  8:31 UTC (permalink / raw)
  To: 9fans

> i don't like changing the shell to mask kernel behavior. it leads
> to surprises.  consider unix shells that `fix' cd in a way that it
> differs from the chdir system call.

But filename pattern matching in the shell is already different from
just reading a directory in the kernel.  By saying 'ls /tmp/*.c'
instead of 'ls /tmp' you are asking for a selection of names.  The
question is how many times to select a name which matches more than
one file.

As David has pointed out, the rc manual at present unambiguously says
"A pattern is replaced by a list of arguments, one for each path
name matched" -- not "one for each file matched".  So if rc isn't
changed, the manual does need to be corrected.

But I would vote for changing rc.  Although I am (after some years
of experience) sufficiently used to unions that I don't find the
behaviour surprising, I do find it irritating: for example, when
doing 'grep XXX *.c' in a union directory produces duplicate output
for files which occur in more than one underlying directory.

-- Richard



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:13               ` Rob Pike
  2004-03-19  8:18                 ` David Tolpin
  2004-03-19  8:31                 ` Richard Miller
@ 2004-03-19  8:35                 ` boyd, rounin
  2004-03-19 14:19                 ` Russ Cox
  3 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  8:35 UTC (permalink / raw)
  To: 9fans

> i say leave it.  the current behavior doesn't break much and as
> rsc says it can even be informative.

yup, Leave it Alone -- Living Colour



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:31                 ` Richard Miller
@ 2004-03-19  8:47                   ` Geoff Collyer
  2004-03-19  9:07                   ` Rob Pike
  1 sibling, 0 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-19  8:47 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 395 bytes --]

But in the case of a union directory with duplicate names, potentially
representing distinct files, there are multiple names, not all of them
distinct, in the directory.  Since rc(1) doesn't say ``one for each
distinct [or unique] path name matched'', one could argue that neither
code nor manual needs change.

If it's really a problem,

	fn nodup {ls -d $*|uniq}
	pr `{nodup *} | lp

[-- Attachment #2: Type: message/rfc822, Size: 2826 bytes --]

From: Richard Miller <rm@hamnavoe.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
Date: Fri, 19 Mar 2004 08:31:25 0000
Message-ID: <50440c6f4b6946137968d772dcf08a2a@hamnavoe.com>

> i don't like changing the shell to mask kernel behavior. it leads
> to surprises.  consider unix shells that `fix' cd in a way that it
> differs from the chdir system call.

But filename pattern matching in the shell is already different from
just reading a directory in the kernel.  By saying 'ls /tmp/*.c'
instead of 'ls /tmp' you are asking for a selection of names.  The
question is how many times to select a name which matches more than
one file.

As David has pointed out, the rc manual at present unambiguously says
"A pattern is replaced by a list of arguments, one for each path
name matched" -- not "one for each file matched".  So if rc isn't
changed, the manual does need to be corrected.

But I would vote for changing rc.  Although I am (after some years
of experience) sufficiently used to unions that I don't find the
behaviour surprising, I do find it irritating: for example, when
doing 'grep XXX *.c' in a union directory produces duplicate output
for files which occur in more than one underlying directory.

-- Richard

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:27                   ` Rob Pike
@ 2004-03-19  8:52                     ` David Tolpin
  2004-03-19  9:16                     ` Richard Miller
  1 sibling, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  8:52 UTC (permalink / raw)
  To: 9fans

> > 1. There is no kernel behavior in glob.  It is just a pass over
> > a list, with results sorted after the pass.
>
> that's sophistry.  the data is generated by a read system call.
> you're proposing making the shell differ from every other
> program that reads directories.  that seems like a bad idea.

This is wrong. 

The data is generated by glob/globsort. The 'read' (in case of Plan9)
system call is buried deep into system-dependent implementations for
Unix, Plan9 and win32. Each using native calls of the underlying
operating system.

The semantics of globbing is (correctly) defined in 
a system-independent manner,  stating that each globbing result
is delivered once.

>
> this is not a bug, it's a feature. i'd rather fix the manual than
> add special code to applications.

Suppose I want to edit glob.c in rc.c. I am copying rc.c to $home/src/rc/.
and then delete definition of globsort from the file.  (This is
what I actually do when I edit system sources).

cpu% mkdir $home/src/rc
cpu% cp /sys/src/cmd/rc/glob.c $home/src/rc/.
cpu% bind -bc $home/src/rc /sys/src/cmd/rc
cpu% cd /sys/src/cmd/rc
cpu% mk
cpu% sam glob.c # delete definition of globsort
cpu% grep '^globsort' *.c
glob.c:globsort(word *left, word *right)
cpu% grep -n '^globsort' glob.c
cpu%

This is a bug.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  5:50           ` ron minnich
  2004-03-19  6:45             ` boyd, rounin
@ 2004-03-19  9:07             ` Charles Forsyth
  2004-03-19  9:24               ` Richard Miller
  2004-03-19  9:38               ` [9fans] Bind, look, everything is duplicated David Tolpin
  1 sibling, 2 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19  9:07 UTC (permalink / raw)
  To: 9fans

>>The thing is, reads from a union directory do in fact give you the union. 

i think the argument is provoked by the assumption that it's a union of sets, preventing duplicates,
but the system happens to work with bags, not sets, where union can (must) allow duplicates.
i'd agree it's sometimes annoying but it's both explicable and understandable,
and (i'd say) not theoretically indefensible.   at a practical level, making the duplicates
visible happens to make it easy to illustrate (bind a directory to itself:
look, everything in it is duplicated).



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:31                 ` Richard Miller
  2004-03-19  8:47                   ` Geoff Collyer
@ 2004-03-19  9:07                   ` Rob Pike
  2004-03-19  9:34                     ` David Tolpin
                                       ` (2 more replies)
  1 sibling, 3 replies; 96+ messages in thread
From: Rob Pike @ 2004-03-19  9:07 UTC (permalink / raw)
  To: 9fans

are you planning to change the other shells too? what about every other 
program
that reads directories?  what does '*' really mean? you're making 
assumptions
about what's right and i don't think it's clear what's right and when 
things aren't
clear i say leave them alone until they clear up. after something like 
15 years
they still haven't cleared up.

i'm not saying the behavior is right, but i'm not admitting it's wrong, 
either.
moreover, i can explain everything that's going on by applying a very 
simple
model that is clear from the documentation. that's worth a lot.

-rob



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:27                   ` Rob Pike
  2004-03-19  8:52                     ` David Tolpin
@ 2004-03-19  9:16                     ` Richard Miller
  2004-03-19  9:29                       ` boyd, rounin
  2004-03-19  9:41                       ` Geoff Collyer
  1 sibling, 2 replies; 96+ messages in thread
From: Richard Miller @ 2004-03-19  9:16 UTC (permalink / raw)
  To: 9fans

> you're proposing making the shell differ from every other
> program that reads directories.  that seems like a bad idea.

The shell is the only program which applies a regular expression
match to the directory entries as it reads them.  If this is a
bad idea, it has a respectable history:

"Putting this expansion mechanism into the shell has several advantages:
the code only appears once, so no space is wasted and commands in general
need take no special action; the algorithm is certain to be applied
uniformly."  (D.M.Ritchie, "Unix Time-Sharing System: A Retrospective",
BSTJ July-Aug 1978)



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:07             ` Charles Forsyth
@ 2004-03-19  9:24               ` Richard Miller
  2004-03-19  9:33                 ` boyd, rounin
  2004-03-19 10:03                 ` Charles Forsyth
  2004-03-19  9:38               ` [9fans] Bind, look, everything is duplicated David Tolpin
  1 sibling, 2 replies; 96+ messages in thread
From: Richard Miller @ 2004-03-19  9:24 UTC (permalink / raw)
  To: 9fans

>>>The thing is, reads from a union directory do in fact give you the union. 
> 
> i think the argument is provoked by the assumption that it's a union of sets, preventing duplicates,
> but the system happens to work with bags, not sets, where union can (must) allow duplicates.

Sorry, David is correct on this point.  Sets and bags are both
unordered, and union is therefore commutative.  But directories are
ordered -- the ordering is essential to resolve the ambiguity of
duplicate names.  So the accurate mathematical model is a sequence
(aka list), and bind is not union of sets or bags, but append of
sequences -- a non-commutative operator, hence the need to specify
MBEFORE or MAFTER.

-- Richard



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:16                     ` Richard Miller
@ 2004-03-19  9:29                       ` boyd, rounin
  2004-03-19  9:41                       ` Geoff Collyer
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  9:29 UTC (permalink / raw)
  To: 9fans

> From: "Richard Miller" <rm@hamnavoe.com>

Violation of protocol can result in permanent data loss -- RX02 manual




^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:24               ` Richard Miller
@ 2004-03-19  9:33                 ` boyd, rounin
  2004-03-19  9:39                   ` Richard Miller
  2004-03-19 10:11                   ` Richard Miller
  2004-03-19 10:03                 ` Charles Forsyth
  1 sibling, 2 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19  9:33 UTC (permalink / raw)
  To: 9fans

> Sorry, David is correct on this point.

BULLSHIT

hiding things will only wind up in nasty suprises, further down the track
...



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:07                   ` Rob Pike
@ 2004-03-19  9:34                     ` David Tolpin
  2004-03-19  9:52                     ` Scott Schwartz
  2004-03-19 14:13                     ` Russ Cox
  2 siblings, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  9:34 UTC (permalink / raw)
  To: 9fans

> are you planning to change the other shells too? 

Yes for all which require to do so.  Are there other documented
shells for Plan9?

> what about every other program that reads directories?  

No, I do not. I am not proposing to change the behavior of dirread().

> what does '*' really mean? 

The meaning is defined in manual for rc.

you're making 
> assumptions
> about what's right and i don't think it's clear what's right 

I just read the manual.

> i'm not saying the behavior is right, but i'm not admitting it's wrong, 
> either.
> moreover, i can explain everything that's going on by applying a very 
> simple
> model that is clear from the documentation. that's worth a lot.

For every problem there always exists a simple, easy to understand,
incorrect solution.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* [9fans] Bind, look, everything is duplicated
  2004-03-19  9:07             ` Charles Forsyth
  2004-03-19  9:24               ` Richard Miller
@ 2004-03-19  9:38               ` David Tolpin
  1 sibling, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19  9:38 UTC (permalink / raw)
  To: 9fans

> visible happens to make it easy to illustrate (bind a directory to itself:
> look, everything in it is duplicated).

What I am proposing does not change this behaviour. Bind a directory
to itself, look (ls), everything is still duplicated. 

I do not propose a change in the behaviour of read().

I do propose a change that fixes a bug in rc's globbing routine.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:33                 ` boyd, rounin
@ 2004-03-19  9:39                   ` Richard Miller
  2004-03-19  9:46                     ` Geoff Collyer
  2004-03-19 10:11                   ` Richard Miller
  1 sibling, 1 reply; 96+ messages in thread
From: Richard Miller @ 2004-03-19  9:39 UTC (permalink / raw)
  To: 9fans

> BULLSHIT

A soundly reasoned mathematical argument, very impressive.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:16                     ` Richard Miller
  2004-03-19  9:29                       ` boyd, rounin
@ 2004-03-19  9:41                       ` Geoff Collyer
  2004-03-19 10:09                         ` boyd, rounin
  2004-03-19 10:50                         ` Geoff Collyer
  1 sibling, 2 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-19  9:41 UTC (permalink / raw)
  To: 9fans

But dmr's comment is talking about wildcard expansion ("filename
generation" in sh-speak), not directory-reading.  And his argument
does make sense in that context.  At minimum, you'd need to modify rc
and ape/sh to suppress duplicates.

Programs other than the shells read directories without expanding
wildcards, and if not also changed would behave inconsistently with
the shells.  They are at least: acme, bitsy/keyboard, cron, du,
exportfs, faces, gzip, listen, mk, mkfs, mothra, replica programs,
srvold9p, tar, vac, and winwatch.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:39                   ` Richard Miller
@ 2004-03-19  9:46                     ` Geoff Collyer
  0 siblings, 0 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-19  9:46 UTC (permalink / raw)
  To: 9fans

Mathematics is a powerful tool, but not appropriate for solving all
problems.  I think that the things that made Unix and now Plan 9 most
appealing are matters of judgement and taste.  Obviously hard
technical work has also gone into those systems, but I don't believe
that that is enough to produce a compelling operating system.  I
wouldn't want to try to justify the creation or use of pipes or 9p or
namespaces or user-mode file servers solely using mathematics.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:07                   ` Rob Pike
  2004-03-19  9:34                     ` David Tolpin
@ 2004-03-19  9:52                     ` Scott Schwartz
  2004-03-19 14:42                       ` ron minnich
  2004-03-19 14:13                     ` Russ Cox
  2 siblings, 1 reply; 96+ messages in thread
From: Scott Schwartz @ 2004-03-19  9:52 UTC (permalink / raw)
  To: 9fans

> are you planning to change the other shells too?

I think having a standard glob routine that all shells (and other
programs!) used would be desirable.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:24               ` Richard Miller
  2004-03-19  9:33                 ` boyd, rounin
@ 2004-03-19 10:03                 ` Charles Forsyth
  1 sibling, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 10:03 UTC (permalink / raw)
  To: 9fans

>>duplicate names.  So the accurate mathematical model is a sequence
>>(aka list), and bind is not union of sets or bags, but append of

you're right.  i was trying too hard to justify the term `union' (or not hard enough,
depending on how you look at it).  i've seen it argued elsewhere that union
directories ought to have used another term, for much these reasons.
still, they seem to work whatever you call them.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:41                       ` Geoff Collyer
@ 2004-03-19 10:09                         ` boyd, rounin
  2004-03-19 10:50                         ` Geoff Collyer
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19 10:09 UTC (permalink / raw)
  To: 9fans

> Programs other than the shells read directories without expanding
> wildcards, and if not also changed would behave inconsistently with
> the shells.

roll on VMS ...



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19  9:33                 ` boyd, rounin
  2004-03-19  9:39                   ` Richard Miller
@ 2004-03-19 10:11                   ` Richard Miller
  2004-03-19 10:42                     ` Charles Forsyth
  1 sibling, 1 reply; 96+ messages in thread
From: Richard Miller @ 2004-03-19 10:11 UTC (permalink / raw)
  To: 9fans

I said:
> Sorry, David is correct on this point.

<boyd@insultant.net> said:
> BULLSHIT

By "this point", I meant the observation that bind(..MAFTER|MBEFORE)
doesn't produce a union in the mathematical sense, but a list.  If
someone finds a reference anywhere in the mathematical literature to a
union operator which is non-commutative, I will happily retract my claim.

<geoff@collyer.net> said:
> Mathematics is a powerful tool, but not appropriate for solving all
> problems. ...

Agreed.  ("A mathematician, a physicist, and an engineer checked into
a hotel one night ... " No, you've all heard that one already.)

But when we use mathematical metaphors to explain the behaviour of
engineering artefacts, choosing a not-quite-right metaphor can
cause confusion.  As in this case.

-- Richard



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question
  2004-03-19 10:11                   ` Richard Miller
@ 2004-03-19 10:42                     ` Charles Forsyth
  0 siblings, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 10:42 UTC (permalink / raw)
  To: 9fans

>>By "this point", I meant the observation that bind(..MAFTER|MBEFORE)
>>doesn't produce a union in the mathematical sense, but a list.  If
>>someone finds a reference anywhere in the mathematical literature to a
>>union operator which is non-commutative, I will happily retract my claim.

a non-commutative union does occur when working with certain constructions
on trees (funnily enough) [for much the same reason that addition and
multiplication aren't always commutative] but i haven't got a good reference.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:41                       ` Geoff Collyer
  2004-03-19 10:09                         ` boyd, rounin
@ 2004-03-19 10:50                         ` Geoff Collyer
  2004-03-19 11:12                           ` David Tolpin
  2004-03-22 22:56                           ` rog
  1 sibling, 2 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-19 10:50 UTC (permalink / raw)
  To: 9fans

I thought my list of directory-reading programs was too short; I
forgot to grep for dirreadall also (not having ls on the list should
have been a tip-off).  It should also include: aux/depend, diff, ftpd,
history, kbmap, ls, mkpaqfs, netstat, news, pptpd, ps, rm, scp.  Not
too surprising on a system built on file servers.

The problem isn't so much breaking things as the inconsistent
behaviour that would result from suppressing duplicates only in some
programs.  This would be especially confusing to new users.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 10:50                         ` Geoff Collyer
@ 2004-03-19 11:12                           ` David Tolpin
  2004-03-19 12:31                             ` Charles Forsyth
  2004-03-19 13:59                             ` David Presotto
  2004-03-22 22:56                           ` rog
  1 sibling, 2 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19 11:12 UTC (permalink / raw)
  To: 9fans

> From 9fans-admin@cse.psu.edu  Fri Mar 19 14:50:31 2004
> To: 9fans@cse.psu.edu
> Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
> From: Geoff Collyer <geoff@collyer.net>
> Content-Type: text/plain; charset="US-ASCII"
> Date: Fri, 19 Mar 2004 02:50:50 -0800
>
> I thought my list of directory-reading programs was too short; I
> forgot to grep for dirreadall also (not having ls on the list should
> have been a tip-off).  It should also include: aux/depend, diff, ftpd,
> history, kbmap, ls, mkpaqfs, netstat, news, pptpd, ps, rm, scp.  Not
> too surprising on a system built on file servers.

Which of these programs uses globbing?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 11:12                           ` David Tolpin
@ 2004-03-19 12:31                             ` Charles Forsyth
  2004-03-19 12:53                               ` boyd, rounin
  2004-03-19 13:59                             ` David Presotto
  1 sibling, 1 reply; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 12:31 UTC (permalink / raw)
  To: 9fans

i'm neutral about the whole thing: i think it's fine as it is,
but woudn't fret too much if glob changed.  as it happens, i discovered only last year
that roger changed Inferno's Filepat (rhymes with cow pat) four
years ago to return a result that had the names sorted and unique
(it just required invoking an option in a lower-level routine).
i probably wouldn't have done it myself (partly for reasons below), but
on a list of changes that would end the world, it's fairly low on my list,
and i doubt it would cause much more trouble in Plan 9.
in practice, though, i find it makes very little difference, and i use both systems of course.

one reason might be that (in my experience) union mounts aren't much used in the small,
so there aren't a lot of them, and those there are tend to be used
to set up a larger structure (eg, /bin, /dev, /net, service directories) or to provide limited,
judicious replacement of one thing by another.  thus, except when
initially experimenting with the mechanism on seeing the system for the first time,
or testing things after changing the implementation, i rarely interact with union
directories through the user interface or scripts to the extent that i need worry in practice
about whether pattern matching returns duplicates or not for lp (say).
thus, i don't really care too much about the duplication effect.

furthermore, if worrying about pattern matching returning duplicates for a directory,
i'd have thought in practice it might be equally troublesome for the use of lp (say) that
	lp a*
would no longer produce duplicates, which is splendid, but
	lp a* *a* q/a*.[ch] q/*.[ch]
would of course still produce duplicates in general (as a simple example to
illustrate the point, there are probably more compelling ones).
you could therefore make good use of a nodup function (as geoff suggested) in general anyhow.
(its surrounding `{} limits the scope of the nodup operation.)
thus, you could just as well use it for the special case of duplicate names
possibly produced by a union directory.

i don't think union mount is particularly good at creating variant directory
structures for software development, because of the white-out and other problems
already mentioned; for that i suspect you want a different definition of union,
more like union of whole directory trees, and it seems both better and possible
to supply that through a user-level file server that can be precisely configured,
rather than putting too much load on the existing simple bind/union operations
in the kernel that are quite effective within their own scope.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 12:31                             ` Charles Forsyth
@ 2004-03-19 12:53                               ` boyd, rounin
  0 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19 12:53 UTC (permalink / raw)
  To: 9fans

cheating with metacharacters will get you into woe,
union moints or not.  type ls first.

the amount of code required to fix this peculiar, but
extremely powerful, end case flies in the face of
good taste, good design and minimalism.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 11:12                           ` David Tolpin
  2004-03-19 12:31                             ` Charles Forsyth
@ 2004-03-19 13:59                             ` David Presotto
  2004-03-19 14:44                               ` David Tolpin
  2004-03-19 20:31                               ` Geoff Collyer
  1 sibling, 2 replies; 96+ messages in thread
From: David Presotto @ 2004-03-19 13:59 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 52 bytes --]

None, they just walk through the directory they get.

[-- Attachment #2: Type: message/rfc822, Size: 2733 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
Date: Fri, 19 Mar 2004 15:12:54 +0400 (AMT)
Message-ID: <200403191112.i2JBCsTK000660@adat.davidashen.net>

> From 9fans-admin@cse.psu.edu  Fri Mar 19 14:50:31 2004
> To: 9fans@cse.psu.edu
> Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
> From: Geoff Collyer <geoff@collyer.net>
> Content-Type: text/plain; charset="US-ASCII"
> Date: Fri, 19 Mar 2004 02:50:50 -0800
>
> I thought my list of directory-reading programs was too short; I
> forgot to grep for dirreadall also (not having ls on the list should
> have been a tip-off).  It should also include: aux/depend, diff, ftpd,
> history, kbmap, ls, mkpaqfs, netstat, news, pptpd, ps, rm, scp.  Not
> too surprising on a system built on file servers.

Which of these programs uses globbi ng?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:07                   ` Rob Pike
  2004-03-19  9:34                     ` David Tolpin
  2004-03-19  9:52                     ` Scott Schwartz
@ 2004-03-19 14:13                     ` Russ Cox
  2004-03-19 14:37                       ` David Tolpin
  2 siblings, 1 reply; 96+ messages in thread
From: Russ Cox @ 2004-03-19 14:13 UTC (permalink / raw)
  To: 9fans

Rob Pike wrote:

> are you planning to change the other shells too? what about every 
> other program
> that reads directories?  what does '*' really mean? you're making 
> assumptions
> about what's right and i don't think it's clear what's right and when 
> things aren't
> clear i say leave them alone until they clear up. after something like 
> 15 years
> they still haven't cleared up.
>
> i'm not saying the behavior is right, but i'm not admitting it's 
> wrong, either.
> moreover, i can explain everything that's going on by applying a very 
> simple
> model that is clear from the documentation. that's worth a lot.


continuing this...

note how much more complicated the model gets if rc and ls
(or perhaps dirreadall) start removing duplicates from the list.
right now, it took you no time at all to find the existence of duplicates,
and then once you realize they exist, you notice them everywhere.

if the bulk of directory-processing programs remove duplicates
for themselves, then some day you run across a program that
doesn't (either out of neglect or because it doesn't fit the way the
program does directory processing).  now you know duplicates
exist again.  but nothing makes sense -- why haven't you noticed
before?  why do some programs show them and others not?
the explanation isn't simple or regular or predictable -- some do, some 
don't,
depending on their code. 

i'd rather live in the first world, where all programs agree.
the alternate universe, where illusions come and go depending on
the program, where shortcomings of the kernel are fixed by hacking
around them in user space, it's a terrible way to live.  if you want
modern unix, you know where to find it.

the clearest example i can think of in modern unix where this
happens is the handling of .. by shells in order to make symlinks
look not like symlinks, while the kernel is still broken, as rob points
out in http://plan9.bell-labs.com/sys/doc/lexnames.pdf.  there are
plenty of others.

russ


p.s. amusingly enough, now we've filled in all the context necessary
to read boyd's first post in this thread, where he posted that url
and nothing else!






^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  8:13               ` Rob Pike
                                   ` (2 preceding siblings ...)
  2004-03-19  8:35                 ` boyd, rounin
@ 2004-03-19 14:19                 ` Russ Cox
  3 siblings, 0 replies; 96+ messages in thread
From: Russ Cox @ 2004-03-19 14:19 UTC (permalink / raw)
  To: 9fans

Rob Pike wrote:

>> Since I don't have a printer, here is a proposed change to rc/glob.c
>
>
> i don't like changing the shell to mask kernel behavior. it leads
> to surprises.  consider unix shells that `fix' cd in a way that it
> differs from the chdir system call.


<mutters about mail reader showing messages in wrong order...>




^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 14:13                     ` Russ Cox
@ 2004-03-19 14:37                       ` David Tolpin
  0 siblings, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19 14:37 UTC (permalink / raw)
  To: 9fans

>
> continuing this...
>
> note how much more complicated the model gets if rc and ls
> (or perhaps dirreadall) start removing duplicates from the list.
> right now, it took you no time at all to find the existence of duplicates,
> and then once you realize they exist, you notice them everywhere.

I didn't propose ls to remove duplicates from the list. ls should not.
rc should from the list of globbing results. It is written in the manual.
>
> if the bulk of directory-processing programs remove duplicates

globbing is not directory processing.

> i'd rather live in the first world, where all programs agree.

echo x

and

echo x*

will bring different results for a directory with two entries
named x . The program does not agree with itself. Globbing has
little in common with directory reading.

> out in http://plan9.bell-labs.com/sys/doc/lexnames.pdf.  there are
> plenty of others.
>
> p.s. amusingly enough, now we've filled in all the context necessary
> to read boyd's first post in this thread, where he posted that url
> and nothing else!

I had read that article, I do know the problem with symlinks,
and this problem is not related to the one I'm discussing.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19  9:52                     ` Scott Schwartz
@ 2004-03-19 14:42                       ` ron minnich
  2004-03-19 16:18                         ` 9nut
  2004-03-19 19:42                         ` boyd, rounin
  0 siblings, 2 replies; 96+ messages in thread
From: ron minnich @ 2004-03-19 14:42 UTC (permalink / raw)
  To: 9fans

On Fri, 19 Mar 2004, Scott Schwartz wrote:

> > are you planning to change the other shells too?
> 
> I think having a standard glob routine that all shells (and other
> programs!) used would be desirable.


maybe a tool! we could call it glob. The shell could call it so that we 
got standardized globbing.

I recommend putting it in /etc

ron
p.s. Yes, :-)



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 13:59                             ` David Presotto
@ 2004-03-19 14:44                               ` David Tolpin
  2004-03-19 17:57                                 ` Russ Cox
  2004-03-19 20:31                               ` Geoff Collyer
  1 sibling, 1 reply; 96+ messages in thread
From: David Tolpin @ 2004-03-19 14:44 UTC (permalink / raw)
  To: 9fans

>
> None, they just walk through the directory they get.

That's why I think those examples are not relevant. dirread semantics
stays intact. globbing semantics should correspond to its definition.

ls -l *.c

and 

ls -l

give different results now, without changes. The first form
will list some entries twice. The second form will list each
entry once, with some entries having the same name. 

I'm proposing to make the first form bring what it should. Namely,
to bring every entry once. How displaying the first occurence of
a file as many times as there are entries with the same name makes
it similar or consistent with the second form, I really don't get.

They are confusing now. The cause of the confusion is a bug in globbing.
The bug should be fixed.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:18                         ` 9nut
@ 2004-03-19 15:34                           ` david presotto
  2004-03-19 15:43                             ` ron minnich
                                               ` (3 more replies)
  2004-03-19 15:53                           ` lucio
  1 sibling, 4 replies; 96+ messages in thread
From: david presotto @ 2004-03-19 15:34 UTC (permalink / raw)
  To: 9fans

I would be most disturbed if 'ls' and 'ls *' returned different numbers
of entries.

The fact that 'ls -l' and 'ls -l *' show different properties bothers me
less.  I see it as a failure of  ls and not globing.  However, requiring
ls to read the whole directory every time seems silly.

I wouldn't mind if the kernel threw out the unreachable entries when
you read a directory.  However, I don't think its worth the effort or
the likely bugs.  This is what  I would change, if anything since it is
the source of the problem and 'fixing' it in rc just makes rc and globing
in general odd.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:34                           ` david presotto
@ 2004-03-19 15:43                             ` ron minnich
  2004-03-19 16:00                               ` Charles Forsyth
  2004-03-19 16:02                             ` Charles Forsyth
                                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 96+ messages in thread
From: ron minnich @ 2004-03-19 15:43 UTC (permalink / raw)
  To: 9fans

On Fri, 19 Mar 2004, david presotto wrote:

> I wouldn't mind if the kernel threw out the unreachable entries when you
> read a directory.  However, I don't think its worth the effort or the
> likely bugs.  This is what I would change, if anything since it is the
> source of the problem and 'fixing' it in rc just makes rc and globing in
> general odd.

in the mode of lnfs, for people who care, make a uniquefs that enforces 
bsd-like behavior?

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:18                         ` 9nut
  2004-03-19 15:34                           ` david presotto
@ 2004-03-19 15:53                           ` lucio
  2004-03-19 16:01                             ` Charles Forsyth
  2004-03-19 16:08                             ` andrey mirtchovski
  1 sibling, 2 replies; 96+ messages in thread
From: lucio @ 2004-03-19 15:53 UTC (permalink / raw)
  To: 9fans

> If you get this joke, you may qualify for a Senior discount at Denny's. ☺

MKS tools.  Unless there was prior art that I don't know about.

I have a lot to thank Mortice Kern for.

++L



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:43                             ` ron minnich
@ 2004-03-19 16:00                               ` Charles Forsyth
  0 siblings, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 16:00 UTC (permalink / raw)
  To: 9fans

>>in the mode of lnfs, for people who care, make a uniquefs that enforces 
>>bsd-like behavior?

that was what i was alluding to earlier, although the uniqueness didn't seem to me to be
the main thing, it's having the effect of `recursive bind' (taking unions over whole
trees), white-out, and similar things that would be useful when
compiling variant versions, including experimental ones,
in the style of ClearCase (but without the need for a dedicated administrator!).



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:53                           ` lucio
@ 2004-03-19 16:01                             ` Charles Forsyth
  2004-03-19 16:08                             ` andrey mirtchovski
  1 sibling, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 16:01 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 31 bytes --]

sorry, you must be under age.

[-- Attachment #2: Type: message/rfc822, Size: 2203 bytes --]

From: lucio@proxima.alt.za
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
Date: Fri, 19 Mar 2004 17:53:22 +0200
Message-ID: <b61261937c345ff6347ff41960bb239f@proxima.alt.za>

> If you get this joke, you may qualify for a Senior discount at Denny's. ☺

MKS tools.  Unless there was prior art that I don't know about.

I have a lot to thank Mortice Kern for.

++L

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:34                           ` david presotto
  2004-03-19 15:43                             ` ron minnich
@ 2004-03-19 16:02                             ` Charles Forsyth
  2004-03-19 16:23                               ` David Presotto
  2004-03-19 16:34                             ` Richard Miller
  2004-03-19 16:34                             ` a
  3 siblings, 1 reply; 96+ messages in thread
From: Charles Forsyth @ 2004-03-19 16:02 UTC (permalink / raw)
  To: 9fans

>>I wouldn't mind if the kernel threw out the unreachable entries when
>>you read a directory.  However, I don't think its worth the effort or

the trouble with that is that it's sometimes quite important to see
that there are duplicates and what they are, if only to diagnose
an unexpected effect caused by a union, so i think you'd need some way to
give an equivalent effect, as i think rsc mentioned earlier.

roger's away so i'll give his observation that the result of ls -l on a union
directory (or reading a union directory and sorting it)
will be less useful or even misleading unless the sort is stable.
if you can rely on that, you know the attributes of the files of the same name below in
the union appear in the correct order (which allows quick checks on the state of the union).
of course, /proc/$pid/ns will give some of that for the
directories involved, but when it is a particular file or files
that you worry about, {ls -l | grep}
seems a natural thing to use to check their attributes.
as it happens, i don't think Plan 9's sort is stable:

term% ls -l /bin|grep 8c
--rwxrwxr-x M 9 forsyth inferno    303305 Dec 10 18:11 /bin/8c
--rwxrwxr-x M 9 sys     sys    299674 Feb 17 03:59 /bin/8c
term% ls -l /bin/8c
--rwxrwxr-x M 9 sys sys 299674 Feb 17 03:59 /bin/8c



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:53                           ` lucio
  2004-03-19 16:01                             ` Charles Forsyth
@ 2004-03-19 16:08                             ` andrey mirtchovski
  2004-03-19 16:12                               ` ron minnich
  2004-03-19 16:22                               ` lucio
  1 sibling, 2 replies; 96+ messages in thread
From: andrey mirtchovski @ 2004-03-19 16:08 UTC (permalink / raw)
  To: 9fans

>> If you get this joke, you may qualify for a Senior discount at Denny's. ☺
> 
> MKS tools.  Unless there was prior art that I don't know about.
> 
> I have a lot to thank Mortice Kern for.

Ron showed me his 6th Ed. Manual day to prove that glob was indeed a
binary, a dæmon running on the system.

andrey



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:08                             ` andrey mirtchovski
@ 2004-03-19 16:12                               ` ron minnich
  2004-03-19 16:22                               ` lucio
  1 sibling, 0 replies; 96+ messages in thread
From: ron minnich @ 2004-03-19 16:12 UTC (permalink / raw)
  To: 9fans

On Fri, 19 Mar 2004, andrey mirtchovski wrote:

> Ron showed me his 6th Ed. Manual day to prove that glob was indeed a
> binary, a dæmon running on the system.

not a daemon, but yeah a binary.

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 14:42                       ` ron minnich
@ 2004-03-19 16:18                         ` 9nut
  2004-03-19 15:34                           ` david presotto
  2004-03-19 15:53                           ` lucio
  2004-03-19 19:42                         ` boyd, rounin
  1 sibling, 2 replies; 96+ messages in thread
From: 9nut @ 2004-03-19 16:18 UTC (permalink / raw)
  To: 9fans

> maybe a tool! we could call it glob. The shell could call it so that we 
> got standardized globbing.
> 
> I recommend putting it in /etc
> 
> ron
> p.s. Yes, :-)

If you get this joke, you may qualify for a Senior discount at Denny's. ☺



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:08                             ` andrey mirtchovski
  2004-03-19 16:12                               ` ron minnich
@ 2004-03-19 16:22                               ` lucio
  1 sibling, 0 replies; 96+ messages in thread
From: lucio @ 2004-03-19 16:22 UTC (permalink / raw)
  To: 9fans

> Ron showed me his 6th Ed. Manual day to prove that glob was indeed a
> binary, a dæmon running on the system.
> 
You mean a predecessor of the Plan 9 file server concept?  <grin>

++L



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:02                             ` Charles Forsyth
@ 2004-03-19 16:23                               ` David Presotto
  0 siblings, 0 replies; 96+ messages in thread
From: David Presotto @ 2004-03-19 16:23 UTC (permalink / raw)
  To: 9fans

The number of times I've reliably used the duplicates in a
listing to see a namespace problem has been 0.  Ns is the
only tool I've ever found useful there.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:34                           ` david presotto
  2004-03-19 15:43                             ` ron minnich
  2004-03-19 16:02                             ` Charles Forsyth
@ 2004-03-19 16:34                             ` Richard Miller
  2004-03-19 16:47                               ` a
                                                 ` (2 more replies)
  2004-03-19 16:34                             ` a
  3 siblings, 3 replies; 96+ messages in thread
From: Richard Miller @ 2004-03-19 16:34 UTC (permalink / raw)
  To: 9fans

> I would be most disturbed if 'ls' and 'ls *' returned different numbers
> of entries.

term% cd /
term% ls|wc
     40      40     202
term% ls *|wc
   1056    1056   10972



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 15:34                           ` david presotto
                                               ` (2 preceding siblings ...)
  2004-03-19 16:34                             ` Richard Miller
@ 2004-03-19 16:34                             ` a
  3 siblings, 0 replies; 96+ messages in thread
From: a @ 2004-03-19 16:34 UTC (permalink / raw)
  To: 9fans

/ I would be most disturbed if 'ls' and 'ls *' returned
// different numbers of entries.

i was totally with David T. on this one until i read this. now i
see that the coice is really that either 'ls' and 'ls *' can agree,
-*or*- 'ls /tmp/foo' and 'ls /tmp/foo*' can agree. changing globing
as proposed will make the later true and the former false, which,
as presotto notes, is likely to be much more confusing. the fix
really would have to be a kernel-level change, and that's a much
bigger deal.

fwiw, it was a decent suggestion on a tricky point, and many 9fans
are still confusing what david was suggesting/asking for with a
dirread change which he *wasn't* asking for.
ア


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:34                             ` Richard Miller
@ 2004-03-19 16:47                               ` a
  2004-03-19 16:52                                 ` Richard Miller
  2004-03-19 17:15                               ` David Presotto
  2004-03-21 20:47                               ` rog
  2 siblings, 1 reply; 96+ messages in thread
From: a @ 2004-03-19 16:47 UTC (permalink / raw)
  To: 9fans

// ...ls|wc...

different cause and different context. there ls
is being instructed to do different things. i
think presotto had an implied "in a dir with only
plain files" or something similar.
ア


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:47                               ` a
@ 2004-03-19 16:52                                 ` Richard Miller
  0 siblings, 0 replies; 96+ messages in thread
From: Richard Miller @ 2004-03-19 16:52 UTC (permalink / raw)
  To: 9fans

> i
> think presotto had an implied "in a dir with only
> plain files" or something similar.

... and I hope he realised I was only teasing.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:34                             ` Richard Miller
  2004-03-19 16:47                               ` a
@ 2004-03-19 17:15                               ` David Presotto
  2004-03-21 20:47                               ` rog
  2 siblings, 0 replies; 96+ messages in thread
From: David Presotto @ 2004-03-19 17:15 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 31 bytes --]

come on, you know what I meant.

[-- Attachment #2: Type: message/rfc822, Size: 1938 bytes --]

From: Richard Miller <rm@hamnavoe.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
Date: Fri, 19 Mar 2004 16:34:12 0000
Message-ID: <240892daccaec2131635fbd4917479de@hamnavoe.com>

> I would be most disturbed if 'ls' and 'ls *' returned different numbers
> of entries.

term% cd /
term% ls|wc
     40      40     202
term% ls *|wc
   1056    1056   10972

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 14:44                               ` David Tolpin
@ 2004-03-19 17:57                                 ` Russ Cox
  2004-03-19 18:04                                   ` David Tolpin
  0 siblings, 1 reply; 96+ messages in thread
From: Russ Cox @ 2004-03-19 17:57 UTC (permalink / raw)
  To: 9fans

David Tolpin wrote:

>I'm proposing to make the first form bring what it should. Namely,
>to bring every entry once. How displaying the first occurence of
>a file as many times as there are entries with the same name makes
>it similar or consistent with the second form, I really don't get.
>
>They are confusing now. The cause of the confusion is a bug in globbing.
>The bug should be fixed.
>  
>

Actually the cause of the confusion is not agreed upon,
as evidenced by the continuation of this discussion!

I'm perfectly happy with things as they are now, so that:

echo `{ls /dev | grep mouse}
echo /dev/*mouse*

produce the same output.  If you want to fix something, figure
out how to fix the kernel.  If 9P servers were required to sort
their directory entries then the kernel could just toss entries
easily during a merge.  Not that I'm proposing this.

Russ



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 17:57                                 ` Russ Cox
@ 2004-03-19 18:04                                   ` David Tolpin
  0 siblings, 0 replies; 96+ messages in thread
From: David Tolpin @ 2004-03-19 18:04 UTC (permalink / raw)
  To: 9fans

> I'm perfectly happy with things as they are now

I'm convinced. Many thanks for opinions.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 14:42                       ` ron minnich
  2004-03-19 16:18                         ` 9nut
@ 2004-03-19 19:42                         ` boyd, rounin
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-19 19:42 UTC (permalink / raw)
  To: 9fans

From: "ron minnich" <rminnich@lanl.gov>
> maybe a tool! we could call it glob. The shell could call it so that we 
> got standardized globbing.

it'll never work, guv.  no chance, zero, zip, nada ...

;)



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 13:59                             ` David Presotto
  2004-03-19 14:44                               ` David Tolpin
@ 2004-03-19 20:31                               ` Geoff Collyer
  1 sibling, 0 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-19 20:31 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 114 bytes --]

> None, they just walk through the directory they get.

Actually, ftpd, mkfs and replica (pull at least) glob.

[-- Attachment #2: Type: message/rfc822, Size: 4874 bytes --]

[-- Attachment #2.1.1: Type: text/plain, Size: 52 bytes --]

None, they just walk through the directory they get.

[-- Attachment #2.1.2: Type: message/rfc822, Size: 2733 bytes --]

From: David Tolpin <dvd@davidashen.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
Date: Fri, 19 Mar 2004 15:12:54 +0400 (AMT)
Message-ID: <200403191112.i2JBCsTK000660@adat.davidashen.net>

> From 9fans-admin@cse.psu.edu  Fri Mar 19 14:50:31 2004
> To: 9fans@cse.psu.edu
> Subject: Re: [9fans] ls, rc question -- proposed change to rc/glob.c
> From: Geoff Collyer <geoff@collyer.net>
> Content-Type: text/plain; charset="US-ASCII"
> Date: Fri, 19 Mar 2004 02:50:50 -0800
>
> I thought my list of directory-reading programs was too short; I
> forgot to grep for dirreadall also (not having ls on the list should
> have been a tip-off).  It should also include: aux/depend, diff, ftpd,
> history, kbmap, ls, mkpaqfs, netstat, news, pptpd, ps, rm, scp.  Not
> too surprising on a system built on file servers.

Which of these programs uses globbi ng?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 16:34                             ` Richard Miller
  2004-03-19 16:47                               ` a
  2004-03-19 17:15                               ` David Presotto
@ 2004-03-21 20:47                               ` rog
  2004-03-21 20:50                                 ` boyd, rounin
                                                   ` (2 more replies)
  2 siblings, 3 replies; 96+ messages in thread
From: rog @ 2004-03-21 20:47 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 3518 bytes --]

for programs that require read-write access to a namespace,
requirements are different, as charles points out.

however, most tools do not write to the namespace.

for me, one of plan 9's fundamental strengths is that while a tool is
often written to access a predefined portion of the namespace, the
actual namespace that it sees can be arranged at will for the tool in
question.

thus if a program accesses files in directory /d, i think it's
reasonable to expect that if i bind another directory, say /b, before
/d, i can expect the program to treat /d exactly as if i had really
replaced the files therein with files from /b.

the program shouldn't have to *care* whether it's dealing with a union
directory or not - that should be solely the concern of whoever's
arranging the namespace.

i think the difference between:

	bind /dev/null /bin/ls

and

	mkdir /tmp/x
	> /tmp/x/ls
	bind /dev/null /tmp/x/ls
	bind -b /tmp/x /bin

should be invisible, particularly to shell scripts, which have no
concept of sequentially reading a directory, and which are often short
and trivial, but for which the above invariants are still important.

russ:
> if a file is covered up, it should be completely hidden.
> it turns out that actually implementing this is quite hard,

i.e.  the visibility of mutiple names in a directory is an
implementation issue.  it's something that's easily solved at user
level (no problems with arbitrary buffering there, and programs can
eliminate duplicates in their own way if they like as, for example,
du(1), mk(1), cron(1) listen(1) do).

i think it should be made as easy as possible to write programs that
behave in a namespace-agnostic way.  i think that if a version of
dirreadall was available that eliminated duplicates, most (all?)
programs would wish to use it.

i think it's particularly silly that

	wc */*.[ch]

should give me erroneous results in a union directory,
and that

	echo /bin/ape/*

should be different from

	echo /bin/ape*/*

when there's only one accessible directory starting "ape" in /bin, and
the shell's sorting the globbed output anyway.

i think eliminating duplicates in two core places would break nothing,
but make parts of the system work more smoothly together.

it's one less thing to think about.

geoff points out many programs that use dirread.  a random inspection
turns up many programs that already do their own elimination of
duplicates (in different ways), and others that behave badly in the
face of duplicate directory entries (e.g.  diff).

this seems to me to point towards a systematic problem, which should
be solved centrally.

towards this end, i've attached a stable sort routine, along the same
lines as qsort(2), which can optionally eliminate duplicates.  i've
tested it to a degree, but not greatly; i haven't done any performance
analysis of it, but i doubt that's a limiting factor in this case.

i envisage ls(1) being changed to use this routine (with an option to
show/suppress duplicate entries), and perhaps a version of dirreadall
that sorts the directory entries alphabetically and eliminates
duplicates, to make it easy to replace it in existing programs.


PS it's interesting to note that ls(1) does make some sort of an
effort to do a stable sort:

	if(i == 0)
		i = (a<b? -1 : 1);

but it's simple minded (in fact, i'm a little bit surprised it doesn't
kill qsort).  a sequence field in the NDir structure could fix it.

[-- Attachment #2: mergesort.c --]
[-- Type: text/plain, Size: 1025 bytes --]

#include <u.h>
#include <libc.h>
	
int
mergesort1(int uniq, char *a, int r, int width, int (*compare)(void*, void*), char *b)
{
	int m, n0, n1, i, j, k, e, c;
	if(r <= width)
		return r;
	m = ((r/width-1)/2 + 1) * width;
	n0 = mergesort1(uniq, a, m, width, compare, b);
	n1 = mergesort1(uniq, a+m, r-m, width, compare, b);
	memcpy(b, a, n0);
	memcpy(b+n0, a+m, n1);
	e = n0+n1;
	i = 0;
	j = n0;
	k = 0;
	for(; i < n0 && j < e; k += width){
		c = (*compare)(b+i, b+j);
		if(c > 0){
			memcpy(a+k, b+j, width);
			j += width;
		}else{
			memcpy(a+k, b+i, width);
			i += width;
			if(c == 0 && uniq)
				j += width;
		}
	}
	if(i < n0){
		memcpy(a+k, b+i, n0-i);
		k += n0-i;
	}else if(j < e){
		memcpy(a+k, b+j, e-j);
		k += e-j;
	}
	return k;
}

int
mergesort(int uniq, void *a, long n, long width, int (*compare)(void*, void*))
{
	void *b;
	n *= width;
	b = malloc(n);
	if(b == nil)
		return -1;
	n = mergesort1(uniq, a, n, width, compare, b);
	free(b);
	return n / width;
}

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 20:47                               ` rog
@ 2004-03-21 20:50                                 ` boyd, rounin
  2004-03-21 21:53                                 ` ron minnich
  2004-03-22 10:09                                 ` Douglas A. Gwyn
  2 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-21 20:50 UTC (permalink / raw)
  To: 9fans

presto:  ls == ls *

QED



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 20:47                               ` rog
  2004-03-21 20:50                                 ` boyd, rounin
@ 2004-03-21 21:53                                 ` ron minnich
  2004-03-21 22:05                                   ` Charles Forsyth
  2004-03-21 23:29                                   ` Enache Adrian
  2004-03-22 10:09                                 ` Douglas A. Gwyn
  2 siblings, 2 replies; 96+ messages in thread
From: ron minnich @ 2004-03-21 21:53 UTC (permalink / raw)
  To: 9fans

assume lsfilter takes a stream of plan 9 stat structs and turns them into
an 'ls'-like listing

bind -b /usr/rminnich/bin /bin

lsfilter</bin

ls /bin

should these yield the same results? If you don't care, then the libc.a 
approach is fine. If you do want the same results, I think you may be 
stuck doing the uniqueness tricks in the kernel a la BSD, which to me 
anyway is unattractive.

You end up picking your inconsistencies. Right now the inconsistencies are 
due to a very simple implementation which acts the same all the time (to 
me anyway). 

I'm not arguing either way, just wondering.

I also have to wonder about the perceived inconsistent behavior of union
directories and the hacks that people keep developing (BSD) or proposing
(this list) to deal with those inconsistencies. Could this imply that my
CS friend was right and we've got to find another way to get what unions
get us now? I.e. as much as we love them, does the very idea of unions
have a fundamental flaw that requires us to think up something newer and
cleverer? I can't imagine what that could be. Just wondering.

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 21:53                                 ` ron minnich
@ 2004-03-21 22:05                                   ` Charles Forsyth
  2004-03-21 23:29                                   ` Enache Adrian
  1 sibling, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-21 22:05 UTC (permalink / raw)
  To: 9fans

>>I also have to wonder about the perceived inconsistent behavior of union
>>directories and the hacks that people keep developing (BSD) or proposing
>>(this list) to deal with those inconsistencies. Could this imply that my

i think it's only a problem when one expects them to do too much:
recursive combination of two directories is an example (yet that's
usually what's needed in general for versioning, i think).
they are an effective way of doing what they usually do now:
build up directories of services and other resources
from many sources, local and remote, including substituting
one for another.   the naming seems consistent to me.
if a name appears in a directory, i can use that name.
whether it appears once or twice, doesn't matter.

the arguments seem to focus on whether with a change here or there
they could be got to do a little more work (eg, by allowing * to
iterate exactly once over all services, protocols, say).
i don't think it's that essential in practice.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 21:53                                 ` ron minnich
  2004-03-21 22:05                                   ` Charles Forsyth
@ 2004-03-21 23:29                                   ` Enache Adrian
  2004-03-22  1:30                                     ` boyd, rounin
  1 sibling, 1 reply; 96+ messages in thread
From: Enache Adrian @ 2004-03-21 23:29 UTC (permalink / raw)
  To: 9fans

On Sun, Mar 21, 2004 a.d., ron minnich wrote:
> should these yield the same results? If you don't care, then the libc.a 
> approach is fine. If you do want the same results, I think you may be 
> stuck doing the uniqueness tricks in the kernel a la BSD, which to me 
> anyway is unattractive.

I'm afraid the uniqueness tricks are done in library/userland in BSD.
(the McKusick paper referred to in lexnames.html already tells it).
Just try to read a "-o union" mounted directory with open/getdirentries.

The hack is in libc/gen/opendir.c. Quoting from there:
                /*
                 * The strategy here is to read all the directory
                 * entries into a buffer, sort the buffer, and
                 * remove duplicate entries by setting the inode
                 * number to zero.
                 */
Only bloating the kernel with a sort/uniq filter could be worse than that.

Regards,
Adi


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 23:29                                   ` Enache Adrian
@ 2004-03-22  1:30                                     ` boyd, rounin
  0 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-22  1:30 UTC (permalink / raw)
  To: 9fans

> The hack is in libc/gen/opendir.c. Quoting from there:
>                 /*
>                  * The strategy here is to read all the directory
>                  * entries into a buffer, sort the buffer, and
>                  * remove duplicate entries by setting the inode
>                  * number to zero.
>                  */
> Only bloating the kernel with a sort/uniq filter could be worse than that.

pure bsd.

i can't see the point in adding more code to handle end cases.

large N% of the time it's not a problem, so Leave It Alone:

    http://www.lyricsdepot.com/living-colour/leave-it-alone.html



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-21 20:47                               ` rog
  2004-03-21 20:50                                 ` boyd, rounin
  2004-03-21 21:53                                 ` ron minnich
@ 2004-03-22 10:09                                 ` Douglas A. Gwyn
  2004-03-22 10:49                                   ` Charles Forsyth
  2 siblings, 1 reply; 96+ messages in thread
From: Douglas A. Gwyn @ 2004-03-22 10:09 UTC (permalink / raw)
  To: 9fans

rog@vitanuova.com wrote:
> the program shouldn't have to *care* whether it's dealing with a union
> directory or not - that should be solely the concern of whoever's
> arranging the namespace.

Absolutely.  However, I don't mind there being some obscure way to 
examine what is mounted in more detail, so long as it is not the 
facility usually used by apps.  (Consider Unix treatment of symbolic 
links; you usually want to follow them but not always, thus lstat(2) vs. 
stat(2).)


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 10:09                                 ` Douglas A. Gwyn
@ 2004-03-22 10:49                                   ` Charles Forsyth
  2004-03-22 12:15                                     ` boyd, rounin
  0 siblings, 1 reply; 96+ messages in thread
From: Charles Forsyth @ 2004-03-22 10:49 UTC (permalink / raw)
  To: 9fans

>>facility usually used by apps.  (Consider Unix treatment of symbolic 
>>links; you usually want to follow them but not always, thus lstat(2) vs. 
>>stat(2).)

it's a good example of how that approach breaks down:
there isn't any way to tell from the outside how any given application
will deal with symbolic links.  for instance,
some versions of ls follow them (unless given -L), others do not,
following different reasonable rationales.

if i tar up a tree, am i interested in following the links (ie, i want
the tree's contents) or recording them (ie, i want the tree's
structure)?  it might depend on the link!  fortunately tar has lots of
options, but the default is to copy the structure (-h follows all
symbolic links).

generally, the approach seems to be for directory-traversing programs
and programs such as mv do not follow the links, but programs that
open files naturally follow the links.

so cp follows the links.

unless it's cp -R!  then there are three options -H, -L, and -P
to (attempt to) control handling of symbolic links, distinguishing
between names on the command line that happen to be links,
and names in the structure that are links.

in some ways, it's rather worse with symbolic links.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 10:49                                   ` Charles Forsyth
@ 2004-03-22 12:15                                     ` boyd, rounin
  2004-03-22 18:23                                       ` Derek Fawcus
  0 siblings, 1 reply; 96+ messages in thread
From: boyd, rounin @ 2004-03-22 12:15 UTC (permalink / raw)
  To: 9fans

> it's a good example of how that approach breaks down:
> there isn't any way to tell from the outside how any given application
> will deal with symbolic links.  for instance,
> some versions of ls follow them (unless given -L), others do not,
> following different reasonable rationales.

symlinks really mess things up.  i think the only time i use
them is to point at the current version of the file or directory
and they only ever link to a 'file' in the same directory.

the (revolting) output of ls make it very obvious what the
link points too.  i could use hard links, but sorting through
inums is no fun.

a case in point is the lunix sam Make.* files.  i symlink the
right one(s) to Makefile.  i think it's a clean approach, but
i don't symlinks because they add far too much complexity,
another system call, more bits and in practice they are
grossly abused.





^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 12:15                                     ` boyd, rounin
@ 2004-03-22 18:23                                       ` Derek Fawcus
  2004-03-23  0:06                                         ` boyd, rounin
  0 siblings, 1 reply; 96+ messages in thread
From: Derek Fawcus @ 2004-03-22 18:23 UTC (permalink / raw)
  To: 9fans

On Mon, Mar 22, 2004 at 01:15:19PM +0100, boyd, rounin wrote:
> 
> a case in point is the lunix sam Make.* files.  i symlink the
> right one(s) to Makefile.

make -f 


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-19 10:50                         ` Geoff Collyer
  2004-03-19 11:12                           ` David Tolpin
@ 2004-03-22 22:56                           ` rog
  2004-03-22 23:19                             ` Scott Schwartz
                                               ` (3 more replies)
  1 sibling, 4 replies; 96+ messages in thread
From: rog @ 2004-03-22 22:56 UTC (permalink / raw)
  To: 9fans

i'm in a minority here, so i'll say this and then shut up.

i think the symlink comparison is not really valid, as a symbolic link
is a potentially useful entity in its own right (hence tools that can
look at them in both ways) whereas a duplicate name signifies nothing
beyond the fact that the directory might be a union directory; there's
nothing useful that can be done with the name.

what this discussion boils down to is epitomised by boyd's:

> large N% of the time it's not a problem, so Leave It Alone:

basically union directories are used hardly at all, and when they are,
it's generally only in "special purpose" places, such as /bin, /cron,
etc.  of course, they're crucial in the places where they are used,
but it's not really the general mechanism that one gets the impression
of when reading the documentation (as one quickly realises when trying
to do unusual things with it, such as bind onto directories in
/n/dump).

i have a feeling almost nothing would break if the kernel was changed
to disallow reading of union directories completely...

> the arguments seem to focus on whether with a change here or there
> they could be got to do a little more work (eg, by allowing * to
> iterate exactly once over all services, protocols, say).
> i don't think it's that essential in practice.

not essential.

it'd just be nice.

for what it's worth, i had a look through the programs that
geoff mentioned, categorising how they react to union directories.

no (or unlikely) adverse consequences:
	pptpd
		controls own namespace
	bitsy/keyboard
	faces
	ps
	winwatch
		displays duplicates (but union unlikely)
	aux/depend
	exportfs
		reflects duplicates in exported namespace
	cron
	du
	aux/listen
	mk
		do their own duplicate elimination
	history
		could produce erroneous results on snap (but union unlikely)

displays duplicate entries/info:
	acme
	news
		display duplicate entries
	netstat
		displays duplicate info

inefficient:
	mkfs
	gzip/zip
	scp
		produces duplicate copy of file
	vac
		stores duplicate directory entries

questionable:
	rm
		removes all entries, including hidden ones

erroneous:
	ip/ftpd
	ls
		displays duplicates, with non-stable reordering of entries.
	diff
		reports spurious extra entries
	tar
		doesn't work properly, (but probably for reasons
		unrelated to union directory reading (try {tar c /bin | tar t})

unknown:
	replica/revproto
		not sure.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 22:56                           ` rog
@ 2004-03-22 23:19                             ` Scott Schwartz
  2004-03-22 23:50                             ` Charles Forsyth
                                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 96+ messages in thread
From: Scott Schwartz @ 2004-03-22 23:19 UTC (permalink / raw)
  To: 9fans

Rog writes:
| i'm in a minority here, so i'll say this and then shut up.
 
Just to give you some moral support, I think you've made
a lot of sense.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 22:56                           ` rog
  2004-03-22 23:19                             ` Scott Schwartz
@ 2004-03-22 23:50                             ` Charles Forsyth
  2004-03-23  0:28                               ` rog
  2004-03-23  0:49                             ` Charles Forsyth
  2004-03-23 11:13                             ` a
  3 siblings, 1 reply; 96+ messages in thread
From: Charles Forsyth @ 2004-03-22 23:50 UTC (permalink / raw)
  To: 9fans

>i have a feeling almost nothing would break if the kernel was changed
>to disallow reading of union directories completely...

but of course it would: i couldn't ls it at all.  i couldn't traverse the hierarchy at all.
you should spend more time in the Windows or OS worlds
where such restrictions are (were) commonplace!



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 18:23                                       ` Derek Fawcus
@ 2004-03-23  0:06                                         ` boyd, rounin
  0 siblings, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-23  0:06 UTC (permalink / raw)
  To: 9fans

> make -f

try that in the top dir?

is it maintainable?

i just wanna cd and type make/mk and have the job done.
i save 3 chars and  lotta time poking around in files.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 23:50                             ` Charles Forsyth
@ 2004-03-23  0:28                               ` rog
  2004-03-23  0:40                                 ` Charles Forsyth
  0 siblings, 1 reply; 96+ messages in thread
From: rog @ 2004-03-23  0:28 UTC (permalink / raw)
  To: 9fans

> i couldn't traverse the hierarchy at all.

you could walk into it, which is all most things do operationally...

ls is nice ("look, here's a union directory and this is what's in it")
but i reckon almost none of the tools that are used by the system in
its day to day running actually read union directories.

union directory reading is most useful interactively.  most
non-trivial uses require elimination of duplicates.  for example, even
du(1) gets things wrong, i now realise.

> >>	rm
> >>		removes all entries, including hidden ones
> 
> it did years ago, but i don't think it does now.

it does.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-23  0:28                               ` rog
@ 2004-03-23  0:40                                 ` Charles Forsyth
  0 siblings, 0 replies; 96+ messages in thread
From: Charles Forsyth @ 2004-03-23  0:40 UTC (permalink / raw)
  To: 9fans

> >>	rm
> >>		removes all entries, including hidden ones
> 
> it did years ago, but i don't think it does now.

>it does.[?]

term% mkdir a
term% mkdir b
term% >a/x
term% >b/x
term% bind -b b a
term% ls a
a/x
a/x
term% rm a/x
term% ls a
a/x

i'm fairly sure it once had some code to do that but it doesn't now.

rm -r will remove them all -- but that's exactly what it's
supposed to do: empty the directory, surely!
indeed, it's interesting that here's an example where the underlying
mechanism seems to encourage just the right behaviour!

i also note that ls a showed two x's the first time,
then i removed one, leaving one, which seems good accounting
to me ...  in fact, with the `show only one name' approach,
surely it would be the Ruby principle of least surprise
to leave `a' empty so rm should revert to its `questionable' behaviour!


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 22:56                           ` rog
  2004-03-22 23:19                             ` Scott Schwartz
  2004-03-22 23:50                             ` Charles Forsyth
@ 2004-03-23  0:49                             ` Charles Forsyth
  2004-03-23  2:12                               ` ron minnich
  2004-03-23 11:13                             ` a
  3 siblings, 1 reply; 96+ messages in thread
From: Charles Forsyth @ 2004-03-23  0:49 UTC (permalink / raw)
  To: 9fans

come to think of it, if i expect {rm -r a} to empty (and delete) `a'
might i not expect {rm a/*} and especially {rm -r a/*}
to remove all removable names from `a'?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-23  0:49                             ` Charles Forsyth
@ 2004-03-23  2:12                               ` ron minnich
  2004-03-23  2:16                                 ` boyd, rounin
  2004-03-23  3:15                                 ` rog
  0 siblings, 2 replies; 96+ messages in thread
From: ron minnich @ 2004-03-23  2:12 UTC (permalink / raw)
  To: 9fans

On Tue, 23 Mar 2004, Charles Forsyth wrote:

> come to think of it, if i expect {rm -r a} to empty (and delete) `a'
> might i not expect {rm a/*} and especially {rm -r a/*}
> to remove all removable names from `a'?

seems right. There's a lot of subtleties to the union thing. I think that 
Plan 9's decision (present it to the users in its entirety) is the right 
one.

ron



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-23  2:12                               ` ron minnich
@ 2004-03-23  2:16                                 ` boyd, rounin
  2004-03-23  3:15                                 ` rog
  1 sibling, 0 replies; 96+ messages in thread
From: boyd, rounin @ 2004-03-23  2:16 UTC (permalink / raw)
  To: 9fans

> seems right. There's a lot of subtleties to the union thing. I think that 
> Plan 9's decision (present it to the users in its entirety) is the right 
> one.

yup



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-23  2:12                               ` ron minnich
  2004-03-23  2:16                                 ` boyd, rounin
@ 2004-03-23  3:15                                 ` rog
  1 sibling, 0 replies; 96+ messages in thread
From: rog @ 2004-03-23  3:15 UTC (permalink / raw)
  To: 9fans

> > come to think of it, if i expect {rm -r a} to empty (and delete) `a'
> > might i not expect {rm a/*} and especially {rm -r a/*}
> > to remove all removable names from `a'?
> 
> seems right. There's a lot of subtleties to the union thing. I think that 
> Plan 9's decision (present it to the users in its entirety) is the right 
> one.

it's a trade-off.  some things work better, some things work worse.

having said that, the above is the only occasion i've seen that makes
use of the duplicate names in any kind of a sensible manner (but
remove (and rename) is a strange operation in a union directory anyway
- remove("x") does not necessarily mean that "x" goes away).  i
excluded write operations originally, and marked the rm behaviour as
"questionable" because i believe it can be argued both ways.  union
directories themselves cannot be removed, so one's on fairly shaky
ground trying to rm -r them...

most tools read a directory in order to enumerate its accessible
items.  in that context, seeing unique names only is more correct.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-22 22:56                           ` rog
                                               ` (2 preceding siblings ...)
  2004-03-23  0:49                             ` Charles Forsyth
@ 2004-03-23 11:13                             ` a
  2004-03-23 11:47                               ` Geoff Collyer
  3 siblings, 1 reply; 96+ messages in thread
From: a @ 2004-03-23 11:13 UTC (permalink / raw)
  To: 9fans

// ...union directories are used hardly at all, and when they are,
// it's generally only in "special purpose" places...

for what it's worth, the lab i was in at BL did a fairly complete
overlay on top of 1127's file server with our own tree. as i recall
our /lib/namespace approached 350 lines. we were pushing hard on
what the kernel was expecting that file to do, and did run into
issues on occasion, but for the most part it just worked, and was
very nice to use in practice. obviously, we would've been SOL had
the kernel not allowed reading union directories.

if i remember correctly, one user even had a fairly complicated
additional overlay of his own as his home directory was split
between our own and 1127's file servers (and he managed to be a
huge user on both ;-).
ア


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [9fans] ls, rc question -- proposed change to rc/glob.c
  2004-03-23 11:13                             ` a
@ 2004-03-23 11:47                               ` Geoff Collyer
  0 siblings, 0 replies; 96+ messages in thread
From: Geoff Collyer @ 2004-03-23 11:47 UTC (permalink / raw)
  To: 9fans

The place where our massive overlay didn't work too well was building
kernels from 1127 sources with a few of our own source files.  The
overlay file server that Russ and I wrote attempted to solve that but
also required a minor change to mk.  Unfortunately we wrote it using
the first public version of lib9p, which has since changed quite a
bit.  I'd like to take another stab at it once I've finished my
current round of file servers.

Actually most of my files remained on 1127's file servers, where they
had started out.  I had relatively few files on our departmental file
server, though obviously the overlay worked well enough that Anthony
couldn't tell. ☺  I felt a little bad about hogging 1127's optical
storage, but then Ken had ~160GB of compressed music on ours by the
end, which relieved my guilt somewhat.



^ permalink raw reply	[flat|nested] 96+ messages in thread

* RE: [9fans] ls, rc question
@ 2004-03-19 10:26 Tiit Lankots
  0 siblings, 0 replies; 96+ messages in thread
From: Tiit Lankots @ 2004-03-19 10:26 UTC (permalink / raw)
  To: 9fans

As Rob said, the current beahaviour is good in the sense that it's
simple and predictable. However, it would maybe also be nice to glob
match _names_, not paths. Right now, I fail to see, how it would 
break something, e.g. acme.


^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2004-03-23 11:47 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-18 21:40 [9fans] ls question David Tolpin
2004-03-18 21:58 ` Russ Cox
2004-03-18 22:05   ` Russ Cox
2004-03-18 23:00     ` [9fans] ls, rc question David Tolpin
2004-03-18 23:31       ` [9fans] dirread David Tolpin
2004-03-18 23:49         ` ron minnich
2004-03-19  0:14         ` boyd, rounin
2004-03-19  3:38         ` rsc
2004-03-19  3:41       ` [9fans] ls, rc question rsc
2004-03-19  5:32         ` David Tolpin
2004-03-19  5:45           ` boyd, rounin
2004-03-19  5:50           ` ron minnich
2004-03-19  6:45             ` boyd, rounin
2004-03-19  9:07             ` Charles Forsyth
2004-03-19  9:24               ` Richard Miller
2004-03-19  9:33                 ` boyd, rounin
2004-03-19  9:39                   ` Richard Miller
2004-03-19  9:46                     ` Geoff Collyer
2004-03-19 10:11                   ` Richard Miller
2004-03-19 10:42                     ` Charles Forsyth
2004-03-19 10:03                 ` Charles Forsyth
2004-03-19  9:38               ` [9fans] Bind, look, everything is duplicated David Tolpin
2004-03-19  7:01           ` [9fans] ls, rc question Micah Stetson
2004-03-19  7:57             ` [9fans] ls, rc question -- proposed change to rc/glob.c David Tolpin
2004-03-19  8:13               ` Rob Pike
2004-03-19  8:18                 ` David Tolpin
2004-03-19  8:24                   ` David Tolpin
2004-03-19  8:27                   ` Rob Pike
2004-03-19  8:52                     ` David Tolpin
2004-03-19  9:16                     ` Richard Miller
2004-03-19  9:29                       ` boyd, rounin
2004-03-19  9:41                       ` Geoff Collyer
2004-03-19 10:09                         ` boyd, rounin
2004-03-19 10:50                         ` Geoff Collyer
2004-03-19 11:12                           ` David Tolpin
2004-03-19 12:31                             ` Charles Forsyth
2004-03-19 12:53                               ` boyd, rounin
2004-03-19 13:59                             ` David Presotto
2004-03-19 14:44                               ` David Tolpin
2004-03-19 17:57                                 ` Russ Cox
2004-03-19 18:04                                   ` David Tolpin
2004-03-19 20:31                               ` Geoff Collyer
2004-03-22 22:56                           ` rog
2004-03-22 23:19                             ` Scott Schwartz
2004-03-22 23:50                             ` Charles Forsyth
2004-03-23  0:28                               ` rog
2004-03-23  0:40                                 ` Charles Forsyth
2004-03-23  0:49                             ` Charles Forsyth
2004-03-23  2:12                               ` ron minnich
2004-03-23  2:16                                 ` boyd, rounin
2004-03-23  3:15                                 ` rog
2004-03-23 11:13                             ` a
2004-03-23 11:47                               ` Geoff Collyer
2004-03-19  8:31                 ` Richard Miller
2004-03-19  8:47                   ` Geoff Collyer
2004-03-19  9:07                   ` Rob Pike
2004-03-19  9:34                     ` David Tolpin
2004-03-19  9:52                     ` Scott Schwartz
2004-03-19 14:42                       ` ron minnich
2004-03-19 16:18                         ` 9nut
2004-03-19 15:34                           ` david presotto
2004-03-19 15:43                             ` ron minnich
2004-03-19 16:00                               ` Charles Forsyth
2004-03-19 16:02                             ` Charles Forsyth
2004-03-19 16:23                               ` David Presotto
2004-03-19 16:34                             ` Richard Miller
2004-03-19 16:47                               ` a
2004-03-19 16:52                                 ` Richard Miller
2004-03-19 17:15                               ` David Presotto
2004-03-21 20:47                               ` rog
2004-03-21 20:50                                 ` boyd, rounin
2004-03-21 21:53                                 ` ron minnich
2004-03-21 22:05                                   ` Charles Forsyth
2004-03-21 23:29                                   ` Enache Adrian
2004-03-22  1:30                                     ` boyd, rounin
2004-03-22 10:09                                 ` Douglas A. Gwyn
2004-03-22 10:49                                   ` Charles Forsyth
2004-03-22 12:15                                     ` boyd, rounin
2004-03-22 18:23                                       ` Derek Fawcus
2004-03-23  0:06                                         ` boyd, rounin
2004-03-19 16:34                             ` a
2004-03-19 15:53                           ` lucio
2004-03-19 16:01                             ` Charles Forsyth
2004-03-19 16:08                             ` andrey mirtchovski
2004-03-19 16:12                               ` ron minnich
2004-03-19 16:22                               ` lucio
2004-03-19 19:42                         ` boyd, rounin
2004-03-19 14:13                     ` Russ Cox
2004-03-19 14:37                       ` David Tolpin
2004-03-19  8:35                 ` boyd, rounin
2004-03-19 14:19                 ` Russ Cox
2004-03-18 21:59 ` [9fans] ls question David Presotto
2004-03-18 22:05 ` matt
2004-03-18 22:04   ` David Tolpin
2004-03-18 22:08   ` boyd, rounin
2004-03-19 10:26 [9fans] ls, rc question Tiit Lankots

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).