zsh-workers
 help / color / mirror / code / Atom feed
* glob qualifier '-' doesn't work correctly on dangling symlinks
@ 2020-04-11 15:15 Vincent Lefevre
  2020-04-11 17:34 ` Stephane Chazelas
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-11 15:15 UTC (permalink / raw)
  To: zsh-workers

The glob qualifier '-' doesn't work correctly on dangling symlinks.

I had reported the following bug in Debian in 2008:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=510038

where I said:

For instance:

$ zsh <<EOF
set -ex
echo $ZSH_VERSION
mkdir globtest-dir
cd globtest-dir
touch file1
chmod 644 file1
ln -s file1 file2
ln -s file0 file3
ls -l file*
ls -l file*(-W)
EOF

gives:

+zsh:2> echo 4.3.6
4.3.6
+zsh:3> mkdir globtest-dir
+mkdir:0> mkdir globtest-dir
+zsh:4> cd globtest-dir
+zsh:5> touch file1
+zsh:6> chmod 644 file1
+zsh:7> ln -s file1 file2
+ln:0> ln -s file1 file2
+zsh:8> ln -s file0 file3
+ln:0> ln -s file0 file3
+zsh:9> ls -l file1 file2 file3
-rw-r--r-- 1 lefevre lefevre 0 2008-12-28 22:34:28 file1
lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file2 -> file1
lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file3 -> file0
+zsh:10> ls -l file3
lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file3 -> file0

file*(-W) should have no matches.

(note that Mac OS X was not affected at that time).

This still occurs in zsh 5.8.

I've looked at the code, and it seems that zsh ignores stat errors
(such as ENOENT) in this case, which is bad.

However, "echo file0(W)" is handled correctly.

zira% echo file0(W)
zsh: no matches found: file0(W)

Thus the issue concerns only the glob qualifier '-' on symbolic links.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 15:15 glob qualifier '-' doesn't work correctly on dangling symlinks Vincent Lefevre
@ 2020-04-11 17:34 ` Stephane Chazelas
  2020-04-11 19:17   ` Vincent Lefevre
  2020-04-12 12:48   ` Peter Stephenson
  0 siblings, 2 replies; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-11 17:34 UTC (permalink / raw)
  To: zsh-workers

2020-04-11 17:15:11 +0200, Vincent Lefevre:
[...]
> +zsh:10> ls -l file3
> lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file3 -> file0
> 
> file*(-W) should have no matches.
[...]

It is not really documented but kind of implied that on broken
symlinks, after -, we're still looking at the symlink instead of
the target (there's no target for us to look at anyway).

The manual has:

>      ls -ld -- *(-@)
> 
> lists all broken symbolic links, and

That's consistent with GNU find's -xtype l

Or:

find -L . -perm -2

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 17:34 ` Stephane Chazelas
@ 2020-04-11 19:17   ` Vincent Lefevre
  2020-04-11 20:37     ` Stephane Chazelas
  2020-04-12 12:48   ` Peter Stephenson
  1 sibling, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-11 19:17 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-11 18:34:50 +0100, Stephane Chazelas wrote:
> 2020-04-11 17:15:11 +0200, Vincent Lefevre:
> [...]
> > +zsh:10> ls -l file3
> > lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file3 -> file0
> > 
> > file*(-W) should have no matches.
> [...]
> 
> It is not really documented but kind of implied that on broken
> symlinks, after -, we're still looking at the symlink instead of
> the target (there's no target for us to look at anyway).

This is documented differently:

    -     toggles between making the qualifiers work on symbolic links
          (the default) and the files they point to

There isn't a different condition whether the target exists or not.

> The manual has:
> 
> >      ls -ld -- *(-@)
> > 
> > lists all broken symbolic links, and
> 
> That's consistent with GNU find's -xtype l

But the behavior is not consistent with the stat system call, with
the GNU stat utility (when using the --dereference option), and with
zsh/stat.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 19:17   ` Vincent Lefevre
@ 2020-04-11 20:37     ` Stephane Chazelas
  2020-04-11 23:48       ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-11 20:37 UTC (permalink / raw)
  To: zsh-workers

2020-04-11 21:17:11 +0200, Vincent Lefevre:
[...]
> > That's consistent with GNU find's -xtype l
> 
> But the behavior is not consistent with the stat system call, with
> the GNU stat utility (when using the --dereference option), and with
> zsh/stat.
[...]

But ls/stat report information, and find/globs find files.

find -xtype l and *(-@) are common, documented idioms. If only
for that, I don't think the behaviour should be changed.

And it's not clear what the better behaviour would be.

If that broken link should not be matched by *(-W), should it be
matched by *(-^W)? Why? Or should that fail the glob (cause the
shell process to exit)? What about for *(-e:code:)?

Here, you can always work around the problem with *(-W^@).

IMO, the current behaviour, though not ideal is probably the
best you can get (and again, it's consistent with "find"'s).

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 20:37     ` Stephane Chazelas
@ 2020-04-11 23:48       ` Vincent Lefevre
  2020-04-12  1:21         ` Daniel Shahaf
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-11 23:48 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-11 21:37:14 +0100, Stephane Chazelas wrote:
> 2020-04-11 21:17:11 +0200, Vincent Lefevre:
> [...]
> > > That's consistent with GNU find's -xtype l
> > 
> > But the behavior is not consistent with the stat system call, with
> > the GNU stat utility (when using the --dereference option), and with
> > zsh/stat.
> [...]
> 
> But ls/stat report information, and find/globs find files.

Globs find files based on reported information.

> find -xtype l and *(-@) are common, documented idioms. If only
> for that, I don't think the behaviour should be changed.
> 
> And it's not clear what the better behaviour would be.
> 
> If that broken link should not be matched by *(-W), should it be
> matched by *(-^W)? Why?

No, just like file0(^W) gives "zsh: no match" (as file0 does not exist).

> Or should that fail the glob (cause the shell process to exit)?

The shell process should not exit (except with "set -e").
The behavior should be: replace the link by the target, then
apply the glob qualifiers.

> What about for *(-e:code:)?

Since file0(e:foo:) does not make zsh execute foo, *(-e:code:) should
execute the code only on existing (and accessible) targets.

> Here, you can always work around the problem with *(-W^@).

But file3(-e:foo:^@) will execute "foo". That's not equivalent.
I suppose that the form *(-^@^W) is the generic way to do it.

However, currently, the behavior does not match the documentation.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 23:48       ` Vincent Lefevre
@ 2020-04-12  1:21         ` Daniel Shahaf
  2020-04-12  2:17           ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Shahaf @ 2020-04-12  1:21 UTC (permalink / raw)
  To: zsh-workers

Vincent Lefevre wrote on Sun, 12 Apr 2020 01:48 +0200:
> On 2020-04-11 21:37:14 +0100, Stephane Chazelas wrote:
> > find -xtype l and *(-@) are common, documented idioms. If only
> > for that, I don't think the behaviour should be changed.
> > 
> > And it's not clear what the better behaviour would be.
> > 
> > If that broken link should not be matched by *(-W), should it be
> > matched by *(-^W)? Why?  
> 
> No, just like file0(^W) gives "zsh: no match" (as file0 does not exist).
> 
> > Or should that fail the glob (cause the shell process to exit)?  
> 
> The shell process should not exit (except with "set -e").
> The behavior should be: replace the link by the target, then
> apply the glob qualifiers.
> 

To be explicit, then, the proposal is that «brokensymlink(-W)» and
«brokensymlink(-^W)» should both trigger the "no match" error?  (I.e.,
the target of a broken symlink is neither writable nor not writable.)

What should «brokensymlink(-)» do?

What would be the glob qualifier syntax for broken symlinks?

Cheers,

Daniel
(ENOTIME, so not expressing an opinion and not diving into the various
points being made; just wanted to clarify a few points.)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12  1:21         ` Daniel Shahaf
@ 2020-04-12  2:17           ` Vincent Lefevre
  2020-04-12  7:09             ` Stephane Chazelas
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-12  2:17 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-12 01:21:55 +0000, Daniel Shahaf wrote:
> To be explicit, then, the proposal is that «brokensymlink(-W)» and
> «brokensymlink(-^W)» should both trigger the "no match" error?  (I.e.,
> the target of a broken symlink is neither writable nor not writable.)

Yes.

> What should «brokensymlink(-)» do?

Since no filtering is requested on the non-existing target of
the symlink, this could give "brokensymlink", i.e. ignore the
fact that "-" was used.

However, one could decide that the use of "-" will immediately
filter out broken symlinks.

> What would be the glob qualifier syntax for broken symlinks?

There would need something for that. But even currently, there
are things one cannot do with glob qualifiers, such as one does
not have a way to know the reason why a symlink is broken, which
can be important when one is interested in broken symlinks.

zira% ln -s /does-not-exist s1
zira% ln -s /root/foo s2
zira% ls -L s*
ls: cannot access 's1': No such file or directory
ls: cannot access 's2': Permission denied

But with glob qualifiers, there does not seem to be a way to
distinguish these two cases.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12  2:17           ` Vincent Lefevre
@ 2020-04-12  7:09             ` Stephane Chazelas
  2020-04-12 14:25               ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-12  7:09 UTC (permalink / raw)
  To: zsh-workers

2020-04-12 04:17:22 +0200, Vincent Lefevre:
[...]
> > What would be the glob qualifier syntax for broken symlinks?
> 
> There would need something for that. But even currently, there
> are things one cannot do with glob qualifiers, such as one does
> not have a way to know the reason why a symlink is broken, which
> can be important when one is interested in broken symlinks.
> 
> zira% ln -s /does-not-exist s1
> zira% ln -s /root/foo s2
> zira% ls -L s*
> ls: cannot access 's1': No such file or directory
> ls: cannot access 's2': Permission denied
> 
> But with glob qualifiers, there does not seem to be a way to
> distinguish these two cases.
[...]

There's:

$ zmodload zsh/system
$ ls -ld -- *(e[ERRNO=0]-e['[[ $errnos[ERRNO] = EACCES ]]'])
lrwxrwxrwx 1 chazelas chazelas 9 Apr 12 07:34 s2 -> /root/foo
$ ls -ld -- *(e[ERRNO=0]-e['[[ $errnos[ERRNO] = ENOENT ]]'])
lrwxrwxrwx 1 chazelas chazelas 15 Apr 12 07:34 s1 -> /does-not-exist

(the ERRNO=0 may not be necessary).

Note:

$ find -L . -perm -o=w
./s1
find: ‘./s2’: Permission denied

But again, *(-@) for broken symlinks is documented and widely
used, we can't break that.

So if we change "-" to exclude broken symlinks, we'd need to
special case -@. What's the scope of what should be special
cased? *(-@e['((count++))']) should probably still work as well
for instance.

How about: *(-e['((n++))']@['((brokenlinks++))'])?

And *(-@m-1) (broken links created in the last 24 hours, though
I'd expect one to write *(m-1-@) instead here)

Note that for "find -L", zsh's current behaviour is required by
POSIX (at least for links whose target can be determined not to
exist):

     -L
	  Cause the file information and file type evaluated for
	  each symbolic link encountered as a path operand on
	  the command line or encountered during the traversal
	  of a file hierarchy to be those of the file referenced
	  by the link, and not the link itself. If the
	                                        ^^^^^^
	  referenced file does not exist, the file information
	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  and type shall be for the link itself.
	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I find some variation in behaviour though:

$ ln -s /etc/passwd/foo s3
$ gfind . -follow -perm -o=w
gfind: ‘./s3’: Not a directory
./s3
./s1
gfind: ‘./s2’: Permission denied
$ busybox find . -follow -perm -o=w
find: ./s3: Not a directory
./s1
find: ./s2: Permission denied
$ find_su3 . -follow -perm -o=w
find_su3: cannot follow symbolic link ./s3: Not a directory
find_su3: cannot follow symbolic link ./s1: No such file or directory
find_su3: cannot follow symbolic link ./s2: Permission denied

(the latter from the heirloom toolchest being not POSIX compliant)

In any case, in */*(W), or ***/*(W) or **/*(W), the cases where
directories are not readable or searchable, or symlink targets
not accessible are always silently ignored (as opposed to the
find equivalents).

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-11 17:34 ` Stephane Chazelas
  2020-04-11 19:17   ` Vincent Lefevre
@ 2020-04-12 12:48   ` Peter Stephenson
  2020-04-12 14:31     ` Vincent Lefevre
  1 sibling, 1 reply; 24+ messages in thread
From: Peter Stephenson @ 2020-04-12 12:48 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

On Sat, 2020-04-11 at 18:34 +0100, Stephane Chazelas wrote:
> 2020-04-11 17:15:11 +0200, Vincent Lefevre:
> [...]
> > +zsh:10> ls -l file3
> > lrwxrwxrwx 1 lefevre lefevre 5 2008-12-28 22:34:28 file3 -> file0
> > 
> > file*(-W) should have no matches.
> 
> [...]
> 
> It is not really documented but kind of implied that on broken
> symlinks, after -, we're still looking at the symlink instead of
> the target (there's no target for us to look at anyway).
> 
> The manual has:
> 
> >      ls -ld -- *(-@)
> > 
> > lists all broken symbolic links, and

Yes, it's already implicit and useful that this is how it works; I use
it myself.  We should document it better.  The current form is ambiguous
--- "the file it refers to" is meaningless if it doesn't refer to a
file.

We certainly can't change this at this stage, but that wouldn't stop us
adding something with alternative behaviour.

For convenience, the proposed wording in the attached is

item(tt(-))(
toggles between making the qualifiers work on symbolic links (the
default) and the files they point to, if any; a broken symbolic link
is treated as a file in its own right
)

pws

[-- Attachment #2: broken_symlinks.dif --]
[-- Type: text/x-patch, Size: 486 bytes --]

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 2a66ab997..fd1f1ca3b 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2837,7 +2837,8 @@ negates all qualifiers following it
 )
 item(tt(-))(
 toggles between making the qualifiers work on symbolic links (the
-default) and the files they point to
+default) and the files they point to, if any; a broken symbolic link
+is treated as a file in its own right
 )
 item(tt(M))(
 sets the tt(MARK_DIRS) option for the current pattern

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12  7:09             ` Stephane Chazelas
@ 2020-04-12 14:25               ` Vincent Lefevre
  2020-04-12 17:34                 ` Stephane Chazelas
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-12 14:25 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-12 08:09:30 +0100, Stephane Chazelas wrote:
> 2020-04-12 04:17:22 +0200, Vincent Lefevre:
> > zira% ln -s /does-not-exist s1
> > zira% ln -s /root/foo s2
> > zira% ls -L s*
> > ls: cannot access 's1': No such file or directory
> > ls: cannot access 's2': Permission denied
> > 
> > But with glob qualifiers, there does not seem to be a way to
> > distinguish these two cases.
> [...]
> 
> There's:
> 
> $ zmodload zsh/system
> $ ls -ld -- *(e[ERRNO=0]-e['[[ $errnos[ERRNO] = EACCES ]]'])
> lrwxrwxrwx 1 chazelas chazelas 9 Apr 12 07:34 s2 -> /root/foo
> $ ls -ld -- *(e[ERRNO=0]-e['[[ $errnos[ERRNO] = ENOENT ]]'])
> lrwxrwxrwx 1 chazelas chazelas 15 Apr 12 07:34 s1 -> /does-not-exist
> 
> (the ERRNO=0 may not be necessary).

Well, I implicitly meant with simple glob qualifiers. Otherwise,
when allowing 'e' (to run arbitrary code), one can do almost
anything based on available information.

> Note:
> 
> $ find -L . -perm -o=w
> ./s1
> find: ‘./s2’: Permission denied
> 
> But again, *(-@) for broken symlinks is documented and widely
> used, we can't break that.

But widely used for what purpose exactly?

For instance, if the goal is to list dangling symlinks only, then
"permission denied" cases (EACCES) would yield false positives, and
existing code may be broken. So, perhaps '-' should still be kept
for dangling symlinks, but its behavior might need to be changed to
match the currently expected behavior.

And what about less common errors such as ENOMEM?

> So if we change "-" to exclude broken symlinks, we'd need to
> special case -@. What's the scope of what should be special
> cased? *(-@e['((count++))']) should probably still work as well
> for instance.
> 
> How about: *(-e['((n++))']@['((brokenlinks++))'])?
> 
> And *(-@m-1) (broken links created in the last 24 hours, though
> I'd expect one to write *(m-1-@) instead here)
> 
> Note that for "find -L", zsh's current behaviour is required by
> POSIX (at least for links whose target can be determined not to
> exist):
> 
>      -L
> 	  Cause the file information and file type evaluated for
> 	  each symbolic link encountered as a path operand on
> 	  the command line or encountered during the traversal
> 	  of a file hierarchy to be those of the file referenced
> 	  by the link, and not the link itself. If the
> 	                                        ^^^^^^
> 	  referenced file does not exist, the file information
> 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 	  and type shall be for the link itself.
> 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

At least, that's explicit, unambiguous documentation.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 12:48   ` Peter Stephenson
@ 2020-04-12 14:31     ` Vincent Lefevre
  2020-04-12 15:49       ` Peter Stephenson
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-12 14:31 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-12 13:48:43 +0100, Peter Stephenson wrote:
> For convenience, the proposed wording in the attached is
> 
> item(tt(-))(
> toggles between making the qualifiers work on symbolic links (the
> default) and the files they point to, if any; a broken symbolic link
> is treated as a file in its own right
> )

The term "broken symbolic link" should properly be defined.
It seems that the zsh code ignores all stat() errors, so that
this may be very surprising, if not dangerous. Imagine a script
whose goal is to remove all dangling symlinks, but could remove
valid ones due to undetected errors as not reported by zsh...

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 14:31     ` Vincent Lefevre
@ 2020-04-12 15:49       ` Peter Stephenson
  2020-04-12 23:07         ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Stephenson @ 2020-04-12 15:49 UTC (permalink / raw)
  To: zsh-workers

On Sun, 2020-04-12 at 16:31 +0200, Vincent Lefevre wrote:
> On 2020-04-12 13:48:43 +0100, Peter Stephenson wrote:
> > For convenience, the proposed wording in the attached is
> > 
> > item(tt(-))(
> > toggles between making the qualifiers work on symbolic links (the
> > default) and the files they point to, if any; a broken symbolic link
> > is treated as a file in its own right
> > )
> 
> The term "broken symbolic link" should properly be defined.
> It seems that the zsh code ignores all stat() errors, so that
> this may be very surprising, if not dangerous. Imagine a script
> whose goal is to remove all dangling symlinks, but could remove
> valid ones due to undetected errors as not reported by zsh...

Right, you're saying we don't know the reason it failed without looking
explicitly at the error, and should be clear about that... how about...

item(tt(-))(
toggles between making the qualifiers work on symbolic links (the
default) and the files they point to, if any; any symbolic link for
whose target the `tt(stat)' system call fails (whatever the cause of the
failure) is treated as a file in its own right
)

I don't think we necessarily tell the reason (as opposed to the error code)
can we?  EPERM might refer to a surrounding directory?

pws


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 14:25               ` Vincent Lefevre
@ 2020-04-12 17:34                 ` Stephane Chazelas
  2020-04-12 23:38                   ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-12 17:34 UTC (permalink / raw)
  To: zsh-workers

2020-04-12 16:25:44 +0200, Vincent Lefevre:
[...]
> > 	  by the link, and not the link itself. If the
> > 	                                        ^^^^^^
> > 	  referenced file does not exist, the file information
> > 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 	  and type shall be for the link itself.
> > 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> At least, that's explicit, unambiguous documentation.
[...]

Well, as seen in the varying interpretations made by the various
implementations, it is not that clear. How do you determine
whether a file exists or not? What does "exist" mean?

If ENOENT or ENOTDIR is returned upon stat(), you can tell
there's no file at that path, but it's not so clear with ELOOP,
EACCES, EOVERFLOW, ENOMEM for instance, where we can't tell for
sure the file doesn't "exist". If didn't exist for the caller at
the time in that it can't access it, but it may exist in the
"absolute" or for a caller with different credentials or in a
different namespace (could also apply for ENOENT/ENOTDIR in that
case).. 

That was discussed on the austin-group mailing list not so long
ago wrt glob(3)

https://www.mail-archive.com/austin-group-l@opengroup.org/msg04496.html
https://www.austingroupbugs.net/view.php?id=1275

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 15:49       ` Peter Stephenson
@ 2020-04-12 23:07         ` Vincent Lefevre
  0 siblings, 0 replies; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-12 23:07 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-12 16:49:45 +0100, Peter Stephenson wrote:
> Right, you're saying we don't know the reason it failed without looking
> explicitly at the error, and should be clear about that... how about...
> 
> item(tt(-))(
> toggles between making the qualifiers work on symbolic links (the
> default) and the files they point to, if any; any symbolic link for
> whose target the `tt(stat)' system call fails (whatever the cause of the
> failure) is treated as a file in its own right
> )

OK.

> I don't think we necessarily tell the reason (as opposed to the error code)
> can we?  EPERM might refer to a surrounding directory?

There's no EPERM. Did you mean EACCES?

[EACCES]
    Search permission is denied for a component of the path prefix.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/stat.html

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 17:34                 ` Stephane Chazelas
@ 2020-04-12 23:38                   ` Vincent Lefevre
  2020-04-13 14:22                     ` Stephane Chazelas
  0 siblings, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-12 23:38 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-12 18:34:48 +0100, Stephane Chazelas wrote:
> 2020-04-12 16:25:44 +0200, Vincent Lefevre:
> [...]
> > > 	  by the link, and not the link itself. If the
> > > 	                                        ^^^^^^
> > > 	  referenced file does not exist, the file information
> > > 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > 	  and type shall be for the link itself.
> > > 	  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > At least, that's explicit, unambiguous documentation.
> [...]
> 
> Well, as seen in the varying interpretations made by the various
> implementations, it is not that clear. How do you determine
> whether a file exists or not? What does "exist" mean?

You necessarily know it, or that's an error case (permission denied
or whatever). The behavior may not be clear in case of error, but
one can expect at least a non-zero exit status:

  EXIT STATUS

    The following exit values shall be returned:

     0
        All path operands were traversed successfully.
    >0
        An error occurred.

As a comparison, for "test -e"

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

POSIX does not say "true if file exists" but conditions the result
to the pathname resolution, so that the result is false if any error
occurs, even if the file actually exists:

  -e  pathname
    True if pathname resolves to an existing directory entry.
    False if pathname cannot be resolved.

BTW, I don't know how zsh behaves on "[[ -e pathname ]]" in case of
error other than ENOENT in the pathname resolution, but this should
be documented (and ditto for the other conditional expressions).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-12 23:38                   ` Vincent Lefevre
@ 2020-04-13 14:22                     ` Stephane Chazelas
  2020-04-13 15:00                       ` Bart Schaefer
  2020-04-13 21:41                       ` Vincent Lefevre
  0 siblings, 2 replies; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-13 14:22 UTC (permalink / raw)
  To: zsh-workers

2020-04-13 01:38:45 +0200, Vincent Lefevre:
[...]
> > Well, as seen in the varying interpretations made by the various
> > implementations, it is not that clear. How do you determine
> > whether a file exists or not? What does "exist" mean?
> 
> You necessarily know it, or that's an error case (permission denied
> or whatever). The behavior may not be clear in case of error, but
> one can expect at least a non-zero exit status:
[...]

Are you saying that EACCES is an "error" but ENOENT is not? How
about ENOTDIR, ELOOP? None of /etc/passwd/foo, /etc/pesswd/foo,
symloop/foo or /root/foo exist on my system, a stat() on a
symlink to those would return ENOTDIR, ENOENT, ELOOP, EACCES. On
which one should find -L return with a non-zero exit status?
Which one(s) should find -L . -type l (or find . -xtype l)
print?

[...]
>   -e  pathname
>     True if pathname resolves to an existing directory entry.
>     False if pathname cannot be resolved.
> 
> BTW, I don't know how zsh behaves on "[[ -e pathname ]]" in case of
> error other than ENOENT in the pathname resolution, but this should
> be documented (and ditto for the other conditional expressions).
[...]

The mention of "directory entry" is misleading here. It's really
about a "file" more than a "directory entry" as stat() gets you
to the inode.

Maybe:

 -e  pathname
   True if pathname can be determined to resolve to an existing
   file (of any type). False otherwise. Note that since this
   applies after symlink resolution, that will return false for
   existing but broken symlinks. Use [[ -e /path/to/file || -L
   /path/to/file ]] to account for that (requires search access
   to the directory).

But that's probably going to confuse the reader more than help
them.

To confuse them even more, we could also mention
(){(($#))} (#I)/path/to/file(|)(N) to really check for file
being an entry in /path/to (requires read access to the
directory and extendedglob for (#I) used to cancel nocaseglob
in case it's enabled).

IIRC the "test" operator was initially "-a" (for "accessible"?)
for that which was a bit more accurrate (or maybe it was a
SYSV-sh vs ksh thing, I can't remember).

See also
https://stackoverflow.com/questions/638975/how-do-i-tell-if-a-regular-file-does-not-exist-in-bash/40046642#40046642

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-13 14:22                     ` Stephane Chazelas
@ 2020-04-13 15:00                       ` Bart Schaefer
  2020-04-13 21:41                       ` Vincent Lefevre
  1 sibling, 0 replies; 24+ messages in thread
From: Bart Schaefer @ 2020-04-13 15:00 UTC (permalink / raw)
  To: zsh-workers

On Mon, Apr 13, 2020 at 7:23 AM Stephane Chazelas <stephane@chazelas.org> wrote:
>
> [...]
> >   -e  pathname
> >     True if pathname resolves to an existing directory entry.
> >     False if pathname cannot be resolved.
> >
> > BTW, I don't know how zsh behaves on "[[ -e pathname ]]" in case of
> > error other than ENOENT in the pathname resolution, but this should
> > be documented (and ditto for the other conditional expressions).
> [...]
>
> The mention of "directory entry" is misleading here. It's really
> about a "file" more than a "directory entry" as stat() gets you
> to the inode.

The problem is that "file" implies "not a directory", but a symlink
can refer to either.  And many users don't know the concept of "inode"
as separate from "directory entry".

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-13 14:22                     ` Stephane Chazelas
  2020-04-13 15:00                       ` Bart Schaefer
@ 2020-04-13 21:41                       ` Vincent Lefevre
  2020-04-14  6:18                         ` Stephane Chazelas
  1 sibling, 1 reply; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-13 21:41 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-13 15:22:57 +0100, Stephane Chazelas wrote:
> 2020-04-13 01:38:45 +0200, Vincent Lefevre:
> [...]
> > > Well, as seen in the varying interpretations made by the various
> > > implementations, it is not that clear. How do you determine
> > > whether a file exists or not? What does "exist" mean?
> > 
> > You necessarily know it, or that's an error case (permission denied
> > or whatever). The behavior may not be clear in case of error, but
> > one can expect at least a non-zero exit status:
> [...]
> 
> Are you saying that EACCES is an "error" but ENOENT is not?

Yes, because getting EACCES implies that the system could not
determine whether the file exists or not. With ENOENT, the system
could determine that the file does not exist.

> How about ENOTDIR, ELOOP? None of /etc/passwd/foo, /etc/pesswd/foo,
> symloop/foo or /root/foo exist on my system, a stat() on a symlink
> to those would return ENOTDIR, ENOENT, ELOOP, EACCES. On which one
> should find -L return with a non-zero exit status?

ENOTDIR: in the symlink resolution, the system got a path prefix that
is not a directory, and this implies that the file cannot exist. Thus
this should not be regarded as an error (since the target file does
not exist, the file information and type is for the link itself).

ENOENT: Not an error, as this means that the target file does not
exist; so the file information and type is for the link itself.

ELOOP: This can imply two possibilities: either there is really a
loop, so that the symlink does not resolve to a file, or the number
of redirections is too large, but in this case, I would say that
this also means that the symlink does not resolve to a file because
a probably fixed system limit has been reached. Therefore it should
be fine to regard this as "not an error", like ENOTDIR and ENOENT.
I think that regarding this as an error would be bad because even
with sufficient permissions (e.g. as root) and resources, this could
make "find" fail every time, thus would not be very helpful.

EACCES: an error since the system could not determine whether
/root/foo exists or not, due to the lack of permissions.

ENOMEM would also be an error, assuming that this could occur on
any symlink.

> Which one(s) should find -L . -type l (or find . -xtype l)
> print?

/etc/passwd/foo
/etc/pesswd/foo
symloop/foo

(and I would expect an error message for /root/foo, such as
"Permission denied", in addition to a non-zero exit status).

> [...]
> >   -e  pathname
> >     True if pathname resolves to an existing directory entry.
> >     False if pathname cannot be resolved.
> > 
> > BTW, I don't know how zsh behaves on "[[ -e pathname ]]" in case of
> > error other than ENOENT in the pathname resolution, but this should
> > be documented (and ditto for the other conditional expressions).
> [...]
> 
> The mention of "directory entry" is misleading here. It's really
> about a "file" more than a "directory entry" as stat() gets you
> to the inode.

Bart replied. In any case, the inode here will necessarily correspond
to a directory entry (it will not be an orphaned inode), and with the
symlink resolution algorithm, you can also determine the directory in
question. So, nothing wrong here.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-13 21:41                       ` Vincent Lefevre
@ 2020-04-14  6:18                         ` Stephane Chazelas
  2020-04-14 12:02                           ` Daniel Shahaf
  2020-04-14 17:59                           ` Vincent Lefevre
  0 siblings, 2 replies; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-14  6:18 UTC (permalink / raw)
  To: zsh-workers

2020-04-13 23:41:49 +0200, Vincent Lefevre:
[...]
> > Which one(s) should find -L . -type l (or find . -xtype l)
> > print?
> 
> /etc/passwd/foo
> /etc/pesswd/foo
> symloop/foo
> 
> (and I would expect an error message for /root/foo, such as
> "Permission denied", in addition to a non-zero exit status).

So not that "unambiguous" after all. I could not find a single
find implementation that agrees with your interpretation (not
that it means that your intepretation is better or worse).

GNU find for instance only prints /etc/pesswd/foo and
/etc/passwd/foo (but outputs an error for the latter) and
returns non-zero for anything but /etc/pesswd/foo.

What should the outcome be for ESYS123 error code?

To me, the best approach is zsh's where *(-@) reports *all*
broken links, broken meaning "whose target cannot be resolved".

> > [...]
> > >   -e  pathname
> > >     True if pathname resolves to an existing directory entry.
> > >     False if pathname cannot be resolved.
> > > 
> > > BTW, I don't know how zsh behaves on "[[ -e pathname ]]" in case of
> > > error other than ENOENT in the pathname resolution, but this should
> > > be documented (and ditto for the other conditional expressions).
> > [...]
> > 
> > The mention of "directory entry" is misleading here. It's really
> > about a "file" more than a "directory entry" as stat() gets you
> > to the inode.
> 
> Bart replied. In any case, the inode here will necessarily correspond
> to a directory entry (it will not be an orphaned inode), and with the
> symlink resolution algorithm, you can also determine the directory in
> question. So, nothing wrong here.
[...]

An example:

# ls -la
total 1
drwxr-xr-x 2 root root 2 Aug 15  2018 ./
drwxr-xr-x 5 root root 5 Mar 18  2019 ../
# [[ -e .zfs ]] && echo yes
yes

No .zfs directory entry, but [[ -e .zfs ]] still returns true.
On ZFS filesystems, the root of each dataset has a hidden
"virtual" .zfs directory that "exists" but not as a directory
entry. That's not unique to ZFS, netapp FSs and several
fuse-based ones are in that case.

And there's:

$ ls -a 1
.  ..  file
$ ls -ld 1
dr--r--r-- 2 chazelas chazelas 3 Apr 14 07:07 1
$ [[ -e 1/file ]] || echo no
no
$ (){(($#))} (#I)1/file(|)(N) && echo yes
yes

That directory does have a "file" entry but [[ -e 1/file ]] does
not report it (and there's a symmetric problem for a=x
directories which don't have entries which the user can see but
for which [[ -e ... ]] finds entries).

There's also the case of case insensitive or unicode-normalizing
file systems.

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-14  6:18                         ` Stephane Chazelas
@ 2020-04-14 12:02                           ` Daniel Shahaf
  2020-04-14 12:38                             ` Stephane Chazelas
  2020-04-14 17:59                           ` Vincent Lefevre
  1 sibling, 1 reply; 24+ messages in thread
From: Daniel Shahaf @ 2020-04-14 12:02 UTC (permalink / raw)
  To: zsh-workers

Stephane Chazelas wrote on Tue, 14 Apr 2020 07:18 +0100:
> 2020-04-13 23:41:49 +0200, Vincent Lefevre:
> [...]
> > > Which one(s) should find -L . -type l (or find . -xtype l)
> > > print?  
> > 
> > /etc/passwd/foo
> > /etc/pesswd/foo
> > symloop/foo
> > 
> > (and I would expect an error message for /root/foo, such as
> > "Permission denied", in addition to a non-zero exit status).  
> 
> So not that "unambiguous" after all. I could not find a single
> find implementation that agrees with your interpretation (not
> that it means that your intepretation is better or worse).
> 
> GNU find for instance only prints /etc/pesswd/foo and
> /etc/passwd/foo (but outputs an error for the latter) and
> returns non-zero for anything but /etc/pesswd/foo.
> 
> What should the outcome be for ESYS123 error code?
> 
> To me, the best approach is zsh's where *(-@) reports *all*
> broken links, broken meaning "whose target cannot be resolved".

Counter-argument: since an ENOMEM during symlink resolution causes
«(-@)» to presume the symlink is broken, the zsh language is
non-deterministic: what «intact-symlink(N-@)» will expand to will
depend on whether there is enough memory at runtime.

Shouldn't an ENOMEM during expansion of «intact-symlink(N-@)» result in
an error?  "In the face of ambiguity, refuse the temptation to guess."

That is: I think there's a qualitative difference between ENOENT and ENOMEM.

I'm not sure what to do about unknown error codes.

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-14 12:02                           ` Daniel Shahaf
@ 2020-04-14 12:38                             ` Stephane Chazelas
  2020-04-15  0:44                               ` Daniel Shahaf
  0 siblings, 1 reply; 24+ messages in thread
From: Stephane Chazelas @ 2020-04-14 12:38 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: zsh-workers

2020-04-14 12:02:41 +0000, Daniel Shahaf:
[...]
> Counter-argument: since an ENOMEM during symlink resolution causes
> «(-@)» to presume the symlink is broken, the zsh language is
> non-deterministic: what «intact-symlink(N-@)» will expand to will
> depend on whether there is enough memory at runtime.
> 
> Shouldn't an ENOMEM during expansion of «intact-symlink(N-@)» result in
> an error?  "In the face of ambiguity, refuse the temptation to guess."
> 
> That is: I think there's a qualitative difference between ENOENT and ENOMEM.
[...]

ENOMEM/EFAULT/EINVAL are the kind of pathological errors for
which I'd say you can't do much more than best effort.

If you run out of kernel memory to the point that stat() in your
shell fails, or if bits start to randomly flip causing
EFAULT/EINVAL, more things are going to fail and an inaccurate
glob expansion is probably the least of your concerns.

And what can you do? Check errno after each and every system
call? And what's the list of errno values we should consider?
And how should we handle them? How likely is it to happen?
What's the worst that can happen if it's not handled "properly"?
Can that be exploited?

Say you're the sysadmin that needs to fix that low kernel memory
situation, it's probably better to keep your shell running as
long as possible. That's a narrow ridge to walk on there. I
agree that letting the user know of that pathological condition
is useful, but we need to be careful the cure be not worse than
the disease.

In any case, I would imagine this kind of consideration has
already been discussed at length here and elsewhere. In anycase,
I'm no expert at all on that.

-- 
Stephane

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-14  6:18                         ` Stephane Chazelas
  2020-04-14 12:02                           ` Daniel Shahaf
@ 2020-04-14 17:59                           ` Vincent Lefevre
  1 sibling, 0 replies; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-14 17:59 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-14 07:18:16 +0100, Stephane Chazelas wrote:
> 2020-04-13 23:41:49 +0200, Vincent Lefevre:
> [...]
> > > Which one(s) should find -L . -type l (or find . -xtype l)
> > > print?
> > 
> > /etc/passwd/foo
> > /etc/pesswd/foo
> > symloop/foo
> > 
> > (and I would expect an error message for /root/foo, such as
> > "Permission denied", in addition to a non-zero exit status).
> 
> So not that "unambiguous" after all.

Note that "unambiguous" does not necessarily mean that the
result is known.

> I could not find a single
> find implementation that agrees with your interpretation (not
> that it means that your intepretation is better or worse).
> 
> GNU find for instance only prints /etc/pesswd/foo and
> /etc/passwd/foo (but outputs an error for the latter) and
> returns non-zero for anything but /etc/pesswd/foo.

I've said that for ELOOP (here, symloop/foo), it should be fine to
regard this as "not an error". This does not mean that the opposite
is necessarily wrong. IMHO, this is just a bad choice.

IMHO, the fact that it returns non-zero for /etc/passwd/foo is a
bug.

> What should the outcome be for ESYS123 error code?

It seems non-standard, thus an error (unless the implementation
knows what it means and the consequence on the existence of the
file).

Note that in any case, an error is always possible, even when not
expected. For instance, it could be due to a network issue in case
of NFS, and more generally some hardware failure. Script must be
able to handle errors at any time.

> To me, the best approach is zsh's where *(-@) reports *all*
> broken links, broken meaning "whose target cannot be resolved".

Since zsh regards "permission denied" errors as non-matching (for
instance, on my machine, /r*/* expands to objects under /run, with
no errors, even though the /root directory is not accessible),
I think it is fine that here, EACCES errors (permission denied) be
regarded as non-existing. But IMHO, "serious" errors such as ENOMEM
should be reported as such with globbing (in general, not just for
symlink resolution), i.e. zsh should not execute the command and
should report an error instead, like with a "bad pattern" error.

[...]
> An example:
> 
> # ls -la
> total 1
> drwxr-xr-x 2 root root 2 Aug 15  2018 ./
> drwxr-xr-x 5 root root 5 Mar 18  2019 ../
> # [[ -e .zfs ]] && echo yes
> yes
> 
> No .zfs directory entry, but [[ -e .zfs ]] still returns true.
> On ZFS filesystems, the root of each dataset has a hidden
> "virtual" .zfs directory that "exists" but not as a directory
> entry. That's not unique to ZFS, netapp FSs and several
> fuse-based ones are in that case.

Is this POSIX compliant? If it is, I would say that it is a bug.

> And there's:
> 
> $ ls -a 1
> .  ..  file
> $ ls -ld 1
> dr--r--r-- 2 chazelas chazelas 3 Apr 14 07:07 1
> $ [[ -e 1/file ]] || echo no
> no
> $ (){(($#))} (#I)1/file(|)(N) && echo yes
> yes
> 
> That directory does have a "file" entry but [[ -e 1/file ]] does
> not report it (and there's a symmetric problem for a=x
> directories which don't have entries which the user can see but
> for which [[ -e ... ]] finds entries).

I would say that this is a also a bug. However, perhaps not since
how the resolution is done is not specified.

> There's also the case of case insensitive or unicode-normalizing
> file systems.

POSIX compliant?

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-14 12:38                             ` Stephane Chazelas
@ 2020-04-15  0:44                               ` Daniel Shahaf
  2020-04-15  9:17                                 ` Vincent Lefevre
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Shahaf @ 2020-04-15  0:44 UTC (permalink / raw)
  To: zsh-workers

Stephane Chazelas wrote on Tue, 14 Apr 2020 13:38 +0100:
> 2020-04-14 12:02:41 +0000, Daniel Shahaf:
> [...]
> > Counter-argument: since an ENOMEM during symlink resolution causes
> > «(-@)» to presume the symlink is broken, the zsh language is
> > non-deterministic: what «intact-symlink(N-@)» will expand to will
> > depend on whether there is enough memory at runtime.
> > 
> > Shouldn't an ENOMEM during expansion of «intact-symlink(N-@)» result in
> > an error?  "In the face of ambiguity, refuse the temptation to guess."
> > 
> > That is: I think there's a qualitative difference between ENOENT and ENOMEM.  
> [...]
> 
> ENOMEM/EFAULT/EINVAL are the kind of pathological errors for
> which I'd say you can't do much more than best effort.
> 
> If you run out of kernel memory to the point that stat() in your
> shell fails, or if bits start to randomly flip causing
> EFAULT/EINVAL, more things are going to fail and an inaccurate
> glob expansion is probably the least of your concerns.
> 

I'm sure you can imagine other transient failure modes, e.g., a network
error with a network-based filesystem.

> And what can you do? Check errno after each and every system
> call?

I am talking about a specific stat(2) syscall in glob.c.

> And what's the list of errno values we should consider?
> And how should we handle them?

If there is agreement that not all errno values should be treated the
same way, then we can move on to answering these questions.

> How likely is it to happen?

Filesystems do get corrupt or go offline from time to time.

> What's the worst that can happen if it's not handled "properly"?

Depends on how we handle it, obviously.  If we handle it by returning an
error and aborting the current command line, the worst that can happen
is that a command line (or script) would be aborted, whereas currently
it would silently continue execution with wrong data.

> Can that be exploited?

That's a question for the script author.

> Say you're the sysadmin that needs to fix that low kernel memory
> situation, it's probably better to keep your shell running as
> long as possible.

No one proposed to kill the shell.  An error in an interactive shell
just aborts the current command-line and drops the user back to the
prompt.

> That's a narrow ridge to walk on there. I agree that letting the user
> know of that pathological condition is useful, but we need to be
> careful the cure be not worse than the disease.
> 
> In any case, I would imagine this kind of consideration has
> already been discussed at length here and elsewhere. In anycase,
> I'm no expert at all on that.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: glob qualifier '-' doesn't work correctly on dangling symlinks
  2020-04-15  0:44                               ` Daniel Shahaf
@ 2020-04-15  9:17                                 ` Vincent Lefevre
  0 siblings, 0 replies; 24+ messages in thread
From: Vincent Lefevre @ 2020-04-15  9:17 UTC (permalink / raw)
  To: zsh-workers

On 2020-04-15 00:44:03 +0000, Daniel Shahaf wrote:
> Stephane Chazelas wrote on Tue, 14 Apr 2020 13:38 +0100:
[Pathological errors in globbing]
> > What's the worst that can happen if it's not handled "properly"?
> 
> Depends on how we handle it, obviously.  If we handle it by returning an
> error and aborting the current command line, the worst that can happen
> is that a command line (or script) would be aborted, whereas currently
> it would silently continue execution with wrong data.

For instance, one can imagine a script that would fix permissions
based on a glob like *(W) before making the directory world-readable.
If the error is not reported, some files would be left world-writable
and an attack would be possible due to the directory becoming
world-readable. With an error, the script would be able to detect
the issue or abort (e.g. with "set -e").

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-04-15  9:18 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-11 15:15 glob qualifier '-' doesn't work correctly on dangling symlinks Vincent Lefevre
2020-04-11 17:34 ` Stephane Chazelas
2020-04-11 19:17   ` Vincent Lefevre
2020-04-11 20:37     ` Stephane Chazelas
2020-04-11 23:48       ` Vincent Lefevre
2020-04-12  1:21         ` Daniel Shahaf
2020-04-12  2:17           ` Vincent Lefevre
2020-04-12  7:09             ` Stephane Chazelas
2020-04-12 14:25               ` Vincent Lefevre
2020-04-12 17:34                 ` Stephane Chazelas
2020-04-12 23:38                   ` Vincent Lefevre
2020-04-13 14:22                     ` Stephane Chazelas
2020-04-13 15:00                       ` Bart Schaefer
2020-04-13 21:41                       ` Vincent Lefevre
2020-04-14  6:18                         ` Stephane Chazelas
2020-04-14 12:02                           ` Daniel Shahaf
2020-04-14 12:38                             ` Stephane Chazelas
2020-04-15  0:44                               ` Daniel Shahaf
2020-04-15  9:17                                 ` Vincent Lefevre
2020-04-14 17:59                           ` Vincent Lefevre
2020-04-12 12:48   ` Peter Stephenson
2020-04-12 14:31     ` Vincent Lefevre
2020-04-12 15:49       ` Peter Stephenson
2020-04-12 23:07         ` Vincent Lefevre

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).