zsh-workers
 help / color / mirror / code / Atom feed
* Command hashing/autocd bug & possible fixes
@ 2019-03-16 20:01 ` Charles Blake
  2019-03-18 10:01   ` Peter Stephenson
  0 siblings, 1 reply; 3+ messages in thread
From: Charles Blake @ 2019-03-16 20:01 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 5520 bytes --]

First the bug.  It happens on any OS.  It just requires there to be a
command in $PATH whose name collides with a root dir entry such as "usr",
"bin", "dev" or "opt".  The below creates a collision (assumes usual /dev
dir exists), does a setopt AUTOCD, witnesses that working, then failing,
then working again.  It should reproduce with a zsh -f (but should not
be sensitive to much besides unsetopt hashcmds):

  PATH=/tmp:$PATH; touch /tmp/dev; chmod +x /tmp/dev
  setopt AUTOCD
  /dev
  pwd
  whence /dev
  /dev          # errors out with permission denied
  unhash /dev
  /dev          # works again

If you do a 'hash -L | grep /dev' after the whence you can see what is
amiss is that /dev is being put in as a hash key mapping to /tmp//dev.
If you strace you can see that Zsh ends up trying to execve("/dev")
which fails with EPERM, hence the message.  unsetopt hashcmds also
works around/removes the problem.

Note that one might think with setopt PATHDIRS in effect that there
would be an ambiguity about what should happen.  /dev appended to an
element of PATH after all hits a real program, /tmp//dev, with the
usual Unix path collapse rules.  Real programs are supposed to take
priority over AUTOCD.  However, the man pages for PATHDIRS say it
should NOT work for commands that begin with "/", "./" or "../".
So, I think it should either do the AUTOCD or the docs are wrong.
I think the "/", "./", "../" is a simple, good rule and AUTOCD
should go through.

This is not entirely theoretical problem.  LLVM comes with a program
called "opt" & "/opt" is very common.  I don't know how popular AUTOCD
is, but I've always loved it.  A lot of typical root entries are not
implausible names for commands.  Finally, getting misleading permission
denied on an "attempted cd" to a root entry is..worrisome.


Now ideas for a fix.  About these I am less confident as there may be
a tangled web of concerns.

Since it makes little sense to call whence with a term that explicitly
blocks path search, one possible fix is inside Src/exec.c:findcmd():

diff --git a/Src/exec.c b/Src/exec.c
index 042ba065a..e6b20dcb6 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -834,7 +834,7 @@ findcmd(char *arg0, int docopy, int default_path)
        return NULL;
     }
     cn = (Cmdnam) cmdnamtab->getnode(cmdnamtab, arg0);
-    if (!cn && isset(HASHCMDS) && !isrelative(arg0))
+    if (!cn && isset(HASHCMDS) && !isrelative(arg0) && *arg0 != '/')
        cn = hashcmd(arg0, path);
     if ((int) strlen(arg0) > PATH_MAX)
        return NULL;

That tests out fine for my "whence" case, but besides bin_whence(),
findcmd() has 6 other call sites: Src/subst.c:equalsubstr(),
Src/Zle/zle_tricky.c:docomplete() Src/Zle/zle_tricky.c:expandcmdpath(),
Src/Zle/compctl.c:makecomplistpc(), Src/Zle/compctl.c:makecomplistcmd(),
and Src/Zle/compctl.c:addmatch().

It's possible those sites are also buggy OR possible they have reason
to put an absolute path into cmdnametab.  If they have a legitimate
need, findcmd() could grow an allowRooted parameter they could pass.
That's a more involved patch, obviously.

Another possible fix might be to try to stop hashcmd from entering
keys with a leading '/' (or accept an allowRooted parameter).  The
easiest way to do that..

diff --git a/Src/exec.c b/Src/exec.c
index 042ba065a..79ef83c1e 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -940,6 +940,8 @@ hashcmd(char *arg0, char **pp)
     char *s, buf[PATH_MAX+1];
     char **pq;

+    if (*arg0 == '/')
+        return NULL;
     for (; *pp; pp++)
        if (**pp == '/') {
            s = buf;

..also tests out fine for my "whence" example.  That sidesteps the
HASHDIRS optimization only for the (probably rare?) rooted path.
If that's viewed as a real problem, the lower isset(HASHDIRS) code
could be duplicated before the early return.  There are only a few
more call sites to hashcmd besides the 6 indirect calls already
mentioned, Src/utils.c:spckword(), Src/Zle/zle_tricky.c:docomplete(),
and Src/utils.c:execcmd_exec().

Chances seem good that the original concept of cmdnametab explicitly
did not want rooted paths in there, but that somewhere in those 9
other call sites someone cares to allow absolute paths for some reason
or another.  But I do think that once a rooted path is in there, if
it collides with a root directory entry the same name collision logic
will break autocd.

So, finally, it also seems possible to let rooted paths stay and have a
smarter isreallycom():

diff --git a/Src/exec.c b/Src/exec.c
index 042ba065a..701e8c2c7 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -906,6 +906,8 @@ isreallycom(Cmdnam cn)
     else if (!cn->u.name)
        return 0;
     else {
+        if (*cn->node.nam == '/')
+            return 0;
        strcpy(fullnam, *(cn->u.name));
        strcat(fullnam, "/");
        strcat(fullnam, cn->node.nam);

This lets "/dev" be put into cmdnametab, but the smarter isreallycmd()
prevents that from interfering with autocd.  We could also add an
"&& isset(AUTOCD) " to that nam == '/' condition, too.  isreallycmd() is
only called in once place the trycd clause of  execcmd_exec().  This
is the most conservative approach (and, yes, yes, whatever we go
with should all have some comment explaining itself a bit).

Does anyone have any firm opinions on what approach they would like
beyond the goes-without-saying "don't break stuff"?  Should rooted
paths be blocked from cmdnametab outright keeping it "clean"?  Or
should we just workaround their presence?  Both?  Some other approach
entirely?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Command hashing/autocd bug & possible fixes
  2019-03-16 20:01 ` Command hashing/autocd bug & possible fixes Charles Blake
@ 2019-03-18 10:01   ` Peter Stephenson
  2019-03-18 21:20     ` Charles Blake
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Stephenson @ 2019-03-18 10:01 UTC (permalink / raw)
  To: zsh-workers

On Sat, 2019-03-16 at 16:01 -0400, Charles Blake wrote:
> Another possible fix might be to try to stop hashcmd from entering
> keys with a leading '/' (or accept an allowRooted parameter).  The
> easiest way to do that..
> 
> diff --git a/Src/exec.c b/Src/exec.c
> index 042ba065a..79ef83c1e 100644
> --- a/Src/exec.c
> +++ b/Src/exec.c
> @@ -940,6 +940,8 @@ hashcmd(char *arg0, char **pp)
>      char *s, buf[PATH_MAX+1];
>      char **pq;
> 
> +    if (*arg0 == '/')
> +        return NULL;
>      for (; *pp; pp++)
>         if (**pp == '/') {
>             s = buf;

This certainly looks plausible.  I can't offhand think of a reason why a
hashed command would begin with a "/"; it's just asking for trouble.

We've been confusingly lax about this sort of thing in other places ---
for example, autoload of a function beginning with a "/" used to do path
look up under $fpath, though for a couple of years now has been treated
as an absolute path.

Unless anyone comes up with a counterexample I think I'll put this in.

Thanks for the work.
pws



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Command hashing/autocd bug & possible fixes
  2019-03-18 10:01   ` Peter Stephenson
@ 2019-03-18 21:20     ` Charles Blake
  0 siblings, 0 replies; 3+ messages in thread
From: Charles Blake @ 2019-03-18 21:20 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2954 bytes --]

>This certainly looks plausible.  I can't offhand think of a reason why a
>hashed command would begin with a "/"; it's just asking for trouble.

Cool.  I think stricter/cleaner cmdnamtab is safer long-term.  Not sure
any users will complain about 'hash' built-in no longer saving absolute
keys which is a consequence of your preferred resolution, but that's
only likely to be short-term, if it exists at all.

You likely want to document that in the man page or at least release
notes so that any 'hash' users aren't mystified.  Also, you may want
to double check other possible ways cmdnamtab can be poisoned.  greping
for addnode.cmdname, I think the only other spot is Modules/parameter.c
but parameter names cannot have dots and slashes anyway.

I haven't analyzed it -- the code is so twisty and FS state dependence
as well as all the setopt stuff is a lot of cases -- but it's at least
conceivable that user-inserted *relative* paths in cmdnamtab can break
something (maybe a feature other than AUTOCD or some future feature).
There may be no good reason to have explicitly relative paths starting
with "./" or "../" in there either.  You may also want to expand your
preferred solution to cover those.  PATH-relative entries can make
sense.  Eg., mh/inc for those old school enough to recall /usr/bin/mh
installations.

So, I'd stop short of a "block *any* '/' in cmdnamtab keys" rule.
Testing just the start of the string is faster anyway.  Presently,
neither the explicitly relative (which don't make sense to cache) nor
PATH-relative entries (which do make sense to cache) get in cmdnamtab
automatically unless I'm missing something.  They can be put manually
via 'hash', though, and PATH-relative profitably so (a dubiously small
profit, but still).  If they do cause trouble it's "likely" users know
they did it by hacking search with 'hash'..Probably very rare if it's
even a problem at all.  All famous last words, though.  Some under-
the-covers package might do it for them.

Indeed, running whence on absolute paths has to be rare, too.  I only
noticed it recently using zsh-syntax-highlighting which broke AUTOCD on
just a couple root dirents..mysteriously at first.  I didn't realize it
was z-sy-h for a while.  Then it took a short while to isolate it to
just entries in my $PATH.  Then I just disabled z-sy-h for *just* long
enough to forget why I disabled it.  Re-trying z-sy-h a few days ago
reminded me.  I figured I should report this time since last time I
forgot.  Matthew Martin quickly reduced the misbehavior to whence with
Daniel doing an easy subshell fix to avoid poisoning longer-term state.

Curious just how long this bug lurked, I tested old versions of Zsh.
I had to go all the way back to zsh2.1 from circa 1993 to find a
version *without* this bug.  zsh2.0 also worked.  I think 2.0 may have
been the first release with AUTOCD.  At least zsh-1.0 didn't have it.
So, well, it worked for a little while, anyway. ;-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-03-18 21:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20190316200339epcas3p2370800b4409cedfc025ef1c972334aa7@epcas3p2.samsung.com>
2019-03-16 20:01 ` Command hashing/autocd bug & possible fixes Charles Blake
2019-03-18 10:01   ` Peter Stephenson
2019-03-18 21:20     ` Charles Blake

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).