From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zsh-workers-request@euclid.skiles.gatech.edu>
Received: (qmail 8950 invoked from network); 25 Dec 1996 02:39:00 -0000
Received: from euclid.skiles.gatech.edu (list@130.207.146.50)
  by coral.primenet.com.au with SMTP; 25 Dec 1996 02:39:00 -0000
Received: (from list@localhost) by euclid.skiles.gatech.edu (8.7.3/8.7.3) id VAA08192; Tue, 24 Dec 1996 21:33:58 -0500 (EST)
Resent-Date: Tue, 24 Dec 1996 21:33:58 -0500 (EST)
From: Zoltan Hidvegi <hzoli@cs.elte.hu>
Message-Id: <199612250228.DAA00980@hzoli.ppp.cs.elte.hu>
Subject: Re: new module
To: zefram@dcs.warwick.ac.uk (Zefram)
Date: Wed, 25 Dec 1996 03:28:12 +0100 (MET)
Cc: zsh-workers@math.gatech.edu
In-Reply-To: <4232.199612232001@stone.dcs.warwick.ac.uk> from Zefram at "Dec 23, 96 08:01:15 pm"
X-Mailer: ELM [version 2.4ME+ PL17 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Resent-Message-ID: <"EDGcW2.0.x_1.LA9mo"@euclid>
Resent-From: zsh-workers@math.gatech.edu
X-Mailing-List: <zsh-workers@math.gatech.edu> archive/latest/2636
X-Loop: zsh-workers@math.gatech.edu
Precedence: list
Resent-Sender: zsh-workers-request@math.gatech.edu

Here are some pieces from Zefram's files.c:

>       + static char buf[PATH_MAX * 2];
>       + static char rbuf[PATH_MAX];
[...]
>       + dorm(char *nam, char *arg, char *ops)
[...]
>       + 		if(arg != buf)
>       + 		    strcpy(buf, arg);

You see I do not like this kind of code.  It works but it makes the code
more difficult to maintain.  buf is a static variable and sometimes arg and
buf is the same.  It is very easy to forget about that and it can cause you
headaches to track down a problem caused by this.  And it is almost
impossible to debug this if you call an other function which also happens
to use buf.  In files.c I do not see what's against using automatic
variable buffers.

There is one more argument against static buffers besides the clarity of
the code.  Zsh code must be as re-entrant as possible as a signal handler
can be called at any time.  E.g. while you busily doing an rm -rf foo a
signal may come and it may call an other rm (that probably violates POSIX
btw.).

I know that static buffers are used in some other places.  The result of
unmeta is in a static buffer if the string contained a metafied character.
But in other parts of the code this static buffer used immediately after
the unmeta call.

>       + 		*pos++ = '/';
>       + 		while((fn = zreaddir(d, 1))) {
>       + 		    if(ztrlen(fn) > space) {
>       + 			pos[-1] = 0;
>       + 			zwarnnam(nam, "%s: %e", buf, ENAMETOOLONG);
>       + 			err = 1;
>       + 			continue;
>       + 		    }
>       + 		    strcpy(pos, fn);

There are two problems with this code.  First it cannot remove a directory
hierarchy if it has too deep subdirectories.  That's quite bad when you use
it to clean up directories.  A user who wants to save his files from
automatic deletion can hide it deeply in a subdirectory.

The other problem if that when such an rm runs from root's cron ro clean up
/tmp it can be exploited to remove anything on the filesystem.  Any
component of a long filename can be replaced with a symlink while rm is
working which can be used to delete any tree on the system.

The right way to do that is to change into the directory we want to remove,
and stat . to make sure that the chdir did no go over a symlink.

Note that it is a more general problem: find ... -exec rm {} \; scripts are
always dangerous and can be exploited using some symlink trickery.
Therefore an option to rm which disables symlinks would be very useful
(which would allow rm -l symlink but not rm -l symlink/file).  If
implemented correctly, rm -lf <some_complicated_zsh_glob_pattern> would be
a safe operation.  Sine rm is a builtin long argument list is no longer a
problem.

E.g. when you type rm foo/bar the right procedure is:

lstat(foo), if a real directory, chpwd(foo), stat(.) and compare with the
previous lstat(foo), if not the same fail.  And now you can delete bar.

And to rm /foo/bar you have to chpwd(/) first.
Of couse a shell-builtin implementation should save pwd first and go back
after the operation is completed.

Btw. I've just noticed that rm -r in files.c uses stat() instead of lstat()
hence it follows symlinks happily :-(.

Zoltan