zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: pathconf() again
@ 2000-08-04  7:02         ` Bart Schaefer
  2000-08-04 13:19           ` Clint Adams
  2000-08-05  6:45           ` PATCH: pathconf() again Wayne Davison
  0 siblings, 2 replies; 18+ messages in thread
From: Bart Schaefer @ 2000-08-04  7:02 UTC (permalink / raw)
  To: zsh-workers

There was a problem with the string passed to pathconf() from the files
module:  It might contain metacharacters, and thus be an invalid path.

Consequently, I rewrote the entire HAVE_PATHCONF patch from scratch, with
the exception of the innocuous configure.in bit.

This patch introduces a new function in compat.c, zpathmax(dir), which
returns -1 if the dir is too long (or otherwise invalid), or returns the
max path length otherwise.  It returns 0 (and sets errno = 0) if path
lengths are unlimited (see the comment), but this isn't used yet.  I'm
not sure the (strlen(dir) < pathmax) test can ever fail, but in case
the limit is for some bizarre reason set to zero ...

There's a macro version of zpathmax() in system.h for not-HAVE_PATHCONF.

I'm not sure how pathconf() treats a path that's exactly PATH_MAX long,
but I decided that since the argument is a directory name it's pretty
silly to create one to which no '/' can be appended; hence zpathmax()
gives ENAMETOOLONG in that case.

I then tore out all the #ifdef HAVE_PATHCONF and put in zpathmax() calls.
I'm tempted to add something to Etc/zsh-development-guide about isolating
#ifdefs in this way whenever possible.

I restrained myself (and was badly bruised in the process) from inserting
spaces before the left-parens in all the if() for() while() statements in
the files module.  I'm tempted to add something to zsh-development-guide
about this, too, but at the moment it'd sound much too ascerbic.

Random questions:  Can someone explain how one is supposed to determine a
useful buffer size for e.g. readlink() if pathconf() returns `unlimited'?
For that matter, how does one even know what directory name to pass into
pathconf() in that case?

Index: Src/compat.c
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-3.1/Src/compat.c,v
retrieving revision 1.2
diff -u -r1.2 compat.c
--- compat.c	1999/12/02 17:16:06	1.2
+++ compat.c	2000/08/04 06:53:48
@@ -105,6 +105,43 @@
 #endif
 
 
+#ifdef HAVE_PATHCONF
+
+/* The documentation for pathconf() says something like:             *
+ *     The limit is returned, if one exists.  If the system  does    *
+ *     not  have  a  limit  for  the  requested  resource,  -1 is    *
+ *     returned, and errno is unchanged.  If there is  an  error,    *
+ *     -1  is returned, and errno is set to reflect the nature of    *
+ *     the error.                                                    *
+ *                                                                   *
+ * This is less useful than may be, as one must reset errno to 0 (or *
+ * some other flag value) in order to determine that the resource is *
+ * unlimited.  What use is leaving errno unchanged?  Instead, define *
+ * a wrapper that resets errno to 0 and returns 0 for "the system    *
+ * does not have a limit."                                           *
+ *                                                                   *
+ * This is replaced by a macro from system.h if not HAVE_PATHCONF.   */
+
+/**/
+mod_export long
+zpathmax(char *dir)
+{
+    long pathmax;
+    errno = 0;
+    if ((pathmax = pathconf(dir, _PC_PATH_MAX)) >= 0) {
+	if (strlen(dir) < pathmax)
+	    return pathmax;
+	else
+	    errno = ENAMETOOLONG;
+    }
+    if (errno)
+	return -1;
+    else
+	return 0; /* pathmax should be considered unlimited */
+}
+#endif
+
+
 /**/
 mod_export char *
 zgetdir(struct dirsav *d)
Index: Src/system.h
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-3.1/Src/system.h,v
retrieving revision 1.7
diff -u -r1.7 system.h
--- system.h	2000/05/18 17:19:38	1.7
+++ system.h	2000/08/04 06:49:07
@@ -194,8 +194,8 @@
 # define VARARR(X,Y,Z)	X *(Y) = (X *) alloca(sizeof(X) * (Z))
 #endif
 
-/* we should be getting this value from pathconf(_PC_PATH_MAX) */
-/* but this is too much trouble                                */
+/* we should handle unlimited sizes from pathconf(_PC_PATH_MAX) */
+/* but this is too much trouble                                 */
 #ifndef PATH_MAX
 # ifdef MAXPATHLEN
 #  define PATH_MAX MAXPATHLEN
@@ -203,6 +203,11 @@
    /* so we will just pick something */
 #  define PATH_MAX 1024
 # endif
+#endif
+#ifndef HAVE_PATHCONF
+# define zpathmax(X) ((long)((strlen(X) >= PATH_MAX) ? \
+			     ((errno = ENAMETOOLONG), -1) : \
+			     ((errno = 0), PATH_MAX))
 #endif
 
 /* we should be getting this value from sysconf(_SC_OPEN_MAX) */
Index: Src/Modules/files.c
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-3.1/Src/Modules/files.c,v
retrieving revision 1.9
diff -u -r1.9 files.c
--- files.c	2000/08/03 04:51:42	1.9
+++ files.c	2000/08/04 06:12:00
@@ -71,9 +71,6 @@
     mode_t oumask = umask(0);
     mode_t mode = 0777 & ~oumask;
     int err = 0;
-#ifdef HAVE_PATHCONF
-    int pathmax = 0;
-#endif
 
     umask(oumask);
     if(ops['m']) {
@@ -94,21 +91,11 @@
 
 	while(ptr > *args + (**args == '/') && *--ptr == '/')
 	    *ptr = 0;
-#ifdef HAVE_PATHCONF
-	errno = 0;
-	if(((pathmax = pathconf(*args,_PC_PATH_MAX)) == -1) && errno) {
-	  zwarnnam(nam, "%s: %e", *args, errno);
-	  err = 1;
-	  continue;
-	}
-	else if((ztrlen(*args) > pathmax - 1) && errno != -1) {
-#else
-	  if(ztrlen(*args) > PATH_MAX - 1) {
-#endif
-	    zwarnnam(nam, "%s: %e", *args, ENAMETOOLONG);
+	if(zpathmax(unmeta(*args)) < 0) {
+	    zwarnnam(nam, "%s: %e", *args, errno);
 	    err = 1;
 	    continue;
-	  }
+	}
 	if(ops['p']) {
 	    char *ptr = *args;
 
Index: Src/Modules/parameter.c
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-3.1/Src/Modules/parameter.c,v
retrieving revision 1.46
diff -u -r1.46 parameter.c
--- parameter.c	2000/08/03 14:49:44	1.46
+++ parameter.c	2000/08/04 06:03:22
@@ -1397,20 +1397,9 @@
 static void
 setpmnameddir(Param pm, char *value)
 {
-#ifdef HAVE_PATHCONF
-    int pathmax = 0;
-
     errno = 0;
-    pathmax = pathconf(value, _PC_PATH_MAX);
-    if ((pathmax == -1) && errno) {
-      zwarn("%s: %e", value, errno);
-    }
-    else if (!value || *value != '/' || ((strlen(value) >= pathmax) &&
-            pathmax != -1))
-#else
-    if (!value || *value != '/' || strlen(value) >= PATH_MAX)
-#endif
-	zwarn("invalid value: %s", value, 0);
+    if (!value || *value != '/' || zpathmax(value) < 0)
+	zwarn((errno ? "%s: %e" : "invalid value: %s"), value, errno);
     else
 	adduserdir(pm->nam, value, 0, 1);
     zsfree(value);
@@ -1432,9 +1421,6 @@
 {
     int i;
     HashNode hn, next, hd;
-#ifdef HAVE_PATHCONF
-    int pathmax = 0;
-#endif
 
     if (!ht)
 	return;
@@ -1457,19 +1443,9 @@
 	    v.arr = NULL;
 	    v.pm = (Param) hn;
 
-#ifdef HAVE_PATHCONF
 	    errno = 0;
-            if((((pathmax = pathconf(val, _PC_PATH_MAX)) == -1)) && errno)
-                zwarn("%s: %e", val, errno);
-            else
-#endif
-	    if (!(val = getstrvalue(&v)) || *val != '/' ||
-#ifdef HAVE_PATHCONF
-                ((strlen(val) >= pathmax)) && pathmax != -1)
-#else
-		strlen(val) >= PATH_MAX)
-#endif
-		zwarn("invalid value: %s", val, 0);
+	    if (!(val = getstrvalue(&v)) || *val != '/' || zpathmax(val) < 0)
+		zwarn((errno ? "%s: %e" : "invalid value: %s"), val, errno);
 	    else
 		adduserdir(hn->nam, val, 0, 1);
 	}

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: pathconf() again
  2000-08-04  7:02         ` PATCH: pathconf() again Bart Schaefer
@ 2000-08-04 13:19           ` Clint Adams
  2000-08-04 18:15             ` Bart Schaefer
  2000-08-05  6:45           ` PATCH: pathconf() again Wayne Davison
  1 sibling, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-04 13:19 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> Random questions:  Can someone explain how one is supposed to determine a
> useful buffer size for e.g. readlink() if pathconf() returns `unlimited'?

What I was told is that one should malloc an arbitrary amount, say 512
bytes, then realloc to double the buffer size if it's too small.  Rinse
and repeat.

> For that matter, how does one even know what directory name to pass into
> pathconf() in that case?

I assume you want the PATH_MAX of the filesystem where the link lives,
and not the psychically-determined filesystem to which it's pointing.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* PATCH: tail-dropping in files module mkdir
@ 2000-08-04 14:53 Clint Adams
  2000-08-04 15:17 ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-04 14:53 UTC (permalink / raw)
  To: zsh-workers

This should let mkdir work a little better.

In addition to the -p problem, I think that zpathmax needs to
be modified to do one of the following:

a) return the number from pathconf() so that it can be compared
with strlen of the full pathname with tail

b) take aforementioned strlen as an argument

BTW, I think pathconf does the "errno unchanged" bit because of
some prohibition of the library settings errno to 0 or setting
errno on success.

Index: Src/Modules/files.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Modules/files.c,v
retrieving revision 1.4
diff -u -r1.4 files.c
--- Src/Modules/files.c	2000/08/04 07:09:46	1.4
+++ Src/Modules/files.c	2000/08/04 14:43:36
@@ -71,6 +71,7 @@
     mode_t oumask = umask(0);
     mode_t mode = 0777 & ~oumask;
     int err = 0;
+    char *head;
 
     umask(oumask);
     if(ops['m']) {
@@ -91,8 +92,19 @@
 
 	while(ptr > *args + (**args == '/') && *--ptr == '/')
 	    *ptr = 0;
-	if(zpathmax(unmeta(*args)) < 0) {
-	    zwarnnam(nam, "%s: %e", *args, errno);
+
+/* Drop the tail so that pathconf receives a potentially valid pathname */
+	head = (char *) ztrdup(*args);
+	if ((ptr = strrchr(head, '/')))
+	    *ptr = 0;
+	else {
+/* Relative to current directory */
+	    *head = '.';
+	    *(head + 1) = '\0';
+	}
+
+	if(zpathmax(unmeta(head)) < 0) {
+	    zwarnnam(nam, "%s: %e", head, errno);
 	    err = 1;
 	    continue;
 	}
@@ -121,6 +133,8 @@
 	    }
 	} else
 	    err |= domkdir(nam, *args, mode, 0);
+
+	free(head);
     }
     return err;
 }


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-04 14:53 PATCH: tail-dropping in files module mkdir Clint Adams
@ 2000-08-04 15:17 ` Bart Schaefer
  2000-08-04 15:32   ` Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-04 15:17 UTC (permalink / raw)
  To: zsh-workers

On Aug 4, 10:53am, Clint Adams wrote:
} Subject: PATCH: tail-dropping in files module mkdir
}
} This should let mkdir work a little better.

But it's not necessary to do this in mkdir if we're going to change
zpathmax() to do it internally.

} In addition to the -p problem, I think that zpathmax needs to
} be modified to do one of the following:
} 
} a) return the number from pathconf() so that it can be compared
} with strlen of the full pathname with tail

Eh?  It does return the number from pathconf(), unless it's already been
exceeded:

    if ((pathmax = pathconf(dir, _PC_PATH_MAX)) >= 0) {
	if (strlen(dir) < pathmax)
	    return pathmax;
    ...

} BTW, I think pathconf does the "errno unchanged" bit because of
} some prohibition of the library settings errno to 0 or setting
} errno on success.

Ah, that makes some kind of sense.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-04 15:17 ` Bart Schaefer
@ 2000-08-04 15:32   ` Clint Adams
  2000-08-04 16:10     ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-04 15:32 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> But it's not necessary to do this in mkdir if we're going to change
> zpathmax() to do it internally.

How can we do that?  I mean, how will zpathmax know what parts to
lop off, if any?

> Eh?  It does return the number from pathconf(), unless it's already been
> exceeded:

Yes.  I was clearly suffering from some sort of disacuity.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-04 15:32   ` Clint Adams
@ 2000-08-04 16:10     ` Bart Schaefer
  2000-08-05  0:40       ` Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-04 16:10 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 4, 11:32am, Clint Adams wrote:
} Subject: Re: PATCH: tail-dropping in files module mkdir
}
} > But it's not necessary to do this in mkdir if we're going to change
} > zpathmax() to do it internally.
} 
} How can we do that?  I mean, how will zpathmax know what parts to
} lop off, if any?

It can just keep lopping the tail as long as (errno == ENOENT ||
errno == ENOTDIR), I think.  There are some special cases involving paths
that contain "../" that I'm a bit worried about, but I think most of
those (and paths with lots of consecutive slashes) would fail zsh's
constant-PATH_MAX tests already in boundary cases, so probably nothing
will become broken that wasn't already.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: pathconf() again
  2000-08-04 13:19           ` Clint Adams
@ 2000-08-04 18:15             ` Bart Schaefer
  2000-08-05  0:52               ` Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-04 18:15 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 4,  9:19am, Clint Adams wrote:
> Subject: Re: PATCH: pathconf() again
> > Random questions:  Can someone explain how one is supposed to determine a
> > useful buffer size for e.g. readlink() if pathconf() returns `unlimited'?
> 
> What I was told is that one should malloc an arbitrary amount, say 512
> bytes, then realloc to double the buffer size if it's too small.  Rinse
> and repeat.

But ... readlink() doesn't have any provision to "read more of the link"
and doesn't tell you if it truncated what it did read.

In that specific instance, I suppose you could allocate space based on
the st_size returned by lstat(), but the general solution feels sloppy
at best.

> > For that matter, how does one even know what directory name to pass into
> > pathconf() in that case?
> 
> I assume you want the PATH_MAX of the filesystem where the link lives,
> and not the psychically-determined filesystem to which it's pointing.

Only if it's a relative rather than absolute link.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-04 16:10     ` Bart Schaefer
@ 2000-08-05  0:40       ` Clint Adams
  2000-08-04  7:02         ` PATCH: pathconf() again Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-05  0:40 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> It can just keep lopping the tail as long as (errno == ENOENT ||
> errno == ENOTDIR), I think.  There are some special cases involving paths

I was assuming that there would be non-mkdir cases where this
would result in the wrong filesystem being selected, but I can't
think of such a situation.

> that contain "../" that I'm a bit worried about, but I think most of
> those (and paths with lots of consecutive slashes) would fail zsh's
> constant-PATH_MAX tests already in boundary cases, so probably nothing
> will become broken that wasn't already.

I suggest a compat.c wrapper around realpath().  Is subsuming
LGPL- or BSD-licensed code objectionable?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: pathconf() again
  2000-08-04 18:15             ` Bart Schaefer
@ 2000-08-05  0:52               ` Clint Adams
  2000-08-05  4:48                 ` PATCH: tail-dropping in files module mkdir Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-05  0:52 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> But ... readlink() doesn't have any provision to "read more of the link"
> and doesn't tell you if it truncated what it did read.

I didn't realize that readlink() was so unfriendly.  You could
memset() the buffer with 0 and then test to see if the final octet
of the buffer was still zero.  If it is, you have a complete,
zero-terminated string.  If not, realloc the buffer and try again.

I don't know if that's more or less ugly.

> Only if it's a relative rather than absolute link.

I don't understand the significance.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-05  0:52               ` Clint Adams
@ 2000-08-05  4:48                 ` Bart Schaefer
  2000-08-07 18:04                   ` Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-05  4:48 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 4,  8:40pm, Clint Adams wrote:
} Subject: Re: PATCH: tail-dropping in files module mkdir
}
} > There are some special cases involving paths
} > that contain "../" that I'm a bit worried about, but I think most of
} > those (and paths with lots of consecutive slashes) would fail zsh's
} > constant-PATH_MAX tests already in boundary cases, so probably nothing
} > will become broken that wasn't already.
} 
} I suggest a compat.c wrapper around realpath().

You misunderstand the problem.  It isn't that we need the real path in
order to determine the value of pathmax, it's this sort of silliness:

/usr/local/../bin/../doc/../etc/../include/../lib/../local/blahblahblah

If the length of the "blahblahblah" part approaches pathmax, you get an
ENAMETOOLONG error even though you could chdir to each directory from
left to right and eventually reach a legitimate file.  Computing the
realpath() in such a case won't change anything.

There's one other problem with that sort of path; if you do

mkdir -p /usr/mountpoint/newdir1/../../newdir2/blahblahblah

then the pathmax is that of /usr/mountpoint, but newdir2/blahblahblah is
under /usr.  realpath() is going to fail with ENOTDIR in that case, so
again it doesn't help; and if newdir1 already exists, then pathconf()
itself will discover that /usr/mountpoint/newdir1/../.. refers to /usr.
Or at least I hope it will, or this is almost a waste of time.

On Aug 4,  8:52pm, Clint Adams wrote:
[About determining a buffer size for readlink()]
} 
} > Only if it's a relative rather than absolute link.
} 
} I don't understand the significance.

It would make sense that an absolute link from the root could be as long
as the longest path on the filesystem to which the link refers, no?

But a relative link has to be concatenated with the path to the directory
containing it in order to resolve the link, so it couldn't be longer than
the longest path on the filesystem of the containing directory.

However, I don't actually know how this works in the kernel and/or the FS
drivers.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: pathconf() again
  2000-08-04  7:02         ` PATCH: pathconf() again Bart Schaefer
  2000-08-04 13:19           ` Clint Adams
@ 2000-08-05  6:45           ` Wayne Davison
  1 sibling, 0 replies; 18+ messages in thread
From: Wayne Davison @ 2000-08-05  6:45 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On Fri, 4 Aug 2000, Bart Schaefer wrote:
> There's a macro version of zpathmax() in system.h for not-HAVE_PATHCONF.

...which is missing a closing paren.  I assume that it should have the
paren added at the end of the macro, but I'll let you double-check that
and check in a fix.  I made that change locally, and was able to compile
a working zsh.

..wayne..


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-05  4:48                 ` PATCH: tail-dropping in files module mkdir Bart Schaefer
@ 2000-08-07 18:04                   ` Clint Adams
  2000-08-07 20:39                     ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-07 18:04 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> You misunderstand the problem.  It isn't that we need the real path in
> order to determine the value of pathmax, it's this sort of silliness:
> 
> /usr/local/../bin/../doc/../etc/../include/../lib/../local/blahblahblah
> 
> If the length of the "blahblahblah" part approaches pathmax, you get an
> ENAMETOOLONG error even though you could chdir to each directory from
> left to right and eventually reach a legitimate file.  Computing the
> realpath() in such a case won't change anything.

Again I am confused.  Does _PC_PATH_MAX have any significance for
absolute paths?  Can someone check POSIX?

I'm now of the opinion that the zpathmax check should be removed
from bin_mkdir, except for mkdir -p, in which case not only zpathmax
should be checked, but also _PC_NAME_MAX for each path element, including
the tail.

> It would make sense that an absolute link from the root could be as long
> as the longest path on the filesystem to which the link refers, no?

Well, let's say you have a BIGFS mounted on on /usr, where the
largest filename is 65,535 characters long.  / is a SMALLFS, which
has a maximum filename size of 255 characters, and the maximum
size of the destination of a symlink is 1023 characters.  Therefore,
the absolute link from the root could not be as long as the longest
path on the referent filesystem.  Obversely, the BIGFS can probably handle
symlinks to the root fs.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-07 18:04                   ` Clint Adams
@ 2000-08-07 20:39                     ` Bart Schaefer
  2000-08-08 11:40                       ` Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-07 20:39 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 7,  2:04pm, Clint Adams wrote:
> Subject: Re: PATCH: tail-dropping in files module mkdir
> > You misunderstand the problem.  It isn't that we need the real path in
> > order to determine the value of pathmax, it's this sort of silliness:
> > 
> > /usr/local/../bin/../doc/../etc/../include/../lib/../local/blahblahblah
> > 
> > If the length of the "blahblahblah" part approaches pathmax, you get an
> > ENAMETOOLONG error even though you could chdir to each directory from
> > left to right and eventually reach a legitimate file.  Computing the
> > realpath() in such a case won't change anything.
> 
> Again I am confused.  Does _PC_PATH_MAX have any significance for
> absolute paths?  Can someone check POSIX?

It doesn't really matter whether it has any significance for absolute paths.
The primary use of the PATH_MAX constant in zsh is to determine the size of
a buffer to allocate for copying a path name.  Even if we use realpath() or
the equivalent to find the actual directory whose pathmax vaue we want, the
actual string that is going to be copied into the resulting buffer does not
change.

What this seems to imply is that we should always arbitrarily grow any
buffer that will hold a path name -- not even attempt to determine a
maximum size in advance -- and simply let the system calls fail when they
will.

> I'm now of the opinion that the zpathmax check should be removed
> from bin_mkdir, except for mkdir -p, in which case not only zpathmax
> should be checked, but also _PC_NAME_MAX for each path element, including
> the tail.

That's equivalent to my alternate suggestion of simply letting domkdir()
fail and examining errno to see whether we should break or continue as a
result.  The zpathmax() test is performed only so that the command does not
give up on the first too-long argument when multiple arguments are present.

Incidentally, I found an HP-UX pathconf() manpage on the web, which says:

           4.   If path or fildes does not refer to a directory, pathconf()
                or fpathconf() returns -1 and sets errno to EINVAL.

So we may need to test EINVAL as well as ENOENT and ENOTDIR in zpathmax().
It goes on:

           5.   If path or fildes refers to a directory, the value returned
                is the maximum length of a relative path name when the
                specified directory is the working directory.

Does this imply that (zpathmax("/") - 4 == zpathmax("/tmp")) is possible?
If so, we're wrong ever to compare strlen(dir) to zpathmax(dir).

> > It would make sense that an absolute link from the root could be as long
> > as the longest path on the filesystem to which the link refers, no?
> 
> Well, let's say you have a BIGFS mounted on on /usr, where the
> largest filename is 65,535 characters long.  / is a SMALLFS, which
> has a maximum filename size of 255 characters, and the maximum
> size of the destination of a symlink is 1023 characters.

You're assuming an implementation that restricts the size of a symlink to
the size of a local filesystem path.  But there doesn't need to be such
a relationship -- the implementation could examine the link component by
component until it finds the target filesystem, and then hand the rest of
the interpretation off to the target driver.  In that case there'd only
be a problem if you didn't cross into a new filesystem within the first
1023 bytes.  Traditional unix filesystems don't handle that, but so what?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-07 20:39                     ` Bart Schaefer
@ 2000-08-08 11:40                       ` Clint Adams
  2000-08-08 21:47                         ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Clint Adams @ 2000-08-08 11:40 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> It doesn't really matter whether it has any significance for absolute paths.

It does when we're trying to predict whether or not the entirety of mkdir -p
will fail.

Of course, it seems as though other mkdir implementations will mkdir()
all elements up to the failure point and happily leave them there, so
we'd be compatible if we removed the check.

> The primary use of the PATH_MAX constant in zsh is to determine the size of
> a buffer to allocate for copying a path name.  Even if we use realpath() or
> the equivalent to find the actual directory whose pathmax vaue we want, the
> actual string that is going to be copied into the resulting buffer does not
> change.

Agreed.  This is why I only touched the two places where PATH_MAX wasn't
being used for buffers.
 
> What this seems to imply is that we should always arbitrarily grow any
> buffer that will hold a path name -- not even attempt to determine a
> maximum size in advance -- and simply let the system calls fail when they
> will.

I wonder if performance will suffer significantly from this.

>            5.   If path or fildes refers to a directory, the value returned
>                 is the maximum length of a relative path name when the
>                 specified directory is the working directory.
> 
> Does this imply that (zpathmax("/") - 4 == zpathmax("/tmp")) is possible?

I believe it does.

> If so, we're wrong ever to compare strlen(dir) to zpathmax(dir).

Not if 'dir' is a relative, multi-directory path being passed to
mkdir -p.  That could conceivably fail if strlen(dir) > zpathmax(dir).
It could also conceivably fail if any of the directories to be made
have strlen(dirpart) > pathconf(dirpart,_PC_NAME_MAX), or if the
library is answering incorrectly on behalf of the kernel or other
limiting entities.

> You're assuming an implementation that restricts the size of a symlink to

I was merely throwing out possibilities.

Anyway, I assume that we can throw out the pathmax checks in the files module.
I'd think we can do the same with the parameter module, since _PC_PATH_MAX
is essentially useless there too.

That just leaves buffers to be dynamically grown, AFAICT.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATCH: tail-dropping in files module mkdir
  2000-08-08 11:40                       ` Clint Adams
@ 2000-08-08 21:47                         ` Bart Schaefer
  2000-08-09 14:25                           ` PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir) Clint Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2000-08-08 21:47 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 8,  7:40am, Clint Adams wrote:
>  
> >            5.   If path or fildes refers to a directory, the value returned
> >                 is the maximum length of a relative path name when the
> >                 specified directory is the working directory.
> > 
> > Does this imply that (zpathmax("/") - 4 == zpathmax("/tmp")) is possible?
> 
> I believe it does.
> 
> > If so, we're wrong ever to compare strlen(dir) to zpathmax(dir).
> 
> Not if 'dir' is a relative, multi-directory path being passed to
> mkdir -p.  That could conceivably fail if strlen(dir) > zpathmax(dir).

What I mean is:

The definition of pathconf(dir, _PC_PATH_MAX) is inherently incompatible
with the definition of the PATH_MAX constant.  The latter is the maximum
length of an absolute path from the root [*], the former approximates the
same constant minus the length of [the equivalent of] realpath(dir).

Suppose pathconf() returns a constant value == _POSIX_PATH_MAX no matter
what its directory argument is.  Comparing strlen(realpath(dir)) to that
constant tells you if the filesystem will ultimately reject the name, but
comparing strlen(dir) tells you nothing.

Now suppose pathconf() returns a differing amount depending on the length
of realpath(dir).  In this case comparing strlen(realpath(dir)) to that
value is entirely wrong, because pathconf() has already accounted for the
real path!  Even comparing strlen(dir) doesn't tell you anything, because
what the value reflects is how much *more* path space you have left after
you already have used whatever it takes to get to realpath(dir), whereas
strlen(dir) gives some fraction of the already-used path space.

If somebody with access to the POSIX spec can refute that last paragraph,
I'll be thrilled.

In the meantime, we have to detect the constant-valued pathconf() in order
to determine whether we should do a comparison, because comparing to a
context-dependent pathconf() value is wrong.

> > What this seems to imply is that we should always arbitrarily grow any
> > buffer that will hold a path name -- not even attempt to determine a
> > maximum size in advance -- and simply let the system calls fail when they
> > will.
> 
> I wonder if performance will suffer significantly from this.

Hard to say.  It's going to take a LOT of rewriting; it might be better
not even to pretend to support pathconf() until that rewriting is done.

[*] Or at least zsh has always treated it as if that's the definition.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir)
  2000-08-08 21:47                         ` Bart Schaefer
@ 2000-08-09 14:25                           ` Clint Adams
  2000-08-09 17:07                             ` Bart Schaefer
  2000-08-09 17:51                             ` Bart Schaefer
  0 siblings, 2 replies; 18+ messages in thread
From: Clint Adams @ 2000-08-09 14:25 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

> The definition of pathconf(dir, _PC_PATH_MAX) is inherently incompatible
> with the definition of the PATH_MAX constant.  The latter is the maximum
> length of an absolute path from the root [*], the former approximates the
> same constant minus the length of [the equivalent of] realpath(dir).

Well, I've seen two or three manpages that list _PC_PATH_MAX as being
"relative path from cwd", and nothing that claims that PATH_MAX is anything
more meaningful than "maximum path length".

FreeBSD doesn't make the relative claim; it just says

     _PC_PATH_MAX
             The maximum number of bytes in a pathname.

But, it will actually go to the directory in the first argument, determine
it's filesystem, and ask the filesystem driver's pathconf() for a value.

As usual, I don't have access to POSIX (and it seems all the old drafts
that were available online have been recalled), but SUSv2 says
(http://www.unix-systems.org/single_unix_specification_v2/xsh/limits.h.html)

PATH_MAX
         Maximum number of bytes in a pathname, including the terminating
null character. Minimum Acceptable Value: _POSIX_PATH_MAX 

On the other hand, SUSv2 pathconf(),
(http://www.unix-systems.org/single_unix_specification_v2/xsh/fpathconf.html)
which is "derived from POSIX-1.1988", both links PATH_MAX with _PC_PATH_MAX,
but it also has the same notes we saw in either HP/UX or Solaris.  Those
would be

4.If path or fildes does not refer to a directory, it is unspecified whether
an implementation supports an association of the variable name with the
specified file. 

5.If path or fildes refers to a directory, the value returned is the maximum
length of a relative pathname when the specified directory is the working
directory. 

> Suppose pathconf() returns a constant value == _POSIX_PATH_MAX no matter
> what its directory argument is.  Comparing strlen(realpath(dir)) to that
> constant tells you if the filesystem will ultimately reject the name, but
> comparing strlen(dir) tells you nothing.
> 
> Now suppose pathconf() returns a differing amount depending on the length
> of realpath(dir).  In this case comparing strlen(realpath(dir)) to that
> value is entirely wrong, because pathconf() has already accounted for the
> real path!  Even comparing strlen(dir) doesn't tell you anything, because
> what the value reflects is how much *more* path space you have left after
> you already have used whatever it takes to get to realpath(dir), whereas
> strlen(dir) gives some fraction of the already-used path space.

Back to the pathconf() manpage from SUSv2, we have

|If the variable corresponding to name has no limit for the path or file
|descriptor, both pathconf() and fpathconf() return -1 without changing
|errno. If the implementation needs to use path to determine the value
|of name and the implementation does not support the association of name
|with the file specified by path, or if the process did not have
|appropriate privileges to query the file specified by path, or path does
|not exist, pathconf() returns -1 and errno is set to indicate the error. 

So it appears that the implementation has the option of caring about
the first argument or completely ignoring it.

> If somebody with access to the POSIX spec can refute that last paragraph,
> I'll be thrilled.

I hope they've made some clarifications since 1988.

> In the meantime, we have to detect the constant-valued pathconf() in order
> to determine whether we should do a comparison, because comparing to a
> context-dependent pathconf() value is wrong.

It could be an autoconf check if you can think of a path guaranteed to
be invalid.

> Hard to say.  It's going to take a LOT of rewriting; it might be better
> not even to pretend to support pathconf() until that rewriting is done.
> 
> [*] Or at least zsh has always treated it as if that's the definition.

It seems to me as though, at least according to what I understood of some
X/Open specs, it's intended as a maximum argument length to library functions.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir)
  2000-08-09 14:25                           ` PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir) Clint Adams
@ 2000-08-09 17:07                             ` Bart Schaefer
  2000-08-09 17:51                             ` Bart Schaefer
  1 sibling, 0 replies; 18+ messages in thread
From: Bart Schaefer @ 2000-08-09 17:07 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

On Aug 9, 10:25am, Clint Adams wrote:
}
} Well, I've seen two or three manpages that list _PC_PATH_MAX as being
} "relative path from cwd", and nothing that claims that PATH_MAX is anything
} more meaningful than "maximum path length".

I've posted something to comp.unix.programmer and comp.std.unix.  I'll
pass along any useful responses.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir)
  2000-08-09 14:25                           ` PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir) Clint Adams
  2000-08-09 17:07                             ` Bart Schaefer
@ 2000-08-09 17:51                             ` Bart Schaefer
  1 sibling, 0 replies; 18+ messages in thread
From: Bart Schaefer @ 2000-08-09 17:51 UTC (permalink / raw)
  To: Clint Adams; +Cc: zsh-workers

} > The definition of pathconf(dir, _PC_PATH_MAX) is inherently incompatible
} > with the definition of the PATH_MAX constant.

I just found my 1991 copy of ORA's POSIX Programmer's Guide.  It says
specifically:

-----------------------

[...]

_PC_PATH_MAX   The maximum lenght of a relative pathname when this direc-
               tory is the working directory; that is, the number of
	       characters that may be appended to path and still have a
	       valid pathname.

[...]

Notes:

    The value returned by _PC_PATH_MAX is not useful for allocating storage.
    Files with paths longer than _PC_PATH_MAX may exist.

-----------------------

So I think pathconf() is of approximately zero use to zsh, at least for
_PC_PATH_MAX, and we should just rip zpathmax() out again.  I apologize
for helping lead us down this rathole.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2000-08-09 17:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-08-04 14:53 PATCH: tail-dropping in files module mkdir Clint Adams
2000-08-04 15:17 ` Bart Schaefer
2000-08-04 15:32   ` Clint Adams
2000-08-04 16:10     ` Bart Schaefer
2000-08-05  0:40       ` Clint Adams
2000-08-04  7:02         ` PATCH: pathconf() again Bart Schaefer
2000-08-04 13:19           ` Clint Adams
2000-08-04 18:15             ` Bart Schaefer
2000-08-05  0:52               ` Clint Adams
2000-08-05  4:48                 ` PATCH: tail-dropping in files module mkdir Bart Schaefer
2000-08-07 18:04                   ` Clint Adams
2000-08-07 20:39                     ` Bart Schaefer
2000-08-08 11:40                       ` Clint Adams
2000-08-08 21:47                         ` Bart Schaefer
2000-08-09 14:25                           ` PATH_MAX vs. _PC_PATH_MAX vs. POSIX (was Re: PATCH: tail-dropping in files module mkdir) Clint Adams
2000-08-09 17:07                             ` Bart Schaefer
2000-08-09 17:51                             ` Bart Schaefer
2000-08-05  6:45           ` PATCH: pathconf() again Wayne Davison

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).