mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: musl <musl@lists.openwall.com>, Rich Felker <dalias@libc.org>,
	 Waldemar Brodkorb <wbx@openadk.org>
Subject: Re: getopt() not exposing __optpos - shell needs it
Date: Mon, 28 Aug 2017 17:17:06 +0200	[thread overview]
Message-ID: <CAK1hOcOd0wAOEM7Weqp+zFtYtOFwQpAVidzvTOooSkZ0s7b8mw@mail.gmail.com> (raw)
In-Reply-To: <CAK1hOcNk95zZeEaWCX0irwZkAnixYjTsU=GMk8Z9153ZwE9rng@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

On Mon, Aug 28, 2017 at 12:18 PM, Denys Vlasenko
<vda.linux@googlemail.com> wrote:
> glibc/uclibc need more extensive changes, since they use a pointer to store
> the position. This is a test in busybox tree which fails miserably
> because of that:
>
> https://git.busybox.net/busybox/tree/shell/hush_test/hush-getopts/getopt_test_libc_bug.tests
> # This test can fail with libc with buggy getopt() implementation.
> # If getopt() wants to parse multi-option args (-abc),
> # it needs to remember a position within current arg.
> #
> # If this position is kept as a POINTER, not an offset,
> # and if argv[] ADDRESSES (not contents!) change, it blows up.


uclibc would need something like this patch (not tested at all):

[-- Attachment #2: z.diff --]
[-- Type: text/plain, Size: 12275 bytes --]

diff --git a/libc/sysdeps/linux/common/bits/getopt.h b/libc/sysdeps/linux/common/bits/getopt.h
index dababe0..c4f5ab7 100644
--- a/libc/sysdeps/linux/common/bits/getopt.h
+++ b/libc/sysdeps/linux/common/bits/getopt.h
@@ -48,6 +48,7 @@ extern char *optarg;
    how much of ARGV has been scanned so far.  */
 
 extern int optind;
+extern int __optpos; /* position within current "-abc" multiopt string */
 
 /* Callers store zero here to inhibit the error message `getopt' prints
    for unrecognized options.  */
diff --git a/libc/unistd/getopt.c b/libc/unistd/getopt.c
index 9d2df34..3c7e6c5 100644
--- a/libc/unistd/getopt.c
+++ b/libc/unistd/getopt.c
@@ -106,6 +106,8 @@ char *optarg;
 /* 1003.2 says this must be 1 before any call.  */
 int optind = 1;
 
+int __optpos = 0;
+
 /* Callers store zero here to inhibit the error message
    for unrecognized options.  */
 
@@ -261,7 +263,7 @@ _getopt_initialize (attribute_unused int argc, attribute_unused char *const *arg
 
   d->__first_nonopt = d->__last_nonopt = d->optind;
 
-  d->__nextchar = NULL;
+  d->optpos = 0;
 
   d->__posixly_correct = !!getenv ("POSIXLY_CORRECT");
 
@@ -325,7 +327,7 @@ _getopt_initialize (attribute_unused int argc, attribute_unused char *const *arg
    from each of the option elements.
 
    If `getopt' finds another option character, it returns that character,
-   updating `optind' and `nextchar' so that the next call to `getopt' can
+   updating `optind' and `optpos' so that the next call to `getopt' can
    resume the scan with the following option character or ARGV-element.
 
    If there are no more option characters, `getopt' returns -1.
@@ -376,6 +378,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 		    const struct option *longopts, int *longind,
 		    int long_only, struct _getopt_data *d)
 {
+  const char *nextchar;
   int print_errors = d->opterr;
   if (optstring[0] == ':')
     print_errors = 0;
@@ -405,9 +408,17 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 # define NONOPTION_P (argv[d->optind][0] != '-' || argv[d->optind][1] == '\0')
 #endif
 
-  if (d->__nextchar == NULL || *d->__nextchar == '\0')
+  nextchar = "";
+  if (d->optind < argc && d->optpos > 0)
+    /* Could be simply "nextchar = argv[d->optind] + d->optpos"
+     * but let's be paranoid about optpos pointing beyond nul byte.
+     */
+    nextchar = argv[d->optind] + strnlen(argv[d->optind], d->optpos);
+
+  if (*nextchar == '\0')
     {
       /* Advance to the next ARGV-element.  */
+      /* (d->optind already points to the next argv[i]) */
 
       /* Give FIRST_NONOPT & LAST_NONOPT rational values if OPTIND has been
 	 moved back by the user (who may also have changed the arguments).  */
@@ -480,8 +491,8 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
       /* We have found another option-ARGV-element.
 	 Skip the initial punctuation.  */
 
-      d->__nextchar = (argv[d->optind] + 1
-		  + (longopts != NULL && argv[d->optind][1] == '-'));
+      d->optpos = (1 + (longopts != NULL && argv[d->optind][1] == '-'));
+      nextchar = argv[d->optind] + d->optpos;
     }
 
   /* Decode the current option-ARGV-element.  */
@@ -512,15 +523,15 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
       int indfound = -1;
       int option_index;
 
-      for (nameend = d->__nextchar; *nameend && *nameend != '='; nameend++)
+      for (nameend = nextchar; *nameend && *nameend != '='; nameend++)
 	/* Do nothing.  */ ;
 
       /* Test all long options for either exact match
 	 or abbreviated matches.  */
       for (p = longopts, option_index = 0; p->name; p++, option_index++)
-	if (!strncmp (p->name, d->__nextchar, nameend - d->__nextchar))
+	if (!strncmp (p->name, nextchar, nameend - nextchar))
 	  {
-	    if ((unsigned int) (nameend - d->__nextchar)
+	    if ((unsigned int) (nameend - nextchar)
 		== (unsigned int) strlen (p->name))
 	      {
 		/* Exact match found.  */
@@ -570,7 +581,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 		       argv[0], argv[d->optind]);
 #endif
 	    }
-	  d->__nextchar += strlen (d->__nextchar);
+	  d->optpos = 0;
 	  d->optind++;
 	  d->optopt = 0;
 	  return '?';
@@ -643,7 +654,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 #endif
 		    }
 
-		  d->__nextchar += strlen (d->__nextchar);
+		  d->optpos = 0;
 
 		  d->optopt = pfound->val;
 		  return '?';
@@ -683,12 +694,12 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 			       argv[0], argv[d->optind - 1]);
 #endif
 		    }
-		  d->__nextchar += strlen (d->__nextchar);
+		  d->optpos = 0;
 		  d->optopt = pfound->val;
 		  return optstring[0] == ':' ? ':' : '?';
 		}
 	    }
-	  d->__nextchar += strlen (d->__nextchar);
+	  d->optpos = 0;
 	  if (longind != NULL)
 	    *longind = option_index;
 	  if (pfound->flag)
@@ -704,7 +715,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	 option, then it's an error.
 	 Otherwise interpret it as a short option.  */
       if (!long_only || argv[d->optind][1] == '-'
-	  || strchr (optstring, *d->__nextchar) == NULL)
+	  || strchr (optstring, *nextchar) == NULL)
 	{
 	  if (print_errors)
 	    {
@@ -718,10 +729,10 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 		  /* --option */
 #if defined _LIBC && defined USE_IN_LIBIO
 		  n = __asprintf (&buf, "%s: unrecognized option `--%s'\n",
-				  argv[0], d->__nextchar);
+				  argv[0], nextchar);
 #else
 		  fprintf (stderr, "%s: unrecognized option `--%s'\n",
-			   argv[0], d->__nextchar);
+			   argv[0], nextchar);
 #endif
 		}
 	      else
@@ -729,10 +740,10 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 		  /* +option or -option */
 #if defined _LIBC && defined USE_IN_LIBIO
 		  n = __asprintf (&buf, "%s: unrecognized option `%c%s'\n",
-				  argv[0], argv[d->optind][0], d->__nextchar);
+				  argv[0], argv[d->optind][0], nextchar);
 #else
 		  fprintf (stderr, "%s: unrecognized option `%c%s'\n",
-			   argv[0], argv[d->optind][0], d->__nextchar);
+			   argv[0], argv[d->optind][0], nextchar);
 #endif
 		}
 
@@ -753,7 +764,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 		}
 #endif
 	    }
-	  d->__nextchar = (char *) "";
+	  d->optpos = 0;
 	  d->optind++;
 	  d->optopt = 0;
 	  return '?';
@@ -763,11 +774,12 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
   /* Look at and handle the next short option-character.  */
 
   {
-    char c = *d->__nextchar++;
+    char c = *nextchar++;
     char *temp = strchr (optstring, c);
+    d->optpos++;
 
     /* Increment `optind' when we start to process its last character.  */
-    if (*d->__nextchar == '\0')
+    if (*nextchar == '\0')
       ++d->optind;
 
     if (temp == NULL || c == ':')
@@ -832,9 +844,9 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	int option_index;
 
 	/* This is an option that requires an argument.  */
-	if (*d->__nextchar != '\0')
+	if (*nextchar != '\0')
 	  {
-	    d->optarg = d->__nextchar;
+	    d->optarg = nextchar;
 	    /* If we end this ARGV-element by taking the rest as an arg,
 	       we must advance to the next element now.  */
 	    d->optind++;
@@ -879,20 +891,21 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	  /* We already incremented `d->optind' once;
 	     increment it again when taking next ARGV-elt as argument.  */
 	  d->optarg = argv[d->optind++];
+	d->optpos = 0;
 
 	/* optarg is now the argument, see if it's in the
 	   table of longopts.  */
 
-	for (d->__nextchar = nameend = d->optarg; *nameend && *nameend != '=';
+	for (nextchar = nameend = d->optarg; *nameend && *nameend != '=';
 	     nameend++)
 	  /* Do nothing.  */ ;
 
 	/* Test all long options for either exact match
 	   or abbreviated matches.  */
 	for (p = longopts, option_index = 0; p->name; p++, option_index++)
-	  if (!strncmp (p->name, d->__nextchar, nameend - d->__nextchar))
+	  if (!strncmp (p->name, nextchar, nameend - nextchar))
 	    {
-	      if ((unsigned int) (nameend - d->__nextchar) == strlen (p->name))
+	      if ((unsigned int) (nameend - nextchar) == strlen (p->name))
 		{
 		  /* Exact match found.  */
 		  pfound = p;
@@ -937,7 +950,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 			 argv[0], argv[d->optind]);
 #endif
 	      }
-	    d->__nextchar += strlen (d->__nextchar);
+	    d->optpos = 0;
 	    d->optind++;
 	    return '?';
 	  }
@@ -981,7 +994,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 #endif
 		      }
 
-		    d->__nextchar += strlen (d->__nextchar);
+		    d->optpos = 0;
 		    return '?';
 		  }
 	      }
@@ -1019,11 +1032,11 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 				 argv[0], argv[d->optind - 1]);
 #endif
 		      }
-		    d->__nextchar += strlen (d->__nextchar);
+		    d->optpos = 0;
 		    return optstring[0] == ':' ? ':' : '?';
 		  }
 	      }
-	    d->__nextchar += strlen (d->__nextchar);
+	    d->optpos = 0;
 	    if (longind != NULL)
 	      *longind = option_index;
 	    if (pfound->flag)
@@ -1033,7 +1046,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	      }
 	    return pfound->val;
 	  }
-	  d->__nextchar = NULL;
+	  d->optpos = 0;
 	  return 'W';	/* Let the application handle it.   */
       }
 #endif
@@ -1042,21 +1055,21 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	if (temp[2] == ':')
 	  {
 	    /* This is an option that accepts an argument optionally.  */
-	    if (*d->__nextchar != '\0')
+	    if (*nextchar != '\0')
 	      {
-		d->optarg = d->__nextchar;
+		d->optarg = nextchar;
 		d->optind++;
 	      }
 	    else
 	      d->optarg = NULL;
-	    d->__nextchar = NULL;
+	    d->optpos = 0;
 	  }
 	else
 	  {
 	    /* This is an option that requires an argument.  */
-	    if (*d->__nextchar != '\0')
+	    if (*nextchar != '\0')
 	      {
-		d->optarg = d->__nextchar;
+		d->optarg = nextchar;
 		/* If we end this ARGV-element by taking the rest as an arg,
 		   we must advance to the next element now.  */
 		d->optind++;
@@ -1101,7 +1114,7 @@ _getopt_internal_r (int argc, char *const *argv, const char *optstring,
 	      /* We already incremented `optind' once;
 		 increment it again when taking next ARGV-elt as argument.  */
 	      d->optarg = argv[d->optind++];
-	    d->__nextchar = NULL;
+	    d->optpos = 0;
 	  }
       }
     return c;
@@ -1115,12 +1128,14 @@ _getopt_internal (int argc, char *const *argv, const char *optstring,
   int result;
 
   getopt_data.optind = optind;
+  getopt_data.optpos = __optpos;
   getopt_data.opterr = opterr;
 
   result = _getopt_internal_r (argc, argv, optstring, longopts,
 			       longind, long_only, &getopt_data);
 
   optind = getopt_data.optind;
+  __optpos = getopt_data.optpos;
   optarg = getopt_data.optarg;
   optopt = getopt_data.optopt;
 
diff --git a/libc/unistd/getopt_int.h b/libc/unistd/getopt_int.h
index a70fa8b..4c03fe0 100644
--- a/libc/unistd/getopt_int.h
+++ b/libc/unistd/getopt_int.h
@@ -42,6 +42,13 @@ struct _getopt_data
      variables, except that they are used for the reentrant
      versions of getopt.  */
   int optind;
+  /* The next char to be scanned in the option-element
+     in which the last option character we returned was found.
+     This allows us to pick up the scan where we left off.
+
+     If this is zero, it means resume the scan
+     by advancing to the next ARGV-element.  */
+  int optpos;
   int opterr;
   char *optarg;
   smalluint optopt; /* we store characters here, a byte is enough */
@@ -84,14 +91,6 @@ struct _getopt_data
   /* If the POSIXLY_CORRECT environment variable is set.  */
   smallint __posixly_correct;
 
-  /* The next char to be scanned in the option-element
-     in which the last option character we returned was found.
-     This allows us to pick up the scan where we left off.
-
-     If this is zero, or a null string, it means resume the scan
-     by advancing to the next ARGV-element.  */
-  char *__nextchar;
-
 
   /* Handle permutation of arguments.  */
 

  reply	other threads:[~2017-08-28 15:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-28 10:18 Denys Vlasenko
2017-08-28 15:17 ` Denys Vlasenko [this message]
2017-08-28 15:28 ` Rich Felker
2017-08-29 11:32   ` Denys Vlasenko
2017-08-29 12:20     ` Rich Felker
2017-08-29 12:47       ` Denys Vlasenko
2017-08-29 13:07         ` Rich Felker
2017-08-29 16:47           ` Denys Vlasenko
2017-08-29 17:38             ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK1hOcOd0wAOEM7Weqp+zFtYtOFwQpAVidzvTOooSkZ0s7b8mw@mail.gmail.com \
    --to=vda.linux@googlemail.com \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=wbx@openadk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).