* [musl] getopt_long() can corrupt argv when an argument for a short option is missing
@ 2023-05-25 7:53 Alexey Izbyshev
2023-05-25 13:25 ` Rich Felker
0 siblings, 1 reply; 3+ messages in thread
From: Alexey Izbyshev @ 2023-05-25 7:53 UTC (permalink / raw)
To: musl
POSIX requires getopt() to set optind to argc + 1 in case of a missing
argument[1], and musl follows it. This bites getopt_long() (which reuses
getopt()) in two ways:
* getopt_long() moves argv[optind - 1] (NULL) when permuting argv to
make all options precede other arguments, essentially corrupting argv.
* even when permuting is not required, getopt_long() is both
incompatible with glibc (which doesn't increment optind past NULL) and
inconsistent with itself (for a long option with a missing argument,
musl doesn't increment optind past NULL too).
Example of the wrong NULL shifting:
#include <getopt.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
int r = getopt_long(argc, argv, "o:", NULL, NULL);
printf("r: %d\n", r);
printf("optind: %d\n", optind);
for (int i = 0; i <= argc; i++)
printf("%d: '%s'\n", i, argv[i]);
}
}
With glibc:
$ ./a.out arg -o
./a.out: option requires an argument -- 'o'
r: 63
optind: 3
0: './a.out'
1: 'arg'
2: '-o'
3: '(null)'
r: -1
optind: 2
0: './a.out'
1: '-o'
2: 'arg'
3: '(null)'
(Note that glibc permutes argv *before* parsing then next option, and
even before comparing optind and argc, so argv is still permuted on the
second invocation.)
With musl:
$ ./a.out arg -o
./a.out: option requires an argument: o
r: 63
optind: 3
0: './a.out'
1: '-o'
2: '(null)'
3: 'arg'
r: -1
optind: 3
0: './a.out'
1: '-o'
2: '(null)'
3: 'arg'
Maybe we could just skip permuting and adjust optind if we detected a
missing argument?
resumed = optind;
ret = __getopt_long_core(argc, argv, optstring, longopts, idx,
longonly);
+ if (optind > argc)
+ return optind--, ret;
if (resumed > skipped) {
On a subsequent invocation we won't permute, unlike glibc, but maybe
this is a good thing, given that such permutation makes it look like
there is no missing argument, essentially changing the command
semantics.
Alexey
[1]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [musl] getopt_long() can corrupt argv when an argument for a short option is missing
2023-05-25 7:53 [musl] getopt_long() can corrupt argv when an argument for a short option is missing Alexey Izbyshev
@ 2023-05-25 13:25 ` Rich Felker
2023-05-25 14:42 ` Alexey Izbyshev
0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2023-05-25 13:25 UTC (permalink / raw)
To: musl
On Thu, May 25, 2023 at 10:53:09AM +0300, Alexey Izbyshev wrote:
> POSIX requires getopt() to set optind to argc + 1 in case of a
> missing argument[1], and musl follows it. This bites getopt_long()
> (which reuses getopt()) in two ways:
>
> * getopt_long() moves argv[optind - 1] (NULL) when permuting argv to
> make all options precede other arguments, essentially corrupting
> argv.
>
> * even when permuting is not required, getopt_long() is both
> incompatible with glibc (which doesn't increment optind past NULL)
> and inconsistent with itself (for a long option with a missing
> argument, musl doesn't increment optind past NULL too).
>
> Example of the wrong NULL shifting:
>
> #include <getopt.h>
> #include <stdio.h>
>
> int main(int argc, char *argv[]) {
> for (int i = 0; i < 2; i++) {
> int r = getopt_long(argc, argv, "o:", NULL, NULL);
> printf("r: %d\n", r);
> printf("optind: %d\n", optind);
> for (int i = 0; i <= argc; i++)
> printf("%d: '%s'\n", i, argv[i]);
> }
> }
>
> With glibc:
> $ ./a.out arg -o
> ../a.out: option requires an argument -- 'o'
> r: 63
> optind: 3
> 0: './a.out'
> 1: 'arg'
> 2: '-o'
> 3: '(null)'
> r: -1
> optind: 2
> 0: './a.out'
> 1: '-o'
> 2: 'arg'
> 3: '(null)'
>
> (Note that glibc permutes argv *before* parsing then next option,
> and even before comparing optind and argc, so argv is still permuted
> on the second invocation.)
>
> With musl:
> $ ./a.out arg -o
> ../a.out: option requires an argument: o
> r: 63
> optind: 3
> 0: './a.out'
> 1: '-o'
> 2: '(null)'
> 3: 'arg'
> r: -1
> optind: 3
> 0: './a.out'
> 1: '-o'
> 2: '(null)'
> 3: 'arg'
>
> Maybe we could just skip permuting and adjust optind if we detected
> a missing argument?
>
> resumed = optind;
> ret = __getopt_long_core(argc, argv, optstring, longopts,
> idx, longonly);
> + if (optind > argc)
> + return optind--, ret;
> if (resumed > skipped) {
>
> On a subsequent invocation we won't permute, unlike glibc, but maybe
> this is a good thing, given that such permutation makes it look like
> there is no missing argument, essentially changing the command
> semantics.
>
> Alexey
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html
OK, this is indeed a mess. I think there's some inherent inconsistency
here, and in general the application should not be calling getopt*
again after a missing argument error, but argv[] should not be
clobbered and the application might semi-legitimately want to do
something with remaining non-option arguments.
Just leaving optind indexing the end of the argv array is probably not
nice. It loses all information about where non-option arguments
started.
I think there are two "kinda reasonable" options aside from what you
proposed:
1. We could leave optind where it was on invocation (so that it points
to the first non-option arg and not do any permutation. This will
make subsequent calls to getopt_long repeat the same error over and
over, but if the caller does not attempt further calls, would tell
the caller the start of the non-option args. However, the final
option with missing argument would also appear in this list.
2. We could permute the option with missing argument before the
remaining non-option args. I think this gives a final ordering
matching glibc, and lets the application see all of the non-option
args, without gratuitously including the option with missing arg.
However, it does produce a result that re-running getopt_long from
the start would misinterpret that option as having had an argument
(repurposing the first non-option arg as its arg). Since glibc does
this, though, apparently it's expected.
My leaning is to do option 2. I think it's as easy as getting rid of
the return part of your patch:
+ if (optind > argc)
+ optind--;
Rich
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [musl] getopt_long() can corrupt argv when an argument for a short option is missing
2023-05-25 13:25 ` Rich Felker
@ 2023-05-25 14:42 ` Alexey Izbyshev
0 siblings, 0 replies; 3+ messages in thread
From: Alexey Izbyshev @ 2023-05-25 14:42 UTC (permalink / raw)
To: musl
On 2023-05-25 16:25, Rich Felker wrote:
> On Thu, May 25, 2023 at 10:53:09AM +0300, Alexey Izbyshev wrote:
>> POSIX requires getopt() to set optind to argc + 1 in case of a
>> missing argument[1], and musl follows it. This bites getopt_long()
>> (which reuses getopt()) in two ways:
>>
>> * getopt_long() moves argv[optind - 1] (NULL) when permuting argv to
>> make all options precede other arguments, essentially corrupting
>> argv.
>>
>> * even when permuting is not required, getopt_long() is both
>> incompatible with glibc (which doesn't increment optind past NULL)
>> and inconsistent with itself (for a long option with a missing
>> argument, musl doesn't increment optind past NULL too).
>>
>> Example of the wrong NULL shifting:
>>
>> #include <getopt.h>
>> #include <stdio.h>
>>
>> int main(int argc, char *argv[]) {
>> for (int i = 0; i < 2; i++) {
>> int r = getopt_long(argc, argv, "o:", NULL, NULL);
>> printf("r: %d\n", r);
>> printf("optind: %d\n", optind);
>> for (int i = 0; i <= argc; i++)
>> printf("%d: '%s'\n", i, argv[i]);
>> }
>> }
>>
>> With glibc:
>> $ ./a.out arg -o
>> ../a.out: option requires an argument -- 'o'
>> r: 63
>> optind: 3
>> 0: './a.out'
>> 1: 'arg'
>> 2: '-o'
>> 3: '(null)'
>> r: -1
>> optind: 2
>> 0: './a.out'
>> 1: '-o'
>> 2: 'arg'
>> 3: '(null)'
>>
>> (Note that glibc permutes argv *before* parsing then next option,
>> and even before comparing optind and argc, so argv is still permuted
>> on the second invocation.)
>>
>> With musl:
>> $ ./a.out arg -o
>> ../a.out: option requires an argument: o
>> r: 63
>> optind: 3
>> 0: './a.out'
>> 1: '-o'
>> 2: '(null)'
>> 3: 'arg'
>> r: -1
>> optind: 3
>> 0: './a.out'
>> 1: '-o'
>> 2: '(null)'
>> 3: 'arg'
>>
>> Maybe we could just skip permuting and adjust optind if we detected
>> a missing argument?
>>
>> resumed = optind;
>> ret = __getopt_long_core(argc, argv, optstring, longopts,
>> idx, longonly);
>> + if (optind > argc)
>> + return optind--, ret;
>> if (resumed > skipped) {
>>
>> On a subsequent invocation we won't permute, unlike glibc, but maybe
>> this is a good thing, given that such permutation makes it look like
>> there is no missing argument, essentially changing the command
>> semantics.
>>
>> Alexey
>>
>> [1]
>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html
>
> OK, this is indeed a mess. I think there's some inherent inconsistency
> here, and in general the application should not be calling getopt*
> again after a missing argument error, but argv[] should not be
> clobbered and the application might semi-legitimately want to do
> something with remaining non-option arguments.
>
> Just leaving optind indexing the end of the argv array is probably not
> nice. It loses all information about where non-option arguments
> started.
>
> I think there are two "kinda reasonable" options aside from what you
> proposed:
>
> 1. We could leave optind where it was on invocation (so that it points
> to the first non-option arg and not do any permutation. This will
> make subsequent calls to getopt_long repeat the same error over and
> over, but if the caller does not attempt further calls, would tell
> the caller the start of the non-option args. However, the final
> option with missing argument would also appear in this list.
>
IMO, while not unreasonable, this option would leave us incompatible
with glibc (which I assume to be the source of truth for getopt_long()).
Also, either handling of long and short options would remain
inconsistent, or we'd have to change the former too, creating even more
incompatibility with glibc.
> 2. We could permute the option with missing argument before the
> remaining non-option args. I think this gives a final ordering
> matching glibc, and lets the application see all of the non-option
> args, without gratuitously including the option with missing arg.
> However, it does produce a result that re-running getopt_long from
> the start would misinterpret that option as having had an argument
> (repurposing the first non-option arg as its arg). Since glibc does
> this, though, apparently it's expected.
>
> My leaning is to do option 2. I think it's as easy as getting rid of
> the return part of your patch:
>
> + if (optind > argc)
> + optind--;
>
This is what I considered before changing to what I proposed. The reason
of the change is that I thought it's more important to match glibc on
the getopt_long() invocation that reports a missing argument (and does
no reordering) than to mimic its subsequent reordering behavior, because
the application is unlikely to call getopt_long() again after the first
error.
However, in my patch I missed one thing: reordering would still be
performed in the same situation for long options (because "optind >
argc" is never true), so getopt_long() would remain inconsistent.
So, unless we want to stop doing reordering for both short and long
options to match glibc on the first getopt_long() call, I agree that
your proposal is better.
Thanks,
Alexey
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-05-25 14:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-25 7:53 [musl] getopt_long() can corrupt argv when an argument for a short option is missing Alexey Izbyshev
2023-05-25 13:25 ` Rich Felker
2023-05-25 14:42 ` Alexey Izbyshev
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).