mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: getopt() not exposing __optpos - shell needs it
Date: Tue, 29 Aug 2017 09:07:57 -0400	[thread overview]
Message-ID: <20170829130757.GA1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAK1hOcNhbaX3R7e42=iQX-LipaeWOP3hX5ZmNDNV6XCr_am2Vw@mail.gmail.com>

On Tue, Aug 29, 2017 at 02:47:13PM +0200, Denys Vlasenko wrote:
> On Tue, Aug 29, 2017 at 2:20 PM, Rich Felker <dalias@libc.org> wrote:
> >> >> When I try to do that (use getopt() to implement "getopts"), it hits a snag.
> >> >> Unlike normal getopt() usage in C programs, where it is called in a loop
> >> >> with the same argv[] array until parsing is finished,
> >> >> when it is used from "getopts", each successive call will (usually) have
> >> >> the same argv[] CONTENTS, but not the ADDRESSES.
> >> >> (The reason is in how shell works: it re-creates command arguments just before
> >> >> running a command, since there can be variable substitution, globbing, etc).
> >> >
> >> > First, some background out of the spec to establish what is supposed
> >> > to work and what's not:
> >> >
> >> >     If the application sets OPTIND to the value 1, a new set of
> >> >     parameters can be used: either the current positional parameters
> >> >     or new arg values. Any other attempt to invoke getopts multiple
> >> >     times in a single shell execution environment with parameters
> >> >     (positional parameters or arg operands) that are not the same in
> >> >     all invocations, or with an OPTIND value modified to be a value
> >> >     other than 1, produces unspecified results.
> >> >
> >> > What this means is that, when you use getopts(1), you need to either
> >> > use the exact same arguments (as you said, *string contents*, not
> >> > likely to be the same argv[] pointers) or reset it with OPTIND=1.
> >> >
> >> > It seems to me that the easiest, fully-portable fix is just the
> >> > obvious quadratic-time solution: on each run of getopts(1), reset
> >> > getopt(3) to the start and call it ++N times.
> >>
> >> This has several problems:
> >> It prints multiple messages "invalid option -q"
> >> when there are options which are not in optstring.
> >
> > opterr=0;
> >
> > Either leave it 0 and always do your own error printing, or set it
> > nonzero just before the last call (for the current option) so that
> > only that one prints an error.
> >
> >> It mangles optarg if an option without argument follows
> >> an option with an argument.
> >
> > Maybe I'm missing what you're trying to say, but all the state is
> > clobbered; I don't see how optarg is a problem specifically. You can
> > clear or set it to a sentinel value before the relevant call if you're
> > trying to determine if the call set it. Across other calls (not the
> > one for the current option) I don't see why it matters at all what
> > happens to it.
> 
> Yes, this can be done.
> 
> It gets increasigly ugly, though.
> 
> With these amounts of massaging around libc API design breakage,

Yes the getopt API is horribly broken. It's all global state, with a
tiny portion of that state internal/inaccessible. It doesn't follow
that the solution is adding new extensions every time an application
hits an obstacle from the brokenness. The right direction for fixing
it on the libc side would be introduction (with consensus across
important implementations) of a getopt_r API or similar with no
global/internal state.

> "getopts" builtin code in hush is almost as big as simply
> reimplementing getopt(): ~500 versus ~750 bytes on x86.
> If I factor out ash getopts implementation and use it in both shells,
> I can probably even decrease code size.

I really don't think adding a single store to optarg is going to have
relevant effects on the size, so we're back to just talking about the
cost of the obvious quadratic solutions I discussed above. Yes it may
turn out that just implementing getopts(1) without getopt(3) can be
smaller -- seems very likely if you can drop getopt(3) entirely from
the link, but less so if other code in busybox uses getopt(3) anyway
or if it's dynamic-linked and thus not included in the binary anyway.
I don't know whether this makes sense for you; it's your call. One
hidden cost is that getopt does have a number of nasty corner cases
that have already been considered (or found as bugs and fixed) in
musl, but I don't know if any of them carry over to corner cases in
getopts(1); if not they're probably not relevant to you.

Rich


  reply	other threads:[~2017-08-29 13:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-28 10:18 Denys Vlasenko
2017-08-28 15:17 ` Denys Vlasenko
2017-08-28 15:28 ` Rich Felker
2017-08-29 11:32   ` Denys Vlasenko
2017-08-29 12:20     ` Rich Felker
2017-08-29 12:47       ` Denys Vlasenko
2017-08-29 13:07         ` Rich Felker [this message]
2017-08-29 16:47           ` Denys Vlasenko
2017-08-29 17:38             ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170829130757.GA1627@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).