* possible getopt stderr output changes @ 2014-12-11 0:10 Rich Felker 2014-12-11 3:53 ` Laurent Bercot 2014-12-11 22:07 ` possible getopt stderr output changes Rich Felker 0 siblings, 2 replies; 12+ messages in thread From: Rich Felker @ 2014-12-11 0:10 UTC (permalink / raw) To: musl The current getopt code uses some ugly write() sequences to generate its output to stderr, and fails to support message translation. The latter was an oversight when locale/translation support was added and should absolutely be fixed. I'm not sure whether we should leave the code using write() though or switch to fprintf. The original motivation for write() was to avoid pulling in the printf core and stdio in programs that use getopt but otherwise don't need printf/stdio. However, the use of multiple write() calls splits the messages up into multiple syscalls unnecessarily (increasing the likelihood of getting output interleaved with other processes running in parallel on the same stderr) and failure to use the stderr FILE makes it so the output is not even atomic within the same process. I don't think there's any formal requirement of atomicity here, but it could be seen as a QoI issue. Note that even converted to use fprintf, the code would still be mildly ugly, since it would have to use multiple %s formats and locale lookups to construct the message. This is because musl security policy forbids use of translatable format strings in libc; instead, translatable literals have to be used and processed by a fixed, non-translated format string. Thoughts on what color the bikeshed should be? Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 0:10 possible getopt stderr output changes Rich Felker @ 2014-12-11 3:53 ` Laurent Bercot 2014-12-11 6:44 ` Rich Felker 2014-12-11 22:07 ` possible getopt stderr output changes Rich Felker 1 sibling, 1 reply; 12+ messages in thread From: Laurent Bercot @ 2014-12-11 3:53 UTC (permalink / raw) To: musl On 11/12/2014 01:10, Rich Felker wrote: > The current getopt code uses some ugly write() sequences to generate > its output to stderr, and fails to support message translation. The > latter was an oversight when locale/translation support was added and > should absolutely be fixed. I'm not sure whether we should leave the > code using write() though or switch to fprintf. For what is worth, I may use getopt() sometime, but I will never, ever use stdio, which should burn in the deepest pits of Hell, and I'm being nuanced here. Please don't tie a reasonable interface to the flying kitchen sink monster just because it's guilty of having to write stuff to stderr in one particular case. It doesn't deserve that much punishment. > printf/stdio. However, the use of multiple write() calls splits the > messages up into multiple syscalls unnecessarily (increasing the > likelihood of getting output interleaved with other processes running > in parallel on the same stderr) It is rare for getopt to return a parsing error when the program is used without an interactive terminal: scripts are usually debugged before they're daemonized. Most use cases of getopt writing to stderr are interactive, so the likelihood of interleaving output is low. That said, I'm all for buffering, but is there anything more to do than print localized versions of "illegal option" and "option requires an argument", with some locale-independent data prepended and appended ? Isn't it possible to compute the size of the final string in advance, and build it in a temporary buffer on the stack, before writing ? It's simple buffering: neither stdio's formatting engine, nor its FILE plate of noodles, are needed. > Thoughts on what color the bikeshed should be? I don't mind the color, but let's keep it SUV-free. -- Laurent ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 3:53 ` Laurent Bercot @ 2014-12-11 6:44 ` Rich Felker 2014-12-11 15:40 ` Laurent Bercot 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-12-11 6:44 UTC (permalink / raw) To: musl On Thu, Dec 11, 2014 at 04:53:52AM +0100, Laurent Bercot wrote: > On 11/12/2014 01:10, Rich Felker wrote: > >The current getopt code uses some ugly write() sequences to generate > >its output to stderr, and fails to support message translation. The > >latter was an oversight when locale/translation support was added and > >should absolutely be fixed. I'm not sure whether we should leave the > >code using write() though or switch to fprintf. > > For what is worth, I may use getopt() sometime, but I will never, ever > use stdio, which should burn in the deepest pits of Hell, and I'm being > nuanced here. Is there a reason behind this? On my build, the printf core is ~6.5k and the other parts of stdio you might be likely to pull in are under 2k. I'm happy to take your opinion into consideration but it would be nice to have some rationale. > Please don't tie a reasonable interface to the flying kitchen sink > monster just because it's guilty of having to write stuff to stderr in > one particular case. It doesn't deserve that much punishment. Personally I find stdio a lot more reasonable than getopt. The latter has ugly global state, including possibly hidden internal state with no standard way to reset it. It works well enough for most things (because you can pretend the global state is a sort of main-local state), but it's a problem if you want to handle multiple virtual command lines in the same process (things like busybox-type shell with builtins, or a program handling input from network, GUI, etc. as command lines to be parsed like options, etc.). > >printf/stdio. However, the use of multiple write() calls splits the > >messages up into multiple syscalls unnecessarily (increasing the > >likelihood of getting output interleaved with other processes running > >in parallel on the same stderr) > > It is rare for getopt to return a parsing error when the program is > used without an interactive terminal: scripts are usually debugged > before they're daemonized. Most use cases of getopt writing to stderr > are interactive, so the likelihood of interleaving output is low. This is certainly true. > That said, I'm all for buffering, but is there anything more to do > than print localized versions of "illegal option" and "option requires > an argument", with some locale-independent data prepended and appended ? > Isn't it possible to compute the size of the final string in advance, > and build it in a temporary buffer on the stack, before writing ? > It's simple buffering: neither stdio's formatting engine, nor its > FILE plate of noodles, are needed. For proper reporting of errors with long options (note: currently this is not done right), at least one component of the message, the option name, has unbounded size, so there's no simple way to generate the whole message in a buffer. And even if we just did as much as we could, the code for buffering would be ugly and increase code size by at least a few hundred bytes I think. So this doesn't sound like much of a win over just doing the current multiple-write() approach. And yes you're right about the nature of the translatable portion and locale-independent portion of the messages. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 6:44 ` Rich Felker @ 2014-12-11 15:40 ` Laurent Bercot 2014-12-11 17:51 ` stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Laurent Bercot @ 2014-12-11 15:40 UTC (permalink / raw) To: musl On 11/12/2014 07:44, Rich Felker wrote: > Is there a reason behind this? On my build, the printf core is ~6.5k > and the other parts of stdio you might be likely to pull in are under > 2k. I'm happy to take your opinion into consideration but it would be > nice to have some rationale. 6.5k, or even 8.5k, is not much in the grand scale of things, but it's about the ratio of useful pulled in code / total pulled in code, which I like to be as close as possible to 1. And stdio tanks that ratio, see below. The modest size of the printf code is a testimony to the efficiency of the musl implementation, not to the sanity of the interface. > Personally I find stdio a lot more reasonable than getopt. I dislike stdio for several reasons: - The formatting engine is certainly convenient, but it is basically a runtime interpreter, which has to be entirely pulled in as soon as there's a format string, no matter how simple the formatting is. (Unless compilers perform specific static analysis on format strings to know which part of the interpreter they have to pull, but I doubt this is the case; gcc magically replaces printf(x) with puts(x) when x is devoid of format operations, and it is ugly enough as is.) That means I have to pull in the formatting code for floating point numbers, even if I only handle integers and strings; I have to pull in the code for edge cases of the specification, including the infamous "%n$" format, even if I never need it; I have to pull in varargs even if I only do very regular things with a fixed number of arguments. Most of the time I just want to print a string, a character, or an integer: being able to do this shouldn't add more than 2k to my executable, at most. - The FILE interface is not by any mesure suited to reliable I/O. When printf fails, there's no way to know how many bytes have been written to the descriptor. Same with fclose: if it fails, and the buffer was not empty, there's no way to know if everything was written. Having the same structure for buffered (stdout) and unbuffered (stderr) output is unnecessarily confusing; and don't get me started on buffered input, the details of which users have exactly zero control over. FILE is totally unusable for asynchronous I/O, which is 99% of what I do; it's just good enough to write error messages to stderr, where you don't need accurate reporting - in which case you can even do without stdio because stderr is unbuffered anyway. stdio, like a lot of today's standards, is only there because it's historical, and interface designers didn't know better at the time. It being a widely used and established standard doesn't mean that it's a good standard, by far. > [getopt] > has ugly global state, including possibly hidden internal state with > no standard way to reset it. It works well enough for most things > (because you can pretend the global state is a sort of main-local > state), but it's a problem if you want to handle multiple virtual > command lines in the same process I agree, it's ugly; but global state is a known problem and it's easy to fix. It's already been fixed for pwd/grp/netdb, for localtime, and a lot of other interfaces; it's only a matter of time before some kind of getopt_r() is standardized. > For proper reporting of errors with long options (note: currently this > is not done right), at least one component of the message, the option > name, has unbounded size, so there's no simple way to generate the > whole message in a buffer. Ah, long options. I have no idea how feasible it is to keep getopt and getopt_long as separated as possible, but I wouldn't mind at all if getopt_long (but not getopt) relied on stdio. Because programs using getopt_long are likely to already be using stdio anyway, and this is probably GNU so no one cares about code size. :) > So this doesn't sound like much > of a win over just doing the current multiple-write() approach. Since it mostly happens in the interactive case, avoiding multiple writes is essentially an artistic consideration. I was just interested in learning why you hadn't suggested manual buffering. -- Laurent ^ permalink raw reply [flat|nested] 12+ messages in thread
* stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] 2014-12-11 15:40 ` Laurent Bercot @ 2014-12-11 17:51 ` Rich Felker 2014-12-11 23:05 ` Laurent Bercot 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2014-12-11 17:51 UTC (permalink / raw) To: musl On Thu, Dec 11, 2014 at 04:40:15PM +0100, Laurent Bercot wrote: > On 11/12/2014 07:44, Rich Felker wrote: > >Is there a reason behind this? On my build, the printf core is ~6.5k > >and the other parts of stdio you might be likely to pull in are under > >2k. I'm happy to take your opinion into consideration but it would be > >nice to have some rationale. > > 6.5k, or even 8.5k, is not much in the grand scale of things, but it's > about the ratio of useful pulled in code / total pulled in code, which > I like to be as close as possible to 1. And stdio tanks that ratio, > see below. The modest size of the printf code is a testimony to the > efficiency of the musl implementation, not to the sanity of the > interface. > > > >Personally I find stdio a lot more reasonable than getopt. > > I dislike stdio for several reasons: > > - The formatting engine is certainly convenient, but it is basically I like it because in all but the tiniest programs, you end up needing this kind of functionality, and whenever somebody rolls their own, it's inevitably 10x to 100x larger and uglier than musl's printf core. > a runtime interpreter, which has to be entirely pulled in as soon as > there's a format string, no matter how simple the formatting is. > (Unless compilers perform specific static analysis on format strings > to know which part of the interpreter they have to pull, but I doubt > this is the case; gcc magically replaces printf(x) with puts(x) when > x is devoid of format operations, and it is ugly enough as is.) > That means I have to pull in the formatting code for floating point > numbers, even if I only handle integers and strings; I have to pull in > the code for edge cases of the specification, including the infamous > "%n$" format, even if I never need it; I have to pull in varargs even > if I only do very regular things with a fixed number of arguments. > Most of the time I just want to print a string, a character, or an > integer: being able to do this shouldn't add more than 2k to my > executable, at most. Of all that, the only thing contributing non-trivial size is floating point support. > - The FILE interface is not by any mesure suited to reliable I/O. This is certainly true. > When printf fails, there's no way to know how many bytes have been > written to the descriptor. For seekable files, ftello can tell you. Generally I agree with this reasoning, that stdio is not the right tool for working in-place on valuable files. But it's perfectly usable for producing new output in cases where all write errors will simply result in failing the whole "make a file" operation. > Same with fclose: if it fails, and the > buffer was not empty, there's no way to know if everything was written. This is solved by fflush before fclose. > Having the same structure for buffered (stdout) and unbuffered (stderr) > output is unnecessarily confusing; and don't get me started on buffered > input, the details of which users have exactly zero control over. FILE Stdio read operations should not block unless more data is needed to satisfy the actual request the application is making. If they do it's an implementation bug. Of course it's not usable with select/poll loops because you can't see if there's data already in the buffer. GNU software (gnulib in particular) likes to ignore this problem by poking at internals; we gave them an alternate solution with musl a couple years back just to avoid this. :( > is totally unusable for asynchronous I/O, which is 99% of what I do; For event-driven models, yes. For threaded models, it's quite usable and IMO it simplifies code by a a larger factor than the size it adds, in cases where it's sufficient. > it's just good enough to write error messages to stderr, where you don't > need accurate reporting - in which case you can even do without stdio > because stderr is unbuffered anyway. The big thing it provides here is a standard point of synchronization for error messages in multithreaded programs. Otherwise there would be no lock for different library components to agree on to prevent interleaved error output. > stdio, like a lot of today's standards, is only there because it's > historical, and interface designers didn't know better at the time. > It being a widely used and established standard doesn't mean that > it's a good standard, by far. Yes and no. There are some things that could have been done better, and some backwards-compatible additions that could be made to make it a lot more useful, but I think stdio still largely succeeds in freeing the programmer from having to spend lots of effort on IO code, for a large class of useful programs (certainly not all, though!). > >So this doesn't sound like much > >of a win over just doing the current multiple-write() approach. > > Since it mostly happens in the interactive case, avoiding multiple > writes is essentially an artistic consideration. I was just interested > in learning why you hadn't suggested manual buffering. I agree that it's essentially an artistic consideration. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] 2014-12-11 17:51 ` stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] Rich Felker @ 2014-12-11 23:05 ` Laurent Bercot 2014-12-11 23:35 ` Rich Felker 2014-12-12 2:33 ` Morten Welinder 0 siblings, 2 replies; 12+ messages in thread From: Laurent Bercot @ 2014-12-11 23:05 UTC (permalink / raw) To: musl On 11/12/2014 18:51, Rich Felker wrote: > I like it because in all but the tiniest programs, you end up needing > this kind of functionality, and whenever somebody rolls their own, > it's inevitably 10x to 100x larger and uglier than musl's printf core. You haven't tried skalibs. ;) (Sure, calling format functions individually is far from being as convenient, but the resulting code path is much shorter and there's no bloat.) I agree we need standards. I just wish the existing standards were better, and I don't want to be forced to use them. > Of all that, the only thing contributing non-trivial size is floating > point support. Yes, that's the main thing, but it's an important one: in system programming, floating point operations are uncommon - someone who cares about code size is probably not using floating points. > For seekable files, ftello can tell you. Same thing: system programming is more about pipes and sockets than seekable files. In applications that write files, the interesting logic is probably not in the I/O, and they don't care. > But it's perfectly usable for producing new output in > cases where all write errors will simply result in failing the whole > "make a file" operation. I agree. > This is solved by fflush before fclose. I'm surprised that you of all people say this. What if another thread writes to the FILE between the fflush and the fclose ? Granted, if the situation arises, it's probably a programming error, but still, since atomicity is a big thing for FILE, needing 2 operations instead of 1 doesn't scream good design. > GNU software (gnulib in particular) likes to ignore this problem by poking > at internals; we gave them an alternate solution with musl a couple > years back just to avoid this. :( Jesus. And you still argue that it's a usable interface, if people have to hack internal implementation details to get a simple readability notification working ? > For event-driven models, yes. For threaded models, it's quite usable > and IMO it simplifies code by a a larger factor than the size it adds, > in cases where it's sufficient. "If you can't write asynchronous code, use threads and write synchronous code." :-Þ I agree that threads are a good paradigm to have, but the choice of which model to use should not be dictated by the indigence of available interfaces. > The big thing it provides here is a standard point of synchronization > for error messages in multithreaded programs. Otherwise there would be > no lock for different library components to agree on to prevent > interleaved error output. write() guarantees atomicity up to PIPE_BUF bytes. I have never seen an stderr error message that was bigger than that. > Yes and no. There are some things that could have been done better, > and some backwards-compatible additions that could be made to make it > a lot more useful, but I think stdio still largely succeeds in freeing > the programmer from having to spend lots of effort on IO code, for a > large class of useful programs (certainly not all, though!). I agree it's good enough for Hello World and applications that just need very basic I/O. What irks me is that stdio sets a potential barrier to designing better I/O interfaces, and people who need reliable I/O management often still contort themselves to use stdio, and the results are ugly. See the aforementioned gnulib case. -- Laurent ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] 2014-12-11 23:05 ` Laurent Bercot @ 2014-12-11 23:35 ` Rich Felker 2014-12-12 2:33 ` Morten Welinder 1 sibling, 0 replies; 12+ messages in thread From: Rich Felker @ 2014-12-11 23:35 UTC (permalink / raw) To: musl On Fri, Dec 12, 2014 at 12:05:28AM +0100, Laurent Bercot wrote: > On 11/12/2014 18:51, Rich Felker wrote: > >I like it because in all but the tiniest programs, you end up needing > >this kind of functionality, and whenever somebody rolls their own, > >it's inevitably 10x to 100x larger and uglier than musl's printf core. > > You haven't tried skalibs. ;) I'm not thinking of thing I've tried using myself but rather things I've seen in mainstream, deployed software. Hideous stuff like APR. Let me know when you convince folks using APR, NSPR, etc. to switch to skalibs. :-) I suppose my use of the word "inevitably" was wrong, but I was trying to talk about stuff I've seen happen in the wild rather than what's in the realm of possibility. > >This is solved by fflush before fclose. > > I'm surprised that you of all people say this. What if another thread > writes to the FILE between the fflush and the fclose ? Granted, if the > situation arises, it's probably a programming error, but still, since > atomicity is a big thing for FILE, needing 2 operations instead of 1 > doesn't scream good design. In that case you have UB, since the write from another thread could equally happen after the fclose (it's unsynchronized) resulting in use of an invalid FILE*. So it's not an interesting case. The point being: you can't close a FILE without ensuring that no other threads could still be accessing it. In practice, aside from the standard streams, it's really unusual to have simultaneous accesses to the same stream from multiple threads anyway. > >GNU software (gnulib in particular) likes to ignore this problem by poking > >at internals; we gave them an alternate solution with musl a couple > >years back just to avoid this. :( > > Jesus. And you still argue that it's a usable interface, if people have > to hack internal implementation details to get a simple readability > notification working ? No, I think this is a misuse of it for something it's not good for. I don't advocate this kind of hackery at all. The only reason I went to the effort to support it and negotiate a solution with the gnulib people was that otherwise we risked them writing hacks to poke at the internals, which are intentionally allowed to change between versions of libc.so, thereby making binaries that crash and burn when libc.so is upgraded. Given rate of musl adoption at the time, this would have just made musl look bad, even if it was 100% their fault. It would also have been a big mess for distros to fix. > >For event-driven models, yes. For threaded models, it's quite usable > >and IMO it simplifies code by a a larger factor than the size it adds, > >in cases where it's sufficient. > > "If you can't write asynchronous code, use threads and write synchronous > code." :-Þ > I agree that threads are a good paradigm to have, but the choice of > which model to use should not be dictated by the indigence of available > interfaces. Agreed. I think stdio is a good choice for many things if you are using threads though. > >The big thing it provides here is a standard point of synchronization > >for error messages in multithreaded programs. Otherwise there would be > >no lock for different library components to agree on to prevent > >interleaved error output. > > write() guarantees atomicity up to PIPE_BUF bytes. I have never seen > an stderr error message that was bigger than that. Only for pipes. Ordinary files are also required by POSIX to have atomic write(), but Linux fails to deliver on this requirement, and the only correct way for a kernel to deliver is by returning a short write (which undermines the atomicity) when the full write would sleep. Terminals have no atomicity at all. Stdio cannot make the fd atomic, but it does make _arbitrarily large_ output atomic within a single process (i.e. assuming other processes aren't writing to the file at the same time) simply by doing the locking in userspace where it's not a DoS issue. > >Yes and no. There are some things that could have been done better, > >and some backwards-compatible additions that could be made to make it > >a lot more useful, but I think stdio still largely succeeds in freeing > >the programmer from having to spend lots of effort on IO code, for a > >large class of useful programs (certainly not all, though!). > > I agree it's good enough for Hello World and applications that just > need very basic I/O. I think it also works fine for basically all programs that fit the unix model of running as a "do one thing and do it well" utility that reads input from stdin and/or one or more files and writes output to one file (possibly stdout). > What irks me is that stdio sets a potential barrier > to designing better I/O interfaces, and people who need reliable I/O > management often still contort themselves to use stdio, and the results > are ugly. See the aforementioned gnulib case. Perhaps I should look at how skalibs does it. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] 2014-12-11 23:05 ` Laurent Bercot 2014-12-11 23:35 ` Rich Felker @ 2014-12-12 2:33 ` Morten Welinder 1 sibling, 0 replies; 12+ messages in thread From: Morten Welinder @ 2014-12-12 2:33 UTC (permalink / raw) To: musl It sounds a bit like the main beef you have with printf is that it pulls in floating point stuff. Absent that, it really isn't a lot of code and it really doesn't eat much stack space. So maybe what you are looking for is a way to compile away all floating point support. M. On Thu, Dec 11, 2014 at 6:05 PM, Laurent Bercot <ska-dietlibc@skarnet.org> wrote: > On 11/12/2014 18:51, Rich Felker wrote: >> >> I like it because in all but the tiniest programs, you end up needing >> this kind of functionality, and whenever somebody rolls their own, >> it's inevitably 10x to 100x larger and uglier than musl's printf core. > > > You haven't tried skalibs. ;) > (Sure, calling format functions individually is far from being as > convenient, but the resulting code path is much shorter and there's > no bloat.) > I agree we need standards. I just wish the existing standards were > better, and I don't want to be forced to use them. > > >> Of all that, the only thing contributing non-trivial size is floating >> point support. > > > Yes, that's the main thing, but it's an important one: in system > programming, floating point operations are uncommon - someone who > cares about code size is probably not using floating points. > > >> For seekable files, ftello can tell you. > > > Same thing: system programming is more about pipes and sockets than > seekable files. In applications that write files, the interesting > logic is probably not in the I/O, and they don't care. > > >> But it's perfectly usable for producing new output in >> cases where all write errors will simply result in failing the whole >> "make a file" operation. > > > I agree. > > >> This is solved by fflush before fclose. > > > I'm surprised that you of all people say this. What if another thread > writes to the FILE between the fflush and the fclose ? Granted, if the > situation arises, it's probably a programming error, but still, since > atomicity is a big thing for FILE, needing 2 operations instead of 1 > doesn't scream good design. > > >> GNU software (gnulib in particular) likes to ignore this problem by poking >> at internals; we gave them an alternate solution with musl a couple >> years back just to avoid this. :( > > > Jesus. And you still argue that it's a usable interface, if people have > to hack internal implementation details to get a simple readability > notification working ? > > >> For event-driven models, yes. For threaded models, it's quite usable >> and IMO it simplifies code by a a larger factor than the size it adds, >> in cases where it's sufficient. > > > "If you can't write asynchronous code, use threads and write synchronous > code." :-Þ > I agree that threads are a good paradigm to have, but the choice of > which model to use should not be dictated by the indigence of available > interfaces. > > >> The big thing it provides here is a standard point of synchronization >> for error messages in multithreaded programs. Otherwise there would be >> no lock for different library components to agree on to prevent >> interleaved error output. > > > write() guarantees atomicity up to PIPE_BUF bytes. I have never seen > an stderr error message that was bigger than that. > > >> Yes and no. There are some things that could have been done better, >> and some backwards-compatible additions that could be made to make it >> a lot more useful, but I think stdio still largely succeeds in freeing >> the programmer from having to spend lots of effort on IO code, for a >> large class of useful programs (certainly not all, though!). > > > I agree it's good enough for Hello World and applications that just > need very basic I/O. What irks me is that stdio sets a potential barrier > to designing better I/O interfaces, and people who need reliable I/O > management often still contort themselves to use stdio, and the results > are ugly. See the aforementioned gnulib case. > > -- > Laurent > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 0:10 possible getopt stderr output changes Rich Felker 2014-12-11 3:53 ` Laurent Bercot @ 2014-12-11 22:07 ` Rich Felker 2014-12-13 0:02 ` Isaac Dunham 2014-12-19 21:49 ` Rich Felker 1 sibling, 2 replies; 12+ messages in thread From: Rich Felker @ 2014-12-11 22:07 UTC (permalink / raw) To: musl On Wed, Dec 10, 2014 at 07:10:32PM -0500, Rich Felker wrote: > The current getopt code uses some ugly write() sequences to generate > its output to stderr, and fails to support message translation. The > latter was an oversight when locale/translation support was added and > should absolutely be fixed. I'm not sure whether we should leave the > code using write() though or switch to fprintf. It's been pointed out on irc that POSIX requires ferror(stderr) to be set if writing the message fails. However fwrite could still be used instead of fprintf. If we need to use stdio at all, however, I'd lean towards wanting to make the whole write atomic (i.e. hold the lock for the whole time) which is more of a pain without fprintf. So basically we're looking at: fprintf: PROS: smaller and simpler code in getopt.c, only one syscall CONS: additional ~6.5k of additional code pulled in for static fwrite: PROS: minimal static linking deps CONS: need to use flockfile (or implementation internals) for atomicity if desired, and multiple writes (so no atomicity on the fd) Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 22:07 ` possible getopt stderr output changes Rich Felker @ 2014-12-13 0:02 ` Isaac Dunham 2014-12-13 3:11 ` Rich Felker 2014-12-19 21:49 ` Rich Felker 1 sibling, 1 reply; 12+ messages in thread From: Isaac Dunham @ 2014-12-13 0:02 UTC (permalink / raw) To: musl On Thu, Dec 11, 2014 at 05:07:56PM -0500, Rich Felker wrote: > On Wed, Dec 10, 2014 at 07:10:32PM -0500, Rich Felker wrote: > > The current getopt code uses some ugly write() sequences to generate > > its output to stderr, and fails to support message translation. The > > latter was an oversight when locale/translation support was added and > > should absolutely be fixed. I'm not sure whether we should leave the > > code using write() though or switch to fprintf. > > It's been pointed out on irc that POSIX requires ferror(stderr) to be > set if writing the message fails. However fwrite could still be used > instead of fprintf. If we need to use stdio at all, however, I'd lean > towards wanting to make the whole write atomic (i.e. hold the lock for > the whole time) which is more of a pain without fprintf. So basically > we're looking at: > > fprintf: > PROS: smaller and simpler code in getopt.c, only one syscall > CONS: additional ~6.5k of additional code pulled in for static > > fwrite: > PROS: minimal static linking deps > CONS: need to use flockfile (or implementation internals) for > atomicity if desired, and multiple writes (so no atomicity on the fd) I realize there's quality of implementation to be concerned about and similar issues, but I'm really wondering: How brain-damaged does code have to be to call getopt() from a thread, *after* starting a second thread and beginning writes to stderr? Is there any real-world use of this? Thanks, Isaac Dunham ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-13 0:02 ` Isaac Dunham @ 2014-12-13 3:11 ` Rich Felker 0 siblings, 0 replies; 12+ messages in thread From: Rich Felker @ 2014-12-13 3:11 UTC (permalink / raw) To: musl On Fri, Dec 12, 2014 at 04:02:24PM -0800, Isaac Dunham wrote: > I realize there's quality of implementation to be concerned about and > similar issues, but I'm really wondering: > > How brain-damaged does code have to be to call getopt() from a thread, > *after* starting a second thread and beginning writes to stderr? > Is there any real-world use of this? The way it's most likely to happen is when something that runs early in main, or from a global ctor, starts threads, and they encounter errors to print to stderr. Generally I think any 'library level' use of stderr (or other standard streams) is bad design and so in my opinion, this shouldn't happen, but plenty of people disagree. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: possible getopt stderr output changes 2014-12-11 22:07 ` possible getopt stderr output changes Rich Felker 2014-12-13 0:02 ` Isaac Dunham @ 2014-12-19 21:49 ` Rich Felker 1 sibling, 0 replies; 12+ messages in thread From: Rich Felker @ 2014-12-19 21:49 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1946 bytes --] On Thu, Dec 11, 2014 at 05:07:56PM -0500, Rich Felker wrote: > On Wed, Dec 10, 2014 at 07:10:32PM -0500, Rich Felker wrote: > > The current getopt code uses some ugly write() sequences to generate > > its output to stderr, and fails to support message translation. The > > latter was an oversight when locale/translation support was added and > > should absolutely be fixed. I'm not sure whether we should leave the > > code using write() though or switch to fprintf. > > It's been pointed out on irc that POSIX requires ferror(stderr) to be > set if writing the message fails. However fwrite could still be used > instead of fprintf. If we need to use stdio at all, however, I'd lean > towards wanting to make the whole write atomic (i.e. hold the lock for > the whole time) which is more of a pain without fprintf. So basically > we're looking at: > > fprintf: > PROS: smaller and simpler code in getopt.c, only one syscall > CONS: additional ~6.5k of additional code pulled in for static > > fwrite: > PROS: minimal static linking deps > CONS: need to use flockfile (or implementation internals) for > atomicity if desired, and multiple writes (so no atomicity on the fd) I'm attaching a patch which allows either solution, and the approach using fwrite is considerably uglier. Even using some stdio internals directly rather than the public API, the resulting getopt.o is 176 bytes larger than the the fprintf version, but I think the ugliness to get the required semantics is the worst part. So I'm strongly leaning towards just using fprintf. The other viable alternative would be having an internal-use function for printing a variadic list of strings atomically through stdio, but I still think there's probably more value in keeping getopt.c independent of musl internals as much as possible. Adding support for translation of the messages is a separate step yet to be done, but the code is setup to support it in either way. Rich [-- Attachment #2: getopt_stdio.diff --] [-- Type: text/plain, Size: 1837 bytes --] diff --git a/src/misc/getopt.c b/src/misc/getopt.c index e77e460..e2a309a 100644 --- a/src/misc/getopt.c +++ b/src/misc/getopt.c @@ -4,6 +4,7 @@ #include <limits.h> #include <stdlib.h> #include "libc.h" +#include "stdio_impl.h" char *optarg; int optind=1, opterr=1, optopt, __optpos, __optreset=0; @@ -11,6 +12,21 @@ int optind=1, opterr=1, optopt, __optpos, __optreset=0; #define optpos __optpos weak_alias(__optreset, optreset); +#if 0 +static void errmsg(const char *a, const char *b, int l, const char *c) +{ + FILE *f = stderr; + FLOCK(f); + size_t al = strlen(a), bl = strlen(b); + __fwritex((void *)a, al, f) == al && + __fwritex((void *)b, bl, f) == bl && + __fwritex((void *)c, l, f) == l && + __fwritex((void *)"\n", 1, f); + FUNLOCK(f); +} +#define fprintf(fi, fo, a, b, l, c) errmsg(a, b, l, c) +#endif + int getopt(int argc, char * const argv[], const char *optstring) { int i; @@ -66,24 +82,18 @@ int getopt(int argc, char * const argv[], const char *optstring) } while (l && d != c); if (d != c) { - if (optstring[0] != ':' && opterr) { - write(2, argv[0], strlen(argv[0])); - write(2, ": illegal option: ", 18); - write(2, optchar, k); - write(2, "\n", 1); - } + if (optstring[0] != ':' && opterr) + fprintf(stderr, "%s%s%.*s\n", argv[0], + ": illegal option: ", k, optchar); return '?'; } if (optstring[i] == ':') { if (optstring[i+1] == ':') optarg = 0; else if (optind >= argc) { if (optstring[0] == ':') return ':'; - if (opterr) { - write(2, argv[0], strlen(argv[0])); - write(2, ": option requires an argument: ", 31); - write(2, optchar, k); - write(2, "\n", 1); - } + if (opterr) + fprintf(stderr, "%s%s%.*s\n", argv[0], + ": option requires an argument: ", k, optchar); return '?'; } if (optstring[i+1] != ':' || optpos) { ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-12-19 21:49 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-12-11 0:10 possible getopt stderr output changes Rich Felker 2014-12-11 3:53 ` Laurent Bercot 2014-12-11 6:44 ` Rich Felker 2014-12-11 15:40 ` Laurent Bercot 2014-12-11 17:51 ` stdio [de]merits discussion [Re: [musl] possible getopt stderr output changes] Rich Felker 2014-12-11 23:05 ` Laurent Bercot 2014-12-11 23:35 ` Rich Felker 2014-12-12 2:33 ` Morten Welinder 2014-12-11 22:07 ` possible getopt stderr output changes Rich Felker 2014-12-13 0:02 ` Isaac Dunham 2014-12-13 3:11 ` Rich Felker 2014-12-19 21:49 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).