* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 4:24 ` Jean-Marc Pigeon
@ 2015-04-23 5:08 ` Raphael Cohn
2015-04-23 12:29 ` Jean-Marc Pigeon
2015-04-23 9:23 ` Laurent Bercot
2015-04-23 10:10 ` Rich Felker
2 siblings, 1 reply; 19+ messages in thread
From: Raphael Cohn @ 2015-04-23 5:08 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 6488 bytes --]
Fail fast, fail early is always the right thing to do. I've seen far too
much code in my career that tries to 'carry on' after encountering an
unexpected condition. It's always the wrong thing to do, even for nuclear
power plants. Once you're in UB land, the rest of your reasoning about your
code is invalid. The best you can do is gracefully shutdown and capture as
much diagnostic info as is feasible. Expecting musl to 'cover up' should be
seen the same as in the real world - cover ups just make the problem worse.
Unexpected is not the same as expected but unpleasant to deal with eg
failures to open(). And so hence the arguments about checked and unchecked
exceptions and so on.
Oh and auto restarting daemons is almost always wrong. For exactly the same
reasoning - and frequently leads to data corruption.
On 23 Apr 2015 05:25, "Jean-Marc Pigeon" <jmp@safe.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 04/22/2015 10:15 PM, Rich Felker wrote:
> > On Wed, Apr 22, 2015 at 09:26:57PM -0400, Jean-Marc Pigeon wrote:
> >>> I think the only safe conclusion is that the application is
> >>> incorrect and should ensure that setenv() is never called with
> >>> a NULL value.
> >>>
> >> Checked glibc, My understanding, it set something as "name=" in
> >> the environment, so the variable is present but value is "empty"i
> >> (top application to decide what to do). uclibc does something
> >> similar (as far I can tell looking at source code)..
> >>
> >>
> >> The application is not careful enough, but not incorrect as
> >> such.
> >
> > It's definitely incorrect. It's doing something that invokes
> > undefined behavior.
> >
> >> Note: we may have tons of applications with the same problem. if
> >> we keep musl setenv like that, musl will be seen as quite
> >> unreliable.
> >
> > No, actually glibc is fixing this bug (maybe they already did).
> > See the thread beginning here:
> >
> > https://sourceware.org/ml/libc-alpha/2015-03/threads.html#00449
> >
> > My understanding is that glibc is planning to do, or already does
> > in the latest version, exactly what musl is doing.
> >
> >> If this situation is indeed UB, there is 2 options for musl: 1)
> >> Swallow the problem nicely... as glibc and uclibc does. 2) Report
> >> an error.. EINVAL? (and document it in manual)
> >>
> >> Crashing at "libc" level is not an option.
> >
> > I can see how it might seem like that at first, but crashing is
> > actually the best possible behavior. Options 1 and 2 cover up a
> I strongly disagree, crashing is not an option for a tools as
> musl/libc.
>
> Think about this, you write an application working perfectly right,
> but 1 in 1000000 you reach something not trapped by low level and
> once in while the application (in production for month) just stop
> to work because "unexpected" within musl...
> (so someone will propose to set a cron to automatically restart this
> unreliable daemon, hmmm...)
>
> Far better to return "trouble" status, then it is to the application
> to decide what must be done in context, as ignore, override, bypass,
> crash, etc.
>
> A sensible policy in case of UB would be for such low level code to
> swallow the problem, (protect the hardware and keep the program
> running as much as possible).
>
> > potentially serious bug -- it's not clear what the application was
> > trying to do, most likely nobody even thought about what they were
> > trying to do, and even if they did have something in mind it's not
> > reliable or portable. The glibc wiki has some text taken from text
> > I wrote on the topic (copied from a stack overflow answer I gave)
> > here:
>
> As reported, the crashing application is hwclock, (util-linux-2.26),
> this a kind of code in the field for a very very long time, so the
> library (glibc and old libc) used for linux over the years defined an
> expected behavior to this "UB".
> As other occurrence of this could be present too in other
> program/application, crashing would make a bench of applications
> to be running hazard (you can have all kind situation with
> env variables, near unpredictable).
>
> Something worry me in comments I have seen in the proposed URL,
> IMHO purpose of musl/glibc is not to "find bugs by crashing", its
> purpose is to be a code "clean, lean, reliable, predictable" (as said
> above, "Protect the hardware, report problem, lets the above layer
> application decide what to do in case of problem").
>
> Crashing is not an option for code pertaining to musl/libc layer.
>
> (:-} why bother to return an error, just crash for all
> problems in open, close, write, etc. just bringing the crashing
> concept to the extreme :-}).
>
>
>
> >
> > https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
> >
> > Specifically it covers why returning an error is not a good idea.
> My experience (for a long time now) about writing complex daemon
> running for months/year, it is not that straightforward (may
> be for a simple application it is)
>
> >
> > Rich
> >
>
>
> - --
>
> A bientôt
> ===========================================================
> Jean-Marc Pigeon E-Mail: jmp@safe.ca
> SAFE Inc. Phone: (514) 493-4280
> Clement, 'a kiss solution' to get rid of SPAM (at last)
> Clement' Home base <"http://www.clement.safe.ca">
> ===========================================================
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.14 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIcBAEBAgAGBQJVOHQOAAoJEAtgwJ4bBQU5bCUP/jxt74112bqFat0+CvFpqNG8
> sCEYEIpRcKIB7wWRLwQXQTQ0M82oUzo6bmpYtgorcwJWh5NIo/dFm2FvDW+/uiTs
> dtutt45jJN7tsq4BhE/z/jyD44vrVqYZ+gJXf3MFPWeTFx4Kd7aCq3dCDvvoT9lk
> Mp9KiiSY+WXfwE/a8CAiw6D+Ma2/iN4zTD1fUekXcZDgu0iMsieeF5uhZNT/L+62
> U8K2VdOoR1H731v3GzKFUiesPTiPNbASg0MDfKyYe5kCZMkijbHIx7fGb/VID6dv
> 01u3bNA8yTfj51MplGv3ddgAzYcOMBMrQ6IXBK4hQPm4wuGp3ELr6IiTepUfkc3y
> WYlNHCFqpgMthj2+nsMllwK+uZcbW36+QlYQ4FCuAMaqOjpA+Yr6qmoiOl5HFSMA
> FoqarDkUeH3jYWYBFq8aUFQdD8Esyj3mKXAc/Bw45vWGkBykTyJXMYGVh4idukFQ
> dkiYTM5rOB3H55dkjB9EnojAX+w0+VG+0H5xtb8fybtQYEr7SbsRO5fhTbmTr/1P
> Q6ojm8fhkoRT0JXC+94htj5rT/87QerVIu06YcozqOsqCKBm7HBkokkMTMtWofJ4
> sznm9/OtNqbYYTSdDQ7AVn1x0K8ycz5Aw302NOWd5NuSEPal+GuWP2mewhydlXAj
> 2+W/ox353kVOSWf0t2Ql
> =R+I2
> -----END PGP SIGNATURE-----
>
>
[-- Attachment #2: Type: text/html, Size: 7836 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 5:08 ` Raphael Cohn
@ 2015-04-23 12:29 ` Jean-Marc Pigeon
0 siblings, 0 replies; 19+ messages in thread
From: Jean-Marc Pigeon @ 2015-04-23 12:29 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 7045 bytes --]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 04/23/2015 01:08 AM, Raphael Cohn wrote:
> Fail fast, fail early is always the right thing to do. I've seen
> far too much code in my career that tries to 'carry on' after
> encountering an unexpected condition. It's always the wrong thing
> to do, even for nuclear power plants. Once you're in UB land, the
> rest of your reasoning about your code is invalid. The best you can
> do is gracefully shutdown and capture as much diagnostic info as is
> feasible. Expecting musl to 'cover up' should be seen the same as
> in the real world - cover ups just make the problem worse.
this is UB, man setenv(name, NULL, override) is/was not defining if
it was valid or not.
Code must be resilient.
Either report an error and have the documentation updated accordingly
(get rid of the UB, have all old running code taking care of the new
standard) or code the 'best' decision implementing code (done by glibc
or uclibc so fare).
>
> Unexpected is not the same as expected but unpleasant to deal with
> eg failures to open(). And so hence the arguments about checked
> and unchecked exceptions and so on.
>
> Oh and auto restarting daemons is almost always wrong. For exactly
> the same reasoning - and frequently leads to data corruption.
Fully agreed with you and this was exactly my point, but reality say
otherwise, in the field sometime it is seen as the only option :-{{{.
>
> On 23 Apr 2015 05:25, "Jean-Marc Pigeon" <jmp@safe.ca
> <mailto:jmp@safe.ca>> wrote:
>
> On 04/22/2015 10:15 PM, Rich Felker wrote:
>> On Wed, Apr 22, 2015 at 09:26:57PM -0400, Jean-Marc Pigeon
>> wrote:
>>>> I think the only safe conclusion is that the application is
>>>> incorrect and should ensure that setenv() is never called
>>>> with a NULL value.
>>>>
>>> Checked glibc, My understanding, it set something as "name="
>>> in the environment, so the variable is present but value is
>>> "empty"i (top application to decide what to do). uclibc does
>>> something similar (as far I can tell looking at source code)..
>>>
>>>
>>> The application is not careful enough, but not incorrect as
>>> such.
>
>> It's definitely incorrect. It's doing something that invokes
>> undefined behavior.
>
>>> Note: we may have tons of applications with the same problem.
>>> if we keep musl setenv like that, musl will be seen as quite
>>> unreliable.
>
>> No, actually glibc is fixing this bug (maybe they already did).
>> See the thread beginning here:
>
>> https://sourceware.org/ml/libc-alpha/2015-03/threads.html#00449
>
>> My understanding is that glibc is planning to do, or already
>> does in the latest version, exactly what musl is doing.
>
>>> If this situation is indeed UB, there is 2 options for musl:
>>> 1) Swallow the problem nicely... as glibc and uclibc does. 2)
>>> Report an error.. EINVAL? (and document it in manual)
>>>
>>> Crashing at "libc" level is not an option.
>
>> I can see how it might seem like that at first, but crashing is
>> actually the best possible behavior. Options 1 and 2 cover up a
> I strongly disagree, crashing is not an option for a tools as
> musl/libc.
>
> Think about this, you write an application working perfectly
> right, but 1 in 1000000 you reach something not trapped by low
> level and once in while the application (in production for month)
> just stop to work because "unexpected" within musl... (so someone
> will propose to set a cron to automatically restart this unreliable
> daemon, hmmm...)
>
> Far better to return "trouble" status, then it is to the
> application to decide what must be done in context, as ignore,
> override, bypass, crash, etc.
>
> A sensible policy in case of UB would be for such low level code
> to swallow the problem, (protect the hardware and keep the program
> running as much as possible).
>
>> potentially serious bug -- it's not clear what the application
>> was trying to do, most likely nobody even thought about what they
>> were trying to do, and even if they did have something in mind
>> it's not reliable or portable. The glibc wiki has some text taken
>> from text I wrote on the topic (copied from a stack overflow
>> answer I gave) here:
>
> As reported, the crashing application is hwclock,
> (util-linux-2.26), this a kind of code in the field for a very
> very long time, so the library (glibc and old libc) used for linux
> over the years defined an expected behavior to this "UB". As other
> occurrence of this could be present too in other
> program/application, crashing would make a bench of applications to
> be running hazard (you can have all kind situation with env
> variables, near unpredictable).
>
> Something worry me in comments I have seen in the proposed URL,
> IMHO purpose of musl/glibc is not to "find bugs by crashing", its
> purpose is to be a code "clean, lean, reliable, predictable" (as
> said above, "Protect the hardware, report problem, lets the above
> layer application decide what to do in case of problem").
>
> Crashing is not an option for code pertaining to musl/libc layer.
>
> (:-} why bother to return an error, just crash for all problems in
> open, close, write, etc. just bringing the crashing concept to the
> extreme :-}).
>
>
>
>
>
> https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
>
>
>> Specifically it covers why returning an error is not a good
>> idea.
> My experience (for a long time now) about writing complex daemon
> running for months/year, it is not that straightforward (may be for
> a simple application it is)
>
>
>> Rich
>
>
>
>
- --
A bientôt
===========================================================
Jean-Marc Pigeon E-Mail: jmp@safe.ca
SAFE Inc. Phone: (514) 493-4280
Clement, 'a kiss solution' to get rid of SPAM (at last)
Clement' Home base <"http://www.clement.safe.ca">
===========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJVOOW5AAoJEAtgwJ4bBQU52jEQAJXCLoQyl0s4E9uQaAOsmA8x
TEuDRsRI0wn90QaW5GdwbpUDwzPqp9d33FXkxpnQljoGX71sH17AmXfe9L7h32pG
RQnlpRQy5zdxXYx1tioUvLfwfMW+PsqG5SwyzmNOKvoBMS4sC7FHrwJtYh9YZYSB
hO7C6tdS8b7hJoRwzW1/4KPJRak7Z3dTD2tZiMk+4VWtzs19HQ6nwMMjAHBIw4l3
EbcLVjdjCcFVtbWC8Wr/nJQcDoAenUXrj5qfxxazPjb1AJGa91k4eGbSS7FDFiqA
JT4tdpgYL4/+RkwzbjJhUJHXBisYJ29KBpq/YV3Yj/BOlSjT0z2yfu/dUlU37x/m
dDP535oty1jR4l3t73yumyiXP41c8tdwK22QAaaK+4Mh+1SdkA+Eh9rsZsl7cij0
UrvLTrc4sNjD+zIvEhkdyhCZ/4SVrXpVCpxHOXO2IIB/1CDTu2a63OaNnpctrrK9
VHsGTgAZgmwGkszXUhK8cIsw4FQNog1U2CwcwJBJhk0UGkwQOG4HXtECvqD9YjhQ
jIY2tFzEeG7ObwMT0AjK0KKd/aVKoCyCfNLjO3Pucww54/8aTQaWZ/mOj4fwsP01
YTrbAs2L4VDRtnHQru4THYIjKMK8P0zU991XqPhTeMGhJuP5x13cHzAoUhsw3HVh
q1Ki9yFz/20JUbK/KqRt
=VNMJ
-----END PGP SIGNATURE-----
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4242 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 4:24 ` Jean-Marc Pigeon
2015-04-23 5:08 ` Raphael Cohn
@ 2015-04-23 9:23 ` Laurent Bercot
2015-04-23 9:52 ` Raphael Cohn
2015-04-23 10:10 ` Rich Felker
2 siblings, 1 reply; 19+ messages in thread
From: Laurent Bercot @ 2015-04-23 9:23 UTC (permalink / raw)
To: musl
On 23/04/2015 06:24, Jean-Marc Pigeon wrote:
> Think about this, you write an application working perfectly right,
> but 1 in 1000000 you reach something not trapped by low level and
> once in while the application (in production for month) just stop
> to work because "unexpected" within musl...
And why do you think the problem exists in the first place ?
Because other libcs were defensive and failed to fail early, so the
bug was never discovered until now. Your application is not working
perfectly right - it is buggy, and it *should* fail. musl is giving
developers a gift that other libcs do not: it helps them debug.
> (so someone will propose to set a cron to automatically restart this
> unreliable daemon, hmmm...)
You want to be defensive, well, yeah, this is the place to be
defensive. Until the bug is found and fixed, at least the daemon is
kind of providing service.
Raphael says this behaviour is wrong for the same reason that
silently failing is wrong, but I disagree. First, restarting crashing
daemons is not silent at all, a crash is always a loud warning and
can hardly be ignored; and second, restarting a process is not
continuing it. A process can always be restarted from a clean state
and work in a predictable way until it trips the bug again, whereas
silently ignoring UB makes the process unpredictable for the rest of
its lifetime.
> Far better to return "trouble" status, then it is to the application
> to decide what must be done in context, as ignore, override, bypass,
> crash, etc.
What "trouble" status do you return when a function dereferences a
NULL pointer ? This is exactly what's happening here. Passing NULL
to setenv is as incorrect as dereferencing NULL, and should result
in the same behaviour.
> A sensible policy in case of UB would be for such low level code to
> swallow the problem, (protect the hardware and keep the program
> running as much as possible).
The language you want is Javascript, not C.
> As reported, the crashing application is hwclock, (util-linux-2.26),
> this a kind of code in the field for a very very long time, so the
> library (glibc and old libc) used for linux over the years defined an
> expected behavior to this "UB".
And this is why musl is so much better. If glibc and uclibc devs
hadn't been so complacent, the bug wouldn't have lived for so long.
> Crashing is not an option for code pertaining to musl/libc layer.
It definitely is. You don't want your program to crash ? Don't
invoke UB.
If you want to be "safe", you can ignore SIGSEGV at the start of
all your applications - it will be the exact same thing as what you
are asking. Your daemons will live longer, I guarantee it.
> (:-} why bother to return an error, just crash for all
> problems in open, close, write, etc. just bringing the crashing
> concept to the extreme :-}).
Straw man. You know as well as we do the difference between a
programming error and a run-time error.
> My experience (for a long time now) about writing complex daemon
> running for months/year, it is not that straightforward (may
> be for a simple application it is)
And mine is that it is. We're evens, now please let's stop bringing
up anecdotal evidence.
--
Laurent
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 9:23 ` Laurent Bercot
@ 2015-04-23 9:52 ` Raphael Cohn
2015-04-23 10:47 ` Laurent Bercot
0 siblings, 1 reply; 19+ messages in thread
From: Raphael Cohn @ 2015-04-23 9:52 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 4441 bytes --]
On 23 April 2015 at 10:23, Laurent Bercot <ska-dietlibc@skarnet.org> wrote:
> On 23/04/2015 06:24, Jean-Marc Pigeon wrote:
>
>> Think about this, you write an application working perfectly right,
>> but 1 in 1000000 you reach something not trapped by low level and
>> once in while the application (in production for month) just stop
>> to work because "unexpected" within musl...
>>
>
> And why do you think the problem exists in the first place ?
> Because other libcs were defensive and failed to fail early, so the
> bug was never discovered until now. Your application is not working
> perfectly right - it is buggy, and it *should* fail. musl is giving
> developers a gift that other libcs do not: it helps them debug.
>
>
> (so someone will propose to set a cron to automatically restart this
>> unreliable daemon, hmmm...)
>>
>
> You want to be defensive, well, yeah, this is the place to be
> defensive. Until the bug is found and fixed, at least the daemon is
> kind of providing service.
>
> Raphael says this behaviour is wrong for the same reason that
> silently failing is wrong, but I disagree. First, restarting crashing
> daemons is not silent at all, a crash is always a loud warning and
> can hardly be ignored; and second, restarting a process is not
> continuing it. A process can always be restarted from a clean state
> and work in a predictable way until it trips the bug again, whereas
> silently ignoring UB makes the process unpredictable for the rest of
> its lifetime.
Yes and no. Crashes are loud and noisy, and should immediately trigger
alerts, but without intimate knowledge of the application and the cause of
the fault, auto-restarting is risky. In my operational experience, it's
usually been a hack employed by incompetent sysadmins (no names, no pack
drill, but one large government dept comes to mind). If you have knowledge
of your daemon processes, then you could if:-
- you know they are idempotent or do not have persistent state (eg DNS
caches)
- they're essential system services (definitions might vary, but I'd have
ssh for geographically remote boxes here)
That said, stuff that has complex state really shouldn't be restarted
without *investigation* - message brokers, relational database titans,
cluster HA set ups, etc. The worst outage of my career was a terracotta
cluster that had suffered from a split brain. Restarting it naively caused
it to _delete_ the only remaining good state. Is this your 'clean state'
caveat above?
>
>
> Far better to return "trouble" status, then it is to the application
>> to decide what must be done in context, as ignore, override, bypass,
>> crash, etc.
>>
>
> What "trouble" status do you return when a function dereferences a
> NULL pointer ? This is exactly what's happening here. Passing NULL
> to setenv is as incorrect as dereferencing NULL, and should result
> in the same behaviour.
>
>
> A sensible policy in case of UB would be for such low level code to
>> swallow the problem, (protect the hardware and keep the program
>> running as much as possible).
>>
>
> The language you want is Javascript, not C.
>
>
> As reported, the crashing application is hwclock, (util-linux-2.26),
>> this a kind of code in the field for a very very long time, so the
>> library (glibc and old libc) used for linux over the years defined an
>> expected behavior to this "UB".
>>
>
> And this is why musl is so much better. If glibc and uclibc devs
> hadn't been so complacent, the bug wouldn't have lived for so long.
>
>
> Crashing is not an option for code pertaining to musl/libc layer.
>>
>
> It definitely is. You don't want your program to crash ? Don't
> invoke UB.
> If you want to be "safe", you can ignore SIGSEGV at the start of
> all your applications - it will be the exact same thing as what you
> are asking. Your daemons will live longer, I guarantee it.
>
>
> (:-} why bother to return an error, just crash for all
>> problems in open, close, write, etc. just bringing the crashing
>> concept to the extreme :-}).
>>
>
> Straw man. You know as well as we do the difference between a
> programming error and a run-time error.
>
>
> My experience (for a long time now) about writing complex daemon
>> running for months/year, it is not that straightforward (may
>> be for a simple application it is)
>>
>
> And mine is that it is. We're evens, now please let's stop bringing
> up anecdotal evidence.
>
> --
> Laurent
>
>
[-- Attachment #2: Type: text/html, Size: 6449 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 9:52 ` Raphael Cohn
@ 2015-04-23 10:47 ` Laurent Bercot
0 siblings, 0 replies; 19+ messages in thread
From: Laurent Bercot @ 2015-04-23 10:47 UTC (permalink / raw)
To: musl
On 23/04/2015 11:52, Raphael Cohn wrote:
> The worst outage of my
> career was a terracotta cluster that had suffered from a split brain.
> Restarting it naively caused it to _delete_ the only remaining good
> state. Is this your 'clean state' caveat above?
Yes it is. I think scripts that start complex services should always
make sure to check the state first. Ideally, a run script (as we name
them in the process supervision world) should be idempotent in all cases.
I realize it's not always easy to achieve, but I believe it is the goal
to aim for.
--
Laurent
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 4:24 ` Jean-Marc Pigeon
2015-04-23 5:08 ` Raphael Cohn
2015-04-23 9:23 ` Laurent Bercot
@ 2015-04-23 10:10 ` Rich Felker
2015-04-23 12:58 ` Jean-Marc Pigeon
2 siblings, 1 reply; 19+ messages in thread
From: Rich Felker @ 2015-04-23 10:10 UTC (permalink / raw)
To: musl
On Thu, Apr 23, 2015 at 12:24:46AM -0400, Jean-Marc Pigeon wrote:
> >> If this situation is indeed UB, there is 2 options for musl: 1)
> >> Swallow the problem nicely... as glibc and uclibc does. 2) Report
> >> an error.. EINVAL? (and document it in manual)
> >>
> >> Crashing at "libc" level is not an option.
> >
> > I can see how it might seem like that at first, but crashing is
> > actually the best possible behavior. Options 1 and 2 cover up a
> I strongly disagree, crashing is not an option for a tools as
> musl/libc.
>
> Think about this, you write an application working perfectly right,
> but 1 in 1000000 you reach something not trapped by low level and
> once in while the application (in production for month) just stop
> to work because "unexpected" within musl...
But that's not what's going on here, so it's a strawman. There is a
big difference in being robust against transient failures and ignoring
programming errors that are going to happen every single time the
program is run.
> (so someone will propose to set a cron to automatically restart this
> unreliable daemon, hmmm...)
>
> Far better to return "trouble" status, then it is to the application
> to decide what must be done in context, as ignore, override, bypass,
> crash, etc.
If the application is sophisticated enough to check for errors in the
return value of setenv, it's also sophisticated enough to check that
the value it's passing is a valid pointer. As we've seen in the case
of hwclock, it does neither. And that was part of the point of the
text I linked: programs which invoke UB in ways like this, empirically
speaking, almost always completely return value checking.
> > potentially serious bug -- it's not clear what the application was
> > trying to do, most likely nobody even thought about what they were
> > trying to do, and even if they did have something in mind it's not
> > reliable or portable. The glibc wiki has some text taken from text
> > I wrote on the topic (copied from a stack overflow answer I gave)
> > here:
>
> As reported, the crashing application is hwclock, (util-linux-2.26),
> this a kind of code in the field for a very very long time,
And it either crashes every time it's run (for a given configuration,
at least) or doesn't. If it does you know during early testing rather
than letting a bug slip through.
> so the
> library (glibc and old libc) used for linux over the years defined an
> expected behavior to this "UB".
No, that was merely a bug in glibc, not a feature.
> Something worry me in comments I have seen in the proposed URL,
> IMHO purpose of musl/glibc is not to "find bugs by crashing", its
> purpose is to be a code "clean, lean, reliable, predictable" (as said
> above, "Protect the hardware, report problem, lets the above layer
> application decide what to do in case of problem").
Part of protecting the system is avoiding any forward progress when
the application is known to be in an invalid/corrupt state due to UB.
> Crashing is not an option for code pertaining to musl/libc layer.
Crashing is inevitable on the vast majority of invalid programs.
setenv("TZ", (char *)0xdeadbeef, 1); will almost certainly crash, and
if it doesn't it will likely do something worse.
> (:-} why bother to return an error, just crash for all
> problems in open, close, write, etc. just bringing the crashing
> concept to the extreme :-}).
An error returned by open or write is not a consequence of any failure
by the programmer -- writing code with UB or otherwise. It's a
legitimate condition that can happen at runtime due to many possible
transient or permanent conditions like resource exhaustion,
non-existence of the file, permissions, etc.
You may notice that on many systems open fails with EFAULT when given
an invalid pointer rather than crashing. This is not particularly a
good thing. Consider code something like the following:
int foo(const char *fn) {
char buf[2];
strcpy(buf, "hello world, and goodbye");
int fd = open(fn ? fn : buf, O_RDONLY);
return fd < 0 ? -1 : fd;
}
Here fn is likely to be invalid at the time open is called due to the
buffer overflow in buf[]. When open ignores this and the program
continues running, it happily jumps to the clobbered return address.
Rich
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 10:10 ` Rich Felker
@ 2015-04-23 12:58 ` Jean-Marc Pigeon
2015-04-23 13:22 ` Szabolcs Nagy
0 siblings, 1 reply; 19+ messages in thread
From: Jean-Marc Pigeon @ 2015-04-23 12:58 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 2852 bytes --]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello...
[..]
>>
>> As reported, the crashing application is hwclock,
>> (util-linux-2.26), this a kind of code in the field for a very
>> very long time,
>
> And it either crashes every time it's run (for a given
> configuration, at least) or doesn't. If it does you know during
> early testing rather than letting a bug slip through.
>
>> so the library (glibc and old libc) used for linux over the years
>> defined an expected behavior to this "UB".
>
> No, that was merely a bug in glibc, not a feature.
Hmmm... glibc-2.21, setenv.c explicitly check the value NULL
condition, so situation is checked, you could object about
the way program handle it, but it is not a bug (situation
expected and addressed).
> Crashing is inevitable on the vast majority of invalid programs.
> setenv("TZ", (char *)0xdeadbeef, 1); will almost certainly crash,
> and if it doesn't it will likely do something worse.
Fully agreed, this is pure garbage, out of spec. library
will/must crash (protect the hardware).
>
>> (:-} why bother to return an error, just crash for all problems
>> in open, close, write, etc. just bringing the crashing concept to
>> the extreme :-}).
>
>
> int foo(const char *fn) { char buf[2]; strcpy(buf, "hello world,
> and goodbye");
In perfect world, strcpy should crash on the spot not going further.
this is clearly "out of spec" (a far fetch to be UB).
> int fd = open(fn ? fn : buf, O_RDONLY); return fd < 0 ? -1 : fd; }
- --
A bientôt
===========================================================
Jean-Marc Pigeon E-Mail: jmp@safe.ca
SAFE Inc. Phone: (514) 493-4280
Clement, 'a kiss solution' to get rid of SPAM (at last)
Clement' Home base <"http://www.clement.safe.ca">
===========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJVOOyKAAoJEAtgwJ4bBQU5E4YP/1G+J2vij9VF34tfZycG3iHO
Gs6y2WYsvyVZK66WsnDOOoaBr5h6bJv8pga++6hu/DVPpbXfZpY1ysNKQ63hxP02
1fFA8h1UjDJ4BM/aXHGothjp2kIXVn5CzJKuf2KUrFiR59t9WJu4KDR6/MsbRXTx
MsnOS25x+UCo4yQExemTuANAofdrst5mDyXovBdEaEGuGiTZuZri9NvZ8BZGEd4Y
v8cX/+UyY3X7UrB1+AAXbMEDs+miibXIGBI6EPPme5sDeVTm3V4kmqgdPu3Q+D00
Mgq6D+uj7Y3Wg8yHfzlVw/wZBYh1KGJNjOXJhKjmBvquCa3SQADAWBPc109J1Iip
v6efkBy3b36uWT5xoRtX4Db8+jdUUiVLLj4zRBytgkLokK8imvLlIGa1oBIUYGN8
RzC7Ma0hgHl7Tqxa7awvOfoVgqTWxC6saLmzJ+N8hFy5yg6YJYr+yvLY8fLkRPQM
BdNu/YzwFb1gIKDyq+lPtCjvMZVHpfcZMYorcV22N2zyaW7UTqU7BqRLR2mY/ojh
Nf8mUZNEsfeSNNHANZa5gixXFfDIQ/8zkZbIbY3XItESjdJft0wBmLBk8t0IJ6nO
zexIcNM3NozIZotsL+L47cHoedrYSLp/v2HK5kJ/V638qRGKhxkz9IBbe1UMwOfV
0kG3h7pAHZ5iBchsiQ8G
=yBbf
-----END PGP SIGNATURE-----
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4242 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 12:58 ` Jean-Marc Pigeon
@ 2015-04-23 13:22 ` Szabolcs Nagy
2015-04-23 13:58 ` Jean-Marc Pigeon
0 siblings, 1 reply; 19+ messages in thread
From: Szabolcs Nagy @ 2015-04-23 13:22 UTC (permalink / raw)
To: musl
* Jean-Marc Pigeon <jmp@safe.ca> [2015-04-23 08:58:50 -0400]:
> >
> >> so the library (glibc and old libc) used for linux over the years
> >> defined an expected behavior to this "UB".
> >
> > No, that was merely a bug in glibc, not a feature.
>
> Hmmm... glibc-2.21, setenv.c explicitly check the value NULL
> condition, so situation is checked, you could object about
> the way program handle it, but it is not a bug (situation
> expected and addressed).
>
you are wrong, glibc actually corrupted then environ on NULL
argument and this was fixed recently
http://sourceware.org/git/?p=glibc.git;a=commit;h=03c1e456b079929a8290aeb4aadb05c0df73bfd2
stop mixing runtime failure with ub, that leads to nonsense
discussions.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 13:22 ` Szabolcs Nagy
@ 2015-04-23 13:58 ` Jean-Marc Pigeon
2015-04-23 14:26 ` Szabolcs Nagy
0 siblings, 1 reply; 19+ messages in thread
From: Jean-Marc Pigeon @ 2015-04-23 13:58 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 1910 bytes --]
On 04/23/2015 09:22 AM, Szabolcs Nagy wrote:
> * Jean-Marc Pigeon <jmp@safe.ca> [2015-04-23 08:58:50 -0400]:
>>>
>>>> so the library (glibc and old libc) used for linux over the
>>>> years defined an expected behavior to this "UB".
>>>
>>> No, that was merely a bug in glibc, not a feature.
>>
>> Hmmm... glibc-2.21, setenv.c explicitly check the value NULL
>> condition, so situation is checked, you could object about the
>> way program handle it, but it is not a bug (situation expected
>> and addressed).
>>
>
> you are wrong, glibc actually corrupted then environ on NULL
> argument and this was fixed recently
>
> http://sourceware.org/git/?p=glibc.git;a=commit;h=03c1e456b079929a8290aeb4aadb05c0df73bfd2
>
> stop mixing runtime failure with ub, that leads to nonsense
> discussions.
>
glibc-2.12-1.132.el6_5.2.x86_64
I do not confirm env corruption, try:
#include <stdio.h>
main()
{
#define ENVNAME "BIGRE"
(void) setenv(ENVNAME,getenv("MISSING_FROM_ENV"),1);
(void) fprintf(stdout,"1 env %s=<%s>\n",ENVNAME,getenv(ENVNAME));
(void) setenv(ENVNAME,"",1);
(void) fprintf(stdout,"2 env %s=<%s>\n",ENVNAME,getenv(ENVNAME));
(void) setenv(ENVNAME,"something",1);
(void) fprintf(stdout,"3 env %s=<%s>\n",ENVNAME,getenv(ENVNAME));
}
What I am trying to explain, you have something not
defined in spec, was resolved in a way (you may not like, I agree)
for EONs.
Now "you" decide, to resolve UB another way, you may
have tons of applications in jeopardy.
--
A bientôt
===========================================================
Jean-Marc Pigeon E-Mail: jmp@safe.ca
SAFE Inc. Phone: (514) 493-4280
Clement, 'a kiss solution' to get rid of SPAM (at last)
Clement' Home base <"http://www.clement.safe.ca">
===========================================================
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4242 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: setenv if value=NULL, what say standard? Bug?
2015-04-23 13:58 ` Jean-Marc Pigeon
@ 2015-04-23 14:26 ` Szabolcs Nagy
0 siblings, 0 replies; 19+ messages in thread
From: Szabolcs Nagy @ 2015-04-23 14:26 UTC (permalink / raw)
To: musl
* Jean-Marc Pigeon <jmp@safe.ca> [2015-04-23 09:58:28 -0400]:
> On 04/23/2015 09:22 AM, Szabolcs Nagy wrote:
> >
> > you are wrong, glibc actually corrupted then environ on NULL
> > argument and this was fixed recently
> >
> > http://sourceware.org/git/?p=glibc.git;a=commit;h=03c1e456b079929a8290aeb4aadb05c0df73bfd2
> >
> > stop mixing runtime failure with ub, that leads to nonsense
> > discussions.
> >
> glibc-2.12-1.132.el6_5.2.x86_64
> I do not confirm env corruption, try:
>
==6225== Invalid read of size 1
==6225== at 0x4C2BFC2: __GI_strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6225== by 0x4EA1F6B: puts (ioputs.c:37)
==6225== by 0x40059D: main (in /data/tmp/env/a.out)
>
> What I am trying to explain, you have something not
> defined in spec, was resolved in a way (you may not like, I agree)
> for EONs.
>
never worked
bsd always crashed
glibc used to generate invalid memory access which can lead to
arbitrarily bad behaviour, now it's fixed to crash
musl always crashed
> Now "you" decide, to resolve UB another way, you may
> have tons of applications in jeopardy.
>
^ permalink raw reply [flat|nested] 19+ messages in thread