mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] newlocale: Segmentation fault when locale input is NULL
@ 2021-10-06  9:31 Pablo Correa Gomez
  2021-10-06 12:29 ` Rich Felker
  2021-10-06 12:31 ` Quentin Rameau
  0 siblings, 2 replies; 4+ messages in thread
From: Pablo Correa Gomez @ 2021-10-06  9:31 UTC (permalink / raw)
  To: musl

Dear musl maintainers,

While doing some work in GNOME control center for postmarketos, we
bumped into a segmentation fault which is also present in GNOME in
Alpine[1].

After doing some degugging, I figured out that the reason is that,
through GNOME desktop[2], there is a call to newlocale, where they end
up calling it with a NULL argument.

newlocale(LC_CTYPE, NULL, (locale_t)0);

In this case, "name" is passed to __get_locale in
src/locale/newlocale.c:27 and then dereferenced in
src/locale/locale_map.c:43, causing a segmentation fault.

In the case of glibc, this is not an issue, as per the documentation[3]
they consider it an error:

       EINVAL locale is NULL.

Unfortunately, this is a difference in the implementation between glibc
and musl, maybe due to the fact that the standard[4] in not clear in
this point:


The newlocale() function may fail if:

[EINVAL]
    The locale argument is not a valid string pointer. 


My personal believe is that adding a NULL pointer check in musl is very
simple and might help not only GNOME desktop, but maybe also other
projects in the future. This is the reason why I brought the issue here
first instead of directly patching GNOME desktop. If you believe that
musl behaviour should remain the way it is, please let me know and I
will send MRs for upstream and Alpine's GNOME desktop. I am not
subscribed to the mailing list, so I would appreciate if I am CC'ed in
any response.

Best regards,
Pablo Correa Gómez.


[1] 
https://gitlab.com/postmarketOS/pmaports/-/merge_requests/2552#note_686956660
[2] https://gitlab.gnome.org/GNOME/gnome-desktop
[3] https://man7.org/linux/man-pages/man3/newlocale.3.html
[4] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/newlocale.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] newlocale: Segmentation fault when locale input is NULL
  2021-10-06  9:31 [musl] newlocale: Segmentation fault when locale input is NULL Pablo Correa Gomez
@ 2021-10-06 12:29 ` Rich Felker
  2021-10-06 12:57   ` Pablo Correa Gomez
  2021-10-06 12:31 ` Quentin Rameau
  1 sibling, 1 reply; 4+ messages in thread
From: Rich Felker @ 2021-10-06 12:29 UTC (permalink / raw)
  To: Pablo Correa Gomez; +Cc: musl

On Wed, Oct 06, 2021 at 11:31:29AM +0200, Pablo Correa Gomez wrote:
> Dear musl maintainers,
> 
> While doing some work in GNOME control center for postmarketos, we
> bumped into a segmentation fault which is also present in GNOME in
> Alpine[1].
> 
> After doing some degugging, I figured out that the reason is that,
> through GNOME desktop[2], there is a call to newlocale, where they end
> up calling it with a NULL argument.
> 
> newlocale(LC_CTYPE, NULL, (locale_t)0);
> 
> In this case, "name" is passed to __get_locale in
> src/locale/newlocale.c:27 and then dereferenced in
> src/locale/locale_map.c:43, causing a segmentation fault.
> 
> In the case of glibc, this is not an issue, as per the documentation[3]
> they consider it an error:
> 
>        EINVAL locale is NULL.
> 
> Unfortunately, this is a difference in the implementation between glibc
> and musl, maybe due to the fact that the standard[4] in not clear in
> this point:
> 
> 
> The newlocale() function may fail if:
> 
> [EINVAL]
>     The locale argument is not a valid string pointer. 

This is specifically documented as a "may fail", not a "shall fail",
i.e. it's not guaranteed to happen. It comes from POSIX, and is an
instance of a weird pattern the committee tried to fix (but missed
some places it seems) where they wrote "may fail"s for conditions that
already have undefined behavior (here, use of an invalid pointer) in
which case EINVAL would already be allowed as a side effect of the UB
without any further specification. (The same thing created a lot of
confusion in the past about use of pthread_t values past the end of
their lifetime.)

> My personal believe is that adding a NULL pointer check in musl is very
> simple and might help not only GNOME desktop, but maybe also other
> projects in the future. This is the reason why I brought the issue here
> first instead of directly patching GNOME desktop. If you believe that
> musl behaviour should remain the way it is, please let me know and I
> will send MRs for upstream and Alpine's GNOME desktop. I am not
> subscribed to the mailing list, so I would appreciate if I am CC'ed in
> any response.

The musl behavior should remain the way it is. My text on the
rationale actually made it into the glibc wiki some years back:

https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers

    "The GNU C library considers it a QoI feature not to mask user
    bugs by detecting invalid pointers and returning EINVAL (unless
    the API is standardized and says it does that). If passing a bad
    pointer has undefined behavior, it is far more useful in the long
    run if it crashes quickly rather than diagnosing an error that is
    probably ignored by the flaky caller.

    If you're going to check for NULL pointer arguments where you have
    not entered into a contract to accept and interpret them, do so
    with an assert, not a conditional error return. This way the bugs
    in the caller will be immediately detected and can be fixed, and
    it makes it easy to disable the overhead in production builds. The
    assert can be valuable as code documentation. However, a segfault
    from dereferencing the NULL pointer is just as effective for
    debugging. If you return an error code to a caller which has
    already proven itself buggy, the most likely result is that the
    caller will ignore the error, and bad things will happen much
    later down the line when the original cause of the error has
    become difficult or impossible to track down. Why is it reasonable
    to assume the caller will ignore the error you return? Because the
    caller already ignored the error return of malloc or fopen or some
    other library-specific allocation function which returned NULL to
    indicate an error."

In particular, here it seems to have found a bug -- what could the
application have possibly meant by passing a null pointer there? Did
it actually intend the behavior of ""? Or of "C"? Or if the intent was
to have this mean "don't use a context-local locale", why pass the
pointer to newlocale and process the error (which could include a
number of other errors you'd certainly want to treat differently)
rather than checking for null and taking a different code path before
calling newlocale?

Rich

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] newlocale: Segmentation fault when locale input is NULL
  2021-10-06  9:31 [musl] newlocale: Segmentation fault when locale input is NULL Pablo Correa Gomez
  2021-10-06 12:29 ` Rich Felker
@ 2021-10-06 12:31 ` Quentin Rameau
  1 sibling, 0 replies; 4+ messages in thread
From: Quentin Rameau @ 2021-10-06 12:31 UTC (permalink / raw)
  To: Pablo Correa Gomez; +Cc: musl

On Wed, 06 Oct 2021 11:31:29 +0200
Pablo Correa Gomez <ablocorrea@hotmail.com> wrote:

> Dear musl maintainers,

Hi Pablo,

> newlocale(LC_CTYPE, NULL, (locale_t)0);
> 
> In this case, "name" is passed to __get_locale in
> src/locale/newlocale.c:27 and then dereferenced in
> src/locale/locale_map.c:43, causing a segmentation fault.
> 
> In the case of glibc, this is not an issue, as per the documentation[3]
> they consider it an error:
> 
>        EINVAL locale is NULL.
> 
> Unfortunately, this is a difference in the implementation between glibc
> and musl, maybe due to the fact that the standard[4] in not clear in
> this point:
> 
> 
> The newlocale() function may fail if:
> 
> [EINVAL]
>     The locale argument is not a valid string pointer. 
> 
> 
> My personal believe is that adding a NULL pointer check in musl is very
> simple and might help not only GNOME desktop, but maybe also other
> projects in the future.

I think the opposite, the standard is quite clear to the developper who
intends to write portable application, newlocale() may fail with an
invalid locale argument pointer. So check your pointer before calling
that function.

While changing musl behaviour would “fix” the GNOME issue, this will not
help GNOME developpers, or other projects, writing better portable code.

Just my two cents, not answering if musl behaviour should be changed or
not.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] newlocale: Segmentation fault when locale input is NULL
  2021-10-06 12:29 ` Rich Felker
@ 2021-10-06 12:57   ` Pablo Correa Gomez
  0 siblings, 0 replies; 4+ messages in thread
From: Pablo Correa Gomez @ 2021-10-06 12:57 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Thank you very much to both for the detailed explanation. I really
appreciate you took the time to explain why musl behaviour should
remain the way it is. I will proceed and fix the bug in GNOME.

> In particular, here it seems to have found a bug -- what could the
> application have possibly meant by passing a null pointer there? Did
> it actually intend the behavior of ""? Or of "C"? Or if the intent
> was
> to have this mean "don't use a context-local locale", why pass the
> pointer to newlocale and process the error (which could include a
> number of other errors you'd certainly want to treat differently)
> rather than checking for null and taking a different code path before
> calling newlocale?

Very valid point. Definitely something worth to ask the maintainers.

Best and thank you again,
Pablo.

On Wed, 2021-10-06 at 08:29 -0400, Rich Felker wrote:
> On Wed, Oct 06, 2021 at 11:31:29AM +0200, Pablo Correa Gomez wrote:
> > Dear musl maintainers,
> > 
> > While doing some work in GNOME control center for postmarketos, we
> > bumped into a segmentation fault which is also present in GNOME in
> > Alpine[1].
> > 
> > After doing some degugging, I figured out that the reason is that,
> > through GNOME desktop[2], there is a call to newlocale, where they
> > end
> > up calling it with a NULL argument.
> > 
> > newlocale(LC_CTYPE, NULL, (locale_t)0);
> > 
> > In this case, "name" is passed to __get_locale in
> > src/locale/newlocale.c:27 and then dereferenced in
> > src/locale/locale_map.c:43, causing a segmentation fault.
> > 
> > In the case of glibc, this is not an issue, as per the
> > documentation[3]
> > they consider it an error:
> > 
> >        EINVAL locale is NULL.
> > 
> > Unfortunately, this is a difference in the implementation between
> > glibc
> > and musl, maybe due to the fact that the standard[4] in not clear
> > in
> > this point:
> > 
> > 
> > The newlocale() function may fail if:
> > 
> > [EINVAL]
> >     The locale argument is not a valid string pointer. 
> 
> This is specifically documented as a "may fail", not a "shall fail",
> i.e. it's not guaranteed to happen. It comes from POSIX, and is an
> instance of a weird pattern the committee tried to fix (but missed
> some places it seems) where they wrote "may fail"s for conditions
> that
> already have undefined behavior (here, use of an invalid pointer) in
> which case EINVAL would already be allowed as a side effect of the UB
> without any further specification. (The same thing created a lot of
> confusion in the past about use of pthread_t values past the end of
> their lifetime.)
> 
> > My personal believe is that adding a NULL pointer check in musl is
> > very
> > simple and might help not only GNOME desktop, but maybe also other
> > projects in the future. This is the reason why I brought the issue
> > here
> > first instead of directly patching GNOME desktop. If you believe
> > that
> > musl behaviour should remain the way it is, please let me know and
> > I
> > will send MRs for upstream and Alpine's GNOME desktop. I am not
> > subscribed to the mailing list, so I would appreciate if I am CC'ed
> > in
> > any response.
> 
> The musl behavior should remain the way it is. My text on the
> rationale actually made it into the glibc wiki some years back:
> 
> https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
> 
>     "The GNU C library considers it a QoI feature not to mask user
>     bugs by detecting invalid pointers and returning EINVAL (unless
>     the API is standardized and says it does that). If passing a bad
>     pointer has undefined behavior, it is far more useful in the long
>     run if it crashes quickly rather than diagnosing an error that is
>     probably ignored by the flaky caller.
> 
>     If you're going to check for NULL pointer arguments where you
> have
>     not entered into a contract to accept and interpret them, do so
>     with an assert, not a conditional error return. This way the bugs
>     in the caller will be immediately detected and can be fixed, and
>     it makes it easy to disable the overhead in production builds.
> The
>     assert can be valuable as code documentation. However, a segfault
>     from dereferencing the NULL pointer is just as effective for
>     debugging. If you return an error code to a caller which has
>     already proven itself buggy, the most likely result is that the
>     caller will ignore the error, and bad things will happen much
>     later down the line when the original cause of the error has
>     become difficult or impossible to track down. Why is it
> reasonable
>     to assume the caller will ignore the error you return? Because
> the
>     caller already ignored the error return of malloc or fopen or
> some
>     other library-specific allocation function which returned NULL to
>     indicate an error."
> 
> In particular, here it seems to have found a bug -- what could the
> application have possibly meant by passing a null pointer there? Did
> it actually intend the behavior of ""? Or of "C"? Or if the intent
> was
> to have this mean "don't use a context-local locale", why pass the
> pointer to newlocale and process the error (which could include a
> number of other errors you'd certainly want to treat differently)
> rather than checking for null and taking a different code path before
> calling newlocale?
> 
> Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-06 14:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06  9:31 [musl] newlocale: Segmentation fault when locale input is NULL Pablo Correa Gomez
2021-10-06 12:29 ` Rich Felker
2021-10-06 12:57   ` Pablo Correa Gomez
2021-10-06 12:31 ` Quentin Rameau

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).