From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20262 invoked from network); 6 Oct 2021 12:29:54 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 6 Oct 2021 12:29:54 -0000 Received: (qmail 22039 invoked by uid 550); 6 Oct 2021 12:29:52 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 22018 invoked from network); 6 Oct 2021 12:29:51 -0000 Date: Wed, 6 Oct 2021 08:29:37 -0400 From: Rich Felker To: Pablo Correa Gomez Cc: musl@lists.openwall.com Message-ID: <20211006122936.GD2559@brightrain.aerifal.cx> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] newlocale: Segmentation fault when locale input is NULL On Wed, Oct 06, 2021 at 11:31:29AM +0200, Pablo Correa Gomez wrote: > Dear musl maintainers, > > While doing some work in GNOME control center for postmarketos, we > bumped into a segmentation fault which is also present in GNOME in > Alpine[1]. > > After doing some degugging, I figured out that the reason is that, > through GNOME desktop[2], there is a call to newlocale, where they end > up calling it with a NULL argument. > > newlocale(LC_CTYPE, NULL, (locale_t)0); > > In this case, "name" is passed to __get_locale in > src/locale/newlocale.c:27 and then dereferenced in > src/locale/locale_map.c:43, causing a segmentation fault. > > In the case of glibc, this is not an issue, as per the documentation[3] > they consider it an error: > > EINVAL locale is NULL. > > Unfortunately, this is a difference in the implementation between glibc > and musl, maybe due to the fact that the standard[4] in not clear in > this point: > > > The newlocale() function may fail if: > > [EINVAL] > The locale argument is not a valid string pointer. This is specifically documented as a "may fail", not a "shall fail", i.e. it's not guaranteed to happen. It comes from POSIX, and is an instance of a weird pattern the committee tried to fix (but missed some places it seems) where they wrote "may fail"s for conditions that already have undefined behavior (here, use of an invalid pointer) in which case EINVAL would already be allowed as a side effect of the UB without any further specification. (The same thing created a lot of confusion in the past about use of pthread_t values past the end of their lifetime.) > My personal believe is that adding a NULL pointer check in musl is very > simple and might help not only GNOME desktop, but maybe also other > projects in the future. This is the reason why I brought the issue here > first instead of directly patching GNOME desktop. If you believe that > musl behaviour should remain the way it is, please let me know and I > will send MRs for upstream and Alpine's GNOME desktop. I am not > subscribed to the mailing list, so I would appreciate if I am CC'ed in > any response. The musl behavior should remain the way it is. My text on the rationale actually made it into the glibc wiki some years back: https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers "The GNU C library considers it a QoI feature not to mask user bugs by detecting invalid pointers and returning EINVAL (unless the API is standardized and says it does that). If passing a bad pointer has undefined behavior, it is far more useful in the long run if it crashes quickly rather than diagnosing an error that is probably ignored by the flaky caller. If you're going to check for NULL pointer arguments where you have not entered into a contract to accept and interpret them, do so with an assert, not a conditional error return. This way the bugs in the caller will be immediately detected and can be fixed, and it makes it easy to disable the overhead in production builds. The assert can be valuable as code documentation. However, a segfault from dereferencing the NULL pointer is just as effective for debugging. If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error." In particular, here it seems to have found a bug -- what could the application have possibly meant by passing a null pointer there? Did it actually intend the behavior of ""? Or of "C"? Or if the intent was to have this mean "don't use a context-local locale", why pass the pointer to newlocale and process the error (which could include a number of other errors you'd certainly want to treat differently) rather than checking for null and taking a different code path before calling newlocale? Rich