From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id A59DE26959 for ; Sat, 11 May 2024 19:58:25 +0200 (CEST) Received: (qmail 30285 invoked by uid 550); 11 May 2024 17:58:20 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30250 invoked from network); 11 May 2024 17:58:19 -0000 Date: Sat, 11 May 2024 19:58:09 +0200 From: Petr Pisar To: musl@lists.openwall.com Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="76lOqPjQ6j6ThPAm" Content-Disposition: inline Subject: [musl] nl_langinfo(CODESET) does not match locale --76lOqPjQ6j6ThPAm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline When debugging test failures in libisds on Gentoo with musl , I found that nl_langinfo(CODESET) does not match current locale. A reproducer: #include #include #include int main(void) { char *old_locale = setlocale(LC_ALL, "cs_CZ.ISO8859-2"); if (old_locale == NULL) { perror("setlocale() set failed"); return 1; } old_locale = setlocale(LC_ALL, NULL); if (old_locale == NULL) { perror("setlocale() query failed"); return 1; } printf("Current LC_ALL=%s\n", old_locale); printf("CODESET=%s\n", nl_langinfo(CODESET)); return 0; } # gcc test.c && ./a.out Current LC_ALL=cs_CZ.ISO8859-2 CODESET=UTF-8 While on glibc: $ gcc test.c && ./a.out Current LC_ALL=cs_CZ.ISO8859-2 CODESET=ISO-8859-2 I can see that for cs_CZ.UTF8 locale, it nl_langinfo() correctly reports UTF-8, as well for C reports ASCII. However, for any other character set it always returns UTF-8. I found a notice that musl does not implements non-UTF-8 locales. If that is true, then selocale() for "cs_CZ.ISO8859-2" should fail, instead of accepting the locale. I observe this behavior with musl-1.2.5. -- Petr --76lOqPjQ6j6ThPAm Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE4/QvzhVoMKgDWObpT9GuwzZa978FAmY/sasACgkQT9GuwzZa 979abhAAmZ2Wd3X3ltXAHwhOIh6ipQ+f9oV7r3bZusiPxZ0uA4sIU/2VHhoa/vQM Na/s4m3IbbEYczo1S65aLZEHW3J24ZScq00wRjR9CEGC0Td6QapOmvKSBbBWbCE5 z7RHNhSifz6CaTWSdZIqqmuzkxRa6Yv31uc51nehlpJeZdkp7ZMwZlZzMgG5S3ZD ZLXIQ3sochlvax536HHWoDAzTdwaVWHGB9FT8RjICYe08OXbBMBLsjsie80hKviP DW01cuw/S4OEvXQ1TYSZJxVe63bQe5O7kA+DVxa4E9nQBZR5bHj2+0hKd1lQiSUv LW6FhIf1piE4DrPqI2TR5GFqOVLDeVRHreYRcQ2KWJZYDuFiCyEGYon77BaQlztD aBN7zXF1Z4YQMnvmkKCFN5Yus48uHleSGFVKjObhu4fyKdYDqQzJ4HrIDUbcPAEN V4h02lh2yJZpHcjGOI4kJpeak+1DGvp3r6pr/tSu+Ugk4uqPVLDYC+LNMKlomyhp 0KGbCD8JpYSaTnbvB0BkvuusOW9g0wNTbpoReLXgKM6lmEn4ev978fhpxmAxPErB 50LAuGvnoWil0yZBzveu3cniE1NNry9C7VuP9sFLyjZbJ53vkk50OYYMwyoea4QH iaLzIKgZzuv3W4H6NIb6FxXwVcDg7ECjZ2AMSiXSdbtyQmdxd0Y= =ttd3 -----END PGP SIGNATURE----- --76lOqPjQ6j6ThPAm--