From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 22094 invoked from network); 1 Nov 2020 20:54:03 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 1 Nov 2020 20:54:03 -0000 Received: (qmail 5187 invoked by uid 550); 1 Nov 2020 20:54:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 5163 invoked from network); 1 Nov 2020 20:54:01 -0000 X-Virus-Scanned: Debian amavisd-new at disroot.org Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail; t=1604264028; bh=ByIiutALNfTCsjUQR52H6eyYc3VeSVqE17TIVEDTny8=; h=Cc:Subject:From:To:Date:In-Reply-To; b=LuSKFurQ9z7lnEsGTeL9C2T0gvFow+x+8rfgTmcPu/Xe/J203oWyNg2JDVboR0rX7 oxiiX8WlwhB+ccfbSMQud4L/MpTD4piipzZUzSSJFvugoabBLtbx3/HN98d4I2DMTB 428mfLPVvy/bV9RPpIeNG6s5mb7qm+LDlcBBanDOMO3Cp7dyG7CbHCekky55cGXwMo eicQLmlyqYKRdBirY6DKKgz1OLRj3gbZejp9Ye4oIjkDQNoDuuKpfZEs515gNjNvvi JvxTr7X51ImY4uILIAFyVmv7xaTyTPuOsDi5f9fyNVD6FS4ejqKxm1KpF4zHL5HqVj GdX7yGSEEvBxQ== Content-Type: text/plain; charset=UTF-8 Cc: , "Alexander Vitiuk" From: =?utf-8?q?=C3=89rico_Nogueira?= To: "Szabolcs Nagy" Date: Sun, 01 Nov 2020 17:48:43 -0300 Message-Id: In-Reply-To: <20201101204002.GA1370092@port70.net> Subject: Re: [musl] swprintf possible bug On Sun Nov 1, 2020 at 6:40 PM -03, Szabolcs Nagy wrote: > * =C3=89rico Nogueira [2020-11-01 17:17:49 -0300]: > > On Sun Nov 1, 2020 at 6:06 PM -03, Alexander Vitiuk wrote: > > > It seems, wsprintf() / wprintf() are not working in musl as expected,= if > > > uses with cyrillic: > > > > > > C testcase: > > > #include > > > int main() { > > > wprintf(L"[hello]\n"); > > > wprintf(L"[=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82]\n"); > > > return 0; > > > } > > > on x86_64-linux-gnu prints: > > > [hello] > > > [Privet] > > > and on x86_64-linux-musl prints: [hello] > > > [ > > > > > > There are other cases described: > > > https://github.com/emscripten-core/emscripten/issues/11947 > >=20 > > For what it's worth, if this is a bug, it would seem to be in how musl > > decides when to print characters (not the formatting functions > > themselves), since the below program doesn't print anything: > >=20 > > #include > > #include > >=20 > > int main() { > > fputws(L"[=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 =D0=92=D0=B0=D1=81=D0= =B8=D0=BB=D0=B8=D0=B9]\n", stdout); > > // I don't know if I'm accessing a wchar_t appropriately here > > fputwc(L"[=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 =D0=92=D0=B0=D1=81=D0= =B8=D0=BB=D0=B8=D0=B9]\n"[3], stdout); > > return 0; > > } > >=20 > > I tried tracing the execution from fputws, and not printing anything > > seems to be caused by the return value of wcsrtombs(). > > these functions return an error code.. > > in this case they must return -1 and set errno to EILSEQ, > since the selected multibyte encoding (LC_CTYPE=3DC) cannot > represent the printed wide characters. > > i think the musl behaviour is correct, you can try adding > setlocale(LC_CTYPE,"") at the start of main to make it work. Thanks, that did fix it. For reference: #include #include #include int main() { setlocale(LC_CTYPE, ""); fputws(L"[=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 =D0=92=D0=B0=D1=81=D0=B8= =D0=BB=D0=B8=D0=B9]\n", stdout); fputwc(L"[=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 =D0=92=D0=B0=D1=81=D0=B8= =D0=BB=D0=B8=D0=B9]\n"[3], stdout); return 0; } I wonder what glibc's behavior is that it allows this; and how emscripten folks can work around the musl behavior as well. Which environment variables could I set to control this, or is that not possible? Thanks, =C3=89rico