From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 15637 invoked from network); 24 May 2021 11:18:45 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 24 May 2021 11:18:45 -0000 Received: (qmail 9885 invoked by uid 550); 24 May 2021 11:18:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 13671 invoked from network); 24 May 2021 04:39:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=A2j+sf0DSe9W4N3WSRtpZeFwNIAdlodR36qT83Q9wU4=; b=WKZCpY7GsDKKfdFsD3W22SPl7N/CfmOAeMnRVCzMKkCkrv72cBHdFV8pH3sT5IP9OH v9xSoJlWjUig99dDh1JsjaS1keHOe7YrkMd9lJazopMXuNy6HMwgcqnwazgzSwgIwwrk Z5QICveSTQrrbYS5THUIeMeaS3TtqsNeiZDYDc/UUhix+HIYR3fAdaImkGVz1EY3cHCi /DoaObp0YF61ZNqRD8UDFMZAfT0PoyXSAdE55du9jtSRYJCYDpy8aX7eCNfflPHgETTh WTf0hD9catTPHc1tPjX/cAV7TXyPF+wF7PbGD+/sIGnb1XCQP9rv/gLzUnCWC6BBfmNn Xg2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=A2j+sf0DSe9W4N3WSRtpZeFwNIAdlodR36qT83Q9wU4=; b=jIgOYJW2eBooTYNMz7r02Wl5tGEQp0lfLNzGUk2Dt9PE81Ac+tkOT+4c5yBa8/p+eG LSo93zPm7d9WamQYoJqrB4R6k36qUJiubggs0QJKPd1PU46PSpXrp8llxulMMYzEuhvj vWBBTlVooKj5mU14lcVrVgep9K2oNs0MlYcwEhfUpxtLTmXX6aw4+Fx2g9Y1lhjqJ+hz 6IZj6hjcNSY3Fj5I3/JdIRhimPRpahYRdYisGiuAIls8XpB3OsHeAjoBjlbLGTgV1DvX sr+pi5XRBVa9a+Zu9mstSnatQPHlv152GnL6aYB5byuNK+VoPz3rkE5sRFVcvYxRYr+K xkjA== X-Gm-Message-State: AOAM533+oaz7Ns6dYLCzHHt/IjUsXFh8lagMQQj736x3nr6ZYBonKoCD 4pFBpuUwL5rDbEw9b35hdV7GrATzReEJ1Kp1P954Y+oB0PXEjQ== X-Google-Smtp-Source: ABdhPJz2ULNzYJmjMt1kJypeUM8Jis6We2UvGVSiJCthOt7QAyG67mE4WJ3U9W1NQVI/XOOcHf1EHxiOY+4LoeSsvQA= X-Received: by 2002:a17:902:a60f:b029:f0:c3c8:2e20 with SMTP id u15-20020a170902a60fb02900f0c3c82e20mr23530795plq.51.1621831186569; Sun, 23 May 2021 21:39:46 -0700 (PDT) MIME-Version: 1.0 From: Konstantin Isakov Date: Mon, 24 May 2021 00:39:35 -0400 Message-ID: To: musl@lists.openwall.com Content-Type: multipart/alternative; boundary="000000000000c2de0005c30bff2a" Subject: [musl] [BUG] swprintf() doesn't handle Unicode characters correctly --000000000000c2de0005c30bff2a Content-Type: text/plain; charset="UTF-8" Hi, The following program: =================================== #include #include int main() { wchar_t buf[ 32 ]; swprintf( buf, sizeof( buf ) / sizeof( *buf ), L"ab\u00E1c" ); for ( wchar_t * p = buf; *p; ++p ) printf( "%u\n", ( unsigned ) *p ); return 0; } =================================== With musl 1.2.2 produces the following output: 97 98 The expected output is: 97 98 225 99 With musl, only the first two characters ('a' and 'b') are processed, and the string ends on a Unicode character (U+00E1, which is an 'a' with acute accent), instead of outputting it and the last character, 'c'. Please CC me when replying. Thanks! --000000000000c2de0005c30bff2a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

The following program:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
#inclu= de <stdio.h>
#include <wchar.h>

int main()
{
= =C2=A0 wchar_t buf[ 32 ];

=C2=A0 swprintf( buf, sizeof( buf ) / size= of( *buf ), L"ab\u00E1c" );

=C2=A0 for ( wchar_t * p =3D b= uf; *p; ++p )
=C2=A0 =C2=A0 printf( "%u\n", ( unsigned ) *p );=

=C2=A0 return 0;
}
=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D

With musl 1.2.2 produces th= e following output:
97
98

The expecte= d output is:
97
98
225
99

With = musl, only the first two characters ('a' and 'b') are proce= ssed, and the string ends on a Unicode character (U+00E1, which is an '= a' with acute accent), instead of outputting it and the last character,= 'c'.

Please CC me when replying. Thanks!
--000000000000c2de0005c30bff2a--