mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Mikhail Kremnyov <mkremnyov@gmail.com>
To: musl@lists.openwall.com
Subject: Issues in mbsnrtowcs and wcsnrtombs
Date: Tue, 18 Jul 2017 23:05:29 +0300	[thread overview]
Message-ID: <c23a73f5-34b4-99e8-786f-622ae42d41e8@gmail.com> (raw)

Hi,

It looks like there are some bugs in the implementations of mbsnrtowcs
and wcsnrtombs.
E.g. inside mbsnrtowcs there is this code:

    while ( s && wn && ( (n2=n/4)>=wn || n2>32 ) ) {
        if (n2>=wn) n2=wn;
        n -= n2;
        l = mbsrtowcs(ws, &s, n2, st);

Here "n" is the number of source bytes to convert and "n2" is the number
of wide chars that may be put to the destination, so it's incorrect to
subtract one from another. And indeed a simple test shows that the
function doesn't work correctly if long enough non-ascii string is
passed to it. E.g.:

    const std::string origStr =
u8"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ";
    const std::string srcStr = origStr + u8"їґіє";

    std::mbstate_t st = {};
    const char* srcPtr = &srcStr[0];
    std::wstring dest(srcStr.length() + 1, wchar_t(0));

    auto res = mbsnrtowcs(&dest[0], &srcPtr, origStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", srcPtr = " << (void*)srcPtr <<
std::endl;

And the output is:
    res = 70, srcPtr = 0

Here mbsnrtowcs was told to convert only "origStr.length()" number of
bytes, which contain 66 2-byte characters, but it converted 70, stopping
only after the zero char was met.

A similar problem happens with wcsnrtombs using a slightly longer string:

    std::wstring srcStr =
L"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдеёжзийклмнопрстуфхцчшщъыьэюя";

    const wchar_t* srcPtr = &srcStr[0];
    std::mbstate_t st = {};
    std::string dest(srcStr.length() * 4 + 1, char(0));

    auto res = wcsnrtombs(&dest[0], &srcPtr, srcStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", dest = " << dest << std::endl;

The output:
    res = 98, dest = абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНО
   
I.e. it only converted 49 characters instead of 99.


Mikhail.



             reply	other threads:[~2017-07-18 20:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-18 20:05 Mikhail Kremnyov [this message]
2017-08-09 17:57 ` Mikhail Kremnyov
2017-08-12  0:31   ` Rich Felker
2017-08-31 18:28 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c23a73f5-34b4-99e8-786f-622ae42d41e8@gmail.com \
    --to=mkremnyov@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).