mailing list of musl libc
 help / color / mirror / code / Atom feed
* Issues in mbsnrtowcs and wcsnrtombs
@ 2017-07-18 20:05 Mikhail Kremnyov
  2017-08-09 17:57 ` Mikhail Kremnyov
  2017-08-31 18:28 ` Rich Felker
  0 siblings, 2 replies; 4+ messages in thread
From: Mikhail Kremnyov @ 2017-07-18 20:05 UTC (permalink / raw)
  To: musl

Hi,

It looks like there are some bugs in the implementations of mbsnrtowcs
and wcsnrtombs.
E.g. inside mbsnrtowcs there is this code:

    while ( s && wn && ( (n2=n/4)>=wn || n2>32 ) ) {
        if (n2>=wn) n2=wn;
        n -= n2;
        l = mbsrtowcs(ws, &s, n2, st);

Here "n" is the number of source bytes to convert and "n2" is the number
of wide chars that may be put to the destination, so it's incorrect to
subtract one from another. And indeed a simple test shows that the
function doesn't work correctly if long enough non-ascii string is
passed to it. E.g.:

    const std::string origStr =
u8"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ";
    const std::string srcStr = origStr + u8"їґіє";

    std::mbstate_t st = {};
    const char* srcPtr = &srcStr[0];
    std::wstring dest(srcStr.length() + 1, wchar_t(0));

    auto res = mbsnrtowcs(&dest[0], &srcPtr, origStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", srcPtr = " << (void*)srcPtr <<
std::endl;

And the output is:
    res = 70, srcPtr = 0

Here mbsnrtowcs was told to convert only "origStr.length()" number of
bytes, which contain 66 2-byte characters, but it converted 70, stopping
only after the zero char was met.

A similar problem happens with wcsnrtombs using a slightly longer string:

    std::wstring srcStr =
L"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдеёжзийклмнопрстуфхцчшщъыьэюя";

    const wchar_t* srcPtr = &srcStr[0];
    std::mbstate_t st = {};
    std::string dest(srcStr.length() * 4 + 1, char(0));

    auto res = wcsnrtombs(&dest[0], &srcPtr, srcStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", dest = " << dest << std::endl;

The output:
    res = 98, dest = абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНО
   
I.e. it only converted 49 characters instead of 99.


Mikhail.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-31 18:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18 20:05 Issues in mbsnrtowcs and wcsnrtombs Mikhail Kremnyov
2017-08-09 17:57 ` Mikhail Kremnyov
2017-08-12  0:31   ` Rich Felker
2017-08-31 18:28 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).