> I'm not saying you need to wait....

1. its hard to read that thread for me, i just glanced once, thx for you advice, ill be more cautious next time! ;p

> Can I ask how .UTF-8 got in the locale name....

2. And '.UTF-8' is copied from glibc's locale-table, i put it there, it's set by normal user. As i looked in to musl's source, i found it's totally useless for musl to set such a suffix, suffixes are meaningless. But we should still do a compatibility with glibc in my view, suffixes seems already unofficial but standard way to ask libc to provide a proper charset.

> I don't think "it crashes on glibc"...
3. Really sorry, forgot to locale-gen before test, that's why segfault, seems glibc only stripped '.GBK' at translation load time, showed me '»ỰѡÏî:'. In another word, it was using real GBK set!

Though I agree with rejection: because musl is utf8, but this '.GBK' asked for using 'GBK' rather than utf8, conceptually we should just reject it. But stand on the side of normal users, rewriting is nice to avoid failing. And for developers using musl, they should know there's no 'non-utf8' sets in musl rather than depending on libc, so i would like the idea of rewriting. Or we could put the responsibility of setting right LC_* to users? Not so friendly...

Because users may want to validate the strings returned by setlocale()... So the best rewriting time, i think, is at the translation time.

> Re: the original patch, it should probably...

4. makes sense, i'm not a pro coder, i havnt think about using strchr or strcmp! :)

And with the idea above, i suggest better using strchr to strip all things after '.'. that is good, and we dont need focus at what is placed after '.', since whatever he asked, musl is using utf8.

2017-01-30 0:37 GMT+08:00 Rich Felker <dalias@libc.org>:

On Sun, Jan 29, 2017 at 10:48:34PM +0800, He X wrote:
> btw, with 'p-> to q->', 'strip .UTF-8'(these two in the first thread), and
> these two patches, fcitx, chromium are working well.

Can I ask how .UTF-8 got in the locale name to begin with? Did you put
it there, or was it copied from another non-glibc system you logged in
from, or did chromium itself add it?

Re: the original patch, it should probably (depending on what we want
to do with other invalid encodings) either use strchr to find the
first '.' and strip everything after it, or something like:

if (loclen > 6 && !strcmp(locname+loclen-6, ".UTF-8"))

There's no reason to pull strstr in here.

Rich