* E02 failing on Alpine / musl libc
@ 2022-05-16 7:14 dana
2022-05-16 10:54 ` Peter Stephenson
2022-05-17 2:33 ` Jun T
0 siblings, 2 replies; 4+ messages in thread
From: dana @ 2022-05-16 7:14 UTC (permalink / raw)
To: Zsh hackers list
Earlier today an Alpine dev posted on IRC that an E02 test is failing for them
on 5.9:
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/34456
It's an issue with the way the function name ヌ is printed:
-$'\M-c\M-\C-C\M-\C-L' () {
+$'\udfe3\udf83\udf8c' () {
I assume it's to do with this:
> Starting with version 1.1.11, musl provides a special C locale where bytes
> 0x80-0xff are treated as abstract single-byte-character units with no actual
> character identity (they’re mapped into wchar_t values that occupy the
> Unicode surrogates range).
( https://wiki.musl-libc.org/functional-differences-from-glibc.html )
dana
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: E02 failing on Alpine / musl libc 2022-05-16 7:14 E02 failing on Alpine / musl libc dana @ 2022-05-16 10:54 ` Peter Stephenson 2022-05-17 2:33 ` Jun T 1 sibling, 0 replies; 4+ messages in thread From: Peter Stephenson @ 2022-05-16 10:54 UTC (permalink / raw) To: Zsh hackers list > On 16 May 2022 at 08:14 dana <dana@dana.is> wrote: > Earlier today an Alpine dev posted on IRC that an E02 test is failing for them > on 5.9: > > https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/34456 > > It's an issue with the way the function name ヌ is printed: > > -$'\M-c\M-\C-C\M-\C-L' () { > +$'\udfe3\udf83\udf8c' () { This probably isn't a big deal for this test, which isn't even a multibyte test, it's just to check we've got come consistent representation for strange output with a meta bit. So arguably we could just pick a simpler string to test, for an easier life. The character output isn't right, though (try passing both versions to print), so I suppose it's a real bug somewhere. The question is whether it's our job to chase it. pws ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: E02 failing on Alpine / musl libc 2022-05-16 7:14 E02 failing on Alpine / musl libc dana 2022-05-16 10:54 ` Peter Stephenson @ 2022-05-17 2:33 ` Jun T 2022-05-19 3:27 ` dana 1 sibling, 1 reply; 4+ messages in thread From: Jun T @ 2022-05-17 2:33 UTC (permalink / raw) To: zsh-workers > 2022/05/16 16:14, dana <dana@dana.is> wrote: > > I assume it's to do with this: > >> Starting with version 1.1.11, musl provides a special C locale where bytes >> 0x80-0xff are treated as abstract single-byte-character units with no actual >> character identity (they’re mapped into wchar_t values that occupy the >> Unicode surrogates range). I tried Alpine for the first time, and found that E02 and two other tests (see below) failed due to this "special" C locale. In this "special" C locale, str[0] = 0xXX; /* any value in the range 0x80-0xff */ mbrtowc(&wc, str, 1, &mbs); sets wc to 0xdfXX (not just 0xXX). For example, if 0xXX is 0x83 then wc is set to 0xdf83. This is indeed "special", but it seems globbing etc. works without problem. So I think we need/should not "fix" this, because 0xfdXX (or \ufdXX) is the correct representation in their "special" C loale. IF they want they can just change (in their package) the expected outputs of the tests to their correct values. These are the two tests that fail due to the same reason: ./A03quoting.ztst: starting. --- /tmp/zsh.ztst.13004/ztst.out +++ /tmp/zsh.ztst.13004/ztst.tout @@ -4,4 +4,4 @@ 16#4D 16#42 16#53 -16#DC +16#DFDC Test ./A03quoting.ztst failed: output differs from expected as shown above for: chars=$(print -r $'BS\\MBS\M-\\') for (( i = 1; i <= $#chars; i++ )); do char=$chars[$i] print $(( [#16] #char )) done Was testing: $'-style quote with metafied backslash ./B03print.ztst: starting. --- /tmp/zsh.ztst.20798/ztst.out +++ /tmp/zsh.ztst.20798/ztst.tout @@ -1 +1 @@ -f0 +dff0 Test ./B03print.ztst failed: output differs from expected as shown above for: printf '%x\n' $(printf '"\xf0') Was testing: numeric value of high numbered character ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: E02 failing on Alpine / musl libc 2022-05-17 2:33 ` Jun T @ 2022-05-19 3:27 ` dana 0 siblings, 0 replies; 4+ messages in thread From: dana @ 2022-05-19 3:27 UTC (permalink / raw) To: Jun T; +Cc: Zsh hackers list On Mon 16 May 2022, at 21:33, Jun T wrote: > So I think we need/should not "fix" this, because 0xfdXX (or \ufdXX) is the > correct representation in their "special" C loale. I think i see the argument for not trying to do any 'special' accounting of this locale in the shell. As far as the tests, i guess we are technically making assumptions about the wchar values of non-'portable' characters that POSIX says we can't actually make, but not making those assumptions seems annoying For the E02 test in particular, as Peter says, it isn't a multi-byte test. If there's not anything special about the code path for xtrace preservation that's sensitive to weird function names maybe that aspect of the test belongs in B13, C04, or D07...? Here is some additional context/history behind these failing tests, in case anyone's ever looking for it later. Don't read this, you probably don't care: The A03 and B03 tests that Jun mentioned here have been failing on musl since at least zsh-5.5 — probably longer (despite workers/48578 indicating that it'd only started 'recently'), since the """special""" (lol) locale was introduced to musl in August 2015, and made its way into Alpine very shortly afterwards The LC_ALL=C in the failing E02 test was introduced by me and Jun in workers/45537+45550 to fix a similar issue i was seeing with the way the function name ヌ was being printed by `which` on macOS Mojave. I bet i was having this problem because i had explicitly set LC_CTYPE to a UTF-8 locale, and Jun had not yet made the change in workers/49908 to have ztst reset that back to C like it did with LANG and LC_ALL. It does now reset it with the others so the LC_ALL=C is probably superfluous in that respect However, if you don't have *any* LANG/LC_* variables set, on some systems, including Alpine, where the 'implementation-defined default locale' is UTF-8, you can get the same behaviour i was seeing where `which` just prints ヌ back out without any escaping I mention that because there are basically only two possibilities on a typical musl system (either the 'special' POSIX locale or a UTF-8 one) and both of them will cause the test to fail as written. And also because there might be other systems that have a UTF-8 default locale where this test and others could fail without an explicit LC_ALL=C because ztst only resets the locale to C if we're *not* using the default one (which i don't think i understand the reasoning for) dana ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-05-19 3:28 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-16 7:14 E02 failing on Alpine / musl libc dana 2022-05-16 10:54 ` Peter Stephenson 2022-05-17 2:33 ` Jun T 2022-05-19 3:27 ` dana
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).