From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: cp437 issue with bad mapping at least for one char
Date: Tue, 21 Nov 2017 22:25:24 -0500 [thread overview]
Message-ID: <20171122032524.GO1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <85e-5a14e600-3-6a898d80@69999707>
On Wed, Nov 22, 2017 at 03:50:48AM +0100, Jacob Thrane Lund wrote:
>
> Hi musl devs,
>
> I experienced a test failing when building the latest version of gammu for Alpine Linux.
>
> After reporting the issue to the gammu developer the reached conclusion was the issue is with musl -
> https://github.com/gammu/gammu/issues/303#issuecomment-345258460
>
> I have checked the log for
> https://git.musl-libc.org/cgit/musl/commit/src/locale/codepages.h
> and Rich Felker pushed a commit 8 days ago. As of yet I have not had
> the chance to verify if this also resolves this issue. Dealing with
> charsets at this level is for me totally new territory..
>
> I was hoping you could confirm/deny if Rich’s commit indeed also resolves my issue?
It does. Here is how CP437 decodes, before:
Çüéâäàåç êëèïîìÄÅ ÉæÆôöòûù ÿÖÜ¢£¥₧ƒ ¡¢£¤¥¦§ ¨©ª«¬®¯ ░▒▓│┤╡╢╖ ╕╣║╗╝╜╛┐
└┴┬├─┼╞╟ ╚╔╩╦╠═╬╧ ╨╤╥╙╘╒╓╫ ╪┘┌█▄▌▐▀ αáΓπΣσæτ ΦΘΩδìφεï ðñ≥≤⌠⌡÷≈ °∙·√ü²■
and after:
Çüéâäàåç êëèïîìÄÅ ÉæÆôöòûù ÿÖÜ¢£¥₧ƒ áíóúñѪº ¿⌐¬½¼¡«» ░▒▓│┤╡╢╖ ╕╣║╗╝╜╛┐
└┴┬├─┼╞╟ ╚╔╩╦╠═╬╧ ╨╤╥╙╘╒╓╫ ╪┘┌█▄▌▐▀ αßΓπΣσµτ ΦΘΩδ∞φε∩ ≡±≥≤⌠⌡÷≈ °∙·√ⁿ²■
The problem (silently fixed) was that the table generation code for
legacychars.h ignored entries in the Unicode charmap files that used
lowercase a-f in the hex, _and_ omitted characters that appeared in
the same slot as their Unicode codepoint (in all the ISO-8859
encodings containing í, it appears in "its own" slot), since these
previously got a special encoding. If not for the latter, this
character would have been included in the legacychars.h map already
due to being in Latin-1, where the charmap file used uppercase.
Somehow when the character was missing in legacychars.h, the mapping
tables ended up containing nonsense.
Rich
prev parent reply other threads:[~2017-11-22 3:25 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-22 2:50 Jacob Thrane Lund
2017-11-22 3:25 ` Rich Felker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171122032524.GO1627@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).