mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: cp437 issue with bad mapping at least for one char
Date: Tue, 21 Nov 2017 22:25:24 -0500	[thread overview]
Message-ID: <20171122032524.GO1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <85e-5a14e600-3-6a898d80@69999707>

On Wed, Nov 22, 2017 at 03:50:48AM +0100, Jacob Thrane Lund wrote:
> 
> Hi musl devs,
> 
> I experienced a test failing when building the latest version of gammu for Alpine Linux.
> 
> After reporting the issue to the gammu developer the reached conclusion was the issue is with musl -
> https://github.com/gammu/gammu/issues/303#issuecomment-345258460
> 
> I have checked the log for
> https://git.musl-libc.org/cgit/musl/commit/src/locale/codepages.h
> and Rich Felker pushed a commit 8 days ago. As of yet I have not had
> the chance to verify if this also resolves this issue. Dealing with
> charsets at this level is for me totally new territory..
> 
> I was hoping you could confirm/deny if Rich’s commit indeed also resolves my issue?

It does. Here is how CP437 decodes, before:

Çüéâäàåç êëèïîìÄÅ ÉæÆôöòûù ÿÖÜ¢£¥₧ƒ  ¡¢£¤¥¦§ ¨©ª«¬­®¯ ░▒▓│┤╡╢╖ ╕╣║╗╝╜╛┐
└┴┬├─┼╞╟ ╚╔╩╦╠═╬╧ ╨╤╥╙╘╒╓╫ ╪┘┌█▄▌▐▀ αáΓπΣσæτ ΦΘΩδìφεï ðñ≥≤⌠⌡÷≈ °∙·√ü²■ 

and after:

Çüéâäàåç êëèïîìÄÅ ÉæÆôöòûù ÿÖÜ¢£¥₧ƒ áíóúñѪº ¿⌐¬½¼¡«» ░▒▓│┤╡╢╖ ╕╣║╗╝╜╛┐
└┴┬├─┼╞╟ ╚╔╩╦╠═╬╧ ╨╤╥╙╘╒╓╫ ╪┘┌█▄▌▐▀ αßΓπΣσµτ ΦΘΩδ∞φε∩ ≡±≥≤⌠⌡÷≈ °∙·√ⁿ²■ 

The problem (silently fixed) was that the table generation code for
legacychars.h ignored entries in the Unicode charmap files that used
lowercase a-f in the hex, _and_ omitted characters that appeared in
the same slot as their Unicode codepoint (in all the ISO-8859
encodings containing í, it appears in "its own" slot), since these
previously got a special encoding. If not for the latter, this
character would have been included in the legacychars.h map already
due to being in Latin-1, where the charmap file used uppercase.

Somehow when the character was missing in legacychars.h, the mapping
tables ended up containing nonsense.

Rich


      reply	other threads:[~2017-11-22  3:25 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22  2:50 Jacob Thrane Lund
2017-11-22  3:25 ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171122032524.GO1627@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).