From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Decode 0x80 Euro for GBK
Date: Tue, 3 Mar 2020 15:45:00 -0500 [thread overview]
Message-ID: <20200303204500.GO11469@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAD66C+b0GdCV=J_euyycNCdoR87BpVe59Hk7ODMNen+Me3QPqQ@mail.gmail.com>
On Tue, Mar 03, 2020 at 04:09:54PM +0800, Mingye Wang wrote:
> Hi,
>
> Sorry for the inconvenience, but please check the attachment.
> --
> Mingye Wang (Artoria2e5)
> From 0451fe959a55cf19d17ca131d68825922e1357a4 Mon Sep 17 00:00:00 2001
> From: Mingye Wang <arthur200126@gmail.com>
> Date: Tue, 3 Mar 2020 15:56:15 +0800
> Subject: [PATCH] Decode 0x80 Euro for GBK
>
> Microsoft's cp936 has a Euro sign in its complete form, and it is the
> official IANA "GBK". Add it.
>
> Ref: https://encoding.spec.whatwg.org/#gbk-flag
> Ref: https://www.iana.org/assignments/charset-reg/GBK
> ---
> src/locale/iconv.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/src/locale/iconv.c b/src/locale/iconv.c
> index 3047c27b..d01342a2 100644
> --- a/src/locale/iconv.c
> +++ b/src/locale/iconv.c
> @@ -403,6 +403,11 @@ size_t iconv(iconv_t cd, char **restrict in, size_t *restrict inb, char **restri
> if (c < 128) break;
> if (c < 0xa1) goto ilseq;
> case GBK:
> + // CP936 Euro. WHATWG tolerates it in GB18030, should we too?
> + if (c == 128) {
> + c = 0x20AC;
> + break;
> + }
> case GB18030:
> if (c < 128) break;
> c -= 0x81;
Does this mean GBK encodes the euro sign twice? Or is the normal
encoding of it only present in GB18030, not legacy GBK?
Rich
prev parent reply other threads:[~2020-03-03 20:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-03 8:09 Mingye Wang
2020-03-03 20:45 ` Rich Felker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200303204500.GO11469@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).