ruby-dev (Japanese) list archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "JesseJohnson (Jesse Johnson) via ruby-dev" <ruby-dev@ml.ruby-lang.org>
To: ruby-dev@ml.ruby-lang.org
Cc: "JesseJohnson (Jesse Johnson)" <noreply@ruby-lang.org>
Subject: [ruby-dev:52057]  [Ruby master Bug#6351] transcode table generator does not support multi characters of Unicode
Date: Mon, 13 Nov 2023 18:35:52 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-105291.20231113183552.9@ruby-lang.org> (raw)
In-Reply-To: <redmine.issue-6351.20120424204139.9@ruby-lang.org>

Issue #6351 has been updated by JesseJohnson (Jesse Johnson).


@duerst Is this still an issue? If so, is there a test case?

----------------------------------------
Bug #6351: transcode table generator does not support multi characters of Unicode
https://bugs.ruby-lang.org/issues/6351#change-105291

* Author: usa (Usaku NAKAMURA)
* Status: Assigned
* Priority: Normal
* Assignee: duerst (Martin Dürst)
* ruby -v: ruby 2.0.0dev (2012-04-24 trunk 35457)
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
改めてチケット起こします。[ruby-dev:45576] より。

On 2012/04/24 17:11, "Martin J. Dürst" wrote:
> On 2012/04/24 17:02, U.Nakamura wrote:
>
>> データは例によってNetBSDのものが利用できそうです。
>> なのですが、transcodeってUnicodeの第0面(BMP)以外はサポートし
>> てましたっけ?
>
> もちろんです :-)

もうちょっと調べました。BMP 以外は transcode の最初から全く問題ないです 
が、現時点で引っかかるのは次のものです 
(http://x0213.org/codetable/euc-jis-2004-std.txt から抜粋):

0xA4F7	U+304B+309A	# 	[2000]
0xA4F8	U+304D+309A	# 	[2000]
0xA4F9	U+304F+309A	# 	[2000]
0xA4FA	U+3051+309A	# 	[2000]
0xA4FB	U+3053+309A	# 	[2000]

0xA5F7	U+30AB+309A	# 	[2000]
0xA5F8	U+30AD+309A	# 	[2000]
0xA5F9	U+30AF+309A	# 	[2000]
0xA5FA	U+30B1+309A	# 	[2000]
0xA5FB	U+30B3+309A	# 	[2000]
0xA5FC	U+30BB+309A	# 	[2000]
0xA5FD	U+30C4+309A	# 	[2000]
0xA5FE	U+30C8+309A	# 	[2000]

0xA6F8	U+31F7+309A	# 	[2000]

0xABC4	U+00E6+0300	# 	[2000]

0xABC8	U+0254+0300	# 	[2000]
0xABC9	U+0254+0301	# 	[2000]
0xABCA	U+028C+0300	# 	[2000]
0xABCB	U+028C+0301	# 	[2000]
0xABCC	U+0259+0300	# 	[2000]
0xABCD	U+0259+0301	# 	[2000]
0xABCE	U+025A+0300	# 	[2000]
0xABCF	U+025A+0301	# 	[2000]

0xABE5	U+02E9+02E5	# 	[2000]
0xABE6	U+02E5+02E9	# 	[2000]

ようするに、JIS X 0213 で一文字になっているが、Unicode で二文字になって 
いるものです。EUC-JISX0213 から UTF-8 は問題ないですが、逆は現在引っかか 
ります。windows-1258 も (逆ですが) 同じ問題がありますので、いずれはなく 
さないといけないと思いましたが、今回はいいきっかけのではないかと思います。

よろしくお願いします。    Martin.




-- 
https://bugs.ruby-lang.org/

           reply	other threads:[~2023-11-13 18:36 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <redmine.issue-6351.20120424204139.9@ruby-lang.org>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-105291.20231113183552.9@ruby-lang.org \
    --to=ruby-dev@ml.ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).