From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS4713 221.184.0.0/13 X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from neon.ruby-lang.org (neon.ruby-lang.org [221.186.184.75]) by dcvr.yhbt.net (Postfix) with ESMTP id 026C11F8C6 for ; Fri, 25 Jun 2021 09:40:05 +0000 (UTC) Received: from neon.ruby-lang.org (localhost [IPv6:::1]) by neon.ruby-lang.org (Postfix) with ESMTP id 30DF9120A64; Fri, 25 Jun 2021 18:38:48 +0900 (JST) Received: from xtrwkhkc.outbound-mail.sendgrid.net (xtrwkhkc.outbound-mail.sendgrid.net [167.89.16.28]) by neon.ruby-lang.org (Postfix) with ESMTPS id 6C1BD120A51 for ; Fri, 25 Jun 2021 18:38:46 +0900 (JST) Received: by filterdrecv-c8c5888c4-lkpjc with SMTP id filterdrecv-c8c5888c4-lkpjc-1-60D5A46B-56 2021-06-25 09:39:55.778452818 +0000 UTC m=+541919.863525652 Received: from herokuapp.com (unknown) by geopod-ismtpd-2-0 (SG) with ESMTP id dByLLg33Raq3eWY_PeGNIQ for ; Fri, 25 Jun 2021 09:39:55.516 +0000 (UTC) Date: Fri, 25 Jun 2021 09:39:55 +0000 (UTC) From: duerst@it.aoyama.ac.jp Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 12052 X-Redmine-Issue-Author: nobu X-Redmine-Issue-Assignee: akr X-Redmine-Sender: duerst X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 80519 X-SG-EID: =?us-ascii?Q?uQY=2F2xNrNfHHTWbKn6MBvvzfU5Pqk9I4lnOVb0CFDuvNW94CAGP+b3rpinU1v3?= =?us-ascii?Q?TX0NiUEHjxsVsvHxB4+b=2FWmpIsu=2FxTDSG2dkjhs?= =?us-ascii?Q?h2BkWP2qCLqu84cTUmPtEmRqGTdfydTGxmBaW2J?= =?us-ascii?Q?FfKmMvKaYqfUFiygQflG7ezUK2ptDhnhlgDt+Ko?= =?us-ascii?Q?NXlTs6grlOMnHEFhiA+JhP3GuGsOWAGdBTA=3D=3D?= To: ruby-dev@ruby-lang.org X-Entity-ID: b/2+PoftWZ6GuOu3b0IycA== X-ML-Name: ruby-dev X-Mail-Count: 51069 Subject: [ruby-dev:51069] [Ruby master Bug#12052] String#encode with xml option returns wrong result for totally non-ASCII-compatible encodings X-BeenThere: ruby-dev@ruby-lang.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: "Ruby developers \(Japanese\)" List-Id: "Ruby developers \(Japanese\)" List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: ruby-dev-bounces@ruby-lang.org Sender: "ruby-dev" Issue #12052 has been updated by duerst (Martin D=C3=BCrst).=0D =0D Status changed from Rejected to Open=0D Subject changed from String#encode with xml option returns wrong result to = String#encode with xml option returns wrong result for totally non-ASCII-co= mpatible encodings=0D =0D Sorry to @jeremyevans0, but I have to disagree. This is a bug. We can disag= ree about how important it is to fix this bug, but it's a bug nevertheless.= First, xml: :text works correctly in other encodings even if the source an= d destination encodings match.=0D ```Ruby=0D " "<q&"=0D ```=0D =0D The bug is that we process UTF-16LE as if it consisted of 1-byte ASCII-base= d code units. I still have to identify exactly where and when that happens.=0D =0D I have changed the subject to indicate what I understand is the extent of t= he problem. By using "totally", I want to distinguish this from encodings s= uch as Shift_JIS which are also not as ASCII-compatible as say UTF-8, but s= till more so than UTF-16 (in its various variants).=0D =0D ----------------------------------------=0D Bug #12052: String#encode with xml option returns wrong result for totally = non-ASCII-compatible encodings=0D https://bugs.ruby-lang.org/issues/12052#change-92645=0D =0D * Author: nobu (Nobuyoshi Nakada)=0D * Status: Open=0D * Priority: Normal=0D * Assignee: akr (Akira Tanaka)=0D * Backport: 2.0.0: REQUIRED, 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED=0D ----------------------------------------=0D `String#encode`=E3=82=92ASCII=E9=9D=9E=E4=BA=92=E6=8F=9B=E3=82=A8=E3=83=B3= =E3=82=B3=E3=83=BC=E3=83=87=E3=82=A3=E3=83=B3=E3=82=B0=E3=81=8B=E3=82=89=E5= =90=8C=E3=81=98=E3=82=A8=E3=83=B3=E3=82=B3=E3=83=BC=E3=83=87=E3=82=A3=E3=83= =B3=E3=82=B0=E3=81=B8=E3=80=81`xml:`=E3=82=AA=E3=83=97=E3=82=B7=E3=83=A7=E3= =83=B3=E4=BB=98=E3=81=8D=E3=81=A7=E5=91=BC=E3=81=B6=E3=81=A8=E3=81=8A=E3=81= =8B=E3=81=97=E3=81=AA=E7=B5=90=E6=9E=9C=E3=82=92=E8=BF=94=E3=81=97=E3=81=BE= =E3=81=99=E3=80=82=0D =E3=83=90=E3=82=A4=E3=83=8A=E3=83=AA=E3=81=A8=E3=81=97=E3=81=A6=E5=A4=89=E6= =8F=9B=E3=81=97=E3=81=A6=E3=81=97=E3=81=BE=E3=81=A3=E3=81=A6=E3=81=84=E3=82= =8B=E3=82=88=E3=81=86=E3=81=A7=E3=81=99=E3=80=82=0D =0D ```ruby=0D p "<\0>\0".encode("utf-16le", "utf-16le", xml: :text)=0D #=3D> "\u6C26\u3B74\u2600\u7467;"=0D ```=0D =0D =0D =0D --=20=0D https://bugs.ruby-lang.org/=0D