From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [94.130.110.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 346FB1F4BE for ; Mon, 28 Oct 2024 14:09:36 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; unprotected) header.d=ml.ruby-lang.org header.i=@ml.ruby-lang.org header.a=rsa-sha256 header.s=mail header.b=VWkMkws1; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=g6tYT7Zm; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ml.ruby-lang.org; s=mail; t=1730124543; bh=bWVdsO16iEb2WTHAubIfrlytBSmSVPVIiEw600eHkXM=; h=Date:References:To:Reply-To:Subject:List-Id:List-Archive: List-Help:List-Owner:List-Post:List-Subscribe:List-Unsubscribe: From:Cc:From; b=VWkMkws1e99ZdqWHznovLWgVpj+yrxNJz+XJGweWsFCG8KNc/J1maX1GgfY7VSDVH 447lxE5OICEuGEMKzt8vgthehYIW4ILxZbpaSgrcb6g4mle2bsO95w1lJNc2c+MvkZ nTrPPFGDxUXmbjqgtNCYxbuTwZpnc+onY+QeaqU0= Received: from nue.mailmanlists.eu (localhost [IPv6:::1]) by nue.mailmanlists.eu (Postfix) with ESMTP id 64262449E4 for ; Mon, 28 Oct 2024 14:09:03 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=pass (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=g6tYT7Zm; dkim-atps=neutral Received: from s.wfbtzhsw.outbound-mail.sendgrid.net (s.wfbtzhsw.outbound-mail.sendgrid.net [159.183.224.105]) by nue.mailmanlists.eu (Postfix) with ESMTPS id A4D4C449AF for ; Mon, 28 Oct 2024 14:08:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=eIe9cW7Nr86WpA7NGw4rqVd3hsrIf10tttIH70BksRg=; b=g6tYT7ZmhfmXvjRhIYwS1w9rPufrMseVcSMxzWYRui7KLH8Rtt/S1Xx63g3fdWTZMqmT LJZ2pxjypHSQm7FvGwgQOt7D4AFyLFD+9StQKc4kFxv3vDELoVuvjzgMT/q+xWbg38PyFs 1JFfcsuO7rJ4Og2FWEhOIOTuutRyH6bd9yygOn391+BdysPcNyqjbYk6gLOB3g5r0CCFnK lhZu1cMTx+4++4tVWIYv3OoheUAYHyzuA93K/TYW42vdqD84vEUcMNRQNN13okT+GfI3vq kmm25KOxYwvSsV8fcHtWjEeDXn0V26tq08BU/TSDnQ9y5vHc7uHCUKN+5YU+brwA== Received: by recvd-5577bcb48c-rmksz with SMTP id recvd-5577bcb48c-rmksz-1-671F9AF3-2F 2024-10-28 14:08:51.64425315 +0000 UTC m=+3960641.644200296 Received: from herokuapp.com (unknown) by geopod-ismtpd-25 (SG) with ESMTP id JuvBl6amSrGI7bC92-naxA for ; Mon, 28 Oct 2024 14:08:51.623 +0000 (UTC) Date: Mon, 28 Oct 2024 14:08:51 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 20819 X-Redmine-Issue-Author: javanthropus X-Redmine-Issue-Priority: Normal X-Redmine-Sender: javanthropus X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 96267 X-SG-EID: =?us-ascii?Q?u001=2ErGSxngKD6oGl9tWadX4XoV3etLquOoKg0CfRIgtn0=2F=2F51X16iSgr7Y8Wy?= =?us-ascii?Q?1fjCWEM5G6QT6lpC2QanJ2tFrap9W5zaCqDWGhY?= =?us-ascii?Q?=2F96ZJYC5KNSAMqtjKiwxrDrDr9xavEeMRU1IseM?= =?us-ascii?Q?jJE4itBp=2FDSXJpKSdl8VPf9OS8C1=2FrvGXhNQfNH?= =?us-ascii?Q?81jnD5zrrlErhBOCnkL5mtMUqAH+9MNkWSjD2l7?= =?us-ascii?Q?X5X4Do7UxgAovhK8W56W4UfT2rHHRSXD66vcbSz?= =?us-ascii?Q?1jKKSp0Dh8CtYZg=2F0VkDPNUM9Q=3D=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: u001.I8uzylDtAfgbeCOeLBYDww== Message-ID-Hash: CI7AK3Q6GF3KUO5YNPG226SL42UDUHP5 X-Message-ID-Hash: CI7AK3Q6GF3KUO5YNPG226SL42UDUHP5 X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.9 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:119633] [Ruby master Bug#20819] IO#readline does not process newlines correctly for non-ASCII compatible encodings List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "javanthropus (Jeremy Bopp) via ruby-core" Cc: "javanthropus (Jeremy Bopp)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Issue #20819 has been reported by javanthropus (Jeremy Bopp). ---------------------------------------- Bug #20819: IO#readline does not process newlines correctly for non-ASCII compatible encodings https://bugs.ruby-lang.org/issues/20819 * Author: javanthropus (Jeremy Bopp) * Status: Open * ruby -v: ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When not performing character conversion, IO#readline only processes newline characters as ASCII when reading paragraphs. However, when character conversion is involved, even when converting between 2 ASCII incompatible encodings, newline handling is correct. ```ruby require "tempfile" Tempfile.open(binmode: true) do |f| f.set_encoding("utf-16le") f.write("\n\n\n\nhello\n\nworld") f.rewind # No character conversion case. # Expecting "hello\n\n".encode(Encoding::UTF_16LE) f.readline("") # => "\0".force_encoding(Encoding::UTF_16LE) + "\n\n\nhello\n\nworld".encode(Encoding::UTF_16LE) f.set_encoding("utf-16le:utf-32le") f.rewind # Character conversion case. f.readline("") # => "hello\n\n".encode(Encoding::UTF_32LE) end ``` In the failing case, a newline character appears in the first byte of the input due to the UTF-16LE encoding. This is discarded per the normal behavior of reading paragraphs, but the following null byte is not consumed as required to consume the entire newline character in UTF-16LE encoding. This leads to a leading and invalid null byte in the output of IO#readline. Furthermore, the newlines between "hello" and "world" are not seen as a pair of newline characters sufficient to end the first paragraph because they are not ASCII newlines and instead have a null byte between them. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/