From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=0.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_BL_SPAMCOP_NET,SPF_HELO_PASS, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 Received: from nue.mailmanlists.eu (nue.mailmanlists.eu [94.130.110.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 5310D1F4C4 for ; Wed, 6 Nov 2024 15:08:26 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (1024-bit key; unprotected) header.d=ml.ruby-lang.org header.i=@ml.ruby-lang.org header.a=rsa-sha256 header.s=mail header.b=dDYO2CdR; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=ryX5s0Zh; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ml.ruby-lang.org; s=mail; t=1730905704; bh=xiQmhyRyHRNmSHZN1q4FKA543GBmYpwrfjbvy4+PBRM=; h=Date:References:To:Reply-To:Subject:List-Id:List-Archive: List-Help:List-Owner:List-Post:List-Subscribe:List-Unsubscribe: From:Cc:From; b=dDYO2CdRHPSb4XO9LiTVa+DRIf7nrUlxNZfOtJqj/C62mmhbVQXs+M5P2049H01td ybwLsRSPNtM2kEPaj2nNpbwne36lBnZjRLYhf3EscQ/y0oxl+3ySrJj6ZymfxlwjL6 il+f3T1mWq0wuEESmBIbRUCr828NL+0nFJ+tY9NE= Received: from nue.mailmanlists.eu (localhost [IPv6:::1]) by nue.mailmanlists.eu (Postfix) with ESMTP id 2843744B27 for ; Wed, 6 Nov 2024 15:08:24 +0000 (UTC) Authentication-Results: nue.mailmanlists.eu; dkim=pass (2048-bit key; unprotected) header.d=ruby-lang.org header.i=@ruby-lang.org header.a=rsa-sha256 header.s=s1 header.b=ryX5s0Zh; dkim-atps=neutral Received: from s.wrqvtvvn.outbound-mail.sendgrid.net (s.wrqvtvvn.outbound-mail.sendgrid.net [149.72.120.130]) by nue.mailmanlists.eu (Postfix) with ESMTPS id 599E044620 for ; Wed, 6 Nov 2024 15:08:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ruby-lang.org; h=from:references:subject:mime-version:content-type: content-transfer-encoding:list-id:to:cc:content-type:from:subject:to; s=s1; bh=tKvgweSgXVztrN4l/w3HJHM89LFyE1hmuWnGaXu+uvo=; b=ryX5s0Zh3PhVKkSE2aGHFi7ATdI37mRvlXEbavjtyAeu7ihzg0utJwu1014weBQ7yBKX 8qeTT5rIuo/9ndEgyl9NcqCqlmcP8YfqEecFxGoD6Syw5CFB3+KE1+NleA3L+P5DYQ8rJM UOoeWhWVI7i/iT01xJXQsRLSLHfk1F4OWM1YhfOmirFfBEkgEo8l5gvu/cpGx4eproi/lY uc37x5gu3i/UVJp7W3lQQKKaLYzD/pJxa+o5GGIYtmz+j9eiAfffNzxfX4WbZDp1hMm/a+ d4YOyKO/4O26TX1GDw8uhFw45VxaNAtiG/QdrrXPagipUy+AAiGoIPM2xvGK/8KA== Received: by recvd-5577bcb48c-xgt5g with SMTP id recvd-5577bcb48c-xgt5g-1-672B8659-15 2024-11-06 15:08:09.42137342 +0000 UTC m=+4741842.667886731 Received: from herokuapp.com (unknown) by geopod-ismtpd-45 (SG) with ESMTP id lERYA3L1QFSNlqEKFmYfPA for ; Wed, 06 Nov 2024 15:08:09.395 +0000 (UTC) Date: Wed, 06 Nov 2024 15:08:09 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 X-Redmine-Project: ruby-master X-Redmine-Issue-Tracker: Bug X-Redmine-Issue-Id: 20869 X-Redmine-Issue-Author: javanthropus X-Redmine-Issue-Priority: Normal X-Redmine-Sender: javanthropus X-Mailer: Redmine X-Redmine-Host: bugs.ruby-lang.org X-Redmine-Site: Ruby Issue Tracking System X-Auto-Response-Suppress: All Auto-Submitted: auto-generated X-Redmine-MailingListIntegration-Message-Ids: 96439 X-SG-EID: =?us-ascii?Q?u001=2ErGSxngKD6oGl9tWadX4XoV3etLquOoKg0CfRIgtn0=2F=2F51X16iSgr7Y8Wy?= =?us-ascii?Q?1fjCWEM5G6QT6lpC2QanJ2tFrap9W5zaCqDWGhY?= =?us-ascii?Q?=2F96ZJYC5KNTIUe1MxiaWasy8ec=2FWnks7UenhCrV?= =?us-ascii?Q?+As=2FF2v00zYpOmNw8UcQIRMNyk9L8sfECXtHFLM?= =?us-ascii?Q?g31Jm9qCILVR1MI6eDNkMLSGWTqX2HfTYVLg5Fm?= =?us-ascii?Q?qDFBXMgfYKW38jhDU2v3utv+ql6T7OKsS5n63Du?= =?us-ascii?Q?SqnB3DZNOmi8mkAB3JyPnVALew=3D=3D?= To: ruby-core@ml.ruby-lang.org X-Entity-ID: u001.I8uzylDtAfgbeCOeLBYDww== Message-ID-Hash: NUKSRCRUH7S6XTZISET5P5IDMHXT3RAV X-Message-ID-Hash: NUKSRCRUH7S6XTZISET5P5IDMHXT3RAV X-MailFrom: bounces+313651-b711-ruby-core=ml.ruby-lang.org@em5188.ruby-lang.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.9 Precedence: list Reply-To: Ruby developers Subject: [ruby-core:119773] [Ruby master Bug#20869] IO buffer handling is inconsistent when seeking List-Id: Ruby developers Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "javanthropus (Jeremy Bopp) via ruby-core" Cc: "javanthropus (Jeremy Bopp)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Issue #20869 has been updated by javanthropus (Jeremy Bopp). I think your code change highlights another bug caused by the current behavior where `IO#pos` can report negative values. Oddly, `IO#seek(0, :CUR)` still returns 0: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind f.ungetbyte(97) f.pos # => -1 f.seek(0, :CUR) # => 0 end ``` Note that `IO#pos` works correctly when used with `IO#ungetc` while transcoding since that cases causes an entirely different buffer to be used which is currently ignored by the seeking functions. As demonstrated in the issue description though, that buffer isn't ever cleared when using the seeking functions. Conceptually, it makes sense to me that the seeking functions should only care about bytes from the underlying stream since that's what they operate on. They should ignore read buffer manipulation by `IO#ungetbyte` and `IO#ungetc` since the data pushed by those methods have no relationship to the bytes of the stream. What I don't see anywhere I've looked is a statement regarding how `IO#ungetbyte` and `IO#ungetc` *should* interact with seeking operations. The existing specs and docs don't seem to cover those cases. It would be great to get clarification here before working on solutions. While I think the best solution would be to disregard the bytes added by `IO#ungetbyte` and `IO#ungetc` and to clear the relevant buffers when seeking, I can imagine others may prefer to preserve the buffers. Maybe the solution is to leave the behavior deliberately undefined and just warn people against mixing these methods via documentation. ---------------------------------------- Bug #20869: IO buffer handling is inconsistent when seeking https://bugs.ruby-lang.org/issues/20869#change-110439 * Author: javanthropus (Jeremy Bopp) * Status: Open * ruby -v: ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When performing any of the seek based operations on IO (IO#seek, IO#pos=, or IO#rewind), the read buffer is inconsistently cleared: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #ungetbyte as the first read buffer # operation uses a buffer that is preserved during # seek operations f.ungetbyte(97) # Byte buffer will not be cleared f.seek(2, :SET) f.getbyte # => 97 end Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #getbyte before #ungetbyte uses a # buffer that is not preserved when seeking f.getbyte f.ungetbyte(97) # Byte buffer will be cleared f.seek(2, :SET) f.getbyte # => 50 end ``` Similar behavior happens when reading characters: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #ungetc as the first read buffer # operation uses a buffer that is preserved during # seek operations f.ungetc('a') # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a' end Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #getc before #ungetc uses a # buffer that is not preserved when seeking f.getc f.ungetc('a') # Character buffer will be cleared f.seek(2, :SET) f.getc # => '2' end ``` When transcoding, however, the character buffer is never cleared when seeking: ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.getc f.ungetc('a'.encode('utf-16le')) # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le') end ``` I would expect the buffers to be cleared in all cases except possibly when the seek operation doesn't actually move the file pointer such as when calling IO#pos or IO#seek(0, :CUR). The inconsistent behavior demonstrated here is a problem regardless though. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/