ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "nobu (Nobuyoshi Nakada) via ruby-core" <ruby-core@ml.ruby-lang.org>
To: ruby-core@ml.ruby-lang.org
Cc: "nobu (Nobuyoshi Nakada)" <noreply@ruby-lang.org>
Subject: [ruby-core:119807] [Ruby master Bug#20869] IO buffer handling is inconsistent when seeking
Date: Thu, 07 Nov 2024 10:46:09 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-110493.20241107104609.692@ruby-lang.org> (raw)
In-Reply-To: <redmine.issue-20869.20241105160701.692@ruby-lang.org>

Issue #20869 has been updated by nobu (Nobuyoshi Nakada).


The buffers and `Encoding::Converter`s should be discarded at positioning, we think.

----------------------------------------
Bug #20869: IO buffer handling is inconsistent when seeking
https://bugs.ruby-lang.org/issues/20869#change-110493

* Author: javanthropus (Jeremy Bopp)
* Status: Open
* ruby -v: ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
When performing any of the seek based operations on IO (IO#seek, IO#pos=, or IO#rewind), the read buffer is inconsistently cleared:

```ruby
require 'tempfile'

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #ungetbyte as the first read buffer
  # operation uses a buffer that is preserved during
  # seek operations
  f.ungetbyte(97)
  # Byte buffer will not be cleared
  f.seek(2, :SET)

  f.getbyte       # => 97
end

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #getbyte before #ungetbyte uses a
  # buffer that is not preserved when seeking
  f.getbyte
  f.ungetbyte(97)
  # Byte buffer will be cleared
  f.seek(2, :SET)

  f.getbyte       # => 50
end
```

Similar behavior happens when reading characters:
```ruby
require 'tempfile'

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #ungetc as the first read buffer
  # operation uses a buffer that is preserved during
  # seek operations
  f.ungetc('a')
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'
end

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #getc before #ungetc uses a
  # buffer that is not preserved when seeking
  f.getc
  f.ungetc('a')
  # Character buffer will be cleared
  f.seek(2, :SET)

  f.getc       # => '2'
end
```

When transcoding, however, the character buffer is never cleared when seeking:
```ruby
require 'tempfile'

Tempfile.open(encoding: 'utf-8:utf-16le') do |f|
  f.write('0123456789')
  f.rewind

  f.ungetc('a'.encode('utf-16le'))
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'.encode('utf-16le')
end

Tempfile.open(encoding: 'utf-8:utf-16le') do |f|
  f.write('0123456789')
  f.rewind

  f.getc
  f.ungetc('a'.encode('utf-16le'))
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'.encode('utf-16le')
end
```

I would expect the buffers to be cleared in all cases except possibly when the seek operation doesn't actually move the file pointer such as when calling IO#pos or IO#seek(0, :CUR).  The inconsistent behavior demonstrated here is a problem regardless though.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

  parent reply	other threads:[~2024-11-07 10:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-05 16:07 [ruby-core:119741] " javanthropus (Jeremy Bopp) via ruby-core
2024-11-05 17:52 ` [ruby-core:119748] " byroot (Jean Boussier) via ruby-core
2024-11-05 18:06 ` [ruby-core:119749] " byroot (Jean Boussier) via ruby-core
2024-11-06 15:08 ` [ruby-core:119773] " javanthropus (Jeremy Bopp) via ruby-core
2024-11-06 15:53 ` [ruby-core:119774] " byroot (Jean Boussier) via ruby-core
2024-11-06 16:58 ` [ruby-core:119777] " javanthropus (Jeremy Bopp) via ruby-core
2024-11-07 10:46 ` nobu (Nobuyoshi Nakada) via ruby-core [this message]
2024-11-07 12:16 ` [ruby-core:119809] " byroot (Jean Boussier) via ruby-core
2024-11-07 13:01 ` [ruby-core:119810] " javanthropus (Jeremy Bopp) via ruby-core
2024-11-08  4:28 ` [ruby-core:119832] " nobu (Nobuyoshi Nakada) via ruby-core
2024-11-08 13:16 ` [ruby-core:119843] " javanthropus (Jeremy Bopp) via ruby-core
2024-11-12 14:33 ` [ruby-core:119896] " javanthropus (Jeremy Bopp) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-110493.20241107104609.692@ruby-lang.org \
    --to=ruby-core@ml.ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).