ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:121271] [Ruby master Bug#21176] Regression in case-insensitive matching for single-byte encodings
@ 2025-03-09  5:31 ima1zumi (Mari Imaizumi) via ruby-core
  2025-03-11  9:05 ` [ruby-core:121290] " ima1zumi (Mari Imaizumi) via ruby-core
  0 siblings, 1 reply; 2+ messages in thread
From: ima1zumi (Mari Imaizumi) via ruby-core @ 2025-03-09  5:31 UTC (permalink / raw)
  To: ruby-core; +Cc: ima1zumi (Mari Imaizumi)

Issue #21176 has been reported by ima1zumi (Mari Imaizumi).

----------------------------------------
Bug #21176: Regression in case-insensitive matching for single-byte encodings
https://bugs.ruby-lang.org/issues/21176

* Author: ima1zumi (Mari Imaizumi)
* Status: Open
* ruby -v: ruby 3.5.0dev (2025-02-28T09:32:36Z master db4ea95219) +PRISM [arm64-darwin24]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
In Ruby 3.4.2, case-insensitive regex matching (`/i`) worked as expected for single-byte encodings like ISO-8859-x.
However, in Ruby 3.5.0dev, characters such as `\u00F3 (ó)` and `\u00D3 (Ó)` are no longer considered equivalent under case-insensitive matching, causing the match to fail.

The likely cause is #16145 , which appears to have broken handling of `0x80–0xFF` in single-byte encodings.

## Reproduction

```ruby
enc = Encoding::ISO_8859_1
o_acute_lower = "\u00F3".encode(enc)  # ó
o_acute_upper = "\u00D3".encode(enc)  # Ó

puts /[x#{o_acute_lower}]/i =~ "abc#{o_acute_upper}"
```

- Ruby 3.4.2: outputs 3 (match successful)
- Ruby 3.5.0dev: outputs nil (match fails)
    - ruby 3.5.0dev (2025-02-28T09:32:36Z master db4ea95219) +PRISM [arm64-darwin24]

I will submit PR to fix this.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [ruby-core:121290] [Ruby master Bug#21176] Regression in case-insensitive matching for single-byte encodings
  2025-03-09  5:31 [ruby-core:121271] [Ruby master Bug#21176] Regression in case-insensitive matching for single-byte encodings ima1zumi (Mari Imaizumi) via ruby-core
@ 2025-03-11  9:05 ` ima1zumi (Mari Imaizumi) via ruby-core
  0 siblings, 0 replies; 2+ messages in thread
From: ima1zumi (Mari Imaizumi) via ruby-core @ 2025-03-11  9:05 UTC (permalink / raw)
  To: ruby-core; +Cc: ima1zumi (Mari Imaizumi)

Issue #21176 has been updated by ima1zumi (Mari Imaizumi).

Description updated

;

----------------------------------------
Bug #21176: Regression in case-insensitive matching for single-byte encodings
https://bugs.ruby-lang.org/issues/21176#change-112254

* Author: ima1zumi (Mari Imaizumi)
* Status: Open
* ruby -v: ruby 3.5.0dev (2025-02-28T09:32:36Z master db4ea95219) +PRISM [arm64-darwin24]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
In Ruby 3.4.2, case-insensitive regex matching (`/i`) worked as expected for single-byte encodings like ISO-8859-x.
However, in Ruby 3.5.0dev, characters such as `\u00F3 (ó)` and `\u00D3 (Ó)` are no longer considered equivalent under case-insensitive matching, causing the match to fail.

The likely cause is #16145 , which appears to have broken handling of `0x80–0xFF` in single-byte encodings.

## Reproduction

```ruby
enc = Encoding::ISO_8859_1
o_acute_lower = "\u00F3".encode(enc)  # ó
o_acute_upper = "\u00D3".encode(enc)  # Ó

puts /[x#{o_acute_lower}]/i =~ "abc#{o_acute_upper}"
```

- Ruby 3.4.2: outputs 3 (match successful)
- Ruby 3.5.0dev: outputs nil (match fails)
    - ruby 3.5.0dev (2025-02-28T09:32:36Z master db4ea95219) +PRISM [arm64-darwin24]

I will submit PR to fix this.

https://github.com/ruby/ruby/pull/12889




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-03-11  9:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-09  5:31 [ruby-core:121271] [Ruby master Bug#21176] Regression in case-insensitive matching for single-byte encodings ima1zumi (Mari Imaizumi) via ruby-core
2025-03-11  9:05 ` [ruby-core:121290] " ima1zumi (Mari Imaizumi) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).