ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
@ 2025-02-27 13:37 srbaker (Steven Baker) via ruby-core
  2025-02-27 13:59 ` [ruby-core:121194] " srbaker (Steven Baker) via ruby-core
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: srbaker (Steven Baker) via ruby-core @ 2025-02-27 13:37 UTC (permalink / raw)
  To: ruby-core; +Cc: srbaker (Steven Baker)

Issue #21161 has been reported by srbaker (Steven Baker).

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161

* Author: srbaker (Steven Baker)
* Status: Open
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:121194] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
  2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
@ 2025-02-27 13:59 ` srbaker (Steven Baker) via ruby-core
  2025-02-27 14:31 ` [ruby-core:121195] " nobu (Nobuyoshi Nakada) via ruby-core
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: srbaker (Steven Baker) via ruby-core @ 2025-02-27 13:59 UTC (permalink / raw)
  To: ruby-core; +Cc: srbaker (Steven Baker)

Issue #21161 has been updated by srbaker (Steven Baker).


I have confirmed this does not affect any 3.x versions before 3.4.

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161#change-112130

* Author: srbaker (Steven Baker)
* Status: Open
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:121195] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
  2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
  2025-02-27 13:59 ` [ruby-core:121194] " srbaker (Steven Baker) via ruby-core
@ 2025-02-27 14:31 ` nobu (Nobuyoshi Nakada) via ruby-core
  2025-02-27 14:32 ` [ruby-core:121196] " nobu (Nobuyoshi Nakada) via ruby-core
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-02-27 14:31 UTC (permalink / raw)
  To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)

Issue #21161 has been updated by nobu (Nobuyoshi Nakada).


Does it happens with `LC_CTYPE=tr_TR.UTF-8 ruby --parser=parse.y -e "puts 42"` too?

I think it is because `pm_strncasecmp` is using system `tolower` function.
We have to use our own locale-insensitive version such as `st_locale_insensitive_strcasecmp` as stated in include/ruby/internal/ctype.h.

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161#change-112131

* Author: srbaker (Steven Baker)
* Status: Open
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:121196] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
  2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
  2025-02-27 13:59 ` [ruby-core:121194] " srbaker (Steven Baker) via ruby-core
  2025-02-27 14:31 ` [ruby-core:121195] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-02-27 14:32 ` nobu (Nobuyoshi Nakada) via ruby-core
  2025-02-27 14:33 ` [ruby-core:121197] " srbaker (Steven Baker) via ruby-core
  2025-03-04  1:08 ` [ruby-core:121232] " ufuk (Ufuk Kayserilioglu) via ruby-core
  4 siblings, 0 replies; 6+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-02-27 14:32 UTC (permalink / raw)
  To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)

Issue #21161 has been updated by nobu (Nobuyoshi Nakada).

Assignee set to prism

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161#change-112132

* Author: srbaker (Steven Baker)
* Status: Open
* Assignee: prism
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:121197] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
  2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
                   ` (2 preceding siblings ...)
  2025-02-27 14:32 ` [ruby-core:121196] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-02-27 14:33 ` srbaker (Steven Baker) via ruby-core
  2025-03-04  1:08 ` [ruby-core:121232] " ufuk (Ufuk Kayserilioglu) via ruby-core
  4 siblings, 0 replies; 6+ messages in thread
From: srbaker (Steven Baker) via ruby-core @ 2025-02-27 14:33 UTC (permalink / raw)
  To: ruby-core; +Cc: srbaker (Steven Baker)

Issue #21161 has been updated by srbaker (Steven Baker).


Thanks for the quick response!

It does not happen with that command:

```
srbaker@geekopad:~> ruby -v
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby --parser=parse.y -e "puts 42"
42
```

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161#change-112133

* Author: srbaker (Steven Baker)
* Status: Open
* Assignee: prism
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [ruby-core:121232] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8
  2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
                   ` (3 preceding siblings ...)
  2025-02-27 14:33 ` [ruby-core:121197] " srbaker (Steven Baker) via ruby-core
@ 2025-03-04  1:08 ` ufuk (Ufuk Kayserilioglu) via ruby-core
  4 siblings, 0 replies; 6+ messages in thread
From: ufuk (Ufuk Kayserilioglu) via ruby-core @ 2025-03-04  1:08 UTC (permalink / raw)
  To: ruby-core; +Cc: ufuk (Ufuk Kayserilioglu)

Issue #21161 has been updated by ufuk (Ufuk Kayserilioglu).

Backport changed from 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONTNEED, 3.4: REQUIRED to 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONTNEED, 3.4: DONE

ruby_3_4 commit:3d744a0a9436fbf7901c345055dd3d775b518361.

----------------------------------------
Bug #21161: Crash when locale is set to Turkish tr_TR.UTF-8
https://bugs.ruby-lang.org/issues/21161#change-112182

* Author: srbaker (Steven Baker)
* Status: Closed
* Assignee: prism
* ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]
* Backport: 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONTNEED, 3.4: DONE
----------------------------------------
TL;DR this bug was reported in our tracker, and I'm pushing it upstream: https://bugzilla.opensuse.org/show_bug.cgi?id=1237861

When the locale is set to `tr_TR.UTF-8`, there is an encoding error.  It has been narrowed down specifically to setting `LC_CTYPE`.

To reproduce simply run `LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"`

Example from a fresh 3.4.2 install:

``` shell
srbaker@geekopad:~> LC_CTYPE=tr_TR.UTF-8 ruby -e "puts 42"
/home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in 'Kernel#require': /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/x86_64-linux/rbconfig.rb:1: unknown or invalid encoding in the magic comment (ArgumentError)
> 1 | # encoding: ascii-8bit
    |             ^~~~~~~~~~
  2 | # frozen-string-literal: false
  3 | #

	from /home/srbaker/.local/share/mise/installs/ruby/3.4.2/lib64/ruby/3.4.0/rubygems.rb:9:in '<top (required)>'
	from <internal:gem_prelude>:2:in 'Kernel#require'
	from <internal:gem_prelude>:2:in '<internal:gem_prelude>'
```

This reproduces across multiple installs of ruby: from our packages, locally built on both GNU/Linux and macOS.

It looks like it's related to some normalisation on lowercase i, which in Turkish appears to produce a lowercase i without a dot, and the string.  Details in our bug linked above.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-04  1:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-27 13:37 [ruby-core:121193] [Ruby master Bug#21161] Crash when locale is set to Turkish tr_TR.UTF-8 srbaker (Steven Baker) via ruby-core
2025-02-27 13:59 ` [ruby-core:121194] " srbaker (Steven Baker) via ruby-core
2025-02-27 14:31 ` [ruby-core:121195] " nobu (Nobuyoshi Nakada) via ruby-core
2025-02-27 14:32 ` [ruby-core:121196] " nobu (Nobuyoshi Nakada) via ruby-core
2025-02-27 14:33 ` [ruby-core:121197] " srbaker (Steven Baker) via ruby-core
2025-03-04  1:08 ` [ruby-core:121232] " ufuk (Ufuk Kayserilioglu) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).