* [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
@ 2023-11-19 16:26 ippachi (Kazuya Hatanaka) via ruby-core
2023-11-21 9:46 ` [ruby-core:115433] " byroot (Jean Boussier) via ruby-core
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: ippachi (Kazuya Hatanaka) via ruby-core @ 2023-11-19 16:26 UTC (permalink / raw)
To: ruby-core; +Cc: ippachi (Kazuya Hatanaka)
Issue #20009 has been reported by ippachi (Kazuya Hatanaka).
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:115433] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
@ 2023-11-21 9:46 ` byroot (Jean Boussier) via ruby-core
2025-05-13 10:54 ` [ruby-core:122043] [Ruby " make_now_just (Hiroya Fujinami) via ruby-core
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2023-11-21 9:46 UTC (permalink / raw)
To: ruby-core; +Cc: byroot (Jean Boussier)
Issue #20009 has been updated by byroot (Jean Boussier).
I dug into this bug, and I'm not sure if it's possible to fix it.
Classes are serialized this way:
```c
case T_CLASS:
w_byte(TYPE_CLASS, arg);
{
VALUE path = class2path(obj);
w_bytes(RSTRING_PTR(path), RSTRING_LEN(path), arg);
RB_GC_GUARD(path);
}
break;
```
We write the `TYPE_CLASS` prefix, and then write the bytes of the class name, without any encoding indication.
Then on `load`, we just read the bytes and try to lookup the class:
```c
case TYPE_CLASS:
{
VALUE str = r_bytes(arg);
v = path2class(str);
```
So on `load` we're looking for `"Cクラス".b.to_sym`, which doesn't match `:"Cクラス"`.
To fix this we'd need to include the encoding in the format, but that would mean breaking backward and forward compatibility which is a huge deal.
### Half-way solution
Some possible half-way solution would be:
- Assume non-ASCII class names are UTF-8
- Raise on dump for class names with non-UTF8 compatible class names.
It's far from ideal though.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-105364
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122043] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
2023-11-21 9:46 ` [ruby-core:115433] " byroot (Jean Boussier) via ruby-core
@ 2025-05-13 10:54 ` make_now_just (Hiroya Fujinami) via ruby-core
2025-05-13 11:47 ` [ruby-core:122048] " Eregon (Benoit Daloze) via ruby-core
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: make_now_just (Hiroya Fujinami) via ruby-core @ 2025-05-13 10:54 UTC (permalink / raw)
To: ruby-core; +Cc: make_now_just (Hiroya Fujinami)
Issue #20009 has been updated by make_now_just (Hiroya Fujinami).
In my opinion, we need to introduce a new format for dumping classes/modules correctly.
Marshal uses `c` and `m` (`TYPE_CLASS` and `TYPE_MODULE`) as format prefixes currently, so the format is the following:
```
| 1 byte | ... | ... |
| 'c'/'m' | path size | path name binary string |
```
And, this format lacks the encoding, then the bug is happened.
To solve this issue, adding the encoding information to dump results is necessary, and introducing a new format seems a natural way to me.
Therefore, a new format is the following (as a new type prefix is `K`):
```
| 1 byte | ... |
| 'K' | a dump of the path symbol (`:`, `;`, or `I`) |
```
Such a format is used for dumping other kinds of objects. e.g., `o` and `S` (`TYPE_OBJECT` and `TYPE_STRUCT`).
This idea does not break the backward compatibility, but we need to increment `MARSHAL_MINOR`.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113192
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122048] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
2023-11-21 9:46 ` [ruby-core:115433] " byroot (Jean Boussier) via ruby-core
2025-05-13 10:54 ` [ruby-core:122043] [Ruby " make_now_just (Hiroya Fujinami) via ruby-core
@ 2025-05-13 11:47 ` Eregon (Benoit Daloze) via ruby-core
2025-05-16 8:40 ` [ruby-core:122137] " nobu (Nobuyoshi Nakada) via ruby-core
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-05-13 11:47 UTC (permalink / raw)
To: ruby-core; +Cc: Eregon (Benoit Daloze)
Issue #20009 has been updated by Eregon (Benoit Daloze).
byroot (Jean Boussier) wrote in #note-1:
> ### Half-way solution
>
> Some possible half-way solution would be:
>
> * Assume non-ASCII class names are UTF-8
> * Raise on dump for class names with non-UTF8 compatible class names.
>
> It's far from ideal though.
I think this would be a pretty good solution actually, it seems very unlikely to have class names which can't be encoded in UTF-8.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113198
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122137] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (2 preceding siblings ...)
2025-05-13 11:47 ` [ruby-core:122048] " Eregon (Benoit Daloze) via ruby-core
@ 2025-05-16 8:40 ` nobu (Nobuyoshi Nakada) via ruby-core
2025-05-16 10:35 ` [ruby-core:122139] " nobu (Nobuyoshi Nakada) via ruby-core
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-05-16 8:40 UTC (permalink / raw)
To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)
Issue #20009 has been updated by nobu (Nobuyoshi Nakada).
Currently instance variables class/module are prohibited.
It may be possible to put the encoding information there.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113288
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122139] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (3 preceding siblings ...)
2025-05-16 8:40 ` [ruby-core:122137] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-05-16 10:35 ` nobu (Nobuyoshi Nakada) via ruby-core
2025-05-16 12:08 ` [ruby-core:122143] " Eregon (Benoit Daloze) via ruby-core
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-05-16 10:35 UTC (permalink / raw)
To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)
Issue #20009 has been updated by nobu (Nobuyoshi Nakada).
I made a patch which works well except for tests added by [ruby/spec@907cb35](https://github.com/ruby/spec/commit/907cb35e21).
Since the dumped strings in these tests have not been loadable, I think they are useless actually.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113290
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122143] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (4 preceding siblings ...)
2025-05-16 10:35 ` [ruby-core:122139] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-05-16 12:08 ` Eregon (Benoit Daloze) via ruby-core
2025-05-16 12:10 ` [ruby-core:122144] " Eregon (Benoit Daloze) via ruby-core
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-05-16 12:08 UTC (permalink / raw)
To: ruby-core; +Cc: Eregon (Benoit Daloze)
Issue #20009 has been updated by Eregon (Benoit Daloze).
Mmh, but Marshal.dump+load of such non-7-bit modules/classes works on TruffleRuby, although it needs a tiny fix:
```patch
diff --git a/src/main/ruby/truffleruby/core/marshal.rb b/src/main/ruby/truffleruby/core/marshal.rb
index 102468e774..ea7469ea4a 100644
--- a/src/main/ruby/truffleruby/core/marshal.rb
+++ b/src/main/ruby/truffleruby/core/marshal.rb
@@ -786,19 +786,19 @@ module Marshal
end
def construct_class
- obj = const_lookup(get_byte_sequence.to_sym, Class)
+ obj = const_lookup(get_byte_sequence.force_encoding(Encoding::UTF_8).to_sym, Class)
store_unique_object obj
obj
end
def construct_module
- obj = const_lookup(get_byte_sequence.to_sym, Module)
+ obj = const_lookup(get_byte_sequence.force_encoding(Encoding::UTF_8).to_sym, Module)
store_unique_object obj
obj
end
def construct_old_module
- obj = const_lookup(get_byte_sequence.to_sym)
+ obj = const_lookup(get_byte_sequence.force_encoding(Encoding::UTF_8).to_sym)
store_unique_object obj
obj
end
```
```ruby
class MultibyteぁあぃいClass
end
source_object = MultibyteぁあぃいClass
p Marshal.dump(source_object)
p Marshal.load(Marshal.dump(source_object))
```
```
$ ruby -v marshal_class.rb
truffleruby 25.0.0-dev-a65bde3d, like ruby 3.3.7, Interpreted JVM [x86_64-linux]
"\x04\bc\x1FMultibyte\xE3\x81\x81\xE3\x81\x82\xE3\x81\x83\xE3\x81\x84Class"
MultibyteぁあぃいClass
```
I think at least if no encoding information is present we should assume UTF-8, because it's by far the most common source encoding.
I think there is no value to look up the name in BINARY encoding as currently, such a constant wouldn't even print well.
(FWIW TruffleRuby stores constant names as Java Strings, which means no encoding information. I'm not convinced it's a good idea to e.g. have two constants `É` in e.g. UTF-8 and ISO-8859-1 on the same module, it just seems needless confusion. Having non-7-bit BINARY-encoded constants seems no good either.)
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113294
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122144] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (5 preceding siblings ...)
2025-05-16 12:08 ` [ruby-core:122143] " Eregon (Benoit Daloze) via ruby-core
@ 2025-05-16 12:10 ` Eregon (Benoit Daloze) via ruby-core
2025-05-16 12:15 ` [ruby-core:122145] " Eregon (Benoit Daloze) via ruby-core
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-05-16 12:10 UTC (permalink / raw)
To: ruby-core; +Cc: Eregon (Benoit Daloze)
Issue #20009 has been updated by Eregon (Benoit Daloze).
So my suggestion would be:
* Interpret the serialized module/class name as UTF-8, not as BINARY. And of course if it's only 7-bit as US-ASCII (already the case).
* If the module/class name uses another encoding, we could either transcode it to UTF-8, or use nobu's trick of serializing the encoding name as a fake instance variable. Transcoding to UTF-8 seems simpler, and the only case it wouldn't work is for BINARY with non-7-bit, which seems of no value anyway.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113295
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122145] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (6 preceding siblings ...)
2025-05-16 12:10 ` [ruby-core:122144] " Eregon (Benoit Daloze) via ruby-core
@ 2025-05-16 12:15 ` Eregon (Benoit Daloze) via ruby-core
2025-05-17 9:43 ` [ruby-core:122168] " nobu (Nobuyoshi Nakada) via ruby-core
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-05-16 12:15 UTC (permalink / raw)
To: ruby-core; +Cc: Eregon (Benoit Daloze)
Issue #20009 has been updated by Eregon (Benoit Daloze).
@nobu What happens when using your patch to Marshal.dump a class and then trying to load it on an older Ruby? Will it create an instance variable `E` or error?
I guess if there is no matching BINARY constant it would error anyway, but if there is it would set that instance variable `E` or different error.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113296
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122168] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (7 preceding siblings ...)
2025-05-16 12:15 ` [ruby-core:122145] " Eregon (Benoit Daloze) via ruby-core
@ 2025-05-17 9:43 ` nobu (Nobuyoshi Nakada) via ruby-core
2025-05-17 9:47 ` [ruby-core:122169] " nobu (Nobuyoshi Nakada) via ruby-core
2025-06-15 4:00 ` [ruby-core:122536] " nagachika (Tomoyuki Chikanaga) via ruby-core
10 siblings, 0 replies; 12+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-05-17 9:43 UTC (permalink / raw)
To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)
Issue #20009 has been updated by nobu (Nobuyoshi Nakada).
Eregon (Benoit Daloze) wrote in #note-8:
> @nobu What happens when using your patch to Marshal.dump a class and then trying to load it on an older Ruby? Will it create an instance variable `E` or error?
Just an error.
> I guess if there is no matching BINARY constant it would error anyway, but if there is it would set that instance variable `E` or different error.
If the matching BINARY class/module exists, the instance variable causes an error.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113317
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122169] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (8 preceding siblings ...)
2025-05-17 9:43 ` [ruby-core:122168] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-05-17 9:47 ` nobu (Nobuyoshi Nakada) via ruby-core
2025-06-15 4:00 ` [ruby-core:122536] " nagachika (Tomoyuki Chikanaga) via ruby-core
10 siblings, 0 replies; 12+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-05-17 9:47 UTC (permalink / raw)
To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)
Issue #20009 has been updated by nobu (Nobuyoshi Nakada).
Eregon (Benoit Daloze) wrote in #note-7:
> So my suggestion would be:
> * Interpret the serialized module/class name as UTF-8, not as BINARY. And of course if it's only 7-bit as US-ASCII (already the case).
> * If the module/class name uses another encoding, we could either transcode it to UTF-8, or use nobu's trick of serializing the encoding name as a fake instance variable. Transcoding to UTF-8 seems simpler, and the only case it wouldn't work is for BINARY with non-7-bit, which seems of no value anyway. EDIT: mmh but I suppose transcoding to UTF-8 wouldn't work on CRuby because the lookup would fail.
I'm not against to interpret the default encoding as UTF-8, but don't think transcoding is intuitive as different encoding symbols are different even if they are same when transcoded.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113318
* Author: ippachi (Kazuya Hatanaka)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ruby-core:122536] [Ruby Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
` (9 preceding siblings ...)
2025-05-17 9:47 ` [ruby-core:122169] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-06-15 4:00 ` nagachika (Tomoyuki Chikanaga) via ruby-core
10 siblings, 0 replies; 12+ messages in thread
From: nagachika (Tomoyuki Chikanaga) via ruby-core @ 2025-06-15 4:00 UTC (permalink / raw)
To: ruby-core; +Cc: nagachika (Tomoyuki Chikanaga)
Issue #20009 has been updated by nagachika (Tomoyuki Chikanaga).
In my understanding, with commit:097d742a1ed53afb91e83aef01365d68b763357b Marshal.load could cause error with bytes stream which was dumped from older version ruby. If this is right, I don't want to backport the change to stable branch.
----------------------------------------
Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII
https://bugs.ruby-lang.org/issues/20009#change-113762
* Author: ippachi (Kazuya Hatanaka)
* Status: Closed
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
* Backport: 3.2: REQUIRED, 3.3: REQUIRED, 3.4: REQUIRED
----------------------------------------
## Reproduction code
```ruby
class Cクラス; end
Marshal.load(Marshal.dump(Cクラス))
```
## Actual result
```
<internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError)
from marshal.rb:2:in `<main>'
```
## Expected result
Returns `Cクラス`
## Impacted area
An exception is raised in Rails under the following conditions
* minitest is used with default settings
* Parallel execution with parallelize
* test class names contain non-ASCII characters
The default parallelization uses DRb, and Marshal is used inside DRb.
## Other
After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-06-15 4:01 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-19 16:26 [ruby-core:115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII ippachi (Kazuya Hatanaka) via ruby-core
2023-11-21 9:46 ` [ruby-core:115433] " byroot (Jean Boussier) via ruby-core
2025-05-13 10:54 ` [ruby-core:122043] [Ruby " make_now_just (Hiroya Fujinami) via ruby-core
2025-05-13 11:47 ` [ruby-core:122048] " Eregon (Benoit Daloze) via ruby-core
2025-05-16 8:40 ` [ruby-core:122137] " nobu (Nobuyoshi Nakada) via ruby-core
2025-05-16 10:35 ` [ruby-core:122139] " nobu (Nobuyoshi Nakada) via ruby-core
2025-05-16 12:08 ` [ruby-core:122143] " Eregon (Benoit Daloze) via ruby-core
2025-05-16 12:10 ` [ruby-core:122144] " Eregon (Benoit Daloze) via ruby-core
2025-05-16 12:15 ` [ruby-core:122145] " Eregon (Benoit Daloze) via ruby-core
2025-05-17 9:43 ` [ruby-core:122168] " nobu (Nobuyoshi Nakada) via ruby-core
2025-05-17 9:47 ` [ruby-core:122169] " nobu (Nobuyoshi Nakada) via ruby-core
2025-06-15 4:00 ` [ruby-core:122536] " nagachika (Tomoyuki Chikanaga) via ruby-core
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).