* [ruby-core:120435] [Ruby master Bug#20990] Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence
@ 2024-12-28 4:06 tompng (tomoya ishida) via ruby-core
2024-12-28 11:36 ` [ruby-core:120438] " tompng (tomoya ishida) via ruby-core
0 siblings, 1 reply; 2+ messages in thread
From: tompng (tomoya ishida) via ruby-core @ 2024-12-28 4:06 UTC (permalink / raw)
To: ruby-core; +Cc: tompng (tomoya ishida)
Issue #20990 has been reported by tompng (tomoya ishida).
----------------------------------------
Bug #20990: Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence
https://bugs.ruby-lang.org/issues/20990
* Author: tompng (tomoya ishida)
* Status: Open
* ruby -v: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +MN [arm64-darwin22]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
IRB crashes when a code is tokenized to an invalid byte sequence.
~~~ruby
Ripper.tokenize '"\C-\あ"'
#=> ["\"", "\\C-\\\xE3\x81", "\x82", "\""]
~~~
I think the error evaluating `"\C-\あ"` should be `Invalid escape character syntax` just like `"\C-あ"`
~~~
$ ./ruby --parser=parse.y -e '"\C-あ"'
-e:1: Invalid escape character syntax
"\C-あ"
$ ./ruby --parser=parse.y -e '"\C-\あ"'
-e:1: invalid multibyte char (UTF-8)
-e:1: invalid multibyte char (UTF-8)
./ruby: compile error (SyntaxError)
~~~
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 2+ messages in thread
* [ruby-core:120438] [Ruby master Bug#20990] Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence
2024-12-28 4:06 [ruby-core:120435] [Ruby master Bug#20990] Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence tompng (tomoya ishida) via ruby-core
@ 2024-12-28 11:36 ` tompng (tomoya ishida) via ruby-core
0 siblings, 0 replies; 2+ messages in thread
From: tompng (tomoya ishida) via ruby-core @ 2024-12-28 11:36 UTC (permalink / raw)
To: ruby-core; +Cc: tompng (tomoya ishida)
Issue #20990 has been updated by tompng (tomoya ishida).
Pull request: https://github.com/ruby/ruby/pull/12484
----------------------------------------
Bug #20990: Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence
https://bugs.ruby-lang.org/issues/20990#change-111214
* Author: tompng (tomoya ishida)
* Status: Open
* ruby -v: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +MN [arm64-darwin22]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
IRB crashes when a code is tokenized to an invalid byte sequence.
~~~ruby
Ripper.tokenize '"\C-\あ"'
#=> ["\"", "\\C-\\\xE3\x81", "\x82", "\""]
~~~
I think the error evaluating `"\C-\あ"` should be `Invalid escape character syntax` just like `"\C-あ"`
~~~
$ ./ruby --parser=parse.y -e '"\C-あ"'
-e:1: Invalid escape character syntax
"\C-あ"
$ ./ruby --parser=parse.y -e '"\C-\あ"'
-e:1: invalid multibyte char (UTF-8)
-e:1: invalid multibyte char (UTF-8)
./ruby: compile error (SyntaxError)
~~~
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-12-28 11:36 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-28 4:06 [ruby-core:120435] [Ruby master Bug#20990] Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence tompng (tomoya ishida) via ruby-core
2024-12-28 11:36 ` [ruby-core:120438] " tompng (tomoya ishida) via ruby-core
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).