* [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors
@ 2024-08-08 10:07 olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 3:31 ` [ruby-core:118813] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: olleolleolle (Olle Jonsson) via ruby-core @ 2024-08-08 10:07 UTC (permalink / raw)
To: ruby-core; +Cc: olleolleolle (Olle Jonsson)
Issue #20669 has been reported by olleolleolle (Olle Jonsson).
----------------------------------------
Feature #20669: Add error classes to differentiate Marshal ArgumentErrors
https://bugs.ruby-lang.org/issues/20669
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Currently, Marshal.load raises ArgumentErrors on failure.
In the memcache client Dalli, there is [a regular expression to check for a few of the ArgumentErrors that can be raised during Marshal.load](https://github.com/petergoldstein/dalli/blob/v3.2.8/lib/dalli/protocol/value_serializer.rb#L41).
In order to make it easier to work with, perhaps new individual error classes deriving from ArgumentError could be introduced in Marshal? Perhaps a MarshalError base class deriving from ArgumentError? Something backwards-compatible.
[Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:118813] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
@ 2024-08-09 3:31 ` kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-08-09 7:24 ` [ruby-core:118816] " olleolleolle (Olle Jonsson) via ruby-core
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core @ 2024-08-09 3:31 UTC (permalink / raw)
To: ruby-core; +Cc: kjtsanaktsidis (KJ Tsanaktsidis)
Issue #20669 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).
I agree it might be nice if Marshal had its own error class like `MarshalError` or some such. However, I don't really see why we need to _differentiate_ between different kinds of unmarshaling errors? Any one of these things from dalli ("marshal data too short", "incompatible marshal file format", ...) are in the realm of "a human needs to look at the data and figure out what went wrong", I would have thought?
I.e. it's useful to know that an exception did come from Marshal, but I don't think I quite see the value in knowing _what_ exception came from Marshal? Maybe I'm missing something obvious and you can enlighten me :)
----------------------------------------
Feature #20669: Add error classes to differentiate Marshal ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109371
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Currently, Marshal.load raises ArgumentErrors on failure.
In the memcache client Dalli, there is [a regular expression to check for a few of the ArgumentErrors that can be raised during Marshal.load](https://github.com/petergoldstein/dalli/blob/v3.2.8/lib/dalli/protocol/value_serializer.rb#L41).
In order to make it easier to work with, perhaps new individual error classes deriving from ArgumentError could be introduced in Marshal? Perhaps a MarshalError base class deriving from ArgumentError? Something backwards-compatible.
[Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:118816] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 3:31 ` [ruby-core:118813] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
@ 2024-08-09 7:24 ` olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 7:27 ` [ruby-core:118817] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: olleolleolle (Olle Jonsson) via ruby-core @ 2024-08-09 7:24 UTC (permalink / raw)
To: ruby-core; +Cc: olleolleolle (Olle Jonsson)
Issue #20669 has been updated by olleolleolle (Olle Jonsson).
kjtsanaktsidis (KJ Tsanaktsidis) wrote in #note-1:
> I agree it might be nice if Marshal had its own error class like `MarshalError` or some such. However, I don't really see why we need to _differentiate_ between different kinds of unmarshaling errors? Any one of these things from dalli ("marshal data too short", "incompatible marshal file format", ...) are in the realm of "a human needs to look at the data and figure out what went wrong", I would have thought?
>
> I.e. it's useful to know that an exception did come from Marshal, but I don't think I quite see the value in knowing _what_ exception came from Marshal? Maybe I'm missing something obvious and you can enlighten me :)
Thanks! Alright, that'd scope it down, making the patch smaller and easier to deal with. I will make an edit to my proposal to be about adding a "is the data you are trying to load broken?" ArgumentError subclass called MarshalError.
----------------------------------------
Feature #20669: Add error classes to differentiate Marshal ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109374
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Currently, Marshal.load raises ArgumentErrors on failure.
In the memcache client Dalli, there is [a regular expression to check for a few of the ArgumentErrors that can be raised during Marshal.load](https://github.com/petergoldstein/dalli/blob/v3.2.8/lib/dalli/protocol/value_serializer.rb#L41).
In order to make it easier to work with, perhaps new individual error classes deriving from ArgumentError could be introduced in Marshal? Perhaps a MarshalError base class deriving from ArgumentError? Something backwards-compatible.
[Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:118817] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 3:31 ` [ruby-core:118813] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-08-09 7:24 ` [ruby-core:118816] " olleolleolle (Olle Jonsson) via ruby-core
@ 2024-08-09 7:27 ` olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 15:41 ` [ruby-core:118821] " byroot (Jean Boussier) via ruby-core
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: olleolleolle (Olle Jonsson) via ruby-core @ 2024-08-09 7:27 UTC (permalink / raw)
To: ruby-core; +Cc: olleolleolle (Olle Jonsson)
Issue #20669 has been updated by olleolleolle (Olle Jonsson).
Description updated
Changed the proposal to be about one error class. Thanks, KJ!
----------------------------------------
Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109376
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Currently, Marshal.load raises ArgumentErrors on failure.
In the memcache client Dalli, there is [a regular expression to check for a few of the ArgumentErrors that can be raised during Marshal.load](https://github.com/petergoldstein/dalli/blob/v3.2.8/lib/dalli/protocol/value_serializer.rb#L41).
In order to make it easier to work with, perhaps a new error class deriving from ArgumentError could be introduced in Marshal? Say, Marshal::MarshalError deriving from ArgumentError? Something backwards-compatible.
[Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:118821] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
` (2 preceding siblings ...)
2024-08-09 7:27 ` [ruby-core:118817] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
@ 2024-08-09 15:41 ` byroot (Jean Boussier) via ruby-core
2024-08-09 19:31 ` [ruby-core:118822] " Eregon (Benoit Daloze) via ruby-core
2024-10-03 9:46 ` [ruby-core:119431] " olleolleolle (Olle Jonsson) via ruby-core
5 siblings, 0 replies; 7+ messages in thread
From: byroot (Jean Boussier) via ruby-core @ 2024-08-09 15:41 UTC (permalink / raw)
To: ruby-core; +Cc: byroot (Jean Boussier)
Issue #20669 has been updated by byroot (Jean Boussier).
> We may be able to create a sub-class of class LoadError < ArgumentError but it fundamentally seems wrong to me (should really be a RuntimeError).
There's little way around it, lots of code today expects `ArgumentError`, if you start raising something that doesn't inherit from `ArgumentError` you'll break that code.
Also I don't think it's wrong for it to be an argument error. You supply a string argument that isn't valid marshal data, so argument error make sense, just like how `Integer("abcd")` raises `ArgumentError`.
----------------------------------------
Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109387
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class.
## Background
There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g.
```
> Marshal.load("foobar")
<internal:marshal>:34:in `load': incompatible marshal file format (can't be read) (TypeError)
> Marshal.load(Marshal.dump(Object.new).slice(0, 10))
<internal:marshal>:34:in `load': marshal data too short (ArgumentError)
> MyThing = Struct.new(:name, :age)
=> MyThing
> Marshal.dump(MyThing.new("Alice", 20))
=> "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
(in a separate session)
> Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
<internal:marshal>:34:in `load': undefined class/module MyThing (ArgumentError)
```
When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`.
Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument.
Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72):
```ruby
# TODO: Some of these error messages need to be validated. It's not obvious
# that all of them are actually generated by the invoked code
# in current systems
# rubocop:disable Layout/LineLength
TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze
ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze
NAME_ERR_STR = 'uninitialized constant'
# rubocop:enable Layout/LineLength
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue TypeError => e
filter_type_error(e)
rescue ArgumentError => e
filter_argument_error(e)
rescue NameError => e
filter_name_error(e)
end
def filter_type_error(err)
raise err unless TYPE_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_argument_error(err)
raise err unless ARGUMENT_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_name_error(err)
raise err unless err.message.include?(NAME_ERR_STR)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
```
Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production.
## Proposal
Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`.
```ruby
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue Marshal::LoadError => error
raise UnmarshalError, "Unable to unmarshal value!"
end
```
One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`).
## See also
- [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
- https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar).
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:118822] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
` (3 preceding siblings ...)
2024-08-09 15:41 ` [ruby-core:118821] " byroot (Jean Boussier) via ruby-core
@ 2024-08-09 19:31 ` Eregon (Benoit Daloze) via ruby-core
2024-10-03 9:46 ` [ruby-core:119431] " olleolleolle (Olle Jonsson) via ruby-core
5 siblings, 0 replies; 7+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2024-08-09 19:31 UTC (permalink / raw)
To: ruby-core; +Cc: Eregon (Benoit Daloze)
Issue #20669 has been updated by Eregon (Benoit Daloze).
Isn't it fair to assume any Exception from calling `Marshal.load` is an error while de-marshaling the data?
I think it is.
If it's about `serializer.load(value)` and `serializer` not being always Marshal, that's easy to deal with: just do a `if serializer == Marshal` and `rescue` things differently there if you want to wrap in a `UnmarshalError`.
----------------------------------------
Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109388
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class.
## Background
There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g.
```
> Marshal.load("foobar")
<internal:marshal>:34:in `load': incompatible marshal file format (can't be read) (TypeError)
> Marshal.load(Marshal.dump(Object.new).slice(0, 10))
<internal:marshal>:34:in `load': marshal data too short (ArgumentError)
> MyThing = Struct.new(:name, :age)
=> MyThing
> Marshal.dump(MyThing.new("Alice", 20))
=> "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
(in a separate session)
> Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
<internal:marshal>:34:in `load': undefined class/module MyThing (ArgumentError)
```
When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`.
Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument.
Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72):
```ruby
# TODO: Some of these error messages need to be validated. It's not obvious
# that all of them are actually generated by the invoked code
# in current systems
# rubocop:disable Layout/LineLength
TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze
ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze
NAME_ERR_STR = 'uninitialized constant'
# rubocop:enable Layout/LineLength
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue TypeError => e
filter_type_error(e)
rescue ArgumentError => e
filter_argument_error(e)
rescue NameError => e
filter_name_error(e)
end
def filter_type_error(err)
raise err unless TYPE_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_argument_error(err)
raise err unless ARGUMENT_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_name_error(err)
raise err unless err.message.include?(NAME_ERR_STR)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
```
Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production.
## Proposal
Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`.
```ruby
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue Marshal::LoadError => error
raise UnmarshalError, "Unable to unmarshal value!"
end
```
One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`).
## See also
- [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
- https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar).
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
* [ruby-core:119431] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
` (4 preceding siblings ...)
2024-08-09 19:31 ` [ruby-core:118822] " Eregon (Benoit Daloze) via ruby-core
@ 2024-10-03 9:46 ` olleolleolle (Olle Jonsson) via ruby-core
5 siblings, 0 replies; 7+ messages in thread
From: olleolleolle (Olle Jonsson) via ruby-core @ 2024-10-03 9:46 UTC (permalink / raw)
To: ruby-core; +Cc: olleolleolle (Olle Jonsson)
Issue #20669 has been updated by olleolleolle (Olle Jonsson).
In the meantime, Dalli merged a change that sounds a lot like Eregon's comment. https://github.com/petergoldstein/dalli/pull/1011
My immediate needs are met, and I will close this feature request.
----------------------------------------
Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-110046
* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class.
## Background
There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g.
```
> Marshal.load("foobar")
<internal:marshal>:34:in `load': incompatible marshal file format (can't be read) (TypeError)
> Marshal.load(Marshal.dump(Object.new).slice(0, 10))
<internal:marshal>:34:in `load': marshal data too short (ArgumentError)
> MyThing = Struct.new(:name, :age)
=> MyThing
> Marshal.dump(MyThing.new("Alice", 20))
=> "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
(in a separate session)
> Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
<internal:marshal>:34:in `load': undefined class/module MyThing (ArgumentError)
```
When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`.
Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument.
Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72):
```ruby
# TODO: Some of these error messages need to be validated. It's not obvious
# that all of them are actually generated by the invoked code
# in current systems
# rubocop:disable Layout/LineLength
TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze
ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze
NAME_ERR_STR = 'uninitialized constant'
# rubocop:enable Layout/LineLength
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue TypeError => e
filter_type_error(e)
rescue ArgumentError => e
filter_argument_error(e)
rescue NameError => e
filter_name_error(e)
end
def filter_type_error(err)
raise err unless TYPE_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_argument_error(err)
raise err unless ARGUMENT_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_name_error(err)
raise err unless err.message.include?(NAME_ERR_STR)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
```
Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production.
## Proposal
Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`.
```ruby
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue Marshal::LoadError => error
raise UnmarshalError, "Unable to unmarshal value!"
end
```
One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`).
## See also
- [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
- https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar).
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-10-03 9:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-08 10:07 [ruby-core:118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 3:31 ` [ruby-core:118813] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-08-09 7:24 ` [ruby-core:118816] " olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 7:27 ` [ruby-core:118817] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors olleolleolle (Olle Jonsson) via ruby-core
2024-08-09 15:41 ` [ruby-core:118821] " byroot (Jean Boussier) via ruby-core
2024-08-09 19:31 ` [ruby-core:118822] " Eregon (Benoit Daloze) via ruby-core
2024-10-03 9:46 ` [ruby-core:119431] " olleolleolle (Olle Jonsson) via ruby-core
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).