ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
@ 2025-06-04 19:31 tenderlovemaking (Aaron Patterson) via ruby-core
  2025-06-04 19:43 ` [ruby-core:122412] " jeremyevans0 (Jeremy Evans) via ruby-core
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: tenderlovemaking (Aaron Patterson) via ruby-core @ 2025-06-04 19:31 UTC (permalink / raw)
  To: ruby-core; +Cc: tenderlovemaking (Aaron Patterson)

Issue #21396 has been reported by tenderlovemaking (Aaron Patterson).

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122412] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
@ 2025-06-04 19:43 ` jeremyevans0 (Jeremy Evans) via ruby-core
  2025-06-04 19:48 ` [ruby-core:122413] " tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jeremyevans0 (Jeremy Evans) via ruby-core @ 2025-06-04 19:43 UTC (permalink / raw)
  To: ruby-core; +Cc: jeremyevans0 (Jeremy Evans)

Issue #21396 has been updated by jeremyevans0 (Jeremy Evans).


This is not a bug, IMO.  Using underlying functions instead of calling methods was one of the deliberate design decisions for core Set (see #21216), and how other core collection classes work. `Array.new(1, true)` does not call `Array#[]=`, it calls `rb_ary_store`.  

If we want to do this for `Set#initialize` and `Set.[]` for backwards compatibility, we should be consistent and call methods instead of underlying functions for every case where the methods were called in stdlib set.  That will make it slower, though.

FWIW, in the code path you are using in stdlib Set, `Set#add` is not called directly, you are relying on `Set#merge` calling it.  So this is at least a request to have `Set#initialize` call `Set#merge` and to have `Set#merge` call `Set#add` for every element.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113592

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122413] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
  2025-06-04 19:43 ` [ruby-core:122412] " jeremyevans0 (Jeremy Evans) via ruby-core
@ 2025-06-04 19:48 ` tenderlovemaking (Aaron Patterson) via ruby-core
  2025-06-05  1:08 ` [ruby-core:122415] " ko1 (Koichi Sasada) via ruby-core
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: tenderlovemaking (Aaron Patterson) via ruby-core @ 2025-06-04 19:48 UTC (permalink / raw)
  To: ruby-core; +Cc: tenderlovemaking (Aaron Patterson)

Issue #21396 has been updated by tenderlovemaking (Aaron Patterson).


jeremyevans0 (Jeremy Evans) wrote in #note-2:
> This is not a bug, IMO.  Using underlying functions instead of calling methods was one of the deliberate design decisions for core Set (see #21216), and how other core collection classes work. `Array.new(1, true)` does not call `Array#[]=`, it calls `rb_ary_store`.  
> 
> If we want to do this for `Set#initialize` and `Set.[]` for backwards compatibility, we should be consistent and call methods instead of underlying functions for every case where the methods were called in stdlib set.  That will make it slower, though.
> 
> FWIW, in the code path you are using in stdlib Set, `Set#add` is not called directly, you are relying on `Set#merge` calling it.  So this is at least a request to have `Set#initialize` call `Set#merge` and to have `Set#merge` call `Set#add` for every element.

I don't have an opinion, really. But this change in behavior is breaking our existing code, and I suspect we're not the only ones.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113593

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122415] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
  2025-06-04 19:43 ` [ruby-core:122412] " jeremyevans0 (Jeremy Evans) via ruby-core
  2025-06-04 19:48 ` [ruby-core:122413] " tenderlovemaking (Aaron Patterson) via ruby-core
@ 2025-06-05  1:08 ` ko1 (Koichi Sasada) via ruby-core
  2025-06-05  8:34 ` [ruby-core:122433] " Eregon (Benoit Daloze) via ruby-core
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: ko1 (Koichi Sasada) via ruby-core @ 2025-06-05  1:08 UTC (permalink / raw)
  To: ruby-core; +Cc: ko1 (Koichi Sasada)

Issue #21396 has been updated by ko1 (Koichi Sasada).


How about to redfine `initialize` on subclass of `Set` to call `#add`?

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113596

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122433] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (2 preceding siblings ...)
  2025-06-05  1:08 ` [ruby-core:122415] " ko1 (Koichi Sasada) via ruby-core
@ 2025-06-05  8:34 ` Eregon (Benoit Daloze) via ruby-core
  2025-06-05 13:43 ` [ruby-core:122456] " knu (Akinori MUSHA) via ruby-core
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-06-05  8:34 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #21396 has been updated by Eregon (Benoit Daloze).


Is there any public code in some gem or so that relies on this? (the example is rather synthetic)

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113614

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122456] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (3 preceding siblings ...)
  2025-06-05  8:34 ` [ruby-core:122433] " Eregon (Benoit Daloze) via ruby-core
@ 2025-06-05 13:43 ` knu (Akinori MUSHA) via ruby-core
  2025-06-05 17:05 ` [ruby-core:122464] " tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: knu (Akinori MUSHA) via ruby-core @ 2025-06-05 13:43 UTC (permalink / raw)
  To: ruby-core; +Cc: knu (Akinori MUSHA)

Issue #21396 has been updated by knu (Akinori MUSHA).


I've always created custom variants of Set too, and I don't think it's rare to find these in in-house codebases.



----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113645

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122464] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (4 preceding siblings ...)
  2025-06-05 13:43 ` [ruby-core:122456] " knu (Akinori MUSHA) via ruby-core
@ 2025-06-05 17:05 ` tenderlovemaking (Aaron Patterson) via ruby-core
  2025-07-08 14:57 ` [ruby-core:122681] " knu (Akinori MUSHA) via ruby-core
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: tenderlovemaking (Aaron Patterson) via ruby-core @ 2025-06-05 17:05 UTC (permalink / raw)
  To: ruby-core; +Cc: tenderlovemaking (Aaron Patterson)

Issue #21396 has been updated by tenderlovemaking (Aaron Patterson).


ko1 (Koichi Sasada) wrote in #note-4:
> How about to redfine `initialize` on subclass of `Set` to call `#add`?

I think we could, but that means people have to change their code when upgrading.

Eregon (Benoit Daloze) wrote in #note-5:
> Is there any public code in some gem or so that relies on this? (the example is rather synthetic)

It's kind of hard to search for this on GitHub, but I was able to find two similar-ish examples:

[This one](https://github.com/alunny/confetti/blob/a10fab1180ab5a553bcda908834240ad277cd168/lib/typedset.rb#L13-L15) would not raise an exception in the same way, and [this one](https://github.com/bassnode/liner_notes/blob/26a899af271453c9ca7f95c11cee9e272189b456/lib/links.rb#L17-L21) could potentially be missing entries from its `@hash_lookup` instance variable.

Maybe we could publish a gem that just has a copy of the current `set.rb` file, but with `Set` renamed to `RbSet` or something. Then if people run in to issues they can use the gem and change the superclass to `RbSet`.  I think it would fix our case at work and probably the cases referenced above.  I agree with @jeremyevans0's points, but I'm sure this is going to cause friction for people upgrading and it would be nice if we can make it as easy as possible to upgrade.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113652

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122681] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (5 preceding siblings ...)
  2025-06-05 17:05 ` [ruby-core:122464] " tenderlovemaking (Aaron Patterson) via ruby-core
@ 2025-07-08 14:57 ` knu (Akinori MUSHA) via ruby-core
  2025-07-09  0:41 ` [ruby-core:122692] " jeremyevans0 (Jeremy Evans) via ruby-core
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: knu (Akinori MUSHA) via ruby-core @ 2025-07-08 14:57 UTC (permalink / raw)
  To: ruby-core; +Cc: knu (Akinori MUSHA)

Issue #21396 has been updated by knu (Akinori MUSHA).


Considering the feedback we've received about compatibility in the new experimental Set implementation, it may be in our best interest to revert to the pure-Ruby version.

If improving performance and reducing memory footprint remain crucial, one option would be to keep only the underlying data structure and a few core methods from set.c, while leaving the rest written in pure Ruby.

I also considered introducing a separate base class (RbSet, Set::Ruby, Set::Base, or whatever) for compatibility, but that approach feels neither elegant nor the Ruby way, and it risks becoming permanent technical debt.  Since 3.5 hasn't been finalized yet, I'm leaning toward making Set itself subclass-friendly again.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113961

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122692] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (6 preceding siblings ...)
  2025-07-08 14:57 ` [ruby-core:122681] " knu (Akinori MUSHA) via ruby-core
@ 2025-07-09  0:41 ` jeremyevans0 (Jeremy Evans) via ruby-core
  2025-07-09  4:57 ` [ruby-core:122693] " knu (Akinori MUSHA) via ruby-core
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jeremyevans0 (Jeremy Evans) via ruby-core @ 2025-07-09  0:41 UTC (permalink / raw)
  To: ruby-core; +Cc: jeremyevans0 (Jeremy Evans)

Issue #21396 has been updated by jeremyevans0 (Jeremy Evans).


knu (Akinori MUSHA) wrote in #note-8:
> Considering the feedback we've received about compatibility in the new experimental Set implementation, it may be in our best interest to revert to the pure-Ruby version.
> 
> If improving performance and reducing memory footprint remain crucial, one option would be to keep only the underlying data structure and a few core methods from set.c, while leaving the rest written in pure Ruby.

I don't think reverting to the pure Ruby version is in our best interests.  I do understand the backwards compatibility issues, though to be honest, all issues raised so far are cases where users were relying on implementation details (of the form method `a` calls method `b`).  I think the benefits of making Set similar to Array and Hash exceed the costs of the backwards compatibility issues.  It's not just performance and reduced memory usage, but also thread safety.

Consider `add?`

```ruby
  def add?(o)
    add(o) unless include?(o)
  end
```

Keeping backwards compatibility (`add?` calls `include?` and possibly `add`) here requires giving up thread safety.  With the pure Ruby implementation, `add?` can return a value that was already added to the set, if another thread concurrently added it to the set between the `include?` and `add` calls.  This type of thread safety issue can perhaps be accepted from a standard library, but should not be accepted for a core class. Since Ruby 3.2, Set has been closer to core class than standard library, due to the autoload.

As you mentioned, restoring backwards compatibility is possible while keeping the same optimized data structure. It's mostly a question of how far we want to go. If we want to support backwards compatibility for Set subclasses and method overrides, I think we should assess each situation on a case-by-case basis. I can implement the changes to have `Set#initialize` call `Set#add` (this issue) and have `Set.[]` call `Set#initialize` (#21375), if that is the behavior you prefer. We can officially support such method overriding by adding tests for the behavior, so users who override the methods are not relying on implementation details.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113970

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122693] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (7 preceding siblings ...)
  2025-07-09  0:41 ` [ruby-core:122692] " jeremyevans0 (Jeremy Evans) via ruby-core
@ 2025-07-09  4:57 ` knu (Akinori MUSHA) via ruby-core
  2025-07-09  6:35 ` [ruby-core:122694] " jeremyevans0 (Jeremy Evans) via ruby-core
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: knu (Akinori MUSHA) via ruby-core @ 2025-07-09  4:57 UTC (permalink / raw)
  To: ruby-core; +Cc: knu (Akinori MUSHA)

Issue #21396 has been updated by knu (Akinori MUSHA).


Jeremy, thanks for the reply.

Your point about thread-safety is well taken.  It is an important advantage.  As a possible compromise, we could keep the C backing and switch the behavior of methods when the class is subclassed in exchange for losing thread-safety.  It's just an idea and not the cleanest solution, though.

The way Set was written, the accompanying tests, and the existence of OrderedSet all make it clear that this behaviour was a deliberate design choice, so after those long-standing signals, I feel like it's hard to tell users that those were implementation details they shouldn't have relied on.


----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113971

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122694] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (8 preceding siblings ...)
  2025-07-09  4:57 ` [ruby-core:122693] " knu (Akinori MUSHA) via ruby-core
@ 2025-07-09  6:35 ` jeremyevans0 (Jeremy Evans) via ruby-core
  2025-07-09  8:12 ` [ruby-core:122695] " Eregon (Benoit Daloze) via ruby-core
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jeremyevans0 (Jeremy Evans) via ruby-core @ 2025-07-09  6:35 UTC (permalink / raw)
  To: ruby-core; +Cc: jeremyevans0 (Jeremy Evans)

Issue #21396 has been updated by jeremyevans0 (Jeremy Evans).


knu (Akinori MUSHA) wrote in #note-10:
> Jeremy, thanks for the reply.
> 
> Your point about thread-safety is well taken.  It is an important advantage.  As a possible compromise, we could keep the C backing and switch the behavior of methods when the class is subclassed in exchange for losing thread-safety.  It's just an idea and not the cleanest solution, though.

I think that is a reasonable compromise.  It keeps the advantages of core Set when not subclassed, and keeps backwards compatibility for subclasses (at least, subclasses that don't access `@hash`).  Please let me know if you would like me to work on that.

> The way Set was written, the accompanying tests, and the existence of OrderedSet all make it clear that this behaviour was a deliberate design choice, so after those long-standing signals, I feel like it's hard to tell users that those were implementation details they shouldn't have relied on.

Thank you for providing that context.  Since there were not explicit tests that methods should call other methods, my assumption was that the behavior was an implementation detail and not by design.  If we consider these to be deliberate design decisions and not implementation details, I recommend we add tests for them. I can handle that if you would like.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113972

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:122695] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (9 preceding siblings ...)
  2025-07-09  6:35 ` [ruby-core:122694] " jeremyevans0 (Jeremy Evans) via ruby-core
@ 2025-07-09  8:12 ` Eregon (Benoit Daloze) via ruby-core
  2025-11-17  9:03 ` [ruby-core:123824] " Eregon (Benoit Daloze) via ruby-core
  2025-11-18  3:20 ` [ruby-core:123840] " jeremyevans0 (Jeremy Evans) via ruby-core
  12 siblings, 0 replies; 14+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-07-09  8:12 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #21396 has been updated by Eregon (Benoit Daloze).


Regarding thread-safety, for `add?` the only thing is the return value where the old implementation might potentially return the Set even though the element was added concurrently.
It's still thread-safe in that it doesn't corrupt the Set or anything like that, and the given element is always added to the Set when the method returns.

> Since there were not explicit tests that methods should call other methods

There were a few ruby/spec specs which you disabled in https://github.com/ruby/ruby/pull/13074/files#diff-00365d65577dd7b3e357e99b895bb8e5a26d699c33a1afabc1a48cd91a8c5914
Those specs or at least part of those specs were added in https://github.com/ruby/spec/pull/629 to ensure proper interop with Set-like classes, as tested with SetSpecs::SetLike and used e.g. in the persistent-dmnd gem.
https://bugs.ruby-lang.org/issues/15240 is the related issue to make this kind of interop better defined and less hacky.
Probably interop with Set-like classes not inheriting from Set is too much of a ask, but it might be good to re-enable these specs and change `SetSpecs::SetLike` to inherit from `Set`.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-113973

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:123824] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (10 preceding siblings ...)
  2025-07-09  8:12 ` [ruby-core:122695] " Eregon (Benoit Daloze) via ruby-core
@ 2025-11-17  9:03 ` Eregon (Benoit Daloze) via ruby-core
  2025-11-18  3:20 ` [ruby-core:123840] " jeremyevans0 (Jeremy Evans) via ruby-core
  12 siblings, 0 replies; 14+ messages in thread
From: Eregon (Benoit Daloze) via ruby-core @ 2025-11-17  9:03 UTC (permalink / raw)
  To: ruby-core; +Cc: Eregon (Benoit Daloze)

Issue #21396 has been updated by Eregon (Benoit Daloze).


jeremyevans0 (Jeremy Evans) wrote in #note-11:
> knu (Akinori MUSHA) wrote in #note-10:
> > Jeremy, thanks for the reply.
> > 
> > Your point about thread-safety is well taken.  It is an important advantage.  As a possible compromise, we could keep the C backing and switch the behavior of methods when the class is subclassed in exchange for losing thread-safety.  It's just an idea and not the cleanest solution, though.
> 
> I think that is a reasonable compromise.  It keeps the advantages of core Set when not subclassed, and keeps backwards compatibility for subclasses (at least, subclasses that don't access `@hash`).  Please let me know if you would like me to work on that.

Yes, this sounds like a good way, it avoids breaking code and keeps performance optimal for Set.
The only downside is some extra complexity but only some conditions in a couple methods, no big deal.
And the thread-safety aspect is pretty minor as explained in my other comment (and obviously it wasn't guaranteed before so that's not breaking anything).

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-115227

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [ruby-core:123840] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in
  2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
                   ` (11 preceding siblings ...)
  2025-11-17  9:03 ` [ruby-core:123824] " Eregon (Benoit Daloze) via ruby-core
@ 2025-11-18  3:20 ` jeremyevans0 (Jeremy Evans) via ruby-core
  12 siblings, 0 replies; 14+ messages in thread
From: jeremyevans0 (Jeremy Evans) via ruby-core @ 2025-11-18  3:20 UTC (permalink / raw)
  To: ruby-core; +Cc: jeremyevans0 (Jeremy Evans)

Issue #21396 has been updated by jeremyevans0 (Jeremy Evans).


Eregon (Benoit Daloze) wrote in #note-13:
> jeremyevans0 (Jeremy Evans) wrote in #note-11:
> > knu (Akinori MUSHA) wrote in #note-10:
> > > Jeremy, thanks for the reply.
> > > 
> > > Your point about thread-safety is well taken.  It is an important advantage.  As a possible compromise, we could keep the C backing and switch the behavior of methods when the class is subclassed in exchange for losing thread-safety.  It's just an idea and not the cleanest solution, though.
> > 
> > I think that is a reasonable compromise.  It keeps the advantages of core Set when not subclassed, and keeps backwards compatibility for subclasses (at least, subclasses that don't access `@hash`).  Please let me know if you would like me to work on that.
> 
> Yes, this sounds like a good way, it avoids breaking code and keeps performance optimal for Set.
> The only downside is some extra complexity but only some conditions in a couple methods, no big deal.
> And the thread-safety aspect is pretty minor as explained in my other comment (and obviously it wasn't guaranteed before so that's not breaking anything).

I submitted a pull request that implements what @knu suggested: https://github.com/ruby/ruby/pull/15228

It allows subclasses to use the new core implementation and avoid the backwards compatible layer if they subclass from `Set::CoreSet`.

----------------------------------------
Bug #21396: Set#initialize should call Set#add on items passed in
https://bugs.ruby-lang.org/issues/21396#change-115239

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
```ruby
class Foo < Set
  def add(item) = super(item.bytesize)
end

x = Foo.new(["foo"])
p x
p x.include?(3)
```

On Ruby 3.4 the output is this:

```
> ruby -v test.rb
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
#<Foo: {3}>
true
```

On Ruby master the output is this:

```
> make run
./miniruby -I./lib -I. -I.ext/common  -r./arm64-darwin24-fake  ./test.rb 
#<Set: {"foo"}>
false
```

The bug is that `initialize` is not calling `add` for the elements passed in, so the subclass doesn't get a chance to change them.

I've sent a PR here: https://github.com/ruby/ruby/pull/13518



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-11-18  3:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 19:31 [ruby-core:122411] [Ruby Bug#21396] Set#initialize should call Set#add on items passed in tenderlovemaking (Aaron Patterson) via ruby-core
2025-06-04 19:43 ` [ruby-core:122412] " jeremyevans0 (Jeremy Evans) via ruby-core
2025-06-04 19:48 ` [ruby-core:122413] " tenderlovemaking (Aaron Patterson) via ruby-core
2025-06-05  1:08 ` [ruby-core:122415] " ko1 (Koichi Sasada) via ruby-core
2025-06-05  8:34 ` [ruby-core:122433] " Eregon (Benoit Daloze) via ruby-core
2025-06-05 13:43 ` [ruby-core:122456] " knu (Akinori MUSHA) via ruby-core
2025-06-05 17:05 ` [ruby-core:122464] " tenderlovemaking (Aaron Patterson) via ruby-core
2025-07-08 14:57 ` [ruby-core:122681] " knu (Akinori MUSHA) via ruby-core
2025-07-09  0:41 ` [ruby-core:122692] " jeremyevans0 (Jeremy Evans) via ruby-core
2025-07-09  4:57 ` [ruby-core:122693] " knu (Akinori MUSHA) via ruby-core
2025-07-09  6:35 ` [ruby-core:122694] " jeremyevans0 (Jeremy Evans) via ruby-core
2025-07-09  8:12 ` [ruby-core:122695] " Eregon (Benoit Daloze) via ruby-core
2025-11-17  9:03 ` [ruby-core:123824] " Eregon (Benoit Daloze) via ruby-core
2025-11-18  3:20 ` [ruby-core:123840] " jeremyevans0 (Jeremy Evans) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).