ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
@ 2024-05-03 19:30 peterzhu2118 (Peter Zhu) via ruby-core
  2024-05-14  7:01 ` [ruby-core:117870] " peterzhu2118 (Peter Zhu) via ruby-core
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: peterzhu2118 (Peter Zhu) via ruby-core @ 2024-05-03 19:30 UTC (permalink / raw)
  To: ruby-core; +Cc: peterzhu2118 (Peter Zhu)

Issue #20470 has been reported by peterzhu2118 (Peter Zhu).

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:117870] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
@ 2024-05-14  7:01 ` peterzhu2118 (Peter Zhu) via ruby-core
  2024-05-17  0:55 ` [ruby-core:117902] " katei (Yuta Saito) via ruby-core
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: peterzhu2118 (Peter Zhu) via ruby-core @ 2024-05-14  7:01 UTC (permalink / raw)
  To: ruby-core; +Cc: peterzhu2118 (Peter Zhu)

Issue #20470 has been updated by peterzhu2118 (Peter Zhu).


I think the risk is fairly low, since the user has to compile Ruby with `--with-shared-gc` to enable this feature. Since they have to manually enable this feature at compile time, they should be aware of the possible security issues that comes with it.

If we want to completely mitigate this risk, we could instead use a command line argument rather than an environment variable.

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-108280

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:117902] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
  2024-05-14  7:01 ` [ruby-core:117870] " peterzhu2118 (Peter Zhu) via ruby-core
@ 2024-05-17  0:55 ` katei (Yuta Saito) via ruby-core
  2024-06-18 11:59 ` [ruby-core:118339] " wks (Kunshan Wang) via ruby-core
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: katei (Yuta Saito) via ruby-core @ 2024-05-17  0:55 UTC (permalink / raw)
  To: ruby-core; +Cc: katei (Yuta Saito)

Issue #20470 has been updated by katei (Yuta Saito).


+1 for extracting GC implementation of `gc.c` into a separate `gc_impl.c` file.

My motivation: Some of the use cases of Ruby on Wasm (e.g. edge computing platform) do not actually need to collect garbage objects during execution because such a process does not last so long. Splitting out the collector implementation allows us to have light-weight implementation like [no-op GC in JVM](https://openjdk.org/jeps/318). It will be a good fit for such use cases and also be helpful for performance analysis too.

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-108319

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:118339] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
  2024-05-14  7:01 ` [ruby-core:117870] " peterzhu2118 (Peter Zhu) via ruby-core
  2024-05-17  0:55 ` [ruby-core:117902] " katei (Yuta Saito) via ruby-core
@ 2024-06-18 11:59 ` wks (Kunshan Wang) via ruby-core
  2024-06-18 17:55 ` [ruby-core:118342] " peterzhu2118 (Peter Zhu) via ruby-core
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: wks (Kunshan Wang) via ruby-core @ 2024-06-18 11:59 UTC (permalink / raw)
  To: ruby-core; +Cc: wks (Kunshan Wang)

Issue #20470 has been updated by wks (Kunshan Wang).



I am working on the [mmtk-ruby](https://github.com/mmtk/mmtk-ruby/) project, and I think this patch is definitely on the right track.  This is similar to [JEP 304: Garbage Collector Interface](https://openjdk.org/jeps/304).  OpenJDK introduced a GC interface which made some other GC algorithms (including Epsilong, ZGC and Shenandoah) much easier to implement in OpenJDK.

In my opinion, whether the GC is extracted into a dedicated `.so` is secondary.  The most important thing is the *interface* between the VM and the GC.  A good VM-GC interface shall clearly define the responsibilities of the VM and the GC, helping the VM developers avoid making too much assumption about one particular GC implementation and hindering the adoption of other more advanced GC algorithms.

Depending on the programming language, the interface may include functions (including call-back functions), abstract classes, interfaces (as in Java), traits, and so on.  The GC and the VM each implements some functions, classes, etc. for the other party to use.  This patch defines the interface informally as the set of exported functions, and it doesn't even have a dedicated header file for `gc_impl.c`, or a "struct of function pointers" (which is roughly the counterpart of a C++ abstract class or a Java interface) for `gc_impl.c` to implement.  I believe this is just the first step of attempting to isolating the GC from the VM, and is not the final form of this patch.

# The contribution

The most important contribution of this patch is clarifying the responsibilities of the VM and the GC.  For example,

-   The GC is responsible for allocating the memory for the objects.  The VM function `newobj_of` calls `rb_gc_impl_new_obj` to allocate an object, and the GC takes care of details such as size classes (size pools) and mutator-local (ractor-local in Ruby) caches, if the particular GC implementation has the notion of size classes at all.  (FYI, bump-pointer allocators allocate objects of different sizes into the same contiguous region.)
-   The VM is responsible for root scanning and object scanning.  At the start of a GC, the GC calls `rb_gc_mark_roots` to identify roots.  When the GC visits an object, it calls `rb_gc_mark_children` to identify the children of an object, and calls `rb_gc_update_object_references` to forward the reference fields if it is a copying GC.  These functions were previously private functions in `gc.c`.  Now they are exported, and ready to be called by different GC implementations.

# "Binding": what the current `gc_impl.c` really is

The `gc_impl.c` in this patch is said to "contain the implementation of Ruby's GC".  As a developer of MMTk, I find that the current `gc_impl.c` not only contains the core of the GC (responsible for object allocation, marking, sweeping, compaction, ...), but also contains parts that are related to GC, but are also closely related to the VM.  That includes

-   The initialization of object header (flags and klass) and object fields (v1, v2, v3) and other metadata (including WB-protected flag) in `rb_gc_impl_new_obj`
    +   FYI, the MMTk core is only responsible for allocating an object of a given size/alignment.  The VM initializes the type metadata and fields.
-   The implementation of finalization, including "zombies" (including the linked list `objspace->heap_pages.deferred_final`), the `finalizer_table`, and the mechanism to *call* a function (`rb_gc_obj_free`) when an object is dead.
    +   FYI, the MMTk core only provides an API to (1) tell if an object is reachable, (2) forward a reference to a reachable object, and (3) resurrect an unreachable object (and its descendents.  The VM decides what to do when an object dies.
-   The implementation of the bidirectional object-to-ID mapping.
    +   FYI, the VM can implement it as a weak table, and the GC is only called to determine if a table entry points to a dead object.

**This is not a bad thing.**  Putting those VM-related things in `gc_impl.c` will **allow those things to be implemented in a way specific to a particular GC implementation**.  For example, if a GC implementation does not use lazy sweeping, then we can eagerly clear the entries of the `finalizer_table` that involve dead objects during a GC, instead of clearing them when the object is lazily swept (in `obj_free`).  As another example, if a GC implementation never moves any object, or if it is able to pin an object if its ID is observed, it can simply use the object's address as the ID instead of maintaining any hash tables.

In MMTk, we call such a component a "**VM binding**".  It is a bi-directional component that bridges the GC implementation (such as MMTk) and the VM (such as CRuby, OpenJDK, JikesRVM, etc.).  **We strongly recommend CRuby to use the term "binding", too.**  While MMTk calls it "VM binding", CRuby can call it "GC binding".

## Further splitting `gc_impl.c` into a binding and an actual implementation.

One possible improvement over this patch is further splitting the current `gc_impl.c` into two parts:

1.  The GC implementation that the Ruby VM can be completely agnostic of, and
2.  the bridge (binding) between the VM and the GC implementation.

The first part includes the code that allocates raw bytes from heap pages, manages bitmaps (such as the marking bits), does the marking, transitive closure, sweeping, compacting, etc.  The second part implements functions that are called by the VM, and wraps functions provided by the Ruby VM for the first part to call.

For example, `gc_mark_stacked_objects` implements the tracing loop that computes the transitive closure from roots.  It is an implementation detail of the GC, and the VM does not need to know it.  It belongs to the first part.

As another example, `rb_gc_impl_new_obj` is called by the VM (`newobj_of`), and it calls the low-level allocator (`newobj_alloc`) to allocate raw objects.  It also initializes Ruby fields.  Therefore, `rb_gc_impl_new_obj` is part of the binding.

As another example, the function `gc_mark_children` is called by `gc_mark_stacked_objects` from the GC.  It wraps the  `rb_gc_mark_children` provided by the VM, but also calls the GC-specific `gc_mark_set_parent`.  It sits between the GC and the VM, therefore belongs to the binding.

Ideally, if the actual GC implementation of CRuby is fully isolated, it can be used for other VMs like Lua, and we just need to make another binding for the Lua VM.

Conversely, when supporting other GC implementations in CRuby, such as MMTk and Lua's minimalist GC, we only need to develop the second part (the binding).  For MMTk, the first part (the actual GC implementation) is in the [mmtk-core](https://github.com/mmtk/mmtk-core/) written in Rust, and the second part (the binding) will be implemented by [mmtk-ruby](https://github.com/mmtk/mmtk-ruby/).  If we are curious enough to try to port Lua's GC into CRuby, we may just copy the source code into CRuby's source tree because it's so small (<2000 LOC) and it's written in C, too.

For performance reasons, a binding usually reimplements part of the GC in a VM-specific way (such as letting the JIT compiler emit the bump-pointer allocation fast path which theoretically belongs to the GC), and reimplements part of the VM in a GC-specific way (such as rewriting the object-scanning functions in Rust so that they can be inlined into MMTk's tracing loop, the single hottest loop during a GC).  I am expecting CRuby's GC binding to do the same for performance optimization, especially when using YJIT, in which case the overhead of GC will become more significant and the allocation fast paths will become more important once the mutator gets faster.  Evidently, *the GC interface should also involve the YJIT compiler* so that allocation fast paths and write barrier fast paths can be emitted in GC-specific machine instruction sequences.

# Problems

## Assumption of mark-sweep-compact

This patch also reveals problems in the GC interface.  The most obvious problem is the strong assumption of mark-sweep-compact GC algorithm.  There are `rb_gc_mark_children` and `rb_gc_update_object_references`.  The former is called during marking, and the second is called during compaction.  This approach is not suitable for evacuating GC algorithms, such as SemiSpace, Immix, etc.  When an evacuating GC visits a field, the object it points to is immediately moved (if this is the first time the object is visited), and the field is updated immediately to point to the new location.  As a workaround, we may call both `rb_gc_mark_children` and `rb_gc_update_object_references` when visiting each object.  And we may rewrite the object-scanning function in the implementation langauge of the GC (Rust for MMTk) for performance reasons.  But it is impractical for the GC binding to rewrite it for every type, especially complex types implemented in C (such as classes and iseqs), and it is impos
 sible to rewrite the "marking" and "compacting" functions written by third-party C extensions.

To solve the problem, we should stop abusing `rb_gc_mark_children` for implementing `rb_objspace_reachable_objects_from`, but instead make `rb_objspace_reachable_objects_from` a generic field-visiting function with a field-visitor callback that may potentially update fields, and use that to implement `rb_gc_mark_children`.  Such a generic field-visiting (and updating) function is what the GC really needs to scan an object.  Matthew Valentine-House has an experimental branch for this: https://github.com/eightbitraptor/ruby/tree/mvh-test-push-fpointer-to-ractor

## Limit of object size

Another problem worth mentioning is the limit of object size the GC can allocate.  The current GC implementation of CRuby cannot allocate objects larger than 640 bytes (the largest size pool), and that's why the VM still calls the function `rb_gc_size_allocatable_p` implemented in `gc_impl.c`.  Most other GC implementations, including the tiny GC implementation in Lua, don't have such limitation.  Such a limitation is also the source of the embedded-vs-heap design for many built-in types, including `T_OBJECT`, `T_ARRAY`, `T_STRING`, etc.  When switching to another GC implementation, CRuby cannot make the best use of the capability to allocate objects larger than 640 bytes unless the VM make changes accordingly.  For example, do not ask the GC if an object of a given size can be allocated (because the GC can allocate any size).  Instead, the VM should determine the optimal size of a String, Array or Object, either using predefined thresholds, or using profile data at run time.


----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-108845

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:118342] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
                   ` (2 preceding siblings ...)
  2024-06-18 11:59 ` [ruby-core:118339] " wks (Kunshan Wang) via ruby-core
@ 2024-06-18 17:55 ` peterzhu2118 (Peter Zhu) via ruby-core
  2025-01-29  8:08 ` [ruby-core:120818] " nobu (Nobuyoshi Nakada) via ruby-core
  2025-01-29 15:12 ` [ruby-core:120829] " peterzhu2118 (Peter Zhu) via ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: peterzhu2118 (Peter Zhu) via ruby-core @ 2024-06-18 17:55 UTC (permalink / raw)
  To: ruby-core; +Cc: peterzhu2118 (Peter Zhu)

Issue #20470 has been updated by peterzhu2118 (Peter Zhu).


Thank you for the comment @wks.

As you've mentioned, this is just the very beginning of this feature. There are shortcomings and shortcuts taken to make this first iteration simpler. If this feature is accepted, we plan on improving this feature in various ways including:

- Improving allocation so that initialization of the object's contents does not occur inside of the GC implementation, but rather done by the VM.
- Support allocation of arbitrary sized objects from the GC. Currently, many types (e.g. string, array, object, hashes) have assumptions about the upper and lower bound of object sizes. It might be a bit difficult to drop assumptions about lower bound due to performance reasons (it would be slower to check if a size can be allocated rather than just assuming that the size can be allocated), but we should at least drop assumptions about the upper bound.
- Remove the dependency on size classes in the VM. We currently need this because the object shape tree requires size classes to build the root shapes, so we need to change the implementation of object shapes to not depend on this.

There are many other suggestions in your comment, some of which may be non-trivial to implement, but we'll definitely keep them in mind and try to achieve as many as possible.

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-108849

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:120818] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
                   ` (3 preceding siblings ...)
  2024-06-18 17:55 ` [ruby-core:118342] " peterzhu2118 (Peter Zhu) via ruby-core
@ 2025-01-29  8:08 ` nobu (Nobuyoshi Nakada) via ruby-core
  2025-01-29 15:12 ` [ruby-core:120829] " peterzhu2118 (Peter Zhu) via ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: nobu (Nobuyoshi Nakada) via ruby-core @ 2025-01-29  8:08 UTC (permalink / raw)
  To: ruby-core; +Cc: nobu (Nobuyoshi Nakada)

Issue #20470 has been updated by nobu (Nobuyoshi Nakada).


This change still keeps `GC_DEBUG_STRESS_TO_CLASS` feature, but removed `GC.add_stress_to_class` method.
Is it intentional?

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-111686

* Author: peterzhu2118 (Peter Zhu)
* Status: Closed
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ruby-core:120829] [Ruby master Feature#20470] Extract Ruby's Garbage Collector
  2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
                   ` (4 preceding siblings ...)
  2025-01-29  8:08 ` [ruby-core:120818] " nobu (Nobuyoshi Nakada) via ruby-core
@ 2025-01-29 15:12 ` peterzhu2118 (Peter Zhu) via ruby-core
  5 siblings, 0 replies; 7+ messages in thread
From: peterzhu2118 (Peter Zhu) via ruby-core @ 2025-01-29 15:12 UTC (permalink / raw)
  To: ruby-core; +Cc: peterzhu2118 (Peter Zhu)

Issue #20470 has been updated by peterzhu2118 (Peter Zhu).


That was not intentional. I have fixed it in this PR: https://github.com/ruby/ruby/pull/12664

----------------------------------------
Feature #20470: Extract Ruby's Garbage Collector
https://bugs.ruby-lang.org/issues/20470#change-111696

* Author: peterzhu2118 (Peter Zhu)
* Status: Closed
----------------------------------------
# Extract Ruby's Garbage Collector

## Background

As described in [[Feature #20351]](https://bugs.ruby-lang.org/issues/20351), we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

## Implementation

GitHub PR: [10721](https://github.com/ruby/ruby/pull/10721)

In this patch, we have split the current `gc.c` file into two files: `gc.c` and `gc_impl.c`.

`gc.c` now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

`gc_impl.c` now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, `gc_impl.c` only uses public APIs in Ruby and a limited set of functions exposed in `gc.c`. This allows us to build `gc_impl.c` independently of Ruby and plug Ruby's GC into itself.

## Demonstration

After [checking out the branch](https://github.com/ruby/ruby/pull/10721), we can first configure with `--with-shared-gc`:

```bash
$ ./configure --with-shared-gc
...
$ make -j
...
```

Let's now change the slot size of the GC to 64 bytes:

```bash
$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c
```

We can compile `gc_impl.c` independently using the following commands for clang or gcc (you may have to change the last `-I` to match your architecture and platform):

```bash
$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c
```

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

```bash
$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}
```

We can now load our new GC using the `RUBY_GC_LIBRARY_PATH` environment variable (note that you may have to change the path to the DSO):

```bash
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}
```

## Benchmark

Benchmarks were ran on commit [c78cebb](https://github.com/ruby/ruby/commit/c78cebb469fe56b45ee5daad16ae976b7760497c) on Ubuntu 22.04 using [yjit-bench](https://github.com/Shopify/yjit-bench/) on commit [cc5a76e](https://github.com/Shopify/yjit-bench/commit/cc5a76ef6240113650547bdfb260a87934208321).

Compiling gc_impl branch without `--with-shared-gc` (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

Compiling gc_impl branch with `--with-shared-gc` and loading Ruby's current GC using `RUBY_GC_LIBRARY_PATH`, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

```
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------
```

## Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

## Future plans

- Refactor and improve `gc_impl.c`.
- Implement alternate GC implementations, such as the Epsilon GC and [MMTk](https://www.mmtk.io/) to prove that this API allows for alternate implementations of the GC.
- Reduce and improve the API of the GC implementation.
- Benchmark and improve performance of the DSO API.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-29 15:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-03 19:30 [ruby-core:117765] [Ruby master Feature#20470] Extract Ruby's Garbage Collector peterzhu2118 (Peter Zhu) via ruby-core
2024-05-14  7:01 ` [ruby-core:117870] " peterzhu2118 (Peter Zhu) via ruby-core
2024-05-17  0:55 ` [ruby-core:117902] " katei (Yuta Saito) via ruby-core
2024-06-18 11:59 ` [ruby-core:118339] " wks (Kunshan Wang) via ruby-core
2024-06-18 17:55 ` [ruby-core:118342] " peterzhu2118 (Peter Zhu) via ruby-core
2025-01-29  8:08 ` [ruby-core:120818] " nobu (Nobuyoshi Nakada) via ruby-core
2025-01-29 15:12 ` [ruby-core:120829] " peterzhu2118 (Peter Zhu) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).