* [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
@ 2024-02-05 4:59 hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-05 5:55 ` [ruby-core:116582] " mame (Yusuke Endoh) via ruby-core
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-02-05 4:59 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been reported by hanazuki (Kasumi Hanazuki).
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237
* Author: hanazuki (Kasumi Hanazuki)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116582] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
@ 2024-02-05 5:55 ` mame (Yusuke Endoh) via ruby-core
2024-02-05 8:15 ` [ruby-core:116584] " hanazuki (Kasumi Hanazuki) via ruby-core
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: mame (Yusuke Endoh) via ruby-core @ 2024-02-05 5:55 UTC (permalink / raw)
To: ruby-core; +Cc: mame (Yusuke Endoh)
Issue #20237 has been updated by mame (Yusuke Endoh).
Status changed from Open to Assigned
Assignee set to ko1 (Koichi Sasada)
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106595
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116584] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-05 5:55 ` [ruby-core:116582] " mame (Yusuke Endoh) via ruby-core
@ 2024-02-05 8:15 ` hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-06 7:26 ` [ruby-core:116594] " hanazuki (Kasumi Hanazuki) via ruby-core
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-02-05 8:15 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been updated by hanazuki (Kasumi Hanazuki).
Another option would be to define something like `fork_then_unshare(unshare_flags:, &block)` method in C extension, but because you would usually want to set up and clean up your environment between fork and unshare, this C function could become huge and kill the flexibility of Ruby.
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106597
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116594] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-05 5:55 ` [ruby-core:116582] " mame (Yusuke Endoh) via ruby-core
2024-02-05 8:15 ` [ruby-core:116584] " hanazuki (Kasumi Hanazuki) via ruby-core
@ 2024-02-06 7:26 ` hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-06 11:06 ` [ruby-core:116599] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-02-06 7:26 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been updated by hanazuki (Kasumi Hanazuki).
hanazuki (Kasumi Hanazuki) wrote in #note-2:
> Another option would be to define something like `fork_then_unshare(unshare_flags:, &block)` method in C extension, but because you would usually want to set up and clean up your environment between fork and unshare, this C function could become huge and kill the flexibility of Ruby.
After some experiments, I found this approach doesn't work with the current API. IIUC, the only official way for native extensions to properly fork the Ruby interpreter is to call `Process.fork` (A plain invocation of fork(2) followed by `rb_thread_atfork` seems to break something). Therefore, the extensions don't have more control than pure-Ruby codes on how the process is forked. Specifically, they can't execute any additional codes before the child process starts the background thread.
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106607
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116599] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (2 preceding siblings ...)
2024-02-06 7:26 ` [ruby-core:116594] " hanazuki (Kasumi Hanazuki) via ruby-core
@ 2024-02-06 11:06 ` kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-02-06 16:12 ` [ruby-core:116605] " hanazuki (Kasumi Hanazuki) via ruby-core
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core @ 2024-02-06 11:06 UTC (permalink / raw)
To: ruby-core; +Cc: kjtsanaktsidis (KJ Tsanaktsidis)
Issue #20237 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).
> or hook into fork to run some code in the child process immediately after it spawns
If your objective is "from a C extension, fork, set up the child process whilst it is still single threaded, and then return to Ruby".... you could possibly do this by registering a `pthread_atfork` function (and then unregistering it after you fork and after it runs, I suppose)
I've run into the same issue before (with pid namespaces though, which have the same problem).
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106610
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116605] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (3 preceding siblings ...)
2024-02-06 11:06 ` [ruby-core:116599] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
@ 2024-02-06 16:12 ` hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-07 9:44 ` [ruby-core:116617] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-02-06 16:12 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been updated by hanazuki (Kasumi Hanazuki).
kjtsanaktsidis (KJ Tsanaktsidis) wrote in #note-4:
> > or hook into fork to run some code in the child process immediately after it spawns
>
> If your objective is "from a C extension, fork, set up the child process whilst it is still single threaded, and then return to Ruby".... you could possibly do this by registering a `pthread_atfork` function (and then unregistering it after you fork and after it runs, I suppose)
Thank you for your advice. It looks something like this works:
```cpp
namespace {
thread_local std::optional<std::function<void()>> atfork_init;
void atfork_child() {
if(atfork_init) (*atfork_init)();
atfork_init.reset();
}
void atfork_parent() {
atfork_init.reset();
}
VALUE Namespace_s_fork(VALUE _self, VALUE opts) {
atfork_init = [flags = /*...*/]() {
if(unshare(flags) != 0) {
// TODO: handle error
}
};
auto const pid = rb_funcall(rb_mProcess, rb_intern("_fork"), 0);
if(FIX2INT(pid) == 0) {
if(rb_block_given_p()) {
int status;
rb_protect(rb_yield, Qundef, &status);
ruby_stop(status);
}
return Qnil;
}
return pid;
}
}
extern "C" {
RUBY_FUNC_EXPORTED void
Init_namespace_ext() {
if(pthread_atfork(nullptr, atfork_parent, atfork_child) != 0) {
rb_sys_fail("pthread_atfork()");
}
auto rb_mNamespace = rb_define_module("Namespace");
rb_define_singleton_method(rb_mNamespace, "fork", Namespace_s_fork, 1);
}
}
```
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106617
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116617] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (4 preceding siblings ...)
2024-02-06 16:12 ` [ruby-core:116605] " hanazuki (Kasumi Hanazuki) via ruby-core
@ 2024-02-07 9:44 ` kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-02-07 13:46 ` [ruby-core:116619] " hanazuki (Kasumi Hanazuki) via ruby-core
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core @ 2024-02-07 9:44 UTC (permalink / raw)
To: ruby-core; +Cc: kjtsanaktsidis (KJ Tsanaktsidis)
Issue #20237 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).
> It looks something like this works:
Won't win any awards for beauty perhaps, but gets the job done!
Is this enough to meet your needs? If so, I can close this out. Otherwise, perhaps a good next step would be to open a new feature request issue with a proposal of what interface, specifically, you want, and why?
From a stability perspective, I would think this is something that should work in future versions of Ruby (of course, I can't make any promises though). It's used in some pretty prevalent gems, like the grpc gem for instance (https://github.com/grpc/grpc/blob/038215b504b9027ac85527f5fdcd85c76b7e3a1f/src/core/lib/iomgr/fork_posix.cc#L117).
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106629
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116619] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (5 preceding siblings ...)
2024-02-07 9:44 ` [ruby-core:116617] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
@ 2024-02-07 13:46 ` hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-13 19:35 ` [ruby-core:116718] " ko1 (Koichi Sasada) via ruby-core
2024-09-16 6:42 ` [ruby-core:119209] " hanazuki (Kasumi Hanazuki) via ruby-core
8 siblings, 0 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-02-07 13:46 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been updated by hanazuki (Kasumi Hanazuki).
kjtsanaktsidis (KJ Tsanaktsidis) wrote in #note-6:
> Is this enough to meet your needs? If so, I can close this out.
Maybe, yes. It's unfortunate for me to rewrite a few lines of Ruby into 100~200 lines of C++, though.
I wonder whether this change is inevitable in return for the future with M:N threads.
Anyway, this is no longer my blocker. Thanks.
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106631
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:116718] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (6 preceding siblings ...)
2024-02-07 13:46 ` [ruby-core:116619] " hanazuki (Kasumi Hanazuki) via ruby-core
@ 2024-02-13 19:35 ` ko1 (Koichi Sasada) via ruby-core
2024-09-16 6:42 ` [ruby-core:119209] " hanazuki (Kasumi Hanazuki) via ruby-core
8 siblings, 0 replies; 10+ messages in thread
From: ko1 (Koichi Sasada) via ruby-core @ 2024-02-13 19:35 UTC (permalink / raw)
To: ruby-core; +Cc: ko1 (Koichi Sasada)
Issue #20237 has been updated by ko1 (Koichi Sasada).
Making the timer thread lazily is in tasklist but not sure when we can make it.
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-106735
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [ruby-core:119209] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
` (7 preceding siblings ...)
2024-02-13 19:35 ` [ruby-core:116718] " ko1 (Koichi Sasada) via ruby-core
@ 2024-09-16 6:42 ` hanazuki (Kasumi Hanazuki) via ruby-core
8 siblings, 0 replies; 10+ messages in thread
From: hanazuki (Kasumi Hanazuki) via ruby-core @ 2024-09-16 6:42 UTC (permalink / raw)
To: ruby-core; +Cc: hanazuki (Kasumi Hanazuki)
Issue #20237 has been updated by hanazuki (Kasumi Hanazuki).
Thank you @ko1 for sharing the current situation. I'm fine with closing this ticket as it is due to the design decision and (AFAIK) the original behavior had never been documented.
----------------------------------------
Bug #20237: Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread
https://bugs.ruby-lang.org/issues/20237#change-109789
* Author: hanazuki (Kasumi Hanazuki)
* Status: Assigned
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
## Backgrounds
[unshare(2)](https://man7.org/linux/man-pages/man2/unshare.2.html) is a syscall in Linux to move the calling process into a fresh execution context. With `unshare(CLONE_NEWUSER)` you can move a process into a new [user_namespace(7)](https://man7.org/linux/man-pages/man7/user_namespaces.7.html), where the process gains the full capability on the resources within the namespace. This is fundamental for Linux containers to achieve privilege separation. `unshare(CLONE_NEWUSER)` requires the calling process to be single-threaded (or no background threads are running). So, it is often invoked after `fork(2)` as forking propagates only the calling thread to the child process.
## Problem
It becomes a problem that Ruby 3.3 on Linux uses timer threads even for a single-`Thread`ed application. Because `Kernel#fork` spawns a thread in the child process before the control returns to the user code, there is no chance to call `unshare(CLONE_NEWUSER)` in Ruby.
The following snippet is a reproducer of this problem. This program first forks and then shows the user namespace to which the process belongs before and after calling unshare(2). It also shows the threads of the child process after forking.
```ruby
p(RUBY_DESCRIPTION:)
require 'fiddle/import'
module C
extend Fiddle::Importer
dlload 'libc.so.6'
extern 'int unshare(int flags)'
CLONE_NEWUSER = 0x10000000
def self.raise_system_call_error
raise SystemCallError.new(Fiddle.last_error)
end
end
pid = fork do
system("ps -O tid -T -p #$$")
system("ls -l /proc/self/ns/user")
if C.unshare(C::CLONE_NEWUSER) != 0
C.raise_system_call_error # => EINVAL with Ruby 3.3
end
system("ls -l /proc/self/ns/user")
end
p Process.wait2(pid)
```
The program successfully changes the user namespace with Ruby 3.2, but it raises EINVAL with Ruby 3.3. You can see Ruby 3.3 has two threads running after forking.
```
% rbenv shell 3.2 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585787 1585787 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 nobody nogroup 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026532675]'
[1585787, #<Process::Status: pid 1585787 exit 0>]
% rbenv shell 3.3 && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585849 1585849 S pts/12 00:00:00 ruby ./test.rb
1585849 1585851 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585849, #<Process::Status: pid 1585849 exit 1>]
% rbenv shell master && ruby ./test.rb
{:RUBY_DESCRIPTION=>"ruby 3.4.0dev (2024-02-04T16:05:02Z master 8bc6fff322) [x86_64-linux]"}
PID TID S TTY TIME COMMAND
1585965 1585965 S pts/12 00:00:00 ruby ./test.rb
1585965 1585967 S pts/12 00:00:00 ruby ./test.rb
lrwxrwxrwx 1 kasumi kasumi 0 Feb 5 02:25 /proc/self/ns/user -> 'user:[4026531837]'
./test.rb:10:in `raise_system_call_error': Invalid argument (Errno::EINVAL)
from ./test.rb:24:in `block in <main>'
from ./test.rb:19:in `fork'
from ./test.rb:19:in `<main>'
[1585965, #<Process::Status: pid 1585965 exit 1>]
```
## Workaround
My workaround is to rebuild ruby with `rb_thread_stop_timer_thread` and `rb_thread_start_timer_thread` exported, and use a C-ext that stops the timer thread before calling `unshare`. This seems not robust because the process cannot know when the terminated thread is reclaimed by the kernel, after which the process is considered single-threaded.
```c
#define _GNU_SOURCE 1
#include <sched.h>
#include <ruby/ruby.h>
static VALUE Unshare_s_unshare(VALUE _self, VALUE rflags) {
int const flags = NUM2INT(rflags);
rb_thread_stop_timer_thread();
usleep(1000); // FIXME: It takes some time for the kernel to remove the stopped thread?
int const ret = unshare(flags);
rb_thread_start_timer_thread();
if(ret != 0) rb_sys_fail_str(rb_sprintf("unshare(%#x)", flags));
return Qnil;
}
RUBY_FUNC_EXPORTED void
Init_unshare(void) {
VALUE rb_mUnshare = rb_define_module("Unshare");
rb_define_singleton_method(rb_mUnshare, "unshare", Unshare_s_unshare, 1);
rb_define_const(rb_mUnshare, "CLONE_NEWUSER", INT2FIX(CLONE_NEWUSER));
}
```
## Questions
- Is this a limitation of Ruby?
- Is it safe (or even possible) to stop the timer thread during execution?
- If so, can we export it as the public API?
- But it may not so useful for this problem as explained in the workaround.
- Is it guaranteed that no other threads are running after forks?
- Are there any better ways to solve this issue?
- Can we somehow delay the start of the timer thread after forking, or hook into `fork` to run some code in the child process immediately after it spawns.
- Can they be Ruby API instead of C API?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-16 6:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-05 4:59 [ruby-core:116581] [Ruby master Bug#20237] Unable to unshare(CLONE_NEWUSER) in Linux because of timer thread hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-05 5:55 ` [ruby-core:116582] " mame (Yusuke Endoh) via ruby-core
2024-02-05 8:15 ` [ruby-core:116584] " hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-06 7:26 ` [ruby-core:116594] " hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-06 11:06 ` [ruby-core:116599] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-02-06 16:12 ` [ruby-core:116605] " hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-07 9:44 ` [ruby-core:116617] " kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core
2024-02-07 13:46 ` [ruby-core:116619] " hanazuki (Kasumi Hanazuki) via ruby-core
2024-02-13 19:35 ` [ruby-core:116718] " ko1 (Koichi Sasada) via ruby-core
2024-09-16 6:42 ` [ruby-core:119209] " hanazuki (Kasumi Hanazuki) via ruby-core
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).