ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
* [ruby-core:123449] [Ruby Bug#21633] A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals.
@ 2025-10-09  4:42 ioquatix (Samuel Williams) via ruby-core
  2026-02-12  4:10 ` [ruby-core:124774] " ioquatix (Samuel Williams) via ruby-core
  0 siblings, 1 reply; 2+ messages in thread
From: ioquatix (Samuel Williams) via ruby-core @ 2025-10-09  4:42 UTC (permalink / raw)
  To: ruby-core; +Cc: ioquatix (Samuel Williams)

Issue #21633 has been reported by ioquatix (Samuel Williams).

----------------------------------------
Bug #21633: A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals.
https://bugs.ruby-lang.org/issues/21633

* Author: ioquatix (Samuel Williams)
* Status: Open
* Assignee: ioquatix (Samuel Williams)
* Backport: 3.3: REQUIRED, 3.4: REQUIRED
----------------------------------------
The gRPC gem calls `rb_thread_call_without_gvl` in a loop, and doesn't exit when interrupts are delivered if `Thread.handle_interrupt(::SignalException => :never)` is used by the scheduler to create a safe point for asynchronous signal handling.

While this may not be considered a bug in any particular part of the system, the combination of the behaviour creates a situation where gRPC can hang for a long time and ignores SIGINT / SIGTERM.

## gRPC Failure Analysis

From [`src/ruby/ext/grpc/rb_completion_queue.c`](https://github.com/samuel-williams-shopify/grpc/blob/debug/src/ruby/ext/grpc/rb_completion_queue.c):

```c
static void unblock_func(void* param) {
  next_call_stack* const next_call = (next_call_stack*)param;
  next_call->interrupted = 1;  // ← SIGINT causes this flag to be set
}

grpc_event rb_completion_queue_pluck(grpc_completion_queue* queue, void* tag,
                                     gpr_timespec deadline,
                                     const char* reason) {
  // ...
  do {
    next_call.interrupted = 0;  // ← Reset flag
    
    rb_thread_call_without_gvl(grpc_rb_completion_queue_pluck_no_gil,
                               (void*)&next_call, unblock_func,
                               (void*)&next_call);
    
    if (next_call.event.type != GRPC_QUEUE_TIMEOUT) break;
  } while (next_call.interrupted);  // ← The problem! If interrupted, LOOP AGAIN!
  
  return next_call.event;
}
```

The loop explicitly retries after interruption, making SIGINT/SIGTERM ineffective.

This might be considered the expected behaviour if `Thread.handle_interrupt` is used. However, the goal of `Thread.handle_interrupt` in the fiber scheduler is to create a safe point for signal handling, not to prevent them completely. Since this loop never yields back to the scheduler, no such chance exists, and the loop will continue indefinitely.

As `rb_thread_call_without_gvl` invokes `vm_check_ints_blocking`, one solution is to yield to the scheduler in the case that there are pending interrupts. This gives the scheduler a chance to handle the incoming SIGINT / SIGTERM signals at the safe point.

For a full reproduction of the issue using gRPC: https://github.com/samuel-williams-shopify/grpc-interrupt

For the proposed fix: https://github.com/ruby/ruby/pull/14700



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [ruby-core:124774] [Ruby Bug#21633] A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals.
  2025-10-09  4:42 [ruby-core:123449] [Ruby Bug#21633] A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals ioquatix (Samuel Williams) via ruby-core
@ 2026-02-12  4:10 ` ioquatix (Samuel Williams) via ruby-core
  0 siblings, 0 replies; 2+ messages in thread
From: ioquatix (Samuel Williams) via ruby-core @ 2026-02-12  4:10 UTC (permalink / raw)
  To: ruby-core; +Cc: ioquatix (Samuel Williams)

Issue #21633 has been updated by ioquatix (Samuel Williams).

Status changed from Open to Closed

The fix was merged.

----------------------------------------
Bug #21633: A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals.
https://bugs.ruby-lang.org/issues/21633#change-116384

* Author: ioquatix (Samuel Williams)
* Status: Closed
* Assignee: ioquatix (Samuel Williams)
* Backport: 3.3: REQUIRED, 3.4: REQUIRED
----------------------------------------
The gRPC gem calls `rb_thread_call_without_gvl` in a loop, and doesn't exit when interrupts are delivered if `Thread.handle_interrupt(::SignalException => :never)` is used by the scheduler to create a safe point for asynchronous signal handling.

While this may not be considered a bug in any particular part of the system, the combination of the behaviour creates a situation where gRPC can hang for a long time and ignores SIGINT / SIGTERM.

## gRPC Failure Analysis

From [`src/ruby/ext/grpc/rb_completion_queue.c`](https://github.com/samuel-williams-shopify/grpc/blob/debug/src/ruby/ext/grpc/rb_completion_queue.c):

```c
static void unblock_func(void* param) {
  next_call_stack* const next_call = (next_call_stack*)param;
  next_call->interrupted = 1;  // ← SIGINT causes this flag to be set
}

grpc_event rb_completion_queue_pluck(grpc_completion_queue* queue, void* tag,
                                     gpr_timespec deadline,
                                     const char* reason) {
  // ...
  do {
    next_call.interrupted = 0;  // ← Reset flag
    
    rb_thread_call_without_gvl(grpc_rb_completion_queue_pluck_no_gil,
                               (void*)&next_call, unblock_func,
                               (void*)&next_call);
    
    if (next_call.event.type != GRPC_QUEUE_TIMEOUT) break;
  } while (next_call.interrupted);  // ← The problem! If interrupted, LOOP AGAIN!
  
  return next_call.event;
}
```

The loop explicitly retries after interruption, making SIGINT/SIGTERM ineffective.

This might be considered the expected behaviour if `Thread.handle_interrupt` is used. However, the goal of `Thread.handle_interrupt` in the fiber scheduler is to create a safe point for signal handling, not to prevent them completely. Since this loop never yields back to the scheduler, no such chance exists, and the loop will continue indefinitely.

As `rb_thread_call_without_gvl` invokes `vm_check_ints_blocking`, one solution is to yield to the scheduler in the case that there are pending interrupts. This gives the scheduler a chance to handle the incoming SIGINT / SIGTERM signals at the safe point.

For a full reproduction of the issue using gRPC: https://github.com/samuel-williams-shopify/grpc-interrupt

For the proposed fix: https://github.com/ruby/ruby/pull/14700



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-02-12  4:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-09  4:42 [ruby-core:123449] [Ruby Bug#21633] A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals ioquatix (Samuel Williams) via ruby-core
2026-02-12  4:10 ` [ruby-core:124774] " ioquatix (Samuel Williams) via ruby-core

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).