Thank you Rich. I misunderstood what was done by the kernel and what was done by the library. I assumed the waiters bit was just for the library and assumed the FUTEX_OWNER_DIED was a flag, not value.

To clarify my use case and slightly correct the XY problem, I'm trying to build an extended mutex using either a combination of musl mutexes and musl condition variables, or futexes.

My extended mutex is a "deadman switch" of sorts that tracks lifetime of on-device services. It doesn't need to be super performant, since services don't come up and down too frequently.

My mutex is similar to a robust process-shared mutex.

It supports standard functionality: lock, trylock, timedlock, unlock
It supports waiting for the mutex to lock without locking: wait_locked, timedwait_locked
It supports waiting for the mutex to unlock: wait_unlocked, timedwait_unlocked

(wait_locked returns a unique token that is passed to wait_unlocked to ensure the same service is tracked)

And the most annoying requirement is that my mutex must be cancelable.

I previously tried making the lock function use a condition variable in order for it to be cancelable, but I couldn't find a way to wake it when the owning process dies.

Now, I'm exploring using futex directly.

Suggestions for how to write cancelable mutex would be much appreciated :)

Thanks again!


On Thu, Apr 21, 2022 at 5:25 AM Rich Felker <dalias@libc.org> wrote:
>
> On Wed, Apr 20, 2022 at 07:56:17PM -0400, Rich Felker wrote:
> > On Wed, Apr 20, 2022 at 04:46:23PM -0700, Leonid Shamis wrote:
> > > Hey Musl Folks,
> > >
> > > I'm trying to understand the futex robust list and it's not quite working
> > > how I would expect it to from a reading of the man pages.
> > >
> > > In a minimal example, I'm getting the futex changed to FUTEX_OWNER_DIED
> > > instead of FUTEX_OWNER_DIED|tid and I'm not getting a FUTEX_WAKE event.
> > > Any idea why this might be?
> > >
> > > Happy to share the minimal example.
> >
> > Can you clarify if you're trying to use robust mutexes under musl or
> > roll your own thing using the kernel robust list directly? I don't
> > think I'm understanding what you're confused about. If the owner of a
> > robust mutex dies, its tid can no longer be in the futex word because
> > that tid is immediately available to be assigned to a new task, and
> > the next thread trying to lock the mutex would not be able to
> > distinguish whether it's owned by a new thread with that tid, or
> > available.
> >
> > So as far as the kernel is concerned, FUTEX_OWNER_DIED is more of a
> > value than a flag. It's what the futex word gets set to when the owner
> > dies with it held. In musl, we use FUTEX_OWNER_DIED as a flag as well
> > for a robust mutex whose old owner died and who has a new owner but
> > hasn't yet called pthread_mutex_consistent (in which case it would
> > become unrecoverable on unlock).
>
> Also: you only get the FUTEX_WAIT if the waiters but (bit 31) was set
> when the owner died.