From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham
	autolearn_force=no version=3.4.4
Received: (qmail 6499 invoked from network); 10 Oct 2020 00:26:34 -0000
Received: from mother.openwall.net (195.42.179.200)
  by inbox.vuxu.org with ESMTPUTF8; 10 Oct 2020 00:26:34 -0000
Received: (qmail 19936 invoked by uid 550); 10 Oct 2020 00:26:26 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 19918 invoked from network); 10 Oct 2020 00:26:26 -0000
Date: Fri, 9 Oct 2020 20:26:13 -0400
From: Rich Felker <dalias@libc.org>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Carlos O'Donell via Libc-alpha <libc-alpha@sourceware.org>,
	musl@lists.openwall.com
Message-ID: <20201010002612.GC17637@brightrain.aerifal.cx>
References: <20200927141952.121047-1-carlos@redhat.com>
 <871rinm1fx.fsf@mid.deneb.enyo.de>
 <20200928234833.GC17637@brightrain.aerifal.cx>
 <87d025jcn0.fsf@mid.deneb.enyo.de>
 <20200929144207.GD17637@brightrain.aerifal.cx>
 <20201001023018.GL17637@brightrain.aerifal.cx>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20201001023018.GL17637@brightrain.aerifal.cx>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] Re: [PATCH] Make abort() AS-safe (Bug 26275).

On Wed, Sep 30, 2020 at 10:30:19PM -0400, Rich Felker wrote:
> On Tue, Sep 29, 2020 at 10:42:07AM -0400, Rich Felker wrote:
> > On Tue, Sep 29, 2020 at 08:54:59AM +0200, Florian Weimer wrote:
> > > * Rich Felker:
> > > 
> > > > Is there a reason to take the lock across fork rather than just
> > > > resetting it in the child? After seeing this I'm working on fixing the
> > > > same issue in musl and was about to take the lock, but realized ours
> > > > isn't actually protecting any userspace data state, just excluding
> > > > sigaction on SIGABRT during abort.
> > > 
> > > It's also necessary to stop the fork because the subprocess could
> > > otherwise observe the impossible SIG_DFL state.  In case the signal
> > > handler returns, the implementation needs to produce a termination
> > > status with SIGABRT as the termination signal, and the only way I can
> > > see to achieve that is to remove the signal handler and send the
> > > signal again.  This suggests that a lock in sigaction is needed as
> > > well.
> > 
> > Yes, in musl we already have the lock in sigaction -- that's the whole
> > point of the lock. To prevent other threads from fighting to change
> > the disposition back to SIG_IGN or a signal handler while abort is
> > trying to change it to SIG_DFL.
> > 
> > > But for the fork case, restting the lock in the new subprocess should
> > > be sufficient.
> > 
> > I don't follow. Do you mean taking the lock in the parent, but just
> > resetting it in the child? That should work but I don't see how it has
> > any advantage over just releasing it in the child.
> 
> OK, this is a lot worse than you thought:
> 
> Even without fork, execve and posix_spawn can also see the SIGABRT
> disposition change made by abort(), passing it on to a process that
> should have started with a disposition of SIG_IGN if you hit exactly
> the wrong spot in the race.
> 
> So, to fix this, these interfaces also have to take the abort lock,
> and to make it AS-safe (since execve is required to be), need to block
> all signals to take the lock. But execve can't leave signals blocked
> or the new process image would inherit that state. So it has to
> unblock them after taking the lock. But then a signal handler can
> interrupt between taking the lock and the execve syscall, making abort
> deadlock if called from the signal handler.
> 
> So how to solve this? Having the abort lock be recursive sounds like
> it helps (avoid the deadlock above), but then the signal handler that
> runs between taking the abort lock and making the execve syscall still
> delays abort by other threads for an unbounded length of time, and in
> fact it could even longjmp out, leaving a stale lock owner that
> prevents any other thread from ever calling abort. Ultimately this
> boils down to a general principle: you can't make AS-safe locks that
> allow arbitrary application code to run while they're held.
> 
> I really don't see any way out without giving abort a mechanism to
> "seize" other threads before changing the signal disposition. This
> could for example be done with the same mechanism used for
> multithreaded set*id (broadcast signal of an implementation-internal,
> unblockable signal) or maybe with some seccomp hacks on a recent
> enough kernel. Is there some better approach I'm missing??

There is something else I was missing, if we're willing to wrap signal
handlers -- something I've long suspected would be useful. The
primitive needed to solve this problem without a seize-all-threads
operation is restartable critical sections that restart if interrupted
by a signal, something like a "userspace EINTR". Here, any signal
arriving between execve taking the lock and making the execve syscall
would cause the lock to be released and the saved program counter to
revert to just before the lock was taken before the application's
signal handler runs.

How would such a thing be implemented? The wrapper for the signal
handler would examine TLS to observe that there's a restartable
critical section in progress, and modify the saved ucontext's
call-saved registers and uc_sigmask based on a sigjmp_buf saved before
the restartable section. It would then reverse the lock (probably via
a function pointer setup as part of the critical section) and
tail-call to the real signal handler. On return, the kernel then
automatically returns to just before the critical section, and the
lock is re-acquired before moving forward again.

Among other things, this allows implementing arbitrary "atomic unmask
signals and make syscall" operations (ala pselect) without need for
the kernel's help. It could also be used as the main ingredient of an
alternate thread cancellation design.

Is this a good idea? I don't know. But at least it means there seem to
be two (or more) possible solutions available.

Rich