From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13922
Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: sigaltstack for implementation-internal signals?
Date: Tue, 5 Mar 2019 10:57:19 -0500
Message-ID: <20190305155719.GO23599@brightrain.aerifal.cx>
Reply-To: musl@lists.openwall.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226";
	logging-data="132555"; mail-complaints-to="usenet@blaine.gmane.org"
User-Agent: Mutt/1.5.21 (2010-09-15)
To: musl@lists.openwall.com
Original-X-From: musl-return-13938-gllmg-musl=m.gmane.org@lists.openwall.com Tue Mar 05 16:57:38 2019
Return-path: <musl-return-13938-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.89)
	(envelope-from <musl-return-13938-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1h1CRk-000YHI-5c
	for gllmg-musl@m.gmane.org; Tue, 05 Mar 2019 16:57:36 +0100
Original-Received: (qmail 28085 invoked by uid 550); 5 Mar 2019 15:57:33 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 28051 invoked from network); 5 Mar 2019 15:57:32 -0000
Content-Disposition: inline
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:13922
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/13922>

It came up recently that some erlang component is doing awful hacks
(see https://twitter.com/RichFelker/status/1099816400036773888) to try
to get implementation-internal signals to run on the alternate stack
setup by sigaltstack instead of the main stack. This desire makes some
sense, as they have tons of really tiny stacks and for their
coroutine-like things that rapidly context-switch in userspace. And
since there's no portable or valid way to do this hack, it raised the
issue for me: should we just always deliver implementation-internal
signals on the alternate stack if it's setup?

Unfortunateley I don't think this is possible/safe, for reasons
related to how the kernel's signal frame setup works. When a signal is
delivered and is to be handled on the alt stack, Linux checks whether
the current stack pointer is already on the alt stack. If so, it
decrements it normally; if not, it sets the stack pointer to the
beginning of the alt stack. The first case is needed in case a signal
interrupts a signal handler already running on the alt stack; if it
weren't handled that way, the second one would clobber the first one's
state, and upon return Bad Things would happen.

Unfortunately, this can go wrong. Suppose the application's signal
handler running on the alt stack changes the stack pointer to
something off the alt stack -- for example, using swapcontext, or some
awful split-stack hack. This is known to be "unsafe" in general, and
was the motivation for the (problematic with respect to POSIX, but
fixed in http://austingroupbugs.net/view.php?id=1187) addition of
SS_AUTODISARM. However, in principle it was "already safe" (without
SS_AUTODISARM) to use swapcontext with sigaltstack if the signal
handler and swapped-to context kept all SA_ONSTACK-flagged signals
blocked for their duration. An application could clearly arrange for
this; for example it's fairly natural if you only use one signal
handler that's SA_ONSTACK.

If we add unblockable, implementation-internal signals that are
flagged SA_ONSTACK, however, this breaks; now even if an application
has taken the "proper precautions", they can be delivered in a state
where the alt stack is nonempty but the stack pointer doesn't point
into it, thereby causing it to get clobbered.

Perhaps there's a chance that this just isn't supported/valid usage,
that "leaving the alt stack" should always be seen as abandoning it,
and that anything that worked before to preserve it was "just by
mistake/luck".

Rich