From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11772 Path: news.gmane.org!.POSTED!not-for-mail From: David Edelsohn Newsgroups: gmane.linux.lib.musl.general Subject: Re: possible bug in setjmp implementation for ppc64 Date: Tue, 1 Aug 2017 11:33:08 -0400 Message-ID: References: <1501520360.0.593167188853569@go.bunnymail.go> <20170731203007.GB1627@brightrain.aerifal.cx> <20170801051042.GA14914@dora.lan> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: blaine.gmane.org 1501601614 20901 195.159.176.226 (1 Aug 2017 15:33:34 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 1 Aug 2017 15:33:34 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-11785-gllmg-musl=m.gmane.org@lists.openwall.com Tue Aug 01 17:33:26 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1dcZAc-0004js-Up for gllmg-musl@m.gmane.org; Tue, 01 Aug 2017 17:33:19 +0200 Original-Received: (qmail 6100 invoked by uid 550); 1 Aug 2017 15:33:21 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 6079 invoked from network); 1 Aug 2017 15:33:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=J7cenxiyT4hAITHpD1Grlag9EeidNnRD0IR7h5koyZw=; b=gukToTHnxWnw8/FZTDGy6QKo6CqgTy9PGd8yDeSR/cNcQ23QRQKaqTMufMHqhAILBZ P3aF2Obi1zU8OmZYVM/8neXq1khgHYSvJWwmZOT64JoDItxmgjJimd/Eym52sjfcHXnd i422+xlzZQHiGnVsU8/0iduAGroP+piMDtbtp3joqSMNIwbOrEa0AqgywRxjoQOvTYDk XFU48nxbwdm/9nK8tHPhnprN2wo9dr0YeRTupwy4eTzI16UsqQa4sx4BXXGL7AzR/Dxg YUBdB/hFsUuT7pqiFhWK17PAz5l5QGJATFFup6EAalEJ8SBCaGjAcZ6t1hegVAqJQQ4n JbEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=J7cenxiyT4hAITHpD1Grlag9EeidNnRD0IR7h5koyZw=; b=UCYCc0EnJrNL1y3/E9e3iqJ1+JSWM1JD+uwpkC8e/3zgZ6JEECP5UmaHw+0vx89Etb VbDIW4wLjayHOtdoms7kyrG/+p3pRgf/x3IzI4XROAoyv5TzRlDZyw52m5QvoBNs8V/6 4VD0c1RViMnqbPncK6pJiWpaAIccg05iVMWVdjLEnQdmwE7ApyC4/Q3Q81Alp4GU1TE+ D6NzwLc/THMiN3HUruPj16ccYgtEQW4xh4B/Lzf6ET8+d1vbfcaTPIGgb0W9oMWsXfbI ia8DPwiyE6abzoqd4Mv3TfZYhuqWx6C6e5vfXiuAZgahRWIWPcjjt+r+wHo0b5khKeR/ J1mg== X-Gm-Message-State: AIVw111nRC2K6RjLqjlcRFszh/iYYqGewUoCjOT77Abt91biGR/GBunr AvKEy27RDitb6E45nuINNo8EcDpQuZE6 X-Received: by 10.200.50.242 with SMTP id a47mr26989283qtb.91.1501601588583; Tue, 01 Aug 2017 08:33:08 -0700 (PDT) In-Reply-To: <20170801051042.GA14914@dora.lan> Xref: news.gmane.org gmane.linux.lib.musl.general:11772 Archived-At: On Tue, Aug 1, 2017 at 1:10 AM, Bobby Bingham wrote: > On Mon, Jul 31, 2017 at 04:30:07PM -0400, Rich Felker wrote: >> On Mon, Jul 31, 2017 at 10:06:51PM +0200, felix.winkelmann@bevuta.com wrote: >> > Hi! >> > >> > I think I may have come across a bug in musl on PPC64(le), and the folks >> > on the #musl IRC channel directed me here. I'm not totally sure whether >> > the problem is caused by a my misunderstanding of C library functions or whether >> > it is a plain bug in the musl implementation of setjmp(3). >> > >> > In out project[1] we use setjmp to establish a global trampoline >> > and allocate small objects on the stack using alloca (see [2] for >> > more information about the compiliation strategy used). I was able to reduce >> > the code that crashes to the following: >> > >> > --- >> > #include >> > #include >> > #include >> > #include >> > #include >> > >> > jmp_buf jb; >> > >> > int foo = 99; >> > int c = 0; >> > >> > void bar() >> > { >> > c++; >> > longjmp(jb, 1); >> > } >> > >> > int main() >> > { >> > setjmp(jb); >> > char *p = alloca(256); >> > memset(p, 0, 256); >> > printf("%d\n", foo); >> > >> > if(c < 10) bar(); >> > >> > exit(0); >> > } >> > --- >> > >> > When executing the longjmp, the code that restores $r2 (TOC) after the call >> > to setjmp reads invalid data, because the memset apparently clobbered >> > the stack frame - i.e. the pointer returned be alloca points into a part >> > of the stack frame that is still in use. >> > >> > I tried this on arm, x86_64 and ppc64 with glibc and it seems to work fine, >> > but crashes when linked with musl (running Alpine Linux on a VM) >> > >> > If you need more information, please feel free to ask. You can also keep >> > me CC'd, since I'd be interested in knowing more about the details. >> >> It looks to me like we have a bug here, but it's one where I or >> someone else needs to read and understand the PPC64 ELFv2 ABI document >> to fully understand what's going on and make a fix. I'll try to get to >> it soon, or I'm happy if someone else wants to. I don't just want to >> cargo-cult whatever glibc is doing, though; a fix should be >> accompanied by an understanding of why it's right. > > I think I can explain what's happening. > > The TOC pointer is constant within a given dynamic module (the main > executable or a library), but needs to be adjusted at cross-module > calls. Each function has two entry points in the ELFv2 ABI. The entry > point for intra-module calls can assume r2 is already set up correctly. > The entry point for inter-module calls starts two instructions earlier > and adjusts r2 before falling through to the intra-module entry point. > > Normally, r2 is supposed to be preserved across calls. For intra-module > calls, there's no problem. For inter-module calls, the PLT stub saves > the caller's r2 value to a slot in the caller's stack frame that's > required to be reserved for it, at r1+24. The linker then inserts code > in the caller to restore the value from the stack immediately after the > call. > > So what's happening here is that the value of r2 that setjmp saves and > that longjmp restores is the TOC pointer for libc, as set up by the PLT > stub. It's not the value of r2 that the caller had. But that's > normally fine -- after the second return from setjmp, the caller will > restore its TOC pointer from the stack where it had been saved by the > PLT stub when it originally called setjmp. But in this example, gcc > decides to allocate the 256 bytes overtop the part of the stack where > the setjmp PLT stub had saved the TOC pointer, so it gets clobbered. > > The problem is that static linking and dynamic linking need to work > differently. With dynamic linking, we can fix this by changing setjmp > to read the caller's TOC pointer from the reserved slot in the caller's > stack frame, and longjmp to restore it to the stack instead of to r2. > > But with static linking, there's no PLT stub or code added by the linker > to restore the TOC pointer from the stack, so we need to save/restore > from/to r2, not the TOC slot in the caller's stack from. > > I think this either requires having different versions of setjmp/longjmp > for static and dynamic libc, or to increase the size of jmpbuf so we can > always save/restore both r2 and the value on the stack, but this would > be an ABI change. The analysis is correct. Quoting my colleague: "If glibc is built as a static library, the contents of r2 are saved in the jmp_buf; but if glibc is built as a dynamic library, the contents of the TOC save slot is saved in the jmp_buf. Similarly, if glibc is built as a dynamic library, longjmp *updates* the TOC save slot with the r2 value from the jmp_buf before returning." GLIBC setjmp/longjmp code explicitly differs for shared and static versions of the library. Musl libc needs equivalent functionality in its implementation. Thanks, David