From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13316 Path: news.gmane.org!.POSTED!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: setrlimit hangs the process Date: Tue, 25 Sep 2018 17:36:05 +0200 Message-ID: <20180925153605.GF10209@port70.net> References: <20180925141551.GE10209@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1537889656 19415 195.159.176.226 (25 Sep 2018 15:34:16 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 25 Sep 2018 15:34:16 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) To: musl@lists.openwall.com Original-X-From: musl-return-13332-gllmg-musl=m.gmane.org@lists.openwall.com Tue Sep 25 17:34:12 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1g4pLk-0004vo-Sj for gllmg-musl@m.gmane.org; Tue, 25 Sep 2018 17:34:08 +0200 Original-Received: (qmail 21590 invoked by uid 550); 25 Sep 2018 15:36:18 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 21572 invoked from network); 25 Sep 2018 15:36:17 -0000 Mail-Followup-To: musl@lists.openwall.com Content-Disposition: inline In-Reply-To: Xref: news.gmane.org gmane.linux.lib.musl.general:13316 Archived-At: * Rabbitstack [2018-09-25 16:54:37 +0200]: > Sorry. Let me describe the problem in more detail. > > The process only hangs when launched without root privileges on the host > (Arch Linux x64 with kernel 4.17.5-1) where Alpine docker container is > running. Once with root privileges, it starts up correctly (but this is > obvious since it doesn't hit setrlimit call). The odd side is that on other > hosts it hangs even when started with root. No error messages so far. > Strace output: > > $ sudo strace -p 9285 > > futex(0x2cddfc0, FUTEX_WAIT_PRIVATE, 0, NULL > > $ sudo strace -f -p 9285 > > ..... > [pid 9287] getdents64(10, /* 14 entries */, 2048) = 336 > [pid 9287] tgkill(9285, 9285, SIGRT_2) = 0 > [pid 9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE, > {tv_sec=1537887068, tv_nsec=51442144}) = -1 ETIMEDOUT (Connection timed out) it looks like musl tries to sync a setuid call across all threads (which is necessary since the linux syscall only changes the uid for the current thread instead of all threads so you can end up with different privileges in the same address space which is dangerous as well as non-posix conform setuid behaviour) it's possible that the setuid syncing is somehow wrong in musl, but it's more likely that there are threads that are not created by the c runtime (but from go) and thus the sync cannot possibly work. so try to look for where set*id is called and ensure it is not called or called before any threads are created (or at least before any go threads are created) note that syscall.Set*id from go does not work either, it does not sync the threads (which is dangerously broken for a runtime that's always multi-threaded). > On Tue, Sep 25, 2018 at 4:15 PM Szabolcs Nagy wrote: > > * Rabbitstack [2018-09-25 14:59:45 +0200]: > > > I'm using the latest golang:alpine Docker image to produce a > > > statically-linked Go binary. Even though I'm able to build the binary, > > when > > > I run it the process gets stuck during ebpf program loading. I've > > > investigated a bit and found the root cause is the call to setrlimit > > (this > > > is the offending line > > > > > > > > https://github.com/iovisor/gobpf/blob/2e314be67b1854ad226f012f08a984e0e89b6da9/elf/elf.go#L105 > > ). > > > Are you aware of such behaviour in musl? > > > > > > > well you could have described what goes wrong in more detail > > (error message, strace output, target platform, are you root, ...) > > > > i assume you are not running this on mips (since there is no > > alpine docker image for mips), which has the issue of > > SYSCALL_RLIM_INFINITY != RLIM_INFINITY > > the kernel side value is different from userspace so musl > > has to translate the value which may go wrong. > > > > nor on x32 (which may have various issues with the raw > > syscalls both in the go code and c code). > > > > increasing rlimit is not allowed by default, so you have to > > ensure you have permissions, musl should have no special > > behaviour with respect to RLIMIT_MEMLOCK, so it's more likely > > that you just don't have bpf and setrlimit permissions. > > > > instead of using a complex system like go + c code + elf loader, > > try a minimal c program to see if the bpf syscall succeeds at > > all in your docker environment. > > > >