mailing list of musl libc
 help / color / mirror / code / Atom feed
* Musl incompatibility with Docker and AWS's C5 class
@ 2018-03-15 13:37 Ryan Wilson-Perkin
  2018-03-15 14:52 ` Rich Felker
  0 siblings, 1 reply; 2+ messages in thread
From: Ryan Wilson-Perkin @ 2018-03-15 13:37 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2551 bytes --]

Hey musl-devs,

Yesterday we tested out the new C5 instance class that AWS offers using our
Alpine-based images and discovered that we would get a segfault whenever we
ran `npm install`. Tracing the code, it appeared to be happening due to the
use of node's "process.setuid" and "process.setgid" commands, either of
which would cause a segfault.

We're running Alpine containers inside Docker on EC2, and the smallest
thing I can provide to reproduce this issue would be to run the following
on a C5 EC2 instance:

docker run -it node:9-alpine sh -c "node -e 'process.setgid(0)'"

A core dump provided the following limited information:


Program terminated with signal SIGSEGV, Segmentation fault.
warning: Unexpected size of section `.reg-xstate/26' in core file.
#0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
29 src/thread/x86_64/syscall_cp.s: No such file or directory.
[Current thread is 1 (LWP 26)]
(gdb) bt
#0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1 0x00007fd6161eecd8 in __syscall_cp_c (nr=202, u=<optimized out>,
v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>,
z=0) at src/thread/pthread_cancel.c:35
#2 0x00007fd6161ee2f5 in __timedwait_cp (addr=addr@entry=0x5612e9ebf820,
val=val@entry=-1, clk=clk@entry=0, at=at@entry=0x0,
priv=<optimized out>) at src/thread/__timedwait.c:31
#3 0x00007fd6161f0e2c in sem_timedwait (sem=0x5612e9ebf820, at=0x0) at
src/thread/sem_timedwait.c:23
#4 0x00007fd615d7a5a4 in uv_sem_wait () from /usr/lib/libuv.so.1
#5 0x00005612e94dc00c in node::DebugSignalThreadMain(void*) ()
#6 0x00007fd6161ef665 in start (p=0x7fd616424ab0) at
src/thread/pthread_create.c:145
#7 0x00007fd6161f13e4 in __clone () at src/thread/x86_64/clone.s:21
Backtrace stopped: frame did not save the PC
*RYAN **WILSON-PERKIN* | Software Engineer
<https://s3.amazonaws.com/wave-buoyant/public/wave-logo-for-email-signatures.png>
<https://www.waveapps.com/> <https://www.waveapps.com/>
<https://www.waveapps.com/>

Join our community on Facebook <http://www.facebook.com/waveHQ>, LinkedIn
<http://www.linkedin.com/company/1196866>, or Twitter
<http://twitter.com/wavehq>

This message and any attachments are intended only for the use of the
addressee and should be considered confidential. If you are not an intended
recipient, you may not review, copy or distribute this message. If you have
received this communication in error or would like to stop receiving these
emails, please notify the sender by replying to this email. Wave is located
at 235 Carlaw Ave., Ste. 501, Toronto ON, M4M 2S1.

[-- Attachment #2: Type: text/html, Size: 19536 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Musl incompatibility with Docker and AWS's C5 class
  2018-03-15 13:37 Musl incompatibility with Docker and AWS's C5 class Ryan Wilson-Perkin
@ 2018-03-15 14:52 ` Rich Felker
  0 siblings, 0 replies; 2+ messages in thread
From: Rich Felker @ 2018-03-15 14:52 UTC (permalink / raw)
  To: musl

On Thu, Mar 15, 2018 at 09:37:28AM -0400, Ryan Wilson-Perkin wrote:
> Hey musl-devs,
> 
> Yesterday we tested out the new C5 instance class that AWS offers using our
> Alpine-based images and discovered that we would get a segfault whenever we
> ran `npm install`. Tracing the code, it appeared to be happening due to the
> use of node's "process.setuid" and "process.setgid" commands, either of
> which would cause a segfault.
> 
> We're running Alpine containers inside Docker on EC2, and the smallest
> thing I can provide to reproduce this issue would be to run the following
> on a C5 EC2 instance:
> 
> docker run -it node:9-alpine sh -c "node -e 'process.setgid(0)'"
> 
> A core dump provided the following limited information:
> 
> 
> Program terminated with signal SIGSEGV, Segmentation fault.
> warning: Unexpected size of section `.reg-xstate/26' in core file.
> #0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
> 29 src/thread/x86_64/syscall_cp.s: No such file or directory.
> [Current thread is 1 (LWP 26)]
> (gdb) bt
> #0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
> #1 0x00007fd6161eecd8 in __syscall_cp_c (nr=202, u=<optimized out>,
> v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>,
> z=0) at src/thread/pthread_cancel.c:35
> #2 0x00007fd6161ee2f5 in __timedwait_cp (addr=addr@entry=0x5612e9ebf820,
> val=val@entry=-1, clk=clk@entry=0, at=at@entry=0x0,
> priv=<optimized out>) at src/thread/__timedwait.c:31
> #3 0x00007fd6161f0e2c in sem_timedwait (sem=0x5612e9ebf820, at=0x0) at
> src/thread/sem_timedwait.c:23
> #4 0x00007fd615d7a5a4 in uv_sem_wait () from /usr/lib/libuv.so.1
> #5 0x00005612e94dc00c in node::DebugSignalThreadMain(void*) ()
> #6 0x00007fd6161ef665 in start (p=0x7fd616424ab0) at
> src/thread/pthread_create.c:145
> #7 0x00007fd6161f13e4 in __clone () at src/thread/x86_64/clone.s:21
> Backtrace stopped: frame did not save the PC

Changing uids/gids in a multithreaded process involves synchronizing
all the threads with a signal. Based on the information, my guess is
that the stack for at least one thread is barely large enough, and
when the signal arrives, creation of the signal frame (in the kernel)
overflows the stack and the kernel generates SIGSEGV for the process.

One approach to test if this is the case and mitigate it: LD_PRELOAD a
library that calls pthread_setattr_default_np from a constructor to
set a larger default thread stack size. If that turns out to be the
problem, the Alpine node package should probably be patched to
increase the stack size. We may also be increasing the default in musl
somewhat (from 80k to 128k or so) in the near future; if so it would
likely be enough to solve your problem here.

Rich


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-03-15 14:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 13:37 Musl incompatibility with Docker and AWS's C5 class Ryan Wilson-Perkin
2018-03-15 14:52 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).