mailing list of musl libc
 help / color / mirror / code / Atom feed
* pthread_getattr_np() vs explicit runtime loader
@ 2015-09-20  6:39 u-wsnj
  2015-09-20 16:34 ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: u-wsnj @ 2015-09-20  6:39 UTC (permalink / raw)
  To: musl

Hello,

musl 1.1.8 on ia32 Linux, building gcc 5.2.0 succeeds.

Nevertheless a subset of the resulting executables segfault when run by
an explicit loader (which is the vital mode of operation in our setups).

They do not seem to segfault when using the implicit loader
which suggests the result depends on the memory mapping layout.

Moreover, the last syscalls seen before the crash are mremap(),
presumably reflecting that pthread_getattr_np() is involved.

It looks like (according to a discussion in mail archives) the logic
in this function makes assumptions which not necessarily are true while
using an explicit runtime loader.

Would you comment on whether this guess is correct and hopefully make
pthread_getattr_np() work even with the explicit loader?

The strace examples limited to mremap() follow.
The same files and libraries are being used, also the same loader
path is used explicitly as embedded in the executable.

-----------------------------------------------------------------
$ strace -e mremap \
  /..../<loader> --library-path ...<libs> /..../jv-convert --help
mremap(0xffffc000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffffb000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffffa000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff9000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff8000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff7000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff6000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff5000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff4000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff3000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff2000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff1000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff0000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffef000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffee000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffed000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffec000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffeb000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffea000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe9000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe8000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe7000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe6000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe5000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe4000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe3000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe2000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe1000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe0000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdf000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffde000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdd000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdc000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdb000, 4096, 8192, 0)       = -1 EFAULT (Bad address)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Process 30289 detached
-----------------------------------------------------------------
$ LD_LIBRARY_PATH=...<libs> strace -e mremap /..../jv-convert --help
mremap(0xffffc000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffffb000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffffa000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff9000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff8000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff7000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff6000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff5000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff4000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff3000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff2000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff1000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xffff0000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffef000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffee000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffed000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffec000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffeb000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffea000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe9000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe8000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe7000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe6000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe5000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe4000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe3000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe2000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe1000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffe0000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdf000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffde000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdd000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdc000, 4096, 8192, 0)       = -1 ENOMEM (Cannot allocate memory)
mremap(0xfffdb000, 4096, 8192, 0)       = -1 EFAULT (Bad address)
Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]

Convert from one encoding to another.

   --encoding FROM
   --from FROM        use FROM as source encoding name
   --to TO            use TO as target encoding name
   -i FILE            read from FILE
   -o FILE            print output to FILE
   --reverse          swap FROM and TO encodings
   --help             print this help, then exit
   --version          print version number, then exit

`-' as a file name argument can be used to refer to stdin or stdout.
Process 30291 detached
-----------------------------------------------------------------

Regards,
Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20  6:39 pthread_getattr_np() vs explicit runtime loader u-wsnj
@ 2015-09-20 16:34 ` Rich Felker
  2015-09-20 17:22   ` u-wsnj
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2015-09-20 16:34 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 08:39:09AM +0200, u-wsnj@aetey.se wrote:
> Hello,
> 
> musl 1.1.8 on ia32 Linux, building gcc 5.2.0 succeeds.
> 
> Nevertheless a subset of the resulting executables segfault when run by
> an explicit loader (which is the vital mode of operation in our setups).
> 
> They do not seem to segfault when using the implicit loader
> which suggests the result depends on the memory mapping layout.
> 
> Moreover, the last syscalls seen before the crash are mremap(),
> presumably reflecting that pthread_getattr_np() is involved.
> 
> It looks like (according to a discussion in mail archives) the logic
> in this function makes assumptions which not necessarily are true while
> using an explicit runtime loader.
> 
> Would you comment on whether this guess is correct and hopefully make
> pthread_getattr_np() work even with the explicit loader?

I reviewed the code and there are no assumptions about how the program
is loaded made there. And the original test program I used to test
pthread_getattr_np runs fine both normally and with an explicit loader
command. So I think the actual problem must be elsewhere, likely in
whatever the application is doing right after pthread_getattr_np.

What triggered the crash to start happening? Upgrading musl? Upgrading
gcc? Have you used gdb to get a backtrace and see where the program
actually crashes?

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 16:34 ` Rich Felker
@ 2015-09-20 17:22   ` u-wsnj
  2015-09-20 18:27     ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: u-wsnj @ 2015-09-20 17:22 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 12:34:05PM -0400, Rich Felker wrote:
> On Sun, Sep 20, 2015 at 08:39:09AM +0200, u-wsnj@aetey.se wrote:
> > Would you comment on whether this guess is correct and hopefully make
> > pthread_getattr_np() work even with the explicit loader?
> 
> I reviewed the code and there are no assumptions about how the program
> is loaded made there. And the original test program I used to test
> pthread_getattr_np runs fine both normally and with an explicit loader
> command. So I think the actual problem must be elsewhere, likely in
> whatever the application is doing right after pthread_getattr_np.

Thanks for checking, sorry that the hypothesis seems to be wrong.

May I run a test with that program of yours?

> What triggered the crash to start happening? Upgrading musl? Upgrading

It is the behaviour of gcc 5. This was the case when I built 5.1.0 but
5.2.0 was supposed to be more compatible with musl, so I did not research
5.1.0. Now gcc 5.2.0 behaves identically in this respect.

> gcc? Have you used gdb to get a backtrace and see where the program
> actually crashes?

Not yet, going to. Rebuilding gcc with '-g', this takes some time.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 17:22   ` u-wsnj
@ 2015-09-20 18:27     ` Rich Felker
  2015-09-20 19:30       ` u-wsnj
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2015-09-20 18:27 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1546 bytes --]

On Sun, Sep 20, 2015 at 07:22:37PM +0200, u-wsnj@aetey.se wrote:
> On Sun, Sep 20, 2015 at 12:34:05PM -0400, Rich Felker wrote:
> > On Sun, Sep 20, 2015 at 08:39:09AM +0200, u-wsnj@aetey.se wrote:
> > > Would you comment on whether this guess is correct and hopefully make
> > > pthread_getattr_np() work even with the explicit loader?
> > 
> > I reviewed the code and there are no assumptions about how the program
> > is loaded made there. And the original test program I used to test
> > pthread_getattr_np runs fine both normally and with an explicit loader
> > command. So I think the actual problem must be elsewhere, likely in
> > whatever the application is doing right after pthread_getattr_np.
> 
> Thanks for checking, sorry that the hypothesis seems to be wrong.
> 
> May I run a test with that program of yours?

Test program attached. It's just a very basic functionality check.

> > What triggered the crash to start happening? Upgrading musl? Upgrading
> 
> It is the behaviour of gcc 5. This was the case when I built 5.1.0 but
> 5.2.0 was supposed to be more compatible with musl, so I did not research
> 5.1.0. Now gcc 5.2.0 behaves identically in this respect.

And both musl and the crashing app were compiled with gcc 5? If so the
problem could be on either side.

> > gcc? Have you used gdb to get a backtrace and see where the program
> > actually crashes?
> 
> Not yet, going to. Rebuilding gcc with '-g', this takes some time.

Unless gcc is the program crashing I don't see why you need to rebuild
gcc with -g...

Rich

[-- Attachment #2: getstack.c --]
[-- Type: text/plain, Size: 599 bytes --]

#define _GNU_SOURCE
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>

void *start(void *p)
{
	for (;;) pause();
}

void printstack(pthread_attr_t *a)
{
	void *base;
	size_t size;
	pthread_attr_getstack(a, &base, &size);
	printf("%p - %p (%zu)\n", base, (char *)base+size, size);
}

char buf[12345];

int main()
{
	pthread_t td;
	pthread_attr_t a;

	pthread_attr_init(&a);
	pthread_attr_setstack(&a, buf, sizeof buf);
	printf("buf = %p\n", buf);

	pthread_create(&td, &a, start, 0);
	pthread_getattr_np(td, &a);
	printstack(&a);
	pthread_getattr_np(pthread_self(), &a);
	printstack(&a);
}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 18:27     ` Rich Felker
@ 2015-09-20 19:30       ` u-wsnj
  2015-09-20 19:41         ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: u-wsnj @ 2015-09-20 19:30 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 02:27:28PM -0400, Rich Felker wrote:
> Test program attached. It's just a very basic functionality check.

Thanks.

I may be misinterpreting the code but I do not see where it tests
the condition
(http://man7.org/linux/man-pages/man3/pthread_getattr_np.3.html)
"Furthermore, if the stack address attribute was not set in the thread
attributes object used to create the thread, then the returned thread
attributes object will report the actual stack address that the
implementation selected for the thread."

It seems to be this case which coincides with the crash.

I looked among others at
 http://www.openwall.com/lists/musl/2013/03/31/5
and
 http://git.musl-libc.org/cgit/musl/commit/?id=5db951ef80cae8b627f95b995811bf916c069757

and still am unsure whether the assumptions hold while using
the explicit loader.

> > > gcc? Have you used gdb to get a backtrace and see where the program
> > > actually crashes?
> > 
> > Not yet, going to. Rebuilding gcc with '-g', this takes some time.
> 
> Unless gcc is the program crashing I don't see why you need to rebuild
> gcc with -g...

These _are_ several of the binaries of gcc-5.x which crash. It looks like
the ones which crash (java-related ones?) are using pthread_getattr_np()
while others do not. I did not though consequently check all of them.

You can easily test this if you have got say a jv-convert binary of
gcc-5.2.0, dynamically linked with musl and run this binary via the
explicit loader. Yours and mine environments are different but I would
not be surprised if the binary crashes for you too.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 19:30       ` u-wsnj
@ 2015-09-20 19:41         ` Rich Felker
  2015-09-21  7:57           ` u-wsnj
  2015-09-30 15:43           ` u-uy74
  0 siblings, 2 replies; 18+ messages in thread
From: Rich Felker @ 2015-09-20 19:41 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 09:30:33PM +0200, u-wsnj@aetey.se wrote:
> On Sun, Sep 20, 2015 at 02:27:28PM -0400, Rich Felker wrote:
> > Test program attached. It's just a very basic functionality check.
> 
> Thanks.
> 
> I may be misinterpreting the code but I do not see where it tests
> the condition
> (http://man7.org/linux/man-pages/man3/pthread_getattr_np.3.html)
> "Furthermore, if the stack address attribute was not set in the thread
> attributes object used to create the thread, then the returned thread
> attributes object will report the actual stack address that the
> implementation selected for the thread."
> 
> It seems to be this case which coincides with the crash.

I'm not sure what you mean. Except for the main thread, the t->stack
and t->stack_size fields store the correct values based on what was
used at pthread_create time. The distinct code paths for
caller-provided stack versus implementation-allocated stack already
took place at pthread_create time.

Moreover the case in your program is getting the stack for the main
thread, not for another thread, so the code you're asking about is not
even what's being executed.

> I looked among others at
>  http://www.openwall.com/lists/musl/2013/03/31/5
> and
>  http://git.musl-libc.org/cgit/musl/commit/?id=5db951ef80cae8b627f95b995811bf916c069757
> 
> and still am unsure whether the assumptions hold while using
> the explicit loader.

I don't see anywhere this code has any interacton whatsoever with how
the program was loaded. So I suspect plain old undefined behavior if
the crash depends on how it was loaded.

> > > > gcc? Have you used gdb to get a backtrace and see where the program
> > > > actually crashes?
> > > 
> > > Not yet, going to. Rebuilding gcc with '-g', this takes some time.
> > 
> > Unless gcc is the program crashing I don't see why you need to rebuild
> > gcc with -g...
> 
> These _are_ several of the binaries of gcc-5.x which crash. It looks like
> the ones which crash (java-related ones?) are using pthread_getattr_np()
> while others do not. I did not though consequently check all of them.
> 
> You can easily test this if you have got say a jv-convert binary of
> gcc-5.2.0, dynamically linked with musl and run this binary via the
> explicit loader. Yours and mine environments are different but I would
> not be surprised if the binary crashes for you too.

I might get a chance to look later, but first thought: is jv-convert
using boehm gc? I ask because boehm is one of the main users (iirc) of
pthread_getattr_np and it's full of UB. It's possible that gcc 5 broke
some of the things it's doing, or that they were already broken but
didn't happen to crash before. I think boehm needs some patches to
work safely on musl but maybe not anymore.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 19:41         ` Rich Felker
@ 2015-09-21  7:57           ` u-wsnj
  2015-09-30 15:43           ` u-uy74
  1 sibling, 0 replies; 18+ messages in thread
From: u-wsnj @ 2015-09-21  7:57 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 03:41:32PM -0400, Rich Felker wrote:
> > These _are_ several of the binaries of gcc-5.x which crash. It looks like
> > the ones which crash (java-related ones?) are using pthread_getattr_np()
> > while others do not. I did not though consequently check all of them.
> > 
> > You can easily test this if you have got say a jv-convert binary of
> > gcc-5.2.0, dynamically linked with musl and run this binary via the
> > explicit loader. Yours and mine environments are different but I would
> > not be surprised if the binary crashes for you too.
> 
> I might get a chance to look later, but first thought: is jv-convert

Unfortunately I could not get any useful backtrace from gdb, this
would need more time.

Hope you would make a simple crash test with
 /<path-to-musl>/libc.so /<path-to>/jv-convert.

> using boehm gc? I ask because boehm is one of the main users (iirc) of

Libjava (and then presumably jv-convert too) involves boehm gc.

> pthread_getattr_np and it's full of UB. It's possible that gcc 5 broke
> some of the things it's doing, or that they were already broken but
> didn't happen to crash before. I think boehm needs some patches to
> work safely on musl but maybe not anymore.

boehm needs some patches to be buildable but this is about removing
harmful glibc-related include/define heuristics.

For the record, there are no similar crashes of jv-convert or other
binaries from an older gcc (4.2.3) built with the same musl.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pthread_getattr_np() vs explicit runtime loader
  2015-09-20 19:41         ` Rich Felker
  2015-09-21  7:57           ` u-wsnj
@ 2015-09-30 15:43           ` u-uy74
  2015-09-30 20:35             ` Update: [musl] " u-uy74
  1 sibling, 1 reply; 18+ messages in thread
From: u-uy74 @ 2015-09-30 15:43 UTC (permalink / raw)
  To: musl

On Sun, Sep 20, 2015 at 03:41:32PM -0400, Rich Felker wrote:
> I don't see anywhere this code has any interacton whatsoever with how
> the program was loaded. So I suspect plain old undefined behavior if
> the crash depends on how it was loaded.

> > You can easily test this if you have got say a jv-convert binary of
> > gcc-5.2.0, dynamically linked with musl and run this binary via the
> > explicit loader. Yours and mine environments are different but I would
> > not be surprised if the binary crashes for you too.

An update:

The observed crashes were both very consequent and confusing
for the debuggers which I tried. Then I had to put this on wait.

Now when I returned to testing, the crashes do not appear any longer.

This is most probably related to the fact that the host has been rebooted
meanwhile (to the same kernel, ~ 3.18.11).

Note that I saw the crashes earlier with binaries from gcc-5.1.0 too,
i.e. this was a consistent pattern under quite some time, with different
builds, on many occasions. No similar problems with other programs
during the same time, nor any problem if using the implicit loader.

So either this was an artifact of a "somehow specifically corrupt"
kernel or this is some assumption which blows up, given a certain state
(not necessarily corrupt) of the kernel. I believe more in the latter
(is there a contract about how/where the kernel shall allocate the
thread stacks?).

I still think that the crashes are caused by errors
while guessing the stack placement in pthread_getattr_np(),
simply because of the kernel doing something else than usual.

Unfortunately, in practical terms: no misbehaviour to analyze for
the moment.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Update: [musl] pthread_getattr_np() vs explicit runtime loader
  2015-09-30 15:43           ` u-uy74
@ 2015-09-30 20:35             ` u-uy74
  2015-10-06 11:34               ` musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader u-uy74
  0 siblings, 1 reply; 18+ messages in thread
From: u-uy74 @ 2015-09-30 20:35 UTC (permalink / raw)
  To: musl

On Wed, Sep 30, 2015 at 05:43:37PM +0200, u-uy74@aetey.se wrote:
> So either this was an artifact of a "somehow specifically corrupt"
> kernel or this is some assumption which blows up, given a certain state
> (not necessarily corrupt) of the kernel. I believe more in the latter
> (is there a contract about how/where the kernel shall allocate the
> thread stacks?).
> 
> I still think that the crashes are caused by errors
> while guessing the stack placement in pthread_getattr_np(),
> simply because of the kernel doing something else than usual.
> 
> Unfortunately, in practical terms: no misbehaviour to analyze for
> the moment.

I can reproduce the problem and this looks like something
to fix or at least work around, either in gcc or in musl.

Running with the implicit loader works, but using the explicit one yields:

----------------------------------------------------------------
# cat /proc/sys/kernel/randomize_va_space
2

$ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]

Convert from one encoding to another.

   --encoding FROM
   --from FROM        use FROM as source encoding name
   --to TO            use TO as target encoding name
   -i FILE            read from FILE
   -o FILE            print output to FILE
   --reverse          swap FROM and TO encodings
   --help             print this help, then exit
   --version          print version number, then exit

`-' as a file name argument can be used to refer to stdin or stdout.

# echo 0 > /proc/sys/kernel/randomize_va_space

$ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
Segmentation fault
----------------------------------------------------------------

Would anybody try this and confirm or refute?

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-09-30 20:35             ` Update: [musl] " u-uy74
@ 2015-10-06 11:34               ` u-uy74
  2015-10-06 14:36                 ` Isaac Dunham
  2015-10-06 17:07                 ` Rich Felker
  0 siblings, 2 replies; 18+ messages in thread
From: u-uy74 @ 2015-10-06 11:34 UTC (permalink / raw)
  To: musl

Either nobody cares or nobody has a gcc-5.x toolchain built with musl?
Wondering.

gcc-5 looks like a case important enough to care.

Rune

On Wed, Sep 30, 2015 at 10:35:48PM +0200, u-uy74@aetey.se wrote:
> On Wed, Sep 30, 2015 at 05:43:37PM +0200, u-uy74@aetey.se wrote:
> > 
> > I still think that the crashes are caused by errors
> > while guessing the stack placement in pthread_getattr_np(),
> > simply because of the kernel doing something else than usual.
> 
> I can reproduce the problem and this looks like something
> to fix or at least work around, either in gcc or in musl.
> 
> Running with the implicit loader works, but using the explicit one yields:
> 
> ----------------------------------------------------------------
> # cat /proc/sys/kernel/randomize_va_space
> 2
> 
> $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]
> 
> Convert from one encoding to another.
> 
>    --encoding FROM
>    --from FROM        use FROM as source encoding name
>    --to TO            use TO as target encoding name
>    -i FILE            read from FILE
>    -o FILE            print output to FILE
>    --reverse          swap FROM and TO encodings
>    --help             print this help, then exit
>    --version          print version number, then exit
> 
> `-' as a file name argument can be used to refer to stdin or stdout.
> 
> # echo 0 > /proc/sys/kernel/randomize_va_space
> 
> $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> Segmentation fault
> ----------------------------------------------------------------
> 
> Would anybody try this and confirm or refute?
> 
> Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-06 11:34               ` musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader u-uy74
@ 2015-10-06 14:36                 ` Isaac Dunham
  2015-10-07  6:48                   ` u-uy74
  2015-10-06 17:07                 ` Rich Felker
  1 sibling, 1 reply; 18+ messages in thread
From: Isaac Dunham @ 2015-10-06 14:36 UTC (permalink / raw)
  To: musl

On Tue, Oct 06, 2015 at 01:34:51PM +0200, u-uy74@aetey.se wrote:
> On Wed, Sep 30, 2015 at 10:35:48PM +0200, u-uy74@aetey.se wrote:
> > On Wed, Sep 30, 2015 at 05:43:37PM +0200, u-uy74@aetey.se wrote:
> > > 
> > > I still think that the crashes are caused by errors
> > > while guessing the stack placement in pthread_getattr_np(),
> > > simply because of the kernel doing something else than usual.
> > 
> > I can reproduce the problem and this looks like something
> > to fix or at least work around, either in gcc or in musl.
> > 
> > Running with the implicit loader works, but using the explicit one yields:
> > 
> > ----------------------------------------------------------------
> > # cat /proc/sys/kernel/randomize_va_space
> > 2
> > 
> > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]
> > 
> > # echo 0 > /proc/sys/kernel/randomize_va_space
> > 
> > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > Segmentation fault
> > ----------------------------------------------------------------
> > 
> > Would anybody try this and confirm or refute?
> > 
> > Rune
> 
> Either nobody cares or nobody has a gcc-5.x toolchain built with musl?
> Wondering.

It's just that nobody cares about gcj, I think.
Now that Alpine has moved to GCC 5.2, I've tried it with the distro packages.
I've installed gcc-java;
with both randomize_va_space = 0 and 2, specifying an alternate path to the
default musl dynamic linker in the same way you did does not result in a
segfault.
Same goes with a local build of musl, using -Os.

HTH,
Isaac



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-06 11:34               ` musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader u-uy74
  2015-10-06 14:36                 ` Isaac Dunham
@ 2015-10-06 17:07                 ` Rich Felker
  2015-10-07  7:27                   ` u-uy74
  1 sibling, 1 reply; 18+ messages in thread
From: Rich Felker @ 2015-10-06 17:07 UTC (permalink / raw)
  To: musl

On Tue, Oct 06, 2015 at 01:34:51PM +0200, u-uy74@aetey.se wrote:
> Either nobody cares or nobody has a gcc-5.x toolchain built with musl?
> Wondering.
> 
> gcc-5 looks like a case important enough to care.

It's not that I'm uninterested, just that there does not yet seem to
be any reason to believe it's a bug in musl or any easy test-case to
reproduce the problem, so I wouldn't even know where to get started...

I think you really need to find a way to use what debugging tools you
have to figure out what's going on and where the actual source of the
crash is.

Rich

> On Wed, Sep 30, 2015 at 10:35:48PM +0200, u-uy74@aetey.se wrote:
> > On Wed, Sep 30, 2015 at 05:43:37PM +0200, u-uy74@aetey.se wrote:
> > > 
> > > I still think that the crashes are caused by errors
> > > while guessing the stack placement in pthread_getattr_np(),
> > > simply because of the kernel doing something else than usual.
> > 
> > I can reproduce the problem and this looks like something
> > to fix or at least work around, either in gcc or in musl.
> > 
> > Running with the implicit loader works, but using the explicit one yields:
> > 
> > ----------------------------------------------------------------
> > # cat /proc/sys/kernel/randomize_va_space
> > 2
> > 
> > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]
> > 
> > Convert from one encoding to another.
> > 
> >    --encoding FROM
> >    --from FROM        use FROM as source encoding name
> >    --to TO            use TO as target encoding name
> >    -i FILE            read from FILE
> >    -o FILE            print output to FILE
> >    --reverse          swap FROM and TO encodings
> >    --help             print this help, then exit
> >    --version          print version number, then exit
> > 
> > `-' as a file name argument can be used to refer to stdin or stdout.
> > 
> > # echo 0 > /proc/sys/kernel/randomize_va_space
> > 
> > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > Segmentation fault
> > ----------------------------------------------------------------
> > 
> > Would anybody try this and confirm or refute?
> > 
> > Rune


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-06 14:36                 ` Isaac Dunham
@ 2015-10-07  6:48                   ` u-uy74
  0 siblings, 0 replies; 18+ messages in thread
From: u-uy74 @ 2015-10-07  6:48 UTC (permalink / raw)
  To: musl

On Tue, Oct 06, 2015 at 07:36:54AM -0700, Isaac Dunham wrote:
> > > ----------------------------------------------------------------
> > > # cat /proc/sys/kernel/randomize_va_space
> > > 2
> > > 
> > > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > > Usage: jv-convert [OPTIONS] [INPUTFILE [OUTPUTFILE]]
> > > 
> > > # echo 0 > /proc/sys/kernel/randomize_va_space
> > > 
> > > $ /pathtomusllibc.so --library-path /pathtogcc-5libs /pathto/jv-convert --help
> > > Segmentation fault
> > > ----------------------------------------------------------------
> > > 
> > > Would anybody try this and confirm or refute?

> It's just that nobody cares about gcj, I think.

A good point, indeed.

> Now that Alpine has moved to GCC 5.2, I've tried it with the distro packages.
> I've installed gcc-java;
> with both randomize_va_space = 0 and 2, specifying an alternate path to the
> default musl dynamic linker in the same way you did does not result in a
> segfault.
> Same goes with a local build of musl, using -Os.

Thanks Isaac, appreciated.
Apparently the breakage is not as simple as it looked in my test here.

> HTH,

It does.

> Isaac

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-06 17:07                 ` Rich Felker
@ 2015-10-07  7:27                   ` u-uy74
  2015-10-07  7:43                     ` Timo Teras
  0 siblings, 1 reply; 18+ messages in thread
From: u-uy74 @ 2015-10-07  7:27 UTC (permalink / raw)
  To: musl

On Tue, Oct 06, 2015 at 01:07:55PM -0400, Rich Felker wrote:
> It's not that I'm uninterested, just that there does not yet seem to
> be any reason to believe it's a bug in musl or any easy test-case to
> reproduce the problem, so I wouldn't even know where to get started...

That's why I looked for somebody to do a simple test (even though with
a "complex" application), to see how reproducible the problem is.

The crash (now I assume that it resides in gcc) depends apparently on
a combination of many variables.

> I think you really need to find a way to use what debugging tools you
> have to figure out what's going on and where the actual source of the
> crash is.

Pretty remarkably, neither my usual gdb nor Debian's current gdb were
able to make sence of the crashes. Probably the thread states became
messed up too badly.

Fortunately I do not think any longer that musl is the culprit, nor do
I actually need gcj, otherwise have a workaround. Will not pursue this
issue further.

Thanks for your feedback, sorry for the noise.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-07  7:27                   ` u-uy74
@ 2015-10-07  7:43                     ` Timo Teras
  2015-10-07 10:59                       ` u-uy74
  2015-10-08 16:48                       ` Rich Felker
  0 siblings, 2 replies; 18+ messages in thread
From: Timo Teras @ 2015-10-07  7:43 UTC (permalink / raw)
  To: u-uy74; +Cc: musl

On Wed, 7 Oct 2015 09:27:54 +0200
u-uy74@aetey.se wrote:

> > I think you really need to find a way to use what debugging tools
> > you have to figure out what's going on and where the actual source
> > of the crash is.  
> 
> Pretty remarkably, neither my usual gdb nor Debian's current gdb were
> able to make sence of the crashes. Probably the thread states became
> messed up too badly.
> 
> Fortunately I do not think any longer that musl is the culprit, nor do
> I actually need gcj, otherwise have a workaround. Will not pursue this
> issue further.
> 
> Thanks for your feedback, sorry for the noise.

gcj uses boehm-gc. Alpine has patches for gcc boehm-gc. We are also
patching gcc's gcj. You can see our full patch set at:
http://git.alpinelinux.org/cgit/aports/tree/main/gcc

Some of these may or may not fix the issue you have at had. Not sure
how your gcc/gcj is built.

/Timo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-07  7:43                     ` Timo Teras
@ 2015-10-07 10:59                       ` u-uy74
  2015-10-08 16:48                       ` Rich Felker
  1 sibling, 0 replies; 18+ messages in thread
From: u-uy74 @ 2015-10-07 10:59 UTC (permalink / raw)
  To: musl

On Wed, Oct 07, 2015 at 10:43:39AM +0300, Timo Teras wrote:
> gcj uses boehm-gc. Alpine has patches for gcc boehm-gc. We are also
> patching gcc's gcj. You can see our full patch set at:
> http://git.alpinelinux.org/cgit/aports/tree/main/gcc

I see now.

My musl-related tweaks to boehm-gc were insufficient,
missing the ones for
 boehm-gc/dyn_load.c
 boehm-gc/include/private/gcconfig.h

Testing with these fixes applied...
... 3 hours later the build is ready.

Now it does not crash (of course) in any situation where I test.

> Some of these may or may not fix the issue you have at had. Not sure
> how your gcc/gcj is built.

Thanks a lot Timo!

I overestimated the compatibility of gcc-5.2 with musl.

Apologies for unfounded doubts of musl correctness.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-07  7:43                     ` Timo Teras
  2015-10-07 10:59                       ` u-uy74
@ 2015-10-08 16:48                       ` Rich Felker
  2015-10-09  5:39                         ` Timo Teras
  1 sibling, 1 reply; 18+ messages in thread
From: Rich Felker @ 2015-10-08 16:48 UTC (permalink / raw)
  To: musl

On Wed, Oct 07, 2015 at 10:43:39AM +0300, Timo Teras wrote:
> On Wed, 7 Oct 2015 09:27:54 +0200
> u-uy74@aetey.se wrote:
> 
> > > I think you really need to find a way to use what debugging tools
> > > you have to figure out what's going on and where the actual source
> > > of the crash is.  
> > 
> > Pretty remarkably, neither my usual gdb nor Debian's current gdb were
> > able to make sence of the crashes. Probably the thread states became
> > messed up too badly.
> > 
> > Fortunately I do not think any longer that musl is the culprit, nor do
> > I actually need gcj, otherwise have a workaround. Will not pursue this
> > issue further.
> > 
> > Thanks for your feedback, sorry for the noise.
> 
> gcj uses boehm-gc. Alpine has patches for gcc boehm-gc. We are also
> patching gcc's gcj. You can see our full patch set at:
> http://git.alpinelinux.org/cgit/aports/tree/main/gcc
> 
> Some of these may or may not fix the issue you have at had. Not sure
> how your gcc/gcj is built.

Thank you very much for finding the cause of this. Do you know if
these patches have been submitted upstream to gcc and/or boehm?
Obviously assuming by default that __environ is the start of .data and
only doing a proper search on glibc is broken basically everywhere but
glibc. The dl_iterate_phdr stuff should probably use a configure
check; I think one already exists in gcc but the boehm-gc dir might
need its own.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader
  2015-10-08 16:48                       ` Rich Felker
@ 2015-10-09  5:39                         ` Timo Teras
  0 siblings, 0 replies; 18+ messages in thread
From: Timo Teras @ 2015-10-09  5:39 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Thu, 8 Oct 2015 12:48:25 -0400
Rich Felker <dalias@libc.org> wrote:

> On Wed, Oct 07, 2015 at 10:43:39AM +0300, Timo Teras wrote:
> > gcj uses boehm-gc. Alpine has patches for gcc boehm-gc. We are also
> > patching gcc's gcj. You can see our full patch set at:
> > http://git.alpinelinux.org/cgit/aports/tree/main/gcc
> > 
> > Some of these may or may not fix the issue you have at had. Not sure
> > how your gcc/gcj is built.
> 
> Thank you very much for finding the cause of this. Do you know if
> these patches have been submitted upstream to gcc and/or boehm?
> Obviously assuming by default that __environ is the start of .data and
> only doing a proper search on glibc is broken basically everywhere but
> glibc. The dl_iterate_phdr stuff should probably use a configure
> check; I think one already exists in gcc but the boehm-gc dir might
> need its own.

I think not. I did write most of those patches, but most of them are
not suitable for upstreaming as-is. Additional configury magic would be
needed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-10-09  5:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-20  6:39 pthread_getattr_np() vs explicit runtime loader u-wsnj
2015-09-20 16:34 ` Rich Felker
2015-09-20 17:22   ` u-wsnj
2015-09-20 18:27     ` Rich Felker
2015-09-20 19:30       ` u-wsnj
2015-09-20 19:41         ` Rich Felker
2015-09-21  7:57           ` u-wsnj
2015-09-30 15:43           ` u-uy74
2015-09-30 20:35             ` Update: [musl] " u-uy74
2015-10-06 11:34               ` musl bug or not, real or not? (Was: [musl] Update: [musl] pthread_getattr_np() vs explicit runtime) loader u-uy74
2015-10-06 14:36                 ` Isaac Dunham
2015-10-07  6:48                   ` u-uy74
2015-10-06 17:07                 ` Rich Felker
2015-10-07  7:27                   ` u-uy74
2015-10-07  7:43                     ` Timo Teras
2015-10-07 10:59                       ` u-uy74
2015-10-08 16:48                       ` Rich Felker
2015-10-09  5:39                         ` Timo Teras

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).