Great, thank you for the fix.
Will it be available only in version 1.1.21 onward? Or will you also
backport it to older versions?

śr., 12 gru 2018 o 01:36 Rich Felker <dalias@libc.org> napisał(a):

> On Mon, Dec 10, 2018 at 10:05:05AM +0100, Arkadiusz Sienkiewicz wrote:
> > Here are answers to some question directed to me earlier:
> >
> > > Could you attach the log from "strace -f -o strace.log ~/aioWrite"?
> > Sorry, can't do that. strace is not installed and I don't have root
> access.
> > If this is still needed I will ask admin to add strace.
> >
> > > Do the other machines have the same kernel (4.15.0-20-generic)?
> > No, other machines use kernel 4.15.0-39-generic.
> >
> > > Have you tried running the binary built on a successful machine on
> > the problematic machine?
> >
> > Yes, same effect - segmentation fault. bt from gdb is identical too.
> >
> > > valgrind might also be a good idea.
> >
> > alpine-tmp-0:~$ strace -f ./aioWrite
> > -sh: strace: not found
> > alpine-tmp-0:~$ valgrind
> > valgrind            valgrind-di-server  valgrind-listener
> > alpine-tmp-0:~$ valgrind ./aioWrite
> > ==70339== Memcheck, a memory error detector
> > ==70339== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
> > ==70339== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
> info
> > ==70339== Command: ./aioWrite
> > ==70339==
> > ==70339== Invalid free() / delete / delete[] / realloc()
> > ==70339==    at 0x4C92B0E: free (vg_replace_malloc.c:530)
> > ==70339==    by 0x4020248: reclaim_gaps (dynlink.c:478)
> > ==70339==    by 0x4020CD0: map_library (dynlink.c:674)
> > ==70339==    by 0x4021818: load_library (dynlink.c:980)
> > ==70339==    by 0x4022607: load_preload (dynlink.c:1075)
> > ==70339==    by 0x4022607: __dls3 (dynlink.c:1585)
> > ==70339==    by 0x4021EDB: __dls2 (dynlink.c:1389)
> > ==70339==    by 0x401FC8E: ??? (in /lib/ld-musl-x86_64.so.1)
> > ==70339==  Address 0x4e9a180 is in a rw- mapped file
> > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so segment
> > ==70339==
> > ==70339== Can't extend stack to 0x4087948 during signal delivery for
> thread
> > 2:
> > ==70339==   no stack segment
> > ==70339==
> > ==70339== Process terminating with default action of signal 11 (SIGSEGV):
> > dumping core
> > ==70339==  Access not within mapped region at address 0x4087948
> > ==70339==    at 0x4016834: __syscall3 (syscall_arch.h:29)
> > ==70339==    by 0x4016834: __wake (pthread_impl.h:133)
> > ==70339==    by 0x4016834: cleanup (aio.c:154)
> > ==70339==    by 0x40167B0: io_thread_func (aio.c:255)
> > ==70339==    by 0x4054292: start (pthread_create.c:145)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==    by 0x4053071: ??? (clone.s:21)
> > ==70339==  If you believe this happened as a result of a stack
> > ==70339==  overflow in your program's main thread (unlikely but
> > ==70339==  possible), you can try to increase the size of the
> > ==70339==  main thread stack using the --main-stacksize= flag.
> > ==70339==  The main thread stack size used in this run was 8388608.
> > ==70339==
> > ==70339== HEAP SUMMARY:
> > ==70339==     in use at exit: 81,051 bytes in 9 blocks
> > ==70339==   total heap usage: 9 allocs, 3 frees, 81,051 bytes allocated
> > ==70339==
> > ==70339== LEAK SUMMARY:
> > ==70339==    definitely lost: 0 bytes in 0 blocks
> > ==70339==    indirectly lost: 0 bytes in 0 blocks
> > ==70339==      possibly lost: 0 bytes in 0 blocks
> > ==70339==    still reachable: 81,051 bytes in 9 blocks
> > ==70339==         suppressed: 0 bytes in 0 blocks
> > ==70339== Rerun with --leak-check=full to see details of leaked memory
> > ==70339==
> > ==70339== For counts of detected and suppressed errors, rerun with: -v
> > ==70339== ERROR SUMMARY: 3 errors from 1 contexts (suppressed: 0 from 0)
> > Killed
>
> Based on discussions in the other branches of this thread and on IRC,
> I'm reasonably sure the cause of your crash is that your combination
> of kernel and cpu model produces very large signal frames that
> overflow the stack on the io thread. I have committed a solution to
> the problem which I plan to push soon, along with some additional
> improvements in this area.
>
> Rich
>