Re: aio_cancel segmentation fault for in progress write requests

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Arkadiusz Sienkiewicz <sienkiewiczarkadiusz@gmail.com>
To: musl@lists.openwall.com
Cc: dalias@libc.org
Subject: Re: aio_cancel segmentation fault for in progress write requests
Date: Mon, 10 Dec 2018 10:05:05 +0100	[thread overview]
Message-ID: <CAO=yjR0DGYBjm4uN4ESE+vosP-div5qu9jkikPsZGpWbgzCm0g@mail.gmail.com> (raw)
In-Reply-To: <87in049em2.fsf@oldenburg2.str.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5308 bytes --]

Here are answers to some question directed to me earlier:

> Could you attach the log from "strace -f -o strace.log ~/aioWrite"?
Sorry, can't do that. strace is not installed and I don't have root access.
If this is still needed I will ask admin to add strace.

> Do the other machines have the same kernel (4.15.0-20-generic)?
No, other machines use kernel 4.15.0-39-generic.

> Have you tried running the binary built on a successful machine on
the problematic machine?

Yes, same effect - segmentation fault. bt from gdb is identical too.

> valgrind might also be a good idea.

alpine-tmp-0:~$ strace -f ./aioWrite
-sh: strace: not found
alpine-tmp-0:~$ valgrind
valgrind            valgrind-di-server  valgrind-listener
alpine-tmp-0:~$ valgrind ./aioWrite
==70339== Memcheck, a memory error detector
==70339== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==70339== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==70339== Command: ./aioWrite
==70339==
==70339== Invalid free() / delete / delete[] / realloc()
==70339==    at 0x4C92B0E: free (vg_replace_malloc.c:530)
==70339==    by 0x4020248: reclaim_gaps (dynlink.c:478)
==70339==    by 0x4020CD0: map_library (dynlink.c:674)
==70339==    by 0x4021818: load_library (dynlink.c:980)
==70339==    by 0x4022607: load_preload (dynlink.c:1075)
==70339==    by 0x4022607: __dls3 (dynlink.c:1585)
==70339==    by 0x4021EDB: __dls2 (dynlink.c:1389)
==70339==    by 0x401FC8E: ??? (in /lib/ld-musl-x86_64.so.1)
==70339==  Address 0x4e9a180 is in a rw- mapped file
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so segment
==70339==
==70339== Can't extend stack to 0x4087948 during signal delivery for thread
2:
==70339==   no stack segment
==70339==
==70339== Process terminating with default action of signal 11 (SIGSEGV):
dumping core
==70339==  Access not within mapped region at address 0x4087948
==70339==    at 0x4016834: __syscall3 (syscall_arch.h:29)
==70339==    by 0x4016834: __wake (pthread_impl.h:133)
==70339==    by 0x4016834: cleanup (aio.c:154)
==70339==    by 0x40167B0: io_thread_func (aio.c:255)
==70339==    by 0x4054292: start (pthread_create.c:145)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==    by 0x4053071: ??? (clone.s:21)
==70339==  If you believe this happened as a result of a stack
==70339==  overflow in your program's main thread (unlikely but
==70339==  possible), you can try to increase the size of the
==70339==  main thread stack using the --main-stacksize= flag.
==70339==  The main thread stack size used in this run was 8388608.
==70339==
==70339== HEAP SUMMARY:
==70339==     in use at exit: 81,051 bytes in 9 blocks
==70339==   total heap usage: 9 allocs, 3 frees, 81,051 bytes allocated
==70339==
==70339== LEAK SUMMARY:
==70339==    definitely lost: 0 bytes in 0 blocks
==70339==    indirectly lost: 0 bytes in 0 blocks
==70339==      possibly lost: 0 bytes in 0 blocks
==70339==    still reachable: 81,051 bytes in 9 blocks
==70339==         suppressed: 0 bytes in 0 blocks
==70339== Rerun with --leak-check=full to see details of leaked memory
==70339==
==70339== For counts of detected and suppressed errors, rerun with: -v
==70339== ERROR SUMMARY: 3 errors from 1 contexts (suppressed: 0 from 0)
Killed


sob., 8 gru 2018 o 17:18 Florian Weimer <fweimer@redhat.com> napisał(a):

> * Rich Felker:
>
> > On Fri, Dec 07, 2018 at 09:06:18PM +0100, Florian Weimer wrote:
> >> * Rich Felker:
> >>
> >> > I don't think so. I'm concerned that it's a stack overflow, and that
> >> > somehow the kernel folks have managed to break the MINSIGSTKSZ ABI.
> >>
> >> Probably:
> >>
> >>   <https://sourceware.org/bugzilla/show_bug.cgi?id=20305>
> >>   <https://sourceware.org/bugzilla/show_bug.cgi?id=22636>
> >>
> >> It's a nasty CPU backwards compatibility problem.  Some of the
> >> suggestions I made to work around this are simply wrong; don't take them
> >> too seriously.
> >>
> >> Nowadays, the kernel has a way to disable the %zmm registers, but it
> >> unfortunately does not reduce the save area size.
> >
> > How large is the saved context with the %zmm junk? I measured just
> > ~768 bytes on normal x86_64 without it, and since 2048 is rounded up
> > to a whole page (4096), overflow should not happen until the signal
> > context is something like 3.5k (allowing ~512 bytes for TCB (~128) and
> > 2 simple call frames).
>
> I wrote a test to do some measurements:
>
>   <https://sourceware.org/ml/libc-alpha/2018-12/msg00271.html>
>
> The signal handler context is quite large on x86-64 with AVX-512F,
> indeed around 3.5 KiB.  It is even larger on ppc64 and ppc64el
> (~4.5 KiB), which I find somewhat surprising.
>
> The cancellation test also includes stack usage from the libgcc
> unwinder.  Its stack usage likely differs between versions, so I should
> have included that in the reported results.
>
> Thanks,
> Florian
>

[-- Attachment #2: Type: text/html, Size: 6631 bytes --]

next prev parent reply	other threads:[~2018-12-10  9:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07 12:52 Arkadiusz Sienkiewicz
2018-12-07 15:44 ` Rich Felker
2018-12-07 16:04   ` Arkadiusz Sienkiewicz
2018-12-07 16:52     ` Orivej Desh
2018-12-07 16:52     ` Rich Felker
2018-12-07 17:31       ` A. Wilcox
2018-12-07 18:26         ` Rich Felker
2018-12-07 19:05           ` A. Wilcox
2018-12-07 20:07             ` Rich Felker
2018-12-07 19:13           ` A. Wilcox
2018-12-07 20:21             ` Rich Felker
2018-12-07 20:35             ` Markus Wichmann
2018-12-07 21:12               ` Rich Felker
2018-12-07 22:51               ` A. Wilcox
2018-12-07 23:50                 ` Rich Felker
2018-12-07 20:06           ` Florian Weimer
2018-12-07 20:14             ` Rich Felker
2018-12-08 16:18               ` Florian Weimer
2018-12-10  9:05                 ` Arkadiusz Sienkiewicz [this message]
2018-12-12  0:36                   ` Rich Felker
2018-12-17 14:21                     ` Arkadiusz Sienkiewicz
2018-12-17 17:29                       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO=yjR0DGYBjm4uN4ESE+vosP-div5qu9jkikPsZGpWbgzCm0g@mail.gmail.com' \
    --to=sienkiewiczarkadiusz@gmail.com \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).