From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13525 Path: news.gmane.org!.POSTED!not-for-mail From: Arkadiusz Sienkiewicz Newsgroups: gmane.linux.lib.musl.general Subject: Re: aio_cancel segmentation fault for in progress write requests Date: Fri, 7 Dec 2018 17:04:07 +0100 Message-ID: References: <20181207154419.GD23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000008d352b057c70c436" X-Trace: blaine.gmane.org 1544198548 29277 195.159.176.226 (7 Dec 2018 16:02:28 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 7 Dec 2018 16:02:28 +0000 (UTC) Cc: musl@lists.openwall.com To: dalias@libc.org Original-X-From: musl-return-13541-gllmg-musl=m.gmane.org@lists.openwall.com Fri Dec 07 17:02:23 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1gVIa7-0007UB-3L for gllmg-musl@m.gmane.org; Fri, 07 Dec 2018 17:02:23 +0100 Original-Received: (qmail 10146 invoked by uid 550); 7 Dec 2018 16:04:31 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 10125 invoked from network); 7 Dec 2018 16:04:31 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XLxRSqh7sw0QZS/M1Of+xaRJqpEDLu9yyT2eenmF4V4=; b=mmVCeED/gedqXMlfW0z4+iUOEgnc4nncGYLDHPNJ5PYytcr3M84dMnwnDxC7rGc9vr b1K4kxNKtRZe/RDTxrmUo6XhdaP8iy7rvAms+g31rCimxI4Ybdny0/f/KCPCvU1Qn0wX yooZu4V8bHoiZM0HgYtcUISFqtnfUPmHjLsWOUJIWPF9/DU4C0sv2xCwotwFzhxQDnt6 DVTedBUUIoym6aB5zHcDYH9fUT2MvqMxBkRJz7kV98+2jDfjhxvlEXUvY9uwJQPXC5jI i/m+y1uBRwFIcMqdOzifrIO4aCSwPLGSU3ePQ+tX+R1QYSNIMYYKVknMIPcePjaXnNCI 3+8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XLxRSqh7sw0QZS/M1Of+xaRJqpEDLu9yyT2eenmF4V4=; b=ni0V3IfX/C6AE7G1G2DvZDGkkJc3/bJl0fn7WoZ+i6BvBGm9+M31snOsIAbRjbBQ/Y GX0F7IsKC/GOXOyKbexZd96N0SL9KojnR7zprSsME3XVdSLiJB9sCRMK7blKTqVXjgfU 6hE46pFOjwenUH4Wdu66JPJukYuhpuNkNmKIJpBVnuvcAeKUIMXIO7YHj1bzgNkpNy+C Hupc/gUQj7qwjZf01XvthKrkLEHZN1YzqdnefFk7pZslra1SXUTGyQ3VOu9e1G1vF/fA Z2+cHasHN/8Ymk1mbuGaDsd9D6BxtAT7sNaFdP6aUDTBYxi4bKDzpE/+eZlQ1iPu3gOp NElA== X-Gm-Message-State: AA+aEWZfGtyRfLCa60cFiu+kMNYls/QTUWiZOFEBuy39STN7OX4ALzcA ezC9DAPmYWg0MYw7Vas5BqgAtIaw15q4NvXZgJI= X-Google-Smtp-Source: AFSGD/UOiCJPhTcxoyiyiD2ki66IpxRB3JKTtd17JifgPiPngXoLH1W/TvJvBK6nROyKGfnaPRenFSqXTIA3fTgB4xs= X-Received: by 2002:ae9:df02:: with SMTP id t2mr2400207qkf.230.1544198659315; Fri, 07 Dec 2018 08:04:19 -0800 (PST) In-Reply-To: <20181207154419.GD23599@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:13525 Archived-At: --0000000000008d352b057c70c436 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Ok, maybe stacktrace is misleading due to some problem in GDB. However, that doesn't explain why I'm getting segmentation fault when I execute test program without gdb. Also commenting aio_cancel line will "fix" seg fault, so that function is most probable culprit. pt., 7 gru 2018 o 16:44 Rich Felker napisa=C5=82(a): > On Fri, Dec 07, 2018 at 01:52:31PM +0100, Arkadiusz Sienkiewicz wrote: > > Hi, > > > > I'm experiencing segmentation fault when I invoke aio_cancel on request > > which is in EINPROGRESS state. This happens only with libc muls (used > > version - 1.1.12-r8) and only on one (dual Intel Xeon Gold 6128) of few > > computers I've tried it on - please let me know if you need more > > information about that machine. Attached is very simple program > > (aioWrite.cpp) that reproduces this problem. > > > > alpine-tmp-0:~$ ./aioWrite > > Segmentation fault (core dumped) > > > > Bt from gdb shows problem is in aio_cancel. > > This is not correct: > > > > > (gdb) r > > Starting program: ~/aioWrite > > [New LWP 70321] > > > > Program received signal ?, Unknown signal. > > [Switching to LWP 70321] > > This just shows that the aio thread received the cancellation request. > It's not a crash or a problem. However, gdb's reporting of it as > "Unknown signal" and inability to pass it on correctly indicates that > something is wrong with the gdb on your system. I've hit this issue a > lot but it works on some systems and I don't recall what the > cause/difference behind it is. We should work to figure that out and > get an appropriate fix in distros that are affected. > > > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > #define TNAME "aio_write/1-1.c" > > > > int main() { > > char tmpfname[256]; > > #define BUF_SIZE 512512 > > char buf[BUF_SIZE]; > > char check[BUF_SIZE+1]; > > int fd; > > struct aiocb aiocb; > > int err; > > int ret; > > > > snprintf(tmpfname, sizeof(tmpfname), "pts_aio_write_1_1_%d", getpid()= ); > > unlink(tmpfname); > > fd =3D open(tmpfname, O_CREAT | O_RDWR | O_EXCL, S_IRUSR | S_IWUSR); > > if (fd =3D=3D -1) { > > printf(TNAME " Error at open(): %s\n", strerror(errno)); > > exit(1); > > } > > > > unlink(tmpfname); > > > > memset(buf, 0xaa, BUF_SIZE); > > memset(&aiocb, 0, sizeof(struct aiocb)); > > aiocb.aio_fildes =3D fd; > > aiocb.aio_buf =3D buf; > > aiocb.aio_nbytes =3D BUF_SIZE; > > > > if (aio_write(&aiocb) =3D=3D -1) { > > printf(TNAME " Error at aio_write(): %s\n", strerror(errno)); > > close(fd); > > exit(2); > > } > > > > int cancellationStatus =3D aio_cancel(fd, &aiocb); > > printf (TNAME " cancelationStatus : %d\n", cancellationStatus); > > > > /* Wait until completion */ > > while (aio_error (&aiocb) =3D=3D EINPROGRESS); > > > > close(fd); > > printf ("Test PASSED\n"); > > return 0; > > } > > I just tried this test and it works for me on 32-bit x86. I'll try > some other systems and see if I can reproduce the issue. It could be a > bug in the test but I didn't see anything obviously wrong with it. > > Rich > --0000000000008d352b057c70c436 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Ok, maybe stacktrace is misleading due to some problem in = GDB. However, that doesn't explain why I'm getting segmentation fau= lt when I execute test program without gdb. Also commenting aio_cancel line= will "fix" seg fault, so that function is most probable culprit.=

pt., 7 gru 2018 o= 16:44=C2=A0Rich Felker <dalias@libc.= org> napisa=C5=82(a):
On Fri, Dec 07, 2018 at 01:52:31PM +0100, Arkadiusz Sienkiewic= z wrote:
> Hi,
>
> I'm experiencing segmentation fault when I invoke aio_cancel on re= quest
> which is in EINPROGRESS state. This happens only with libc muls (used<= br> > version - 1.1.12-r8) and only on one (dual Intel Xeon Gold 6128) of fe= w
> computers I've tried it on - please let me know if you need more > information about that machine. Attached is very simple program
> (aioWrite.cpp) that reproduces this problem.
>
> alpine-tmp-0:~$ ./aioWrite
> Segmentation fault (core dumped)
>
> Bt from gdb shows problem is in aio_cancel.

This is not correct:

>
> (gdb) r
> Starting program: ~/aioWrite
> [New LWP 70321]
>
> Program received signal ?, Unknown signal.
> [Switching to LWP 70321]

This just shows that the aio thread received the cancellation request.
It's not a crash or a problem. However, gdb's reporting of it as "Unknown signal" and inability to pass it on correctly indicates = that
something is wrong with the gdb on your system. I've hit this issue a lot but it works on some systems and I don't recall what the
cause/difference behind it is. We should work to figure that out and
get an appropriate fix in distros that are affected.


> #include <stdio.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <string.h>
> #include <errno.h>
> #include <stdlib.h>
> #include <aio.h>
>
> #define TNAME "aio_write/1-1.c"
>
> int main() {
>=C2=A0 =C2=A0char tmpfname[256];
>=C2=A0 =C2=A0#define BUF_SIZE 512512
>=C2=A0 =C2=A0char buf[BUF_SIZE];
>=C2=A0 =C2=A0char check[BUF_SIZE+1];
>=C2=A0 =C2=A0int fd;
>=C2=A0 =C2=A0struct aiocb aiocb;
>=C2=A0 =C2=A0int err;
>=C2=A0 =C2=A0int ret;
>
>=C2=A0 =C2=A0snprintf(tmpfname, sizeof(tmpfname), "pts_aio_write_1= _1_%d", getpid());
>=C2=A0 =C2=A0unlink(tmpfname);
>=C2=A0 =C2=A0fd =3D open(tmpfname, O_CREAT | O_RDWR | O_EXCL, S_IRUSR |= S_IWUSR);
>=C2=A0 =C2=A0if (fd =3D=3D -1) {
>=C2=A0 =C2=A0 =C2=A0printf(TNAME " Error at open(): %s\n", st= rerror(errno));
>=C2=A0 =C2=A0 =C2=A0exit(1);
>=C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0unlink(tmpfname);
>
>=C2=A0 =C2=A0memset(buf, 0xaa, BUF_SIZE);
>=C2=A0 =C2=A0memset(&aiocb, 0, sizeof(struct aiocb));
>=C2=A0 =C2=A0aiocb.aio_fildes =3D fd;
>=C2=A0 =C2=A0aiocb.aio_buf =3D buf;
>=C2=A0 =C2=A0aiocb.aio_nbytes =3D BUF_SIZE;
>
>=C2=A0 =C2=A0if (aio_write(&aiocb) =3D=3D -1) {
>=C2=A0 =C2=A0 =C2=A0printf(TNAME " Error at aio_write(): %s\n"= ;, strerror(errno));
>=C2=A0 =C2=A0 =C2=A0close(fd);
>=C2=A0 =C2=A0 =C2=A0exit(2);
>=C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0int cancellationStatus =3D aio_cancel(fd, &aiocb);
>=C2=A0 =C2=A0printf (TNAME " cancelationStatus : %d\n", cance= llationStatus);
>
>=C2=A0 =C2=A0/* Wait until completion */
>=C2=A0 =C2=A0while (aio_error (&aiocb) =3D=3D EINPROGRESS);
>
>=C2=A0 =C2=A0close(fd);
>=C2=A0 =C2=A0printf ("Test PASSED\n");
>=C2=A0 =C2=A0return 0;
> }

I just tried this test and it works for me on 32-bit x86. I'll try
some other systems and see if I can reproduce the issue. It could be a
bug in the test but I didn't see anything obviously wrong with it.

Rich
--0000000000008d352b057c70c436--