From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12020 Path: news.gmane.org!.POSTED!not-for-mail From: Petr Skocik Newsgroups: gmane.linux.lib.musl.general Subject: Re: posix_spawn topics [was: Re: [musl] musl 1.1.17 released] Date: Fri, 20 Oct 2017 11:02:02 +0200 Message-ID: References: <20171019201337.GA335@brightrain.aerifal.cx> <20171020011716.GS1627@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="001a11417734f627de055bf6b9bb" X-Trace: blaine.gmane.org 1508490164 27773 195.159.176.226 (20 Oct 2017 09:02:44 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 20 Oct 2017 09:02:44 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-12033-gllmg-musl=m.gmane.org@lists.openwall.com Fri Oct 20 11:02:35 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1e5TC0-0003eO-BH for gllmg-musl@m.gmane.org; Fri, 20 Oct 2017 11:02:12 +0200 Original-Received: (qmail 11774 invoked by uid 550); 20 Oct 2017 09:02:16 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 11753 invoked from network); 20 Oct 2017 09:02:15 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to; bh=B9MOQ3HL8tLov63/iKpEDWOU6KDoU0K5yCfgF9cxXmc=; b=HuMYQgAFxk3vI8fuz/qI5atF+WR3pVUTKjj492WAadCdht4IGS/Om9T0Ah49VVpZCg I+Rqdr+yJlnFKp/yiaesUWo+y616o1P88tfTrbGtT/mPFRszE5xyoNJobCEckCcNQQbx Tg3/TBjrOXzYdX5JnNbhTnuyI94uD4jLPsRkJ0i/yXHVhFAAGTvRI6n4DTwVQNNYDIpy f0amDQRLE1p80TuUYAx9lNOqEuNeKGZzMIVqoP5FlGET7UgeldT8/2PaMrgZz1NWKbzO VNoiCW73ra1wnRIRiBRhLaU7GCpoG0L2lditKAOZNaXVfRp4B1d0aonCqiM47C44YjW2 aHUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to; bh=B9MOQ3HL8tLov63/iKpEDWOU6KDoU0K5yCfgF9cxXmc=; b=mdI7R/C4rT52soDLED73nEap8AY/lFb53XFZB0Ql57ayVTs/OlUmlbjSv89VtzLiSP gTfw+Mf1StmKc5/sdcqBG4OxIgp/hYFJlrB3/pE6ef+1imxaaPCAjHm9CWpvPSFpdDpN xNAkQZsQLnlt5AWYOlSD0ONtm9Np23R8ENix6SIlYVLfm/fA/7K1rh1eRUqykygesu8Y kfxZaPA7Zsclz7jqCIK77S5s966cyjDTw35LbkfEZZgwz8JsiygBj9x0hswi54btoyz0 uHVwcarFj1VLW5OwTo0V81DsmWTqRd+YIgQ+SFowkolMklRo0BxCLoa6gdm08dzcMG34 s0Rg== X-Gm-Message-State: AMCzsaWGDyaa260nyH/59dfJkf2CdJVIkYv4OMJsAeFuxZdQXYiC+hFM cNMegw9EgRCH6eRBPbRPNiAKLA+Iyql5YTyJcvZKyw== X-Google-Smtp-Source: ABhQp+Q3YRmxtMbOouSThQJZm1Cbpw2aXGTTCccYfe+MRuFYtvQF0ReMi4zjpw3bdqptrBp+vrA6pVNeR2lw6Xk+rUA= X-Received: by 10.157.54.212 with SMTP id s20mr2668857otd.83.1508490123539; Fri, 20 Oct 2017 02:02:03 -0700 (PDT) Original-Sender: thorx89@gmail.com In-Reply-To: <20171020011716.GS1627@brightrain.aerifal.cx> X-Google-Sender-Auth: VW-ZhhYfy6a0upzd4tq0QomXKKQ Xref: news.gmane.org gmane.linux.lib.musl.general:12020 Archived-At: --001a11417734f627de055bf6b9bb Content-Type: multipart/alternative; boundary="001a11417734f627da055bf6b9b9" --001a11417734f627da055bf6b9b9 Content-Type: text/plain; charset="UTF-8" Thanks for the feedback. I was using it modified like this because I need the enhanced adddup2 semantics since I want to do all my opens in the parent for more fine-grained error reporting. So I only really care about clearing the FD_CLOEXEC flag in the FDOP_DUP2 case, and so I'm glad you're willing to accept the semantics change. (Attached is perhaps a mergeable patch.) On Fri, Oct 20, 2017 at 3:17 AM, Rich Felker wrote: > On Fri, Oct 20, 2017 at 02:11:35AM +0200, Petr Skocik wrote: > > Hi. > > Since posix_spawn is being discussed, I'd like to report a what I think > is > > a bug in the current implementation: > > Since the child is unmasking its signals only after it's performed its > file > > actions, the process may get blocked indefinitely (e.g., on a FIFO > > open) while being unresponsive to signals. > > > > Example program (with close to no error checking): > > > > > > #define _GNU_SOURCE > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > int main() > > { > > pid_t pid; > > mkfifo("fifo", 0640); > > posix_spawn_file_actions_t fa; > > posix_spawn_file_actions_init(&fa); > > posix_spawn_file_actions_addopen(&fa, 0, "fifo", O_RDONLY, 0); > > posix_spawnp(&pid, "tr", &fa, 0, (char *const[]){ "tr", "a-z", "A-Z", 0}, > > environ); > > > > //will get stuck here > > > > } > > > > > > It think the pthread_mask call should be moved above the file actions, > > which fixes this problem. > > It's actually rather ugly that this happens at all. Your "fix" is > presumably that ^C now works, which of course helps. But short of > terminals signals there's no legitimate way to kill the hung process > because its pid has not been returned yet. > > I'm not really sure what should happen here; it may call for an > interpretation request. > > > Additionally, as far as extensions to the current POSIX standard are > > concerned, adding the herein (http://austingroupbugs.net/view.php?id=411 > ) > > proposed change to *adddup2 semantics (clearing the FD_CLOEXEC flag on > the > > (given target in dup2 file actions where the source and target > > filedescriptor are identical) would be super nice (the reasoning for it > > should be in the link). > > I wasn't aware of that part of #411; I agree it should be adopted now > without waiting for a new standard. > > > Finally, I noticed the error passing code can be reduced to a simple > write > > to a volatile int made from the child (purely cosmetic). > > The pipe is there for a very important reason, and the reason is not > passing the error. It just happens that, since we already have to read > from it in the parent anyway, it makes a convenient channel to pass > the error on. Your modified version has serious memory corruption > because the parent continues running (and may return from posix_spawn) > after args->ret is set but before the child exits, thereby ending the > lifetime of the child's stack. Atomicity of closing of the > close-on-exec pipe with respect to exec is the property we're using > here to detect when exec has taken effect and memory is no longer > shared. > > An argument can be made that CLONE_VFORK mandates the schedudler > prevent this, but historically it did not always do this correctly, > especially when traced (and I believe it's still possible when tracing > to manually resume the parent while the child is still sharing > memory), and the intent is to be compatible with kernels where > CLONE_VFORK is a nop. The main reason CLONE_VFORK is used is actually > as a hint to qemu-userlevel to get it to emulate things right. > > > Attached are patches for these 3 things in case you wanted them. (I hope > > I'm not doing something stupid.) > > Otherwise no, not "stupid", but the patches are a lot larger/more > invasive than they need to be and don't follow coding style. > > > @@ -97,14 +108,14 @@ static int child(void *args_vp) > > __syscall(SYS_close, op->fd); > > break; > > case FDOP_DUP2: > > - if ((ret=__sys_dup2(op->srcfd, op->fd))<0) > > + if ((ret=dup3_or_set_cloexec(op->srcfd, > op->fd, 0))<0) > > goto fail; > > break; > > case FDOP_OPEN: > > fd = __sys_open(op->path, op->oflag, > op->mode); > > if ((ret=fd) < 0) goto fail; > > if (fd != op->fd) { > > - if ((ret=__sys_dup2(fd, op->fd))<0) > > + if ((ret=dup3_or_set_cloexec(fd, > op->fd, 0))<0) > > This (FDOP_OPEN) is not a case covered by the POSIX interpretation, > and using it here would actually be wrong if it were reachable, but > you're already inside "if (fd != op->fd)" and thus there's no way > fd==op->fd can be true. So the change in this line is just spurious. > There's then just one place where dup3_or_set_cloexec is called (the > previous case), and it doesn't make sense to have a function; rather > the conditional should just be inline in that case. > > Rich > --001a11417734f627da055bf6b9b9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the feedback. I was using it modified like this= because=C2=A0 I need the enhanced adddup2 semantics since I want to do all= my opens in the parent for more fine-grained error reporting.=C2=A0 So I o= nly really care about clearing the FD_CLOEXEC flag in the FDOP_DUP2 case, a= nd so I'm glad you're willing to accept the semantics change. (Atta= ched is perhaps a mergeable patch.)

On Fri, Oct 20, 2017 at 3:17 AM, Rich Felker <dalias= @libc.org> wrote:
On Fri, = Oct 20, 2017 at 02:11:35AM +0200, Petr Skocik wrote:
> Hi.
> Since posix_spawn is being discussed, I'd like to report a what I = think is
> a bug in the current implementation:
> Since the child is unmasking its signals only after it's performed= its file
> actions, the process may get blocked indefinitely (e.g., on a FIFO
> open) while being unresponsive to signals.
>
> Example program (with close to no error checking):
>
>
> #define _GNU_SOURCE
> #include <unistd.h>
> #include <spawn.h>
> #include <stdlib.h>
> #include <fcntl.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/wait.h>
> int main()
> {
> pid_t pid;
> mkfifo("fifo", 0640);
> posix_spawn_file_actions_t fa;
> posix_spawn_file_actions_init(&fa);
> posix_spawn_file_actions_addopen(&fa, 0, "fifo", O_= RDONLY, 0);
> posix_spawnp(&pid, "tr", &fa, 0, (char *const[]){ &q= uot;tr", "a-z", "A-Z", 0},
> environ);
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0//will get stuck here
>
> }
>
>
> It=C2=A0 think the pthread_mask call should be moved above the file ac= tions,
> which fixes this problem.

It's actually rather ugly that this happens at all. Your "fix"= ; is
presumably that ^C now works, which of course helps. But short of
terminals signals there's no legitimate way to kill the hung process because its pid has not been returned yet.

I'm not really sure what should happen here; it may call for an
interpretation request.

> Additionally, as far as extensions to the current POSIX standard are > concerned, adding the herein (http://austingroupbug= s.net/view.php?id=3D411)
> proposed change to *adddup2 semantics (clearing the FD_CLOEXEC flag on= the
>=C2=A0 (given target in dup2 file actions where the source and target > filedescriptor are identical) would be super nice (the reasoning for i= t
> should be in the link).

I wasn't aware of that part of #411; I agree it should be adopted now without waiting for a new standard.

> Finally, I noticed the error passing code can be reduced to a simple w= rite
> to a volatile int made from the child (purely cosmetic).

The pipe is there for a very important reason, and the reason is not
passing the error. It just happens that, since we already have to read
from it in the parent anyway, it makes a convenient channel to pass
the error on. Your modified version has serious memory corruption
because the parent continues running (and may return from posix_spawn)
after args->ret is set but before the child exits, thereby ending the lifetime of the child's stack. Atomicity of closing of the
close-on-exec pipe with respect to exec is the property we're using
here to detect when exec has taken effect and memory is no longer
shared.

An argument can be made that CLONE_VFORK mandates the schedudler
prevent this, but historically it did not always do this correctly,
especially when traced (and I believe it's still possible when tracing<= br> to manually resume the parent while the child is still sharing
memory), and the intent is to be compatible with kernels where
CLONE_VFORK is a nop. The main reason CLONE_VFORK is used is actually
as a hint to qemu-userlevel to get it to emulate things right.

> Attached are patches for these 3 things in case you wanted them. (I ho= pe
> I'm not doing something stupid.)

Otherwise no, not "stupid", but the patches are a lot larger/more=
invasive than they need to be and don't follow coding style.

> @@ -97,14 +108,14 @@ static int child(void *args_vp)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0__syscall(SYS_close, op->fd); >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0case FDOP_DUP2:
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((ret=3D__sys_dup2(op->srcfd, op-&= gt;fd))<0)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((ret=3Ddup3_or_set_cloexec(op->srcfd, op->fd, 0))<0)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto f= ail;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0case FDOP_OPEN:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fd =3D __sys_open(op->path, op-= >oflag, op->mode);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((ret=3Dfd) < 0) goto fail;<= br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (fd !=3D op->fd) {
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((ret=3D_= _sys_dup2(fd, op->fd))<0)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((ret=3Dd= up3_or_set_cloexec(fd, op->fd, 0))<0)

This (FDOP_OPEN) is not a case covered by the POSIX interpretation,
and using it here would actually be wrong if it were reachable, but
you're already inside "if (fd !=3D op->fd)" and thus there= 's no way
fd=3D=3Dop->fd can be true. So the change in this line is just spurious.=
There's then just one place where dup3_or_set_cloexec is called (the previous case), and it doesn't make sense to have a function; rather the conditional should just be inline in that case.

Rich

--001a11417734f627da055bf6b9b9-- --001a11417734f627de055bf6b9bb Content-Type: text/x-patch; charset="US-ASCII"; name="0001-New-posix_spawn_file_actions_adddup2-semantics.patch" Content-Disposition: attachment; filename="0001-New-posix_spawn_file_actions_adddup2-semantics.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_j8zo15s80 RnJvbSA4ZTcyOTYzMmEwNDc0NmJhYTU2NWY0NWNmNWZkYWUyMjVkYjE0MjBiIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBQZXRyIFNrb2NpayA8Z21haWxAcHNrb2Npay5jb20+CkRhdGU6 IEZyaSwgMjAgT2N0IDIwMTcgMTA6NTA6MjQgKzAyMDAKU3ViamVjdDogW1BBVENIXSBOZXcgcG9z aXhfc3Bhd25fZmlsZV9hY3Rpb25zX2FkZGR1cDIgc2VtYW50aWNzCgpDbGVhciBGRF9DTE9FWEVD IG9uIHRoZSB0YXJnZXQgZmQgaW4gZHVwMiBmaWxlIGFjdGlvbnMgd2hlcmUgdGFyZ2V0IGZkID09 IHNyY19mZC4KUmVmOiBodHRwOi8vYXVzdGluZ3JvdXBidWdzLm5ldC92aWV3LnBocD9pZD00MTEK LS0tCiBzcmMvcHJvY2Vzcy9wb3NpeF9zcGF3bi5jIHwgMTEgKysrKysrKysrLS0KIDEgZmlsZSBj aGFuZ2VkLCA5IGluc2VydGlvbnMoKyksIDIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvc3Jj L3Byb2Nlc3MvcG9zaXhfc3Bhd24uYyBiL3NyYy9wcm9jZXNzL3Bvc2l4X3NwYXduLmMKaW5kZXgg ZWE1ZDI5OTguLjcyMWQwNzdkIDEwMDY0NAotLS0gYS9zcmMvcHJvY2Vzcy9wb3NpeF9zcGF3bi5j CisrKyBiL3NyYy9wcm9jZXNzL3Bvc2l4X3NwYXduLmMKQEAgLTEwOSw4ICsxMDksMTUgQEAgc3Rh dGljIGludCBjaGlsZCh2b2lkICphcmdzX3ZwKQogCQkJCV9fc3lzY2FsbChTWVNfY2xvc2UsIG9w LT5mZCk7CiAJCQkJYnJlYWs7CiAJCQljYXNlIEZET1BfRFVQMjoKLQkJCQlpZiAoKHJldD1fX3N5 c19kdXAyKG9wLT5zcmNmZCwgb3AtPmZkKSk8MCkKLQkJCQkJZ290byBmYWlsOworCQkJCWlmIChv cC0+c3JjZmQgIT0gb3AtPmZkKXsKKwkJCQkJaWYgKChyZXQ9X19zeXNfZHVwMihvcC0+c3JjZmQs IG9wLT5mZCkpPDApCisJCQkJCQlnb3RvIGZhaWw7CisJCQkJfWVsc2V7CisJCQkJCWlmICgocmV0 PV9fc3lzY2FsbChTWVNfZmNudGwsIG9wLT5mZCwgRl9HRVRGRCkpPDApCisJCQkJCQlnb3RvIGZh aWw7CisJCQkJCWlmICgocmV0PV9fc3lzY2FsbChTWVNfZmNudGwsIG9wLT5mZCwgRl9TRVRGRCwg cmV0ICYgfkZEX0NMT0VYRUMpKQorCQkJCQkJZ290byBmYWlsOworCQkJCX0KIAkJCQlicmVhazsK IAkJCWNhc2UgRkRPUF9PUEVOOgogCQkJCWZkID0gX19zeXNfb3BlbihvcC0+cGF0aCwgb3AtPm9m bGFnLCBvcC0+bW9kZSk7Ci0tIAoyLjE0LjIKCg== --001a11417734f627de055bf6b9bb--