From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12016 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: posix_spawn topics [was: Re: [musl] musl 1.1.17 released] Date: Thu, 19 Oct 2017 21:17:16 -0400 Message-ID: <20171020011716.GS1627@brightrain.aerifal.cx> References: <20171019201337.GA335@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1508462264 9780 195.159.176.226 (20 Oct 2017 01:17:44 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 20 Oct 2017 01:17:44 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12029-gllmg-musl=m.gmane.org@lists.openwall.com Fri Oct 20 03:17:39 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1e5LwC-0008V4-DV for gllmg-musl@m.gmane.org; Fri, 20 Oct 2017 03:17:24 +0200 Original-Received: (qmail 20273 invoked by uid 550); 20 Oct 2017 01:17:29 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 20250 invoked from network); 20 Oct 2017 01:17:28 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12016 Archived-At: On Fri, Oct 20, 2017 at 02:11:35AM +0200, Petr Skocik wrote: > Hi. > Since posix_spawn is being discussed, I'd like to report a what I think is > a bug in the current implementation: > Since the child is unmasking its signals only after it's performed its file > actions, the process may get blocked indefinitely (e.g., on a FIFO > open) while being unresponsive to signals. > > Example program (with close to no error checking): > > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > int main() > { > pid_t pid; > mkfifo("fifo", 0640); > posix_spawn_file_actions_t fa; > posix_spawn_file_actions_init(&fa); > posix_spawn_file_actions_addopen(&fa, 0, "fifo", O_RDONLY, 0); > posix_spawnp(&pid, "tr", &fa, 0, (char *const[]){ "tr", "a-z", "A-Z", 0}, > environ); > > //will get stuck here > > } > > > It think the pthread_mask call should be moved above the file actions, > which fixes this problem. It's actually rather ugly that this happens at all. Your "fix" is presumably that ^C now works, which of course helps. But short of terminals signals there's no legitimate way to kill the hung process because its pid has not been returned yet. I'm not really sure what should happen here; it may call for an interpretation request. > Additionally, as far as extensions to the current POSIX standard are > concerned, adding the herein (http://austingroupbugs.net/view.php?id=411) > proposed change to *adddup2 semantics (clearing the FD_CLOEXEC flag on the > (given target in dup2 file actions where the source and target > filedescriptor are identical) would be super nice (the reasoning for it > should be in the link). I wasn't aware of that part of #411; I agree it should be adopted now without waiting for a new standard. > Finally, I noticed the error passing code can be reduced to a simple write > to a volatile int made from the child (purely cosmetic). The pipe is there for a very important reason, and the reason is not passing the error. It just happens that, since we already have to read from it in the parent anyway, it makes a convenient channel to pass the error on. Your modified version has serious memory corruption because the parent continues running (and may return from posix_spawn) after args->ret is set but before the child exits, thereby ending the lifetime of the child's stack. Atomicity of closing of the close-on-exec pipe with respect to exec is the property we're using here to detect when exec has taken effect and memory is no longer shared. An argument can be made that CLONE_VFORK mandates the schedudler prevent this, but historically it did not always do this correctly, especially when traced (and I believe it's still possible when tracing to manually resume the parent while the child is still sharing memory), and the intent is to be compatible with kernels where CLONE_VFORK is a nop. The main reason CLONE_VFORK is used is actually as a hint to qemu-userlevel to get it to emulate things right. > Attached are patches for these 3 things in case you wanted them. (I hope > I'm not doing something stupid.) Otherwise no, not "stupid", but the patches are a lot larger/more invasive than they need to be and don't follow coding style. > @@ -97,14 +108,14 @@ static int child(void *args_vp) > __syscall(SYS_close, op->fd); > break; > case FDOP_DUP2: > - if ((ret=__sys_dup2(op->srcfd, op->fd))<0) > + if ((ret=dup3_or_set_cloexec(op->srcfd, op->fd, 0))<0) > goto fail; > break; > case FDOP_OPEN: > fd = __sys_open(op->path, op->oflag, op->mode); > if ((ret=fd) < 0) goto fail; > if (fd != op->fd) { > - if ((ret=__sys_dup2(fd, op->fd))<0) > + if ((ret=dup3_or_set_cloexec(fd, op->fd, 0))<0) This (FDOP_OPEN) is not a case covered by the POSIX interpretation, and using it here would actually be wrong if it were reachable, but you're already inside "if (fd != op->fd)" and thus there's no way fd==op->fd can be true. So the change in this line is just spurious. There's then just one place where dup3_or_set_cloexec is called (the previous case), and it doesn't make sense to have a function; rather the conditional should just be inline in that case. Rich