From mboxrd@z Thu Jan 1 00:00:00 1970 From: presotto@plan9.bell-labs.com Message-Id: <200008140430.AAA01256@cse.psu.edu> Date: Mon, 14 Aug 2000 00:30:41 -0400 To: 9fans@cse.psu.edu Subject: Re: [9fans] Potential race in syspipe() MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="upas-nqanixaqvsqmaqxsvfdfhfxfvj" Topicbox-Message-UUID: fadb3e4c-eac8-11e9-9e20-41e7f4b1d025 This is a multi-part message in MIME format. --upas-nqanixaqvsqmaqxsvfdfhfxfvj Content-Disposition: inline Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit If the user is doing a dup2 unsynchronized with the syspipe(), even if everything works, the results are wierd with both solutions. Both the pipe() and the dup2() can succeed, except the pipe() can be returning fd's that have nothing to do with a pipe. As you said, dup2 is an dangerous thing when the fd's are shared. However, you're right. Your solution yields a more explainable world in the unlikely event that newfd() fails in syspipe(). I put in a newfd2(). The exact code follows. I added the routine findfreefd() since I want to keep that piece of magic in one place. I noticed your snippet didn't check the need to growfd after getting i, perhaps a good reason for findfreefd(). /* * this assumes that the fgrp is locked */ int findfreefd(Fgrp *f, int start) { int fd; for(fd=start; fdnfd; fd++) if(f->fd[fd] == 0) break; if(fd >= f->nfd && growfd(f, fd) < 0) return -1; return fd; } int newfd(Chan *c) { int fd; Fgrp *f; f = up->fgrp; lock(f); fd = findfreefd(f, 0); if(fd < 0){ unlock(f); return -1; } if(fd > f->maxfd) f->maxfd = fd; f->fd[fd] = c; unlock(f); return fd; } int newfd2(int fd[2], Chan *c[2]) { Fgrp *f; f = up->fgrp; lock(f); fd[0] = findfreefd(f, 0); if(fd[0] < 0){ unlock(f); return -1; } fd[1] = findfreefd(f, fd[0]+1); if(fd[1] < 0){ unlock(f); return -1; } if(fd[1] > f->maxfd) f->maxfd = fd[1]; f->fd[fd[0]] = c[0]; f->fd[fd[1]] = c[1]; unlock(f); return 0; } ... long syspipe(ulong *arg) { int fd[2]; Chan *c[2]; Dev *d; validaddr(arg[0], 2*BY2WD, 1); evenaddr(arg[0]); d = devtab[devno('|', 0)]; c[0] = namec("#|", Atodir, 0, 0); c[1] = 0; fd[0] = -1; fd[1] = -1; if(waserror()){ cclose(c[0]); if(c[1]) cclose(c[1]); nexterror(); } c[1] = cclone(c[0], 0); if(walk(&c[0], "data", 1) < 0) error(Egreg); if(walk(&c[1], "data1", 1) < 0) error(Egreg); c[0] = d->open(c[0], ORDWR); c[1] = d->open(c[1], ORDWR); if(newfd2(fd, c) < 0) error(Enofd); poperror(); ((long*)arg[0])[0] = fd[0]; ((long*)arg[0])[1] = fd[1]; return 0; } As for create(), I'ld rather change the man page. Otherwise we'ld have to reserve an fd in some way before the open but not have that fd visible to other processes until it was assinged a channel or let it be step on-able by dup2... --upas-nqanixaqvsqmaqxsvfdfhfxfvj Content-Type: message/rfc822 Content-Disposition: inline Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Sun Aug 13 22:56:43 EDT 2000 Received: from cse.psu.edu ([130.203.3.50]) by plan9; Sun Aug 13 22:56:42 EDT 2000 Received: from localhost (majordom@localhost) by cse.psu.edu (8.8.8/8.8.8) with SMTP id WAA29743; Sun, 13 Aug 2000 22:42:29 -0400 (EDT) Received: by claven.cse.psu.edu (bulk_mailer v1.5); Sun, 13 Aug 2000 22:42:17 -0400 Received: (from majordom@localhost) by cse.psu.edu (8.8.8/8.8.8) id WAA29705 for 9fans-outgoing; Sun, 13 Aug 2000 22:42:12 -0400 (EDT) Received: from math.psu.edu (leibniz.math.psu.edu [146.186.130.2]) by cse.psu.edu (8.8.8/8.8.8) with ESMTP id WAA29701 for <9fans@cse.psu.edu>; Sun, 13 Aug 2000 22:42:07 -0400 (EDT) Received: from weyl.math.psu.edu (weyl.math.psu.edu [146.186.130.226]) by math.psu.edu (8.9.3/8.9.3) with ESMTP id WAA23756 for <9fans@cse.psu.edu>; Sun, 13 Aug 2000 22:42:06 -0400 (EDT) Received: from localhost (viro@localhost) by weyl.math.psu.edu (8.9.3/8.9.3) with ESMTP id WAA10400 for <9fans@cse.psu.edu>; Sun, 13 Aug 2000 22:42:05 -0400 (EDT) X-Authentication-Warning: weyl.math.psu.edu: viro owned process doing -bs Date: Sun, 13 Aug 2000 22:42:05 -0400 (EDT) From: Alexander Viro To: 9fans@cse.psu.edu Subject: Re: [9fans] Potential race in syspipe() In-Reply-To: <200008140147.VAA29002@cse.psu.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-9fans@cse.psu.edu Reply-To: 9fans@cse.psu.edu Precedence: bulk On Sun, 13 Aug 2000 presotto@plan9.bell-labs.com wrote: > You are right, it is rookie programming, probably mine. Your statement: > > _anything_ that goes into ->fd[] must be treated as final > as soon as you release the lock on ->fgrp > > is the crux of the matter. > > Your solution is reasonable. I would probably do something a little > dirtier to avoid duplicating the functionality of newfd(). It would be > just to make the waserror() a little smarter in syspipe(): > > if(waserror()){ > if(fd[0] >= 0) > fdclose(fd[0]) > else > cclose(c[0]); > if(fd[1] >= 0) > fdclose(fd[1]); > else if(c[1] != nil) > cclose(c[1]); > nexterror(); > } > > and leave the rest alone. Umm... You are handling that race to userland - effect of the race scenario will be Process A: pipe() returns -1 Process B: dup(foo, bar) returns bar, but fd[bar]==NULL OTOH the race _was_ in the userland from the very beginning, so it may be OK. Sigh... dup2() semantics seems to be ill-suited for systems that can share descriptor table between several processes. Probably Plan 9 was the first system that had to deal with that (IIRC Linux got clone() in '94 or '95 and *BSD got rfork() pretty recently). Were there any attempts to provide something, erm, less race-prone than dup2()? BTW, looks like create(2) (and creat() in APE) have an interesting misfeature/bug/whatever: suppose that file creation succeeds but newfd() fails. Then you are getting Enofd, but file is already created (and stays alive, indeed). For APE it's a bug - directly violates POSIX. For the native code... Hell knows, but it seems to be at least worth mentioning in the manpage. I realize very well that changing that would be painful - the best strategy is probably to ask for space at the very beginning and keep the number of pending allocations (i.e. make sure that newfd() never fails). Yuck. I'll probably try to see what this approach gives on our side, but I'm not sure that it will be any better than our current code (right now we are using an equivalent of falloc(9); since we keep a bitmap of allocated descriptors we can see if dup2() tries to stick its nose into the pending slot). newfd() equivalent had been tried some time ago, but back then it was rejected due to problems with creat(2). At least you don't have to deal with bloody SCM_RIGHTS file-passing... Al --upas-nqanixaqvsqmaqxsvfdfhfxfvj--