From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Sun, 13 Aug 2000 22:42:05 -0400
From: Alexander Viro <viro@math.psu.edu>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] Potential race in syspipe()
In-Reply-To: <200008140147.VAA29002@cse.psu.edu>
Message-ID: <Pine.GSO.4.10.10008132154210.194-100000@weyl.math.psu.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Topicbox-Message-UUID: fad5ebea-eac8-11e9-9e20-41e7f4b1d025



On Sun, 13 Aug 2000 presotto@plan9.bell-labs.com wrote:

> You are right, it is rookie programming, probably mine.  Your statement:
> 
> 	_anything_ that goes into ->fd[] must be treated as final
> 	as soon as you release the lock on ->fgrp
> 
> is the crux of the matter.
> 
> Your solution is reasonable.  I would probably do something a little
> dirtier to avoid duplicating the functionality of newfd().  It would be
> just to make the waserror() a little smarter in syspipe():
> 
> 	if(waserror()){
> 		if(fd[0] >= 0)
> 			fdclose(fd[0])
> 		else
> 			cclose(c[0]);
> 		if(fd[1] >= 0)
> 			fdclose(fd[1]);
> 		else if(c[1] != nil)
> 			cclose(c[1]);
> 		nexterror();
> 	}
> 
> and leave the rest alone.

Umm... You are handling that race to userland - effect of the race
scenario will be
Process A: pipe() returns -1
Process B: dup(foo, bar) returns bar, but fd[bar]==NULL

OTOH the race _was_ in the userland from the very beginning, so it may be
OK. Sigh... dup2() semantics seems to be ill-suited for systems that can
share descriptor table between several processes. Probably Plan 9 was the
first system that had to deal with that (IIRC Linux got clone() in '94 or
'95 and *BSD got rfork() pretty recently). Were there any attempts to
provide something, erm, less race-prone than dup2()?

	BTW, looks like create(2) (and creat() in APE) have an interesting
misfeature/bug/whatever: suppose that file creation succeeds but newfd()
fails. Then you are getting Enofd, but file is already created (and stays
alive, indeed). For APE it's a bug - directly violates POSIX. For the
native code... Hell knows, but it seems to be at least worth mentioning in
the manpage.

	I realize very well that changing that would be painful - the best
strategy is probably to ask for space at the very beginning and keep the
number of pending allocations (i.e. make sure that newfd() never fails).
Yuck. I'll probably try to see what this approach gives on our side, but
I'm not sure that it will be any better than our current code (right now
we are using an equivalent of falloc(9); since we keep a bitmap of
allocated descriptors we can see if dup2() tries to stick its nose into
the pending slot). newfd() equivalent had been tried some time ago, but
back then it was rejected due to problems with creat(2).
	At least you don't have to deal with bloody SCM_RIGHTS
file-passing...

								Al