From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Sun, 4 Jan 2009 22:20:55 -0800 From: "Russ Cox" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net> Subject: Re: [9fans] Why do we need syspipe() ? In-Reply-To: <1231130372.11463.433.camel@goose.sun.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1231045486.11463.245.camel@goose.sun.com> <1231130372.11463.433.camel@goose.sun.com> Topicbox-Message-UUID: 799aaeba-ead4-11e9-9d60-3106f5b1d025 >> I don't believe you can write a race-free implementation of >> the pipe system call using #|. > > Could you, please, elaborate on what particular race do you have > in mind? Indeed, I ran into a problem with devpipe implementation, > but it isn't a race, its a dreaded implicit ->attach that namec() > does when it evaluates names with the first character being #. The closest you can come in user space to implementing pipe is: int pipe(int *fd) { bind("#|", "/mnt", MREPL); fd[0] = open("/mnt/data", ORDWR); fd[1] = open("/mnt/data1", ORDWR); unmount("/mnt"); return 0; } but if there are multiple processes running pipe() in the same name space, the binds will step on each other and the pipes might get crossed. Even if not, maybe something else was already mounted on /mnt (or whatever mount point you choose), and now there's nothing there. >> I also don't believe you can implement the dup system call >> (remember, it has two arguments) using #d. > > Agreed. That's why I mentioned that a more feature-rich devdup > is needed. Of course, now I've also discovered that the current > implementation of devpipe is also not sufficient enough for me > to be able to produce a 100% user-space version of pipe(2). Sorry, I thought you were saying that #d was already more feature rich than dup (it is, in a way, since it has the ctl files now). I was trying to say that although that is true, it doesn't have the dup features. There are some devices in Plan 9 that simply don't "virtualize", because at a deep level they are tied to process state that doesn't go through the file system. Dup manipulates the file descriptor table, not files themselves. Pipe accesses files that have no name in the file system. The pid returned by getpid needs to match the pid returned by the parent's fork; it really needs to be the process's actual pid. For example, suppose a process wants to know . If getpid read from /dev/pid instead of #c/pid, then running "iostats rc -c 'echo $pid'" would show iostats's pid, not rc's. What then if rc wants to send itself (or, more likely, its note group) a note, or fiddle with one of its /proc files? It would be manipulating iostats, not itself. A write to devsrv is even more magical: when you write "23" to #s/newfile, your process's fd 23 gets taken over by the kernel. For this reason you can't use iostats on any program that writes to /srv/newfile instead of #s/newfile--when the program writes "23", the kernel sees the request come from iostats instead of the original program, and it takes over the wrong fd. (Most of those programs are 9P servers that fork into the background, and iostats isn't too useful on those anyway, so no one has bothered to address this.) The tls device and ssl devices #a and #D use the same trick, so you can't interpose on traffic destined to them. Happily, the libraries use #a and #D directly, so using iostats on them simply misses that i/o rather than causing the program to execute incorrectly. You could add a special message to #d to make the dup system call unnecessary, but it wouldn't be any cleaner than having the system call, since you'd have to hard code #d instead of using /fd, or else you'd have the same problems as #s, #a, and #D already do. The # device syntax is very useful to mean the kernel device and none other in these situations. There's definitely something unsatisfactory about it, but it works. Russ