From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13137 invoked from network); 14 Apr 2000 17:06:25 -0000 Received: from sunsite.auc.dk (130.225.51.30) by ns1.primenet.com.au with SMTP; 14 Apr 2000 17:06:25 -0000 Received: (qmail 8789 invoked by alias); 14 Apr 2000 17:06:17 -0000 Mailing-List: contact zsh-workers-help@sunsite.auc.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 10766 Received: (qmail 8779 invoked from network); 14 Apr 2000 17:06:16 -0000 Date: Fri, 14 Apr 2000 18:05:49 +0100 From: Peter Stephenson Subject: FIFOs To: zsh-workers@sunsite.auc.dk (Zsh hackers list) Message-id: <0FT0004DUNHOXH@la-la.cambridgesiliconradio.com> Content-transfer-encoding: 7BIT Bart wrote: > For guaranteed correct operation, we should remove the PATH_DEV_FD code > from getproc() in exec.c, or (perhaps better) change it to be used only > if mkfifo() is absent or fails. This is easy (apart from the `or fails', which I haven't attempted to implement --- unless that simply means `or the configure test for it fails'). There are two issues here, however (without a patch, you'll have to #undef PATH_DEV_FD in exec.c to see them). The first isn't too bad. % echo <(echo foo) Here the parent shell can, with the wind in the right direction, get back and delete the file named by the <(...) before the child has had a chance to open it (let alone call the code to fill it). There's no easy way to synch this, since you end up with deadlock --- the child can't open the fifo until there's a process reading it. This has happened to me a few times. It looks pretty unlikely if you stare at the code --- the open is only a few instructions later in the child while the host is doing all the normal command processing first --- but if you think about the scheduling of forked-off child processes on heavily loaded machines (in this case SMP) maybe it's not so surprising. One good reason not to worry about this is that if the process actually opens the fifo, that's guaranteed not to happen, i.e. % cat <(echo foo) always works. The second thing is a killer, at least without a rethink. In the case first shown, where the fifo is never opened, but this time does still exist, the zsh just hangs on for ever waiting for it and sits around uselessly in the process table. The second remark above still applies, but this time the failure is less benign. Maybe somebody understands this better. Anyway, I haven't sent a patch because of that. I suppose this a system issue, since it's not obvious to me why the read doesn't just fail when the fifo is deleted, at which point there's no chance of anyone ever reading from it (this is Solaris 2.6). It would be reasonably safe to arrange for a timeout, but it would have to be set up specially since poll() and select() won't work if we haven't yet got an fd. -- Peter Stephenson Cambridge Silicon Radio, Unit 300, Science Park, Milton Road, Cambridge, CB4 0XL, UK Tel: +44 (0)1223 392070