From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <0fee984c7b0eb63ea4f0ff13c7945c2c@coraid.com> From: erik quanstrom Date: Mon, 18 Dec 2006 15:00:51 -0500 To: rsc@swtch.com, 9fans@cse.psu.edu Subject: Re: [9fans] wait hang MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Cc: Topicbox-Message-UUID: f6645840-ead1-11e9-9d60-3106f5b1d025 On Mon Dec 18 14:27:31 EST 2006, rsc@swtch.com wrote: > I'm not really convinced this part of rc ever worked right to > begin with, so this is no big loss. you're right. this feature of rc doesn't work as expected. and your example illustrates this well. for example if we try 20 wait records instead of 128: ; sleep 1000000 & ; pids=() ; for(i in `{seq 20}){ echo foo >/dev/null & pids=($pids $apid) } ; for(p in $pids){ echo $p wait $p } 4627 [hang] since 20 is much less than 128, this can't be the kernel. i think the problem is in the discarded wait messages here: rc/plan9.c/^Waitfor int Waitfor(int pid, int) { thread *p; Waitmsg *w; char errbuf[ERRMAX]; while((w = wait()) != nil){ if(w->pid==pid){ setstatus(w->msg); free(w); return 0; } for(p = runq->ret;p;p = p->ret) if(p->pid==w->pid){ p->pid=-1; strcpy(p->status, w->msg); } free(w); } errstr(errbuf, sizeof errbuf); if(strcmp(errbuf, "interrupted")==0) return -1; return 0; } byron's rc solves this problem by maintaining it's own waitlist in $apids. ; for(i in 1 2 3 4) { echo fu>/dev/null&} 31427 31428 31429 31430 ; echo $apids 31427 31428 31429 31430 ; echo $apids 31427 31428 31429 31430 ; wait $apids(1) ; echo $apids 31428 31429 31430 ; wait $apids(3) ; echo $apids ; wait 31428 ; i would think the right way to go may be to either remove "wait $pid" from rc's vocabulary or do something like byron's rc. - erik