> > i can reproduce it with this: > > > > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ > > > > 8c test.c > > 8a int80.s > > 8l test.8 int80.8 > > ./8.out > > > > 8.out 12490667: suicide: sys: trap: general protection violation > > pc=0x00001333 > > okay. it seems pretty clear from the code that you're dead meat > if you receive a note while you're in the note handler. that is, > up->notified = 1. No! Notes are bufferd in the up->note[] array. If you are in the note handler, another process *can* send you further (NUser) notes without doing any harm. If we are in the note handler (up->notified == 1) and notify() gets hit, it will do nothing and return 0 see: /sys/src/9/pc/trap.c: notify() ... if(n->flag!=NUser && (up->notified || up->notify==0)){ if(n->flag == NDebug) pprint("suicide: %s\n", n->msg); qunlock(&up->debug); pexit(n->msg, n->flag!=NDebug); } if(up->notified){ qunlock(&up->debug); splhi(); return 0; } ... The problem is when we get a NDebug note *after* an NUser note. Then after notify() poped the first NUser note and putting the process into the user handler, the NDebug note will be the next/first (up->note[0]) and then, any (indirect) call to notify() will kill us because now it thinks while handling the last note (up->notified == 1) it caused some trap/fatal event (up->note[0].flag != NUser). but this was *not* the case here! We just traped after some other process put a note in our queue. The notify() code for detecting trap in note handler is fine i think. Whats wrong is that the trap got put after the NUser note. > it looks pretty clear that this is intentional. > i don't see why one couldn't get 3-4 note before the note handler > is called, however. > > given this, calling sleep() from the note handler is an especially > bad idea. > > however, on a multiprocessor (or if you get scheduled by a clock > tick on a up), you're still vulnerable. this is akin to hitting ^c > twice quickly — and watching one's shell exit. > > it would be good to track down what's really going on in your > vm. how many processors does plan 9 think it has? just one :-) > i did some looking to see if i could find any discussions on the > implementation of notes and didn't find anything in my quick scan. > it would be very interesting to have a little perspective from someone > who was there. I have done further experiments and changed postnote() in /sys/src/9/port/proc.c from: ... if(flag != NUser && (p->notify == 0 || p->notified)) p->nnote = 0; ... to: ... if(flag != NUser) p->nnote = 0; ... which lets the testcase run without any suicides. What it does is to ensure (in a harsh way) that not only if the destination process is currently inside the notehandler but always, the trap will end up as the first entry in the up->note array. so no matter what NUser-notes we received before. A trap caused by a note handler will still suicide the process which is correct. This is just a hack. It would be better to keep the other notes and move the tail one step down and then putting the new note on the first entry if its != NUser. What do you think? > - erik -- cinap