* [9fans] notes and traps @ 2008-08-28 13:21 Kernel Panic 2008-08-28 14:09 ` erik quanstrom 0 siblings, 1 reply; 13+ messages in thread From: Kernel Panic @ 2008-08-28 13:21 UTC (permalink / raw) To: 9fans I run into an interesting problem with linuxemu. The problem *seems* to be that traps can be enqueued in the process note array *after* some other note causing notify() to kill us. Please correct me if i miss(understand) something here. It just hit my mind after hours of late night debugging. The case: If some other process sends a note with postnote(). The kernel will enqueue the note in the target process. (The note is marked as Nuser) notestate of the target process: note[0] = {Nuser, "usernote"} nnote = 1 notified = 0 This note will be "delivered" when notify() is called. notify is called on: - after a syscall - after interrupt/trap handling (timer interrupt?) Now, in our case, the receiving process causes a trap. It enters the kernel from the vector table and /sys/src/9/pc/trap.c:trap() is called. trap() decodes the error and enqueues a note with the Ndebug flag. the target process note state now looks like this: note[0] = {Nuser, "usernote"} note[1] = {Ndebug, "sys: trap: some trap"} nnote = 2 notified = 0 trap() now calls notify() because nnote != 0. notify() delivers the "usernote" and sets notified = 1; we return to userspace with kexit() and we pop out in the note handler to process the "usernote". the note state looks now like this: note[0] = {Ndebug, "sys: trap: some trap"} nnote = 1 notified = 1 now strange things happen. if we enter the kernel for some reason (syscall, timer interrupt?) notify() will kill us: /* n is the address of the 0th entry from the note array */ if(n->flag!=NUser && (up->notified || up->notify==0)){ if(n->flag == NDebug) pprint("suicide: %s\n", n->msg); qunlock(&up->debug); pexit(n->msg, n->flag!=NDebug); } if we are lucky, we could do a noted() inside the notehandler and it will set up->notified to zero before some the timer interrupt arrives. What i need is that i get the Ndebug-note in the notehandler *first*. it doesnt make sense to me to queue traps. Something similar i found in postnote(): if(flag != NUser && (p->notify == 0 || p->notified)) p->nnote = 0; but obviously this only works if it has already notified the process. why? -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-28 13:21 [9fans] notes and traps Kernel Panic @ 2008-08-28 14:09 ` erik quanstrom 2008-08-28 15:28 ` Kernel Panic 2008-08-29 10:04 ` Kernel Panic 0 siblings, 2 replies; 13+ messages in thread From: erik quanstrom @ 2008-08-28 14:09 UTC (permalink / raw) To: 9fans it looks like you get the second trap while you are still in your notify handler, since i think this test (up->notified || up->notify==0) is for a proc in a notify handler getting a system trap (or a proc with no notify handler). it would be very interesting to know what the system trap is. it would also be interesting to know if you are seeing this randomly or if you can reliable reproduce this condition. - erik ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-28 14:09 ` erik quanstrom @ 2008-08-28 15:28 ` Kernel Panic 2008-08-29 10:04 ` Kernel Panic 1 sibling, 0 replies; 13+ messages in thread From: Kernel Panic @ 2008-08-28 15:28 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs erik quanstrom wrote: > it looks like you get the second trap while > you are still in your notify handler, since > i think this test > (up->notified || up->notify==0) > is for a proc in a notify handler getting a system > trap (or a proc with no notify handler). > right, the problem is, my note handler never sees the *real* trap. the message passed to the notehandler is the string posted from the other process. "sig" in the case of linuxemu. the ureg has the trap field set to 0xD. the pc is right on the INT 0x80 instruction., but we get the "sig", not "sys: trap: general protection ..." (Of course! because that was the context/event that brougth it into the kernel in the first place. But i think it picked the wrong note for delivery, and keeps the NDebug note pending.) If i ignore the "sig" and do a noted(NCONT); the real trap comes in. (most of the time) But if i do any syscall in the handler of "sig", we get killed (maybe caused by notify() detecting the condition: up->note[0].flag != NUser && up->notified while we process the "sig" note) > it would be very interesting to know what the system > trap is. > it would also be interesting to know if you are > seeing this randomly or if you can reliable reproduce > this condition. > I can reproduce it outside of linuxemu. I'm at work currently... I will provide a testcase. Many thanks for your response. > - erik > -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-28 14:09 ` erik quanstrom 2008-08-28 15:28 ` Kernel Panic @ 2008-08-29 10:04 ` Kernel Panic 2008-08-29 10:09 ` Kernel Panic 2008-08-29 18:26 ` erik quanstrom 1 sibling, 2 replies; 13+ messages in thread From: Kernel Panic @ 2008-08-29 10:04 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs erik quanstrom wrote: > it would also be interesting to know if you are > seeing this randomly or if you can reliable reproduce > this condition. > i can reproduce it with this: http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ 8c test.c 8a int80.s 8l test.8 int80.8 ./8.out 8.out 12490667: suicide: sys: trap: general protection violation pc=0x00001333 the parent process loops and does "fake" syscalls while the child posts notes to it. > - erik > -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-29 10:04 ` Kernel Panic @ 2008-08-29 10:09 ` Kernel Panic 2008-08-29 18:26 ` erik quanstrom 1 sibling, 0 replies; 13+ messages in thread From: Kernel Panic @ 2008-08-29 10:09 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Kernel Panic wrote: > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ > > 8c test.c > 8a int80.s > 8l test.8 int80.8 > ./8.out > > 8.out 12490667: suicide: sys: trap: general protection violation > pc=0x00001333 > > the parent process loops and does "fake" syscalls while > the child posts notes to it. arg... swap parent <-> child... but you'll see it in the code. -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-29 10:04 ` Kernel Panic 2008-08-29 10:09 ` Kernel Panic @ 2008-08-29 18:26 ` erik quanstrom 2008-08-30 2:18 ` cinap_lenrek 1 sibling, 1 reply; 13+ messages in thread From: erik quanstrom @ 2008-08-29 18:26 UTC (permalink / raw) To: 9fans > i can reproduce it with this: > > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ > > 8c test.c > 8a int80.s > 8l test.8 int80.8 > ./8.out > > 8.out 12490667: suicide: sys: trap: general protection violation > pc=0x00001333 okay. it seems pretty clear from the code that you're dead meat if you receive a note while you're in the note handler. that is, up->notified = 1. it looks pretty clear that this is intentional. i don't see why one couldn't get 3-4 note before the note handler is called, however. given this, calling sleep() from the note handler is an especially bad idea. however, on a multiprocessor (or if you get scheduled by a clock tick on a up), you're still vulnerable. this is akin to hitting ^c twice quickly — and watching one's shell exit. it would be good to track down what's really going on in your vm. how many processors does plan 9 think it has? i did some looking to see if i could find any discussions on the implementation of notes and didn't find anything in my quick scan. it would be very interesting to have a little perspective from someone who was there. - erik ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-29 18:26 ` erik quanstrom @ 2008-08-30 2:18 ` cinap_lenrek 2008-08-30 3:38 ` erik quanstrom 0 siblings, 1 reply; 13+ messages in thread From: cinap_lenrek @ 2008-08-30 2:18 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 3271 bytes --] > > i can reproduce it with this: > > > > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ > > > > 8c test.c > > 8a int80.s > > 8l test.8 int80.8 > > ./8.out > > > > 8.out 12490667: suicide: sys: trap: general protection violation > > pc=0x00001333 > > okay. it seems pretty clear from the code that you're dead meat > if you receive a note while you're in the note handler. that is, > up->notified = 1. No! Notes are bufferd in the up->note[] array. If you are in the note handler, another process *can* send you further (NUser) notes without doing any harm. If we are in the note handler (up->notified == 1) and notify() gets hit, it will do nothing and return 0 see: /sys/src/9/pc/trap.c: notify() ... if(n->flag!=NUser && (up->notified || up->notify==0)){ if(n->flag == NDebug) pprint("suicide: %s\n", n->msg); qunlock(&up->debug); pexit(n->msg, n->flag!=NDebug); } if(up->notified){ qunlock(&up->debug); splhi(); return 0; } ... The problem is when we get a NDebug note *after* an NUser note. Then after notify() poped the first NUser note and putting the process into the user handler, the NDebug note will be the next/first (up->note[0]) and then, any (indirect) call to notify() will kill us because now it thinks while handling the last note (up->notified == 1) it caused some trap/fatal event (up->note[0].flag != NUser). but this was *not* the case here! We just traped after some other process put a note in our queue. The notify() code for detecting trap in note handler is fine i think. Whats wrong is that the trap got put after the NUser note. > it looks pretty clear that this is intentional. > i don't see why one couldn't get 3-4 note before the note handler > is called, however. > > given this, calling sleep() from the note handler is an especially > bad idea. > > however, on a multiprocessor (or if you get scheduled by a clock > tick on a up), you're still vulnerable. this is akin to hitting ^c > twice quickly — and watching one's shell exit. > > it would be good to track down what's really going on in your > vm. how many processors does plan 9 think it has? just one :-) > i did some looking to see if i could find any discussions on the > implementation of notes and didn't find anything in my quick scan. > it would be very interesting to have a little perspective from someone > who was there. I have done further experiments and changed postnote() in /sys/src/9/port/proc.c from: ... if(flag != NUser && (p->notify == 0 || p->notified)) p->nnote = 0; ... to: ... if(flag != NUser) p->nnote = 0; ... which lets the testcase run without any suicides. What it does is to ensure (in a harsh way) that not only if the destination process is currently inside the notehandler but always, the trap will end up as the first entry in the up->note array. so no matter what NUser-notes we received before. A trap caused by a note handler will still suicide the process which is correct. This is just a hack. It would be better to keep the other notes and move the tail one step down and then putting the new note on the first entry if its != NUser. What do you think? > - erik -- cinap [-- Attachment #2: Type: message/rfc822, Size: 3223 bytes --] From: erik quanstrom <quanstro@coraid.com> To: 9fans@9fans.net Subject: Re: [9fans] notes and traps Date: Fri, 29 Aug 2008 14:26:48 -0400 Message-ID: <daf3c79569266500f69d598cf6532941@coraid.com> > i can reproduce it with this: > > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/ > > 8c test.c > 8a int80.s > 8l test.8 int80.8 > ./8.out > > 8.out 12490667: suicide: sys: trap: general protection violation > pc=0x00001333 okay. it seems pretty clear from the code that you're dead meat if you receive a note while you're in the note handler. that is, up->notified = 1. it looks pretty clear that this is intentional. i don't see why one couldn't get 3-4 note before the note handler is called, however. given this, calling sleep() from the note handler is an especially bad idea. however, on a multiprocessor (or if you get scheduled by a clock tick on a up), you're still vulnerable. this is akin to hitting ^c twice quickly — and watching one's shell exit. it would be good to track down what's really going on in your vm. how many processors does plan 9 think it has? i did some looking to see if i could find any discussions on the implementation of notes and didn't find anything in my quick scan. it would be very interesting to have a little perspective from someone who was there. - erik ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 2:18 ` cinap_lenrek @ 2008-08-30 3:38 ` erik quanstrom 2008-08-30 5:48 ` cinap_lenrek 0 siblings, 1 reply; 13+ messages in thread From: erik quanstrom @ 2008-08-30 3:38 UTC (permalink / raw) To: cinap_lenrek, 9fans > No! Notes are bufferd in the up->note[] array. If you are in the note handler, > another process *can* send you further (NUser) notes without doing any harm. > > If we are in the note handler (up->notified == 1) and notify() gets hit, > it will do nothing and return 0 see: > > /sys/src/9/pc/trap.c: notify() > ... > if(n->flag!=NUser && (up->notified || up->notify==0)){ > if(n->flag == NDebug) > pprint("suicide: %s\n", n->msg); > qunlock(&up->debug); > pexit(n->msg, n->flag!=NDebug); > } > > if(up->notified){ > qunlock(&up->debug); > splhi(); > return 0; > } > ... > > The problem is when we get a NDebug note *after* an NUser note. Then i think you may be misunderstanding what i'm saying. the test for n->flag does not appear to be accidental. i am not quite sure why that test is there, but i go on the assumption that presotto knew what he was doing. if you're going to claim he was wrong, i think you'd better have good reasons. including explaining why it's okay to not terminate a process which has not handed a system note when it gets another. the example program is carefully constructed to abuse notes beyond reason. sleeping in a note handler seems like it's pushing the limits to me. having two threads sending notes to the same proc including one generating a constent stream of gpf's seems like it's a bit beyond what notes were intended for. if you remove the sleep from the note handler, all the notes are handled. i don't think by itself it's particularly compelling example. your vm example may be compelling. i don't understand it yet, though. so i can't say. i'm not arguing for broken system calls. but notes, like signals, tend to be convienent but fraught interfaces. maybe now that plan 9 has given in to multithreading, it's time to replace notes. - erik ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 3:38 ` erik quanstrom @ 2008-08-30 5:48 ` cinap_lenrek 2008-08-30 10:50 ` Steve Simon 2008-08-30 14:04 ` erik quanstrom 0 siblings, 2 replies; 13+ messages in thread From: cinap_lenrek @ 2008-08-30 5:48 UTC (permalink / raw) To: 9fans > i think you may be misunderstanding what i'm saying. > the test for n->flag does not appear to be accidental. > i am not quite sure why that test is there, but i go on > the assumption that presotto knew what he was doing. > if you're going to claim he was wrong, i think you'd better > have good reasons. I'm not claiming anyone to be wrong. I want to solve this problem. > including explaining why it's okay > to not terminate a process which has not handed a system > note when it gets another. which has not handled what? All it checks is if here is *any* note handled currently. And if the *next* note to be handled is a system note. If here are any other implications in the check i dont see here please tell. I also noted that i think this is the correct behavior in the previous mail. My patch is on the other side inpostnote(), where the note is put in the up->note[] array, not where it decides to kill the proc. > the example program is carefully constructed to abuse > notes beyond reason. Its the extract of what linuxemu needs to do. Its not just constructed to annoy anyone or prove someone wrong. > sleeping in a note handler seems > like it's pushing the limits to me. It doesnt matter if we sleep() here. You could do a write() or any other syscall instead. Or do the looping and wait for the timer interrupt. Being curious, why is sleep() in a note handler a bad idea? > having two threads > sending notes to the same proc including one generating > a constent stream of gpf's seems like it's a bit beyond what > notes were intended for. if you remove the sleep from the > note handler, all the notes are handled. Not the case for me... enabling the for loops results in the same crash here. (It takes a little bit longer than doing the sleep() thing, maybe you need to increase loop count to trigger it on a faster machine) > i don't think by itself > it's particularly compelling example. your vm example > may be compelling. i don't understand it yet, though. > so i can't say. > i'm not arguing for broken system calls. but notes, like > signals, tend to be convienent but fraught interfaces. > maybe now that plan 9 has given in to multithreading, > it's time to replace notes. Dont tell me... But i cant think of any other way to catch linux syscalls for now. > - erik -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 5:48 ` cinap_lenrek @ 2008-08-30 10:50 ` Steve Simon 2008-08-30 14:04 ` erik quanstrom 1 sibling, 0 replies; 13+ messages in thread From: Steve Simon @ 2008-08-30 10:50 UTC (permalink / raw) To: 9fans purely as an aside this detailed conversation on how notes are handled in the kernel is very interesting, I was trying to understand this myself recently and gave up; I will now try again. Thanks, -Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 5:48 ` cinap_lenrek 2008-08-30 10:50 ` Steve Simon @ 2008-08-30 14:04 ` erik quanstrom 2008-08-30 17:40 ` cinap_lenrek 2008-08-30 18:09 ` cinap_lenrek 1 sibling, 2 replies; 13+ messages in thread From: erik quanstrom @ 2008-08-30 14:04 UTC (permalink / raw) To: cinap_lenrek, 9fans > I'm not claiming anyone to be wrong. I want to solve > this problem. i'm not sure there is a problem with notes. but i really can't answer that question. someone with a better understanding of what notes are supposed to be could answer this. > I also noted that i think this is the correct behavior in > the previous mail. My patch is on the other side inpostnote(), > where the note is put in the up->note[] array, not where > it decides to kill the proc. i'm not convinced that you've explained why this change is correct or why it could solve the problem. after all, the NNOTED array is only 5 entries long. what if one gets 5 user notes before the system note? do you kill the process (isn't this how it works now?), make notes unreliable or block the sender? none of these options seems like an improvement to me. > > sleeping in a note handler seems > > like it's pushing the limits to me. > > It doesnt matter if we sleep() here. You could do a > write() or any other syscall instead. Or do the looping > and wait for the timer interrupt. i know. but sleep gives other processes a large window to deliver more notes. what is the external note, by the way? a keyboard interrupt? > Being curious, why is sleep() in a note handler a bad > idea? as i see it (please chime in if you disagree), a process in a note handler is in a funny state. the kernel treats this state as funny as it keeps up->notified around to note (ha!) this fact. as a general rule, i would think that anything prolonging this funny state would tend to cause problems. > Dont tell me... But i cant think of any other way to catch > linux syscalls for now. without changing your mechanism, you could move the work out of the note handler and let the note handler just set state. for example, byron's rc ^c handler just set a global interrupted variable. there were only three or four places where this variable was checked. these locations were carefully vetted for longjmp safety. and lonjmp'ed back to the command loop (if interactive). it doesn't sound like it would work well, but it worked perfectly. about the same time byron was working on rc, i wrote my own shell. i wasn't convinced that byron's technique would work or would be more reliable than doing things from the signal handler. ... experience says that i was wrong. there's also a special dodge of a note continuation handler for ape available. this hints that the note interface might be pushed too far by ape. - erik ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 14:04 ` erik quanstrom @ 2008-08-30 17:40 ` cinap_lenrek 2008-08-30 18:09 ` cinap_lenrek 1 sibling, 0 replies; 13+ messages in thread From: cinap_lenrek @ 2008-08-30 17:40 UTC (permalink / raw) To: 9fans > > I'm not claiming anyone to be wrong. I want to solve > > this problem. > > i'm not sure there is a problem with notes. but i really can't > answer that question. someone with a better understanding > of what notes are supposed to be could answer this. > > > I also noted that i think this is the correct behavior in > > the previous mail. My patch is on the other side inpostnote(), > > where the note is put in the up->note[] array, not where > > it decides to kill the proc. > > i'm not convinced that you've explained why this change is > correct or why it could solve the problem. after all, the > NNOTED array is only 5 entries long. what if one gets 5 > user notes before the system note? assuming now that up->note[] is filled up and up->nnote = 5. the trap happens... we are in sys/src/9/pc/trap.c trpa(): we detect the fault and do a postnote(up, 1, "sys: trap: bla", NDebug) postnote() will see that p->notified is 0 becasue we are not in the note handler yet, so it will skip the: if(flag != NUser && (p->notify == 0 || p->notified)) p->nnote = 0; then it checks if here is any room to store the note, which is not the case. then we set p->notepending and exit. So now, our trap note got lost! We will *not* get killed then... and happily process the pending 5 user notes. With my suggested patch, it would have discarded all the notes and *have* the NDebug note posted. Notify will pick it up and kill us if here is no handler installed or if it happend while inside the note handler. > do you kill the process (isn't this how it works now?), no, it would lose the note with the current implementation. > make notes unreliable or block the sender? yes, trowing usernotes away when something important as an internal note happend to make sure it gets put in the note array so we have a chance to process it: if(n->flag!=NUser && (up->notified || up->notify==0)){ if(n->flag == NDebug) pprint("suicide: %s\n", n->msg); qunlock(&up->debug); pexit(n->msg, n->flag!=NDebug); } This was already in the code (see postnote()). The only thing that i claim to change is that it doesnt matter if we are notified or not when posting the note. We check for it in notify() ayway so the process *will* be terminated if it causes another trap while inside its note handler. > none of these > options seems like an improvement to me. losing the trap note and *not* handling it at all looks like a bug to me. > > > sleeping in a note handler seems > > > like it's pushing the limits to me. > > > > It doesnt matter if we sleep() here. You could do a > > write() or any other syscall instead. Or do the looping > > and wait for the timer interrupt. > > i know. but sleep gives other processes a large window > to deliver more notes. what is the external note, by the > way? a keyboard interrupt? External notes are the ones posted to /proc/pid/note[pg]. Like if you hit [DEL], rio writes "interrupt" the the programs note file running in the window. > > Being curious, why is sleep() in a note handler a bad > > idea? > > as i see it (please chime in if you disagree), a process in > a note handler is in a funny state. the kernel treats this > state as funny as it keeps up->notified around > to note (ha!) this fact. As i see it, up->notified just means we have put the process into its note handler, and any further notes that are in the up->note[] array (that are NUser) get not handled until the user process calls noted(). Thats exactly how it is documented in notify(2): Until the program calls noted, any further externally-generated notes (e.g., hangup or alarm) *[1] will be held off, and any further notes generated by erroneous behavior by the program (such as divide by zero) *[2] will kill the program. [1] that are, NUser notes [2] NDebug for traps > as a general rule, i would think that anything prolonging > this funny state would tend to cause problems. > > Dont tell me... But i cant think of any other way to catch > > linux syscalls for now. > > without changing your mechanism, you could move the > work out of the note handler and let the note handler > just set state. Not in this case. I need both notes to interrupt the linux machine code (which i cant change) and to catch the syscalls. With the current handling, the only place where it would be save to send (external) notes to a process from another is where i'm not in linux user code (which can trigger a trap when its doing a syscall). I dont generate traps in the linuxemu code. (Which would kill me as documneted in the manpage) > for example, byron's rc ^c handler just set a global > interrupted variable. there were only three or four > places where this variable was checked. these locations > were carefully vetted for longjmp safety. and lonjmp'ed > back to the command loop (if interactive). it doesn't > sound like it would work well, but it worked perfectly. The problem with notes and signals are that it can interrupt your code in places where your data structures will be in an inconsistent state so you would not be able to do anything usefull here. I agree that channels are the better and more deterministic way to pass another process some information or notification, because the "interruption" happens on well defined places only. But here are places where you need notes. For example if you have to interrupt blocking systemcalls like in ioproc(), or in my case, to get me out of the linux code to check if here are linux signals pending and catching systemcalls. > about the same time byron was working on rc, i wrote > my own shell. i wasn't convinced that byron's technique > would work or would be more reliable than doing things > from the signal handler. ... experience says that i was > wrong. > > there's also a special dodge of a note continuation handler > for ape available. this hints that the note interface might > be pushed too far by ape. Linuxemu doesnt use/need that special NSAVE NRSTR stuff. Really, if here are linux signals all i do is put the information to continue after the signal on the stack and transfer execution back to linux code in the linux signal handler. > - erik -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [9fans] notes and traps 2008-08-30 14:04 ` erik quanstrom 2008-08-30 17:40 ` cinap_lenrek @ 2008-08-30 18:09 ` cinap_lenrek 1 sibling, 0 replies; 13+ messages in thread From: cinap_lenrek @ 2008-08-30 18:09 UTC (permalink / raw) To: 9fans > i'm not convinced that you've explained why this change is > correct or why it could solve the problem. after all, the > NNOTED array is only 5 entries long. what if one gets 5 > user notes before the system note? > do you kill the process (isn't this how it works now?), > make notes unreliable or block the sender? none of these > options seems like an improvement to me. Hm, i think here would be a better way to handle it without losing user notes while make sure internal system notes get posted (and handled) in any case. What if we change postnote to: int postnote(Proc *p, int dolock, char *n, int flag) { int x, s, ret; Rendez *r; Proc *d, **l; if(dolock) qlock(&p->debug); if(flag != NUser){ x = 0; if(p->nnote < NNOTE){ if(p->nnote) memmove(&p->note[0], &p->note[1], p->nnote * sizeof(p->note[0])); p->nnote++; } } else { x = -1; if(p->nnote+1 < NNOTE) x = p->nnote++; } ret = 0; if(x >= 0) { strcpy(p->note[x].msg, n); p->note[x].flag = flag; ret = 1; } p->notepending = 1; if(dolock) qunlock(&p->debug); ... In that case, if a external note is posted, we make sure here is always room for one further internal note in the array. (see the p->nnote+1 < NNOTE) On arrival of a internal note, we move the rest down the array and put our note in the first entry. So the following conditions hold true: - internal notes get *always* posted (so the process gets terminated if its not handled or while in the note handler) - we do not drop external notes on internal note arrival and the sender can detect if posting the note failed. - internal notes get not queued after external notes so they will be handled by the next call to notify() and not tick notify() to think that while processing a previous user note, that this caused an internal note and kill the process before it has a chance to see the that note. Anything wrong here? -- cinap ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-08-30 18:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-08-28 13:21 [9fans] notes and traps Kernel Panic 2008-08-28 14:09 ` erik quanstrom 2008-08-28 15:28 ` Kernel Panic 2008-08-29 10:04 ` Kernel Panic 2008-08-29 10:09 ` Kernel Panic 2008-08-29 18:26 ` erik quanstrom 2008-08-30 2:18 ` cinap_lenrek 2008-08-30 3:38 ` erik quanstrom 2008-08-30 5:48 ` cinap_lenrek 2008-08-30 10:50 ` Steve Simon 2008-08-30 14:04 ` erik quanstrom 2008-08-30 17:40 ` cinap_lenrek 2008-08-30 18:09 ` cinap_lenrek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).