* Re: [9fans] thread confusion @ 2005-09-21 14:44 Fco. J. Ballesteros 2005-09-21 15:05 ` Axel Belinfante 0 siblings, 1 reply; 16+ messages in thread From: Fco. J. Ballesteros @ 2005-09-21 14:44 UTC (permalink / raw) To: 9fans AFAIK, you must call threadnotify() to install a handler for your note. If you don't do that, your process is killed (which is what you are seeing right now). You should probably install a handler that says 'it's ok, got the note'. Use threadnotify to do this. I understand that you are interested in the "side effect" of interrupting the I/O call. I't funny, anyway, because I had the same problem a few days ago; I had to abort a connection to a `Broken-maybe' file server. I tried not to use interrupts and I was nevertheless decided to alarm(x) read() alarm(0) the call. After letting Russ know, he (once more) suggested me not to use interrupts and to read the Alef paper (which I had read before, btw). However, after thinking it twice, I was able to avoid the interrupts. (The process is kept there, it will sooner or later abort due to a broken connection). Thus, excuse me for suggesting this again ;-), have you tried not to use interrupts? In your case, if "the other end" decides to give up, can't it let you know so you could shutdown and restart in a clean way? hth ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 14:44 [9fans] thread confusion Fco. J. Ballesteros @ 2005-09-21 15:05 ` Axel Belinfante 2005-09-21 15:42 ` Russ Cox 0 siblings, 1 reply; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 15:05 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Thus, excuse me for suggesting this again ;-), have you tried not to use > interrupts? In your case, if "the other end" decides to give up, can't it > let you know so you could shutdown and restart in a clean way? I have tried not to use interrupts. I know when the other end wants to restart. My essential problem seems to be to 'shutdown' this pending library call - it will not time out by itself -- if indeed it does not react to me closing the pipe, it will stay there forever. But maybe I did something wrong there. I'll think things through again (as I wrote: back to the drawing board). Thanks for your reaction. Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 15:05 ` Axel Belinfante @ 2005-09-21 15:42 ` Russ Cox 2005-09-21 20:25 ` Axel Belinfante 0 siblings, 1 reply; 16+ messages in thread From: Russ Cox @ 2005-09-21 15:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > My essential problem seems to be to 'shutdown' this pending library call - > it will not time out by itself -- if indeed it does not react > to me closing the pipe, it will stay there forever. > But maybe I did something wrong there. You have to install a note handler, as Nemo said. But leave the proc alone and figure out the pipe close bug instead. The most likely problem is that you haven't actually closed the other side of the pipe completely. For example, maybe you have forked a child who inherited a copy of that fd, and that child is holding the pipe up. Programming with notes or signals is asking for trouble. Always. I'm sorry that threadint/threadkill are in the library. Russ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 15:42 ` Russ Cox @ 2005-09-21 20:25 ` Axel Belinfante 2005-09-21 20:32 ` Axel Belinfante ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 20:25 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > But leave the proc alone and figure out the pipe close > bug instead. The most likely problem is that you haven't > actually closed the other side of the pipe completely. > For example, maybe you have forked a child who inherited > a copy of that fd, and that child is holding the pipe up. the problem seems to be that at both ends of the pipe a read is 'hanging'. (one from tlsClient, and one from my own pipe reader proc) If I in this situation close (both ends of) the pipe, it doesn't work. I have tried to mimic the situation in a small test program, and there I found: if I first close one randomly chosen end of the pipe, and then do a zero-length write at the other end and then close that end, it works. It also works if I first close one end of the pipe, and then do the zero-length write and close at the other end. is this a correct procedure, or would another be preferred? > Programming with notes or signals is asking for trouble. I do use a timer, by having a proc that repeatedly sleeps and decrements a counter, and when the counter reaches zero it sends a (nil) timeout message on a channel. in alts I not only wait for the io channels but also for the timer timeout channel. the question is how to start and reset the timer. I have been thinking about using channels for that too, but that seems deadlock prone: how to avoid the case where I want to send a reset message to the timer when the timer wants to send an expiration message to me? could I work around that by having a timer thread in the same process with the main thread(s) that use it, and a separate clock proc (process) that does regular sleeps to generate regular tick messages? the timer thread forever does an alt to - receive timer start message (contains timer and timeout value) - receive timer reset message (contains timer) - receive tick message, triggers decrement of counters and sending of timeout message if counter reaches zero the clock proc forever sleeps and sends tick messages to the timer thread. Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 20:25 ` Axel Belinfante @ 2005-09-21 20:32 ` Axel Belinfante 2005-09-21 20:37 ` Russ Cox 2005-09-26 18:40 ` rog 2 siblings, 0 replies; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 20:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs To avoid repeating myself too much: if I first close one randomly chosen end of the pipe, and then do a zero-length write at the other end and then close that end, it works. It also works if I first do the zero-length write and close at one end and then do the close at the other end is this a correct procedure, or would another be preferred? Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 20:25 ` Axel Belinfante 2005-09-21 20:32 ` Axel Belinfante @ 2005-09-21 20:37 ` Russ Cox 2005-09-21 22:34 ` Axel Belinfante 2005-09-26 18:40 ` rog 2 siblings, 1 reply; 16+ messages in thread From: Russ Cox @ 2005-09-21 20:37 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > if I first close one randomly chosen end of the pipe, > and then do a zero-length write at the other end and > then close that end, it works. By it works I assume you mean you get a zero-length read out the other end. But that's because you did a zero-length write, not because the pipe is signaling EOF. > It also works if I first close one end of the pipe, > and then do the zero-length write and close at the > other end. Pipes are symmetric so this is good. > is this a correct procedure, or would another be preferred? What doesn't work? Can you post a small test program that doesn't make a blocked read fail when the other end of the pipe is closed? Again, it sounds like you're not closing all the references to one end of the pipe. If multiple programs have references to a pipe end, they *all* need to close them. Make sure that the proc running tlsClient doesn't have a reference too. > the question is how to start and reset the timer. It depends how granular this timer is. If we're talking about something large like seconds, then it is reasonable to have the timer proc just poll the channel with nbrecvp for new work or cancellations after it ticks off each second. Your alternate approach, with a tick stream, is also reasonable. No matter which you use, the return channels that the timer proc writes to should be buffered so that the timer proc never blocks writing to them. Russ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 20:37 ` Russ Cox @ 2005-09-21 22:34 ` Axel Belinfante 2005-09-21 22:44 ` Russ Cox 0 siblings, 1 reply; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 22:34 UTC (permalink / raw) To: Russ Cox, Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1085 bytes --] > Again, it sounds like you're > not closing all the references to one end of the pipe. > If multiple programs have references to a pipe end, > they *all* need to close them. Make sure that the > proc running tlsClient doesn't have a reference too. Just checking: I assume by 'references' you mean file descriptors in the process' fd table? ehm... I just realized: I create the procs with proccreate, so all procs share the same references, and when I close them in the main proc, they go away in the other ones as well, or so it seems, at least according to cat /proc/*/fd. nevertheless, without the zero-write before one of the closes they just keep hanging in the read, even when their fd table no longer shows the pipe file descriptors in any of the fd tables. If I don't share the fd table, and just give each of the sub processes their own end of the pipe, I'll never be able to close those from within those processes, because they are both locked up in a read (essentially waiting for each other). I have attached a silly program. Axel. [-- Attachment #2: Type: text/plain, Size: 3820 bytes --] #include <u.h> #include <libc.h> #include <thread.h> enum { STACK = 16*2048, }; typedef struct State { int id; int fd, tobeclosed; Channel *c; } State; static void subproc(void *arg) { State *m; char buf[1024]; int n; m = arg; print("subproc tid=%d tobeclosed=%d\n", threadid(), m->tobeclosed); sleep(15000); // print(" subproc tid=%d pid=%d\n", m->id, threadpid(m->id)); // close(m->tobeclosed); // sleep(15000); print("subproc writing fd=%d n=%d\n", m->fd, n); n = write(m->fd, buf, 5); print("subproc written fd=%d n=%d\n", m->fd, n); while((n = read(m->fd, buf, sizeof(buf))) > 0) print("subproc fd=%d] read n=%d\n", m->fd, n); print("subproc eof fd=%d n=%d\n", m->fd, n); sleep(15000); sendul(m->c, 0); print("exiting subproc tid=%d pid=%d\n", m->id, threadpid(m->id)); threadexits(nil); } static void mainproc(void *arg) { State *m; char buf[1024]; int n; m = arg; print("mainproc tid=%d tobeclosed=%d\n", threadid(), m->tobeclosed); sleep(15000); // print(" mainproc tid=%d pid=%d\n", m->id, threadpid(m->id)); // close(m->tobeclosed); // sleep(15000); print("mainproc writing fd=%d n=%d\n", m->fd, n); n = write(m->fd, buf, 7); print("mainproc written fd=%d n=%d\n", m->fd, n); while((n = read(m->fd, buf, sizeof(buf))) > 0) print("mainproc fd=%d n=%d\n", m->fd, n); print("mainproc fd=%d eof n=%d\n", m->fd, n); sleep(15000); sendul(m->c, 0); print("exiting mainproc tid=%d pid=%d\n", m->id, threadpid(m->id)); threadexits(nil); } void threadmain(int argc, char *argv[]) { State mainState, *m; State subState, *s; char buf[256]; int i, j, k, l, n, N, maineof, subeof, ret, hang; int p[2]; Alt a[] = { /* c v op */ {nil , nil, CHANRCV}, {nil, nil, CHANRCV}, {nil, nil, CHANEND}, }; hang = 0; //change to 1 to hang N = 1; m = &mainState; s = &subState; print("threadmain tid=%d pid=%d\n", threadid(), threadpid(threadid())); memset(m, 0, sizeof(State)); m->c = chancreate(sizeof(int), 0); a[0].c = m->c; s->c = chancreate(sizeof(int), 0); a[1].c = s->c; for (l=0; l < N; l++) { print("for %d\n", l); if (pipe(p) < 0) { fprint(2, "pipe failed: %r\n"); threadexitsall("pipe failed"); } m->fd = p[0]; m->tobeclosed = p[1]; s->fd = p[1]; s->tobeclosed = p[0]; // m->id = procrfork(mainproc, m, STACK, RFFDG); m->id = proccreate(mainproc, m, STACK); print("started mainproc tid=%d pid=%d\n", m->id, threadpid(m->id)); // s->id = procrfork(subproc, s, STACK, RFFDG); s->id = proccreate(subproc, s, STACK); print("started subproc tid=%d pid=%d\n", m->id, threadpid(m->id)); sleep(60000); print("manthread after sleep\n"); i = 0; j = 1; print("threadmain closing %d\n", p[j]); close(p[j]); print("threadmain closed %d\n", p[j]); if (!hang) { print("threadmain writing zero to %d\n", p[i]); n = write(p[i], buf, 0); print("threadmain write %d returns %d\n", p[i], n); sleep(15000); } print("threadmain closing %d\n", p[i]); close(p[i]); print("threadmain closed %d\n", p[i]); sleep(15000); maineof = 0; subeof = 0; while(!maineof || !subeof) { print("threadmain while alt maineof=%d subeof=%d\n", maineof, subeof); switch(ret = alt(a)){ case 0: print("main eof\n"); maineof = 1; break; case 1: print("sub eof\n"); subeof = 1; break; default: print("should not happen ret=%d\n", ret); sysfatal("should not happen"); } } print("threadmain while alt done\n"); // print("threadmain threadint mainid\n"); // threadint(m->mainid); // print("threadmain done threadint mainid\n"); } print("threadmain exiting\n"); threadexits(nil); } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 22:34 ` Axel Belinfante @ 2005-09-21 22:44 ` Russ Cox 0 siblings, 0 replies; 16+ messages in thread From: Russ Cox @ 2005-09-21 22:44 UTC (permalink / raw) To: Axel Belinfante; +Cc: Fans of the OS Plan 9 from Bell Labs It appears that your program, at its core, it is doing this: void readproc(void *v) { int fd; char buf[100]; fd = (int)v; read(fd, buf, sizeof buf); } void threadmain(int argc, char **argv) { int p[2]; pipe(p); proccreate(readproc, (void*)p[0], 8192); proccreate(readproc, (void*)p[1], 8192); close(p[0]); /* and here you expect the first readproc to be done */ close(p[1]); /* and here the second */ } Each read call is holding up a reference to its channel inside the kernel, so that even though you've closed the fd and removed the ref from the fd table, there is still a reference to each side of the pipe in the form of the process blocked on the read. I've never been sure whether the implicit ref held during the system call is good behavior, but it's hard to change. In your case, writing 0 (or anything) makes the read finish, releasing the last ref to the underlying pipe when the system call finishes, and then everything cleans up as expected. So you've found your workaround, and now we understand why it works. Russ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 20:25 ` Axel Belinfante 2005-09-21 20:32 ` Axel Belinfante 2005-09-21 20:37 ` Russ Cox @ 2005-09-26 18:40 ` rog 2005-09-26 18:52 ` Russ Cox 2005-09-27 10:12 ` Axel Belinfante 2 siblings, 2 replies; 16+ messages in thread From: rog @ 2005-09-26 18:40 UTC (permalink / raw) To: 9fans > I do use a timer, by having a proc that repeatedly sleeps > and decrements a counter, and when the counter reaches > zero it sends a (nil) timeout message on a channel. > in alts I not only wait for the io channels but > also for the timer timeout channel. > > the question is how to start and reset the timer. > I have been thinking about using channels for that > too, but that seems deadlock prone: how to avoid > the case where I want to send a reset message to > the timer when the timer wants to send an expiration > message to me? i think this is a good question. i've found writing time-based code using the threads library to be quite awkward. it seems to me that there may be room for an extension to help with writing this kind of code. the difficulty with the plan 9 thread library (and with Limbo too) is that sleep(2) exists in a different universe to channels, so one has to use a separate process to bridge the gap. but when this thread is sleeping, it's not possible to communicate with it, so one needs another thread to act as an intermediary, and one has to design the interface carefully - if possible one doesn't want a separate process and thread for each thread that wishes to wait for a little while. this implies multiplexing access to the sleeping process between many other threads, which, depending on the kind of access required (one-shot? repeating event? no-faster-than?) starts to make things quite complex. i haven't yet seen a nicely designed interface that starts to make this kind of thing as easy as i think it could be. Occam had "timer" variables which could be used like: TIMER time INT start SEQ time ? start ALT inputch ? val ... no timeout, do somehing time ? AFTER start + 3 * Tickspersecond ... time out after three seconds; do something else this seems to me to be a nice solution - as long as this is sufficiently lightweight, it's then easy to leverage this to build up whatever other timer mechanisms one requires. here are some attributes i'd like to see in a timing mechanism for the thread library (however implemented): useful in many different scenarios. overhead and latency comparable with use of regular channels. capable of dealing with the full range interval requests (sub-millisecond upwards). does not soak up CPU time when unused. reasonable accuracy. easy, robust and non-error-prone to use. is something like this possible? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-26 18:40 ` rog @ 2005-09-26 18:52 ` Russ Cox 2005-09-26 19:20 ` rog 2005-09-27 10:12 ` Axel Belinfante 1 sibling, 1 reply; 16+ messages in thread From: Russ Cox @ 2005-09-26 18:52 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs What if we did without the timer variable? alt { x = <-c => print("%d\n", x); timeout 1 => print("1 second passed\n"); } You could model the timer variable easily enough: t := time()+10; for(;;){ alt { x = <-c => print("%d\n", x); timeout t-time() => print("10 seconds passed\n"); break; } } which do you think would be more common? I think the former, hence the dropping of explicit timer variables. Once you've figured out what a good interface is, implementing it is subtle and difficult to get right. But it only needs to be done right once and then everyone benefits. Channel communication is complicated too under the hood, but it's still a good abstraction. Russ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-26 18:52 ` Russ Cox @ 2005-09-26 19:20 ` rog 0 siblings, 0 replies; 16+ messages in thread From: rog @ 2005-09-26 19:20 UTC (permalink / raw) To: 9fans > What if we did without the timer variable? sure. as you say, it's pretty much equivalent. the only thing that troubles me is that the Occam version uses absolute timestamps, where yours are relative. when dealing with short intervals of time, given that scheduling is somewhat arbitrary, the difference between: # do something every microsecond (relative timeout) for(;;){ alt{ timeout 1 => # do something } } and: # do something every microsecond (absolute timeout) t := now(); for(;;){ alt{ timeout t => # do something; t++; } } might be significant. the former version allows errors to accumulate, where the latter does not. having calculated the timeout necessary for the alt, you never know exactly when it is going to acually start; a relative timeout is inevitably inaccurate. i think that might be one of the reasons why the transputer folks (who thought quite hard about things) chose absolute over relative timeouts. > Once you've figured out what a good interface is, implementing > it is subtle and difficult to get right. But it only needs to be done > right once and then everyone benefits. Channel communication > is complicated too under the hood, but it's still a good abstraction. i agree totally. this was my motivation in posting. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-26 18:40 ` rog 2005-09-26 18:52 ` Russ Cox @ 2005-09-27 10:12 ` Axel Belinfante 1 sibling, 0 replies; 16+ messages in thread From: Axel Belinfante @ 2005-09-27 10:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > > I do use a timer, by having a proc that repeatedly sleeps > > and decrements a counter, and when the counter reaches > > zero it sends a (nil) timeout message on a channel. > > in alts I not only wait for the io channels but > > also for the timer timeout channel. > > > > the question is how to start and reset the timer. > > I have been thinking about using channels for that > > too, but that seems deadlock prone: how to avoid > > the case where I want to send a reset message to > > the timer when the timer wants to send an expiration > > message to me? > > i think this is a good question. > > i've found writing time-based code using the threads library to be > quite awkward. it seems to me that there may be room for an extension > to help with writing this kind of code. > > the difficulty with the plan 9 thread library (and with Limbo too) is > that sleep(2) exists in a different universe to channels, so one has > to use a separate process to bridge the gap. > > but when this thread is sleeping, it's not possible to communicate > with it, so one needs another thread to act as an intermediary, and > one has to design the interface carefully - if possible one doesn't > want a separate process and thread for each thread that wishes to wait > for a little while. I have seen the followup discussion after this post and like the idea of support for this in the thread library. Indeed the accuracy may/will be higher than what I'm using right now (but for my use it is not really an issue, I guess). This is just to share the approach I have taken after my initial posts on the topic. (after some bad experience) I've abandoned the idea of a timer process that only delivers expiration messages, and with which one communicates to start and cancel timers. Instead I'm using (something like) the etimer(2) approach. I have now a proc that regularly sends ticks over a channel using non-blocking sends, and decrement timers and check for expiration in the alt. Hmm... thinking on it while writing this, I suppose that tickproc could just as well use blocking sends. void tickproc(void *v) { for(;;) { sleep(tickTime); nbsend((Channel*)v, nil); } } use of ticks: ... t = ticksToWait; done = 0; while(!done) switch(alt(a)){ case iochannel: done = 1; ... other handling ... break; ... case tickchannel: t--; if (t == 0) { done = 1; ... handle timeout ... } } ... This seems to get the job done in a less complex and more clean way than what I was using before. Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion @ 2005-09-21 15:48 Fco. J. Ballesteros 0 siblings, 0 replies; 16+ messages in thread From: Fco. J. Ballesteros @ 2005-09-21 15:48 UTC (permalink / raw) To: 9fans : Programming with notes or signals is asking for trouble. : Always. I'm sorry that threadint/threadkill are in the : library. Nothing that can't be solved using Cut. :-) Appart from aquarela and execnet how many programs are using this? ^ permalink raw reply [flat|nested] 16+ messages in thread
* [9fans] thread confusion @ 2005-09-21 13:53 Fco. J. Ballesteros 2005-09-21 14:32 ` Axel Belinfante 0 siblings, 1 reply; 16+ messages in thread From: Fco. J. Ballesteros @ 2005-09-21 13:53 UTC (permalink / raw) To: 9fans : A/The main weird thing is that after doing threadint on a thread : (created with proccreate) which presumably is hanging in a read : sometimes(?) the process just disappears without leaving a trace, : even though it is packed with syslog calls : (of which only the first part gets executed). Did you call threadnotify()? Put a print there (the handler) to see what's going on. Also, forwarding advice Russ gave time before :-), read the alef paper and don't use interrupts at all. BTW, if you are debugging, use threadsetname() and then use ps -a. Let me know if I can help somehow. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [9fans] thread confusion 2005-09-21 13:53 Fco. J. Ballesteros @ 2005-09-21 14:32 ` Axel Belinfante 0 siblings, 0 replies; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 14:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > : A/The main weird thing is that after doing threadint on a thread > : (created with proccreate) which presumably is hanging in a read > : sometimes(?) the process just disappears without leaving a trace, > : even though it is packed with syslog calls > : (of which only the first part gets executed). > > Did you call threadnotify()? No, threadint() (see below) > Put a print there (the handler) to see what's going on. lots of syslog already (instead of print - does that matter?) > Also, forwarding advice Russ gave time before :-), read the alef paper > and don't use interrupts at all. Agreed. I tried to do that (not use interrupts). The problem is that in this particular thread/proc I call library routine tlsClient which may hang in a read from a pipe of which I am holding the other end (to tunnel the messages). It may happen that 'the other end' decides to give up on a TLS handshake in progress and start a new TLS handshake, in which case I have to 'clean up' my side of the handshake in progress. This is where sometimes things go wrong: it seems that just closing my end of the pipe is not sufficient to get tlsClient out of the read. That's why I tried to resort to threadint: "Threadint interrupts a thread that is blocked in a channel operation or system call" (although now the question is what it means to be interrupted) Probably I should instead investigate how/why closing (my end of) the pipe is not sufficient. Would not be surprised if I'm making mistake that's so trivial/basic that I'm just overlooking it :-( (in the mean time my piece of code/administration to deal with all this is starting to live a life of its own so just getting it right and simple would be good. back to the drawing board, I guess...) Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [9fans] thread confusion @ 2005-09-21 13:25 Axel Belinfante 0 siblings, 0 replies; 16+ messages in thread From: Axel Belinfante @ 2005-09-21 13:25 UTC (permalink / raw) To: 9fans I've been wrestling with the thread library trying to do resource management in my 802.1x thingy. This has proven to be harder than I envisioned - I guess I've been spending more time trying to get this right (I want this right since it may run as a daemon for a long time, doing a new tls handshake every 20 minutes or even more often) than I think I've spent on the protocol/state machine part :-( One reason may be that I'm still not very experienced with thread(2). Another may be that weird things are happening. A/The main weird thing is that after doing threadint on a thread (created with proccreate) which presumably is hanging in a read sometimes(?) the process just disappears without leaving a trace, even though it is packed with syslog calls (of which only the first part gets executed). Actually, it does leave a trace, because when invoking acid on the main process and doing threads() or stacks() acid complains that setproc cannot read /proc/XXX/mem where XXX presumably was the pid of the disappeared process. (this seems to suggest that also the thread administration was not aware of the thread/process dying?) This is hard to track since I'm not really able to reproduce it, though I may be able to detect it when it happened. Any ideas of what might be going on here, or how to debug this? Another weird thing (bug?) is that threapid always returns -1 (I started looking a bit into this, but maybe someone in the know sees the problem immediately) (furthermore thread(2) and thread.h seem to be inconsistent regarding return values of (e.g.) threadint*, threadkill* but this is not hard to fix; I could/can submit a patch) Axel. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2005-09-27 10:12 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-09-21 14:44 [9fans] thread confusion Fco. J. Ballesteros 2005-09-21 15:05 ` Axel Belinfante 2005-09-21 15:42 ` Russ Cox 2005-09-21 20:25 ` Axel Belinfante 2005-09-21 20:32 ` Axel Belinfante 2005-09-21 20:37 ` Russ Cox 2005-09-21 22:34 ` Axel Belinfante 2005-09-21 22:44 ` Russ Cox 2005-09-26 18:40 ` rog 2005-09-26 18:52 ` Russ Cox 2005-09-26 19:20 ` rog 2005-09-27 10:12 ` Axel Belinfante -- strict thread matches above, loose matches on Subject: below -- 2005-09-21 15:48 Fco. J. Ballesteros 2005-09-21 13:53 Fco. J. Ballesteros 2005-09-21 14:32 ` Axel Belinfante 2005-09-21 13:25 Axel Belinfante
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).