* [9fans] self inflicted gunshot wound
@ 2009-04-07 14:16 erik quanstrom
2009-04-08 1:48 ` Russ Cox
0 siblings, 1 reply; 5+ messages in thread
From: erik quanstrom @ 2009-04-07 14:16 UTC (permalink / raw)
To: 9fans
like kutner, the plumber decided to off itself for
seemingly inscrutable reasons this morning.
the abort condition does not appear to hold:
if(t > s+n)
abort();
since 0x3a497 < 0x3a430+0x93 and also
a!= nil, as would be required.
the interesting thing that happened at the
time was that one of plumber's clients was
off in the weeds waiting for something to
happen.
ideas?
- erik
abort()+0x0 /sys/src/libc/9sys/abort.c:6
plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
n=0x93
a=0x3e990
s=0x3a430
t=0x3a497
plumbpack(m=0x3c710,np=0x3e7c4)+0x31 /sys/src/libplumb/mesg.c:148
ndata=0x10
attr=0x6523
n=0x1430
buf=0x0
p=0x3a330
drainqueue(d=0x1b288)+0x84 /sys/src/cmd/plumb/fsys.c:393
prevs=0x0
nexts=0x3eb30
prevr=0x0
i=0x0
r=0x3a330
s=0x3e7b0
n=0x103cb
fsysread(buf=0x28f50,f=0x3c210,t=0x3a1f0)+0x1ed /sys/src/cmd/plumb/fsys.c:811
o=0x17
e=0x0
clock=0x3a1f0
b=0x3c210
i=0x13
d=0x1a7f
n=0x1f494
fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
t=0x3a1f0
buf=0x28f50
n=0x17
acid: regs()
PC 0x0000c80c abort /sys/src/libc/9sys/abort.c:6
SP 0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
CS 0x00000023 DS 0x0000001b SS 0x0000001b
GS 0x0000001b FS 0x0000001b ES 0x0000001b
TRAP 0x0000000e page fault
AX 0x0003a4c3 BX 0x0003a4c6 CX 0x0003a430 DX 0x00000093
DI 0x0003a4c7 SI 0x0003ea19 BP 0x0003e9f0
acid: stacks()
p=(Proc)0x3f090 pid 4505 Sched
t=(Thread)0x40f10 Rendez /sys/src/cmd/plumb/fsys.c:295 newfid
_threadrendezvous(tag=0x1939c,val=0x1)+0x11d /sys/src/libthread/rendez.c:56
qlock(q=0x1f448)+0x6f /sys/src/libc/9sys/qlock.c:74
newfid(fid=0x30d)+0x10 /sys/src/cmd/plumb/fsys.c:295
fsysproc()+0x165 /sys/src/cmd/plumb/fsys.c:261
launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
0xfefefefe ?file?:0
p=(Proc)0x3c750 pid 4506 Sched
t=(Thread)0x3be30 Rendez /sys/src/cmd/plumb/fsys.c:529 dispose
_threadrendezvous(tag=0x19390,val=0x1)+0x11d /sys/src/libthread/rendez.c:56
qlock(q=0x1f448)+0x6f /sys/src/libc/9sys/qlock.c:74
dispose(rs=0x0,m=0x39a70,e=0x0,t=0x28bc0,buf=0x68ff0)+0x10 /sys/src/cmd/plumb/fsys.c:529
fsyswrite(buf=0x68ff0,f=0x3c270,t=0x28bc0)+0x1ef /sys/src/cmd/plumb/fsys.c:898
fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
0xfefefefe ?file?:0
p=(Proc)0x39010 pid 16359 Running
t=(Thread)0x287a0 Running /sys/src/libplumb/mesg.c:125 plumbpackattr
abort()+0x0 /sys/src/libc/9sys/abort.c:6
plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
plumbpack(m=0x3c710,np=0x3e7c4)+0x31 /sys/src/libplumb/mesg.c:148
drainqueue(d=0x1b288)+0x84 /sys/src/cmd/plumb/fsys.c:393
fsysread(buf=0x28f50,f=0x3c210,t=0x3a1f0)+0x1ed /sys/src/cmd/plumb/fsys.c:811
fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
0xfefefefe ?file?:0
p=(Proc)0x6b030 pid 83108 Running
t=(Thread)0x39f50 Running /sys/src/cmd/plumb/fsys.c:241 fsysproc
pread()+0x7 /sys/src/libc/9syscall/pread.s:5
read(fd=0x6,buf=0x6d9f0,n=0x4)+0x2f /sys/src/libc/9sys/read.c:7
readn(n=0x4,av=0x6d9f0,f=0x6)+0x3a /sys/src/libc/port/readn.c:13
read9pmsg(abuf=0x6d9f0,fd=0x6,n=0x2018)+0x24 /sys/src/libc/9sys/read9pmsg.c:14
fsysproc()+0x74 /sys/src/cmd/plumb/fsys.c:241
launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
0xfefefefe ?file?:0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [9fans] self inflicted gunshot wound
2009-04-07 14:16 [9fans] self inflicted gunshot wound erik quanstrom
@ 2009-04-08 1:48 ` Russ Cox
2009-04-08 2:59 ` erik quanstrom
0 siblings, 1 reply; 5+ messages in thread
From: Russ Cox @ 2009-04-08 1:48 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
> abort()+0x0 /sys/src/libc/9sys/abort.c:6
> plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
> n=0x93
> a=0x3e990
> s=0x3a430
> t=0x3a497
t is unlikely to be correct here; it would have been saved
at the last call to strlen but since then got +='ed with the result.
> acid: regs()
> PC 0x0000c80c abort /sys/src/libc/9sys/abort.c:6
> SP 0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
> CS 0x00000023 DS 0x0000001b SS 0x0000001b
> GS 0x0000001b FS 0x0000001b ES 0x0000001b
> TRAP 0x0000000e page fault
> AX 0x0003a4c3 BX 0x0003a4c6 CX 0x0003a430 DX 0x00000093
> DI 0x0003a4c7 SI 0x0003ea19 BP 0x0003e9f0
there's s+n in AX. t is likely to be BX or DI, judging from the
pointer values; it has either written 3 or 4 bytes past the
end of the allocated section, which explains the abort.
you'd have to disassemble plumbpackattr to make sure.
it would be interesting to print *(*plumbpackattr:s\s)
to see if the string is corrupted.
> the interesting thing that happened at the
> time was that one of plumber's clients was
> off in the weeds waiting for something to
> happen.
i don't understand what you mean.
plumbers clients are always waiting for something
to happen. the plumber's only job is to tell them
when it does.
i suspect the global buffer in plumbpackattr's quote.
if you had multiple threads running through
plumbpackattr at once, it might cause the kind of crash you saw.
all the ordinary plumbing is protected by the QLock named queue,
but it looks like maybe if you'd been writing the rules file
at exactly the same time, that might have triggered
a simultaneous plumbpackattr call.
i'd prefer to be sure before throwing a lock in plumbpackattr.
russ
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [9fans] self inflicted gunshot wound
2009-04-08 1:48 ` Russ Cox
@ 2009-04-08 2:59 ` erik quanstrom
2009-04-08 3:33 ` Russ Cox
0 siblings, 1 reply; 5+ messages in thread
From: erik quanstrom @ 2009-04-08 2:59 UTC (permalink / raw)
To: 9fans
On Tue Apr 7 21:50:14 EDT 2009, rsc@swtch.com wrote:
> > abort()+0x0 /sys/src/libc/9sys/abort.c:6
> > plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
> > n=0x93
> > a=0x3e990
> > s=0x3a430
> > t=0x3a497
>
> t is unlikely to be correct here; it would have been saved
> at the last call to strlen but since then got +='ed with the result.
>
> > acid: regs()
> > PC 0x0000c80c abort /sys/src/libc/9sys/abort.c:6
> > SP 0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
> > CS 0x00000023 DS 0x0000001b SS 0x0000001b
> > GS 0x0000001b FS 0x0000001b ES 0x0000001b
> > TRAP 0x0000000e page fault
> > AX 0x0003a4c3 BX 0x0003a4c6 CX 0x0003a430 DX 0x00000093
> > DI 0x0003a4c7 SI 0x0003ea19 BP 0x0003e9f0
>
> there's s+n in AX. t is likely to be BX or DI, judging from the
> pointer values; it has either written 3 or 4 bytes past the
> end of the allocated section, which explains the abort.
> you'd have to disassemble plumbpackattr to make sure.
> it would be interesting to print *(*plumbpackattr:s\s)
> to see if the string is corrupted.
acid: *(*plumbpackattr:s\s)
filetype=mail sender=xxxx@xxxx.xxx length=8749 mailtype=delete date='Sun Mar4de7153cecd4a9b45aead1clfs digest=aff98fb56526d94ab768adbc93d12d989a11ed53
hmmm. i think you might be on to something after all.
maybe it was fratricide.
> i don't understand what you mean.
> plumbers clients are always waiting for something
> to happen. the plumber's only job is to tell them
> when it does.
several were waiting on something else to happen; they were
sleeping waiting for an exclusive-open file. the only reason
i mentioned it is that may have been 5 minutes between the
time that plumber tried to write the message and when it
could be delivered.
> i suspect the global buffer in plumbpackattr's quote.
> if you had multiple threads running through
> plumbpackattr at once, it might cause the kind of crash you saw.
> all the ordinary plumbing is protected by the QLock named queue,
> but it looks like maybe if you'd been writing the rules file
> at exactly the same time, that might have triggered
> a simultaneous plumbpackattr call.
unfortunately, i was not writing the rules file, that i know of.
- erik
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [9fans] self inflicted gunshot wound
2009-04-08 2:59 ` erik quanstrom
@ 2009-04-08 3:33 ` Russ Cox
2009-04-08 3:36 ` erik quanstrom
0 siblings, 1 reply; 5+ messages in thread
From: Russ Cox @ 2009-04-08 3:33 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
> acid: *(*plumbpackattr:s\s)
> filetype=mail sender=xxxx@xxxx.xxx length=8749 mailtype=delete
> date='Sun Mar4de7153cecd4a9b45aead1clfs
> digest=aff98fb56526d94ab768adbc93d12d989a11ed53
> several were waiting on something else to happen; they were
> sleeping waiting for an exclusive-open file. the only reason
> i mentioned it is that may have been 5 minutes between the
> time that plumber tried to write the message and when it
> could be delivered.
aha. plumbunpackattr is also using attrbuf.
that explains it. a new plumbing message came
in at the same time an old one was being
delivered. this can only happen if a reader
gets behind and is catching up while new
messages are still coming in.
i would put a lock around the use of attrbuf
in both plumbpackattr and plumbunpackattr.
russ
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [9fans] self inflicted gunshot wound
2009-04-08 3:33 ` Russ Cox
@ 2009-04-08 3:36 ` erik quanstrom
0 siblings, 0 replies; 5+ messages in thread
From: erik quanstrom @ 2009-04-08 3:36 UTC (permalink / raw)
To: 9fans
> i would put a lock around the use of attrbuf
> in both plumbpackattr and plumbunpackattr.
>
> russ
why not just use malloc?
- erik
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-04-08 3:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-07 14:16 [9fans] self inflicted gunshot wound erik quanstrom
2009-04-08 1:48 ` Russ Cox
2009-04-08 2:59 ` erik quanstrom
2009-04-08 3:33 ` Russ Cox
2009-04-08 3:36 ` erik quanstrom
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).