9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] self inflicted gunshot wound
@ 2009-04-07 14:16 erik quanstrom
  2009-04-08  1:48 ` Russ Cox
  0 siblings, 1 reply; 5+ messages in thread
From: erik quanstrom @ 2009-04-07 14:16 UTC (permalink / raw)
  To: 9fans

like kutner, the plumber decided to off itself for
seemingly inscrutable reasons this morning.

the abort condition does not appear to hold:
	if(t > s+n)
		abort();
since 0x3a497 < 0x3a430+0x93 and also
a!= nil, as would be required.

the interesting thing that happened at the
time was that one of plumber's clients was
off in the weeds waiting for something to
happen.

ideas?

- erik

abort()+0x0 /sys/src/libc/9sys/abort.c:6
plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
	n=0x93
	a=0x3e990
	s=0x3a430
	t=0x3a497
plumbpack(m=0x3c710,np=0x3e7c4)+0x31 /sys/src/libplumb/mesg.c:148
	ndata=0x10
	attr=0x6523
	n=0x1430
	buf=0x0
	p=0x3a330
drainqueue(d=0x1b288)+0x84 /sys/src/cmd/plumb/fsys.c:393
	prevs=0x0
	nexts=0x3eb30
	prevr=0x0
	i=0x0
	r=0x3a330
	s=0x3e7b0
	n=0x103cb
fsysread(buf=0x28f50,f=0x3c210,t=0x3a1f0)+0x1ed /sys/src/cmd/plumb/fsys.c:811
	o=0x17
	e=0x0
	clock=0x3a1f0
	b=0x3c210
	i=0x13
	d=0x1a7f
	n=0x1f494
fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
	t=0x3a1f0
	buf=0x28f50
	n=0x17

acid: regs()
PC	0x0000c80c abort  /sys/src/libc/9sys/abort.c:6
SP	0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
CS	0x00000023 DS	 0x0000001b SS	0x0000001b
GS	0x0000001b FS	 0x0000001b ES	0x0000001b
TRAP	0x0000000e page fault
AX	0x0003a4c3 BX	0x0003a4c6 CX	0x0003a430 DX	0x00000093
DI	0x0003a4c7 SI	0x0003ea19 BP	0x0003e9f0

acid: stacks()
p=(Proc)0x3f090    pid 4505  Sched
	t=(Thread)0x40f10    Rendez     /sys/src/cmd/plumb/fsys.c:295 newfid
		_threadrendezvous(tag=0x1939c,val=0x1)+0x11d /sys/src/libthread/rendez.c:56
		qlock(q=0x1f448)+0x6f /sys/src/libc/9sys/qlock.c:74
		newfid(fid=0x30d)+0x10 /sys/src/cmd/plumb/fsys.c:295
		fsysproc()+0x165 /sys/src/cmd/plumb/fsys.c:261
		launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
		0xfefefefe ?file?:0


p=(Proc)0x3c750    pid 4506  Sched
	t=(Thread)0x3be30    Rendez     /sys/src/cmd/plumb/fsys.c:529 dispose
		_threadrendezvous(tag=0x19390,val=0x1)+0x11d /sys/src/libthread/rendez.c:56
		qlock(q=0x1f448)+0x6f /sys/src/libc/9sys/qlock.c:74
		dispose(rs=0x0,m=0x39a70,e=0x0,t=0x28bc0,buf=0x68ff0)+0x10 /sys/src/cmd/plumb/fsys.c:529
		fsyswrite(buf=0x68ff0,f=0x3c270,t=0x28bc0)+0x1ef /sys/src/cmd/plumb/fsys.c:898
		fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
		launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
		0xfefefefe ?file?:0


p=(Proc)0x39010    pid 16359  Running
	t=(Thread)0x287a0    Running    /sys/src/libplumb/mesg.c:125 plumbpackattr
		abort()+0x0 /sys/src/libc/9sys/abort.c:6
		plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
		plumbpack(m=0x3c710,np=0x3e7c4)+0x31 /sys/src/libplumb/mesg.c:148
		drainqueue(d=0x1b288)+0x84 /sys/src/cmd/plumb/fsys.c:393
		fsysread(buf=0x28f50,f=0x3c210,t=0x3a1f0)+0x1ed /sys/src/cmd/plumb/fsys.c:811
		fsysproc()+0x186 /sys/src/cmd/plumb/fsys.c:262
		launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
		0xfefefefe ?file?:0


p=(Proc)0x6b030    pid 83108  Running
	t=(Thread)0x39f50    Running    /sys/src/cmd/plumb/fsys.c:241 fsysproc
		pread()+0x7 /sys/src/libc/9syscall/pread.s:5
		read(fd=0x6,buf=0x6d9f0,n=0x4)+0x2f /sys/src/libc/9sys/read.c:7
		readn(n=0x4,av=0x6d9f0,f=0x6)+0x3a /sys/src/libc/port/readn.c:13
		read9pmsg(abuf=0x6d9f0,fd=0x6,n=0x2018)+0x24 /sys/src/libc/9sys/read9pmsg.c:14
		fsysproc()+0x74 /sys/src/cmd/plumb/fsys.c:241
		launcher386(arg=0x0,f=0x17f6)+0x10 /sys/src/libthread/386.c:10
		0xfefefefe ?file?:0



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] self inflicted gunshot wound
  2009-04-07 14:16 [9fans] self inflicted gunshot wound erik quanstrom
@ 2009-04-08  1:48 ` Russ Cox
  2009-04-08  2:59   ` erik quanstrom
  0 siblings, 1 reply; 5+ messages in thread
From: Russ Cox @ 2009-04-08  1:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> abort()+0x0 /sys/src/libc/9sys/abort.c:6
> plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
>        n=0x93
>        a=0x3e990
>        s=0x3a430
>        t=0x3a497

t is unlikely to be correct here; it would have been saved
at the last call to strlen but since then got +='ed with the result.

> acid: regs()
> PC      0x0000c80c abort  /sys/src/libc/9sys/abort.c:6
> SP      0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
> CS      0x00000023 DS    0x0000001b SS  0x0000001b
> GS      0x0000001b FS    0x0000001b ES  0x0000001b
> TRAP    0x0000000e page fault
> AX      0x0003a4c3 BX   0x0003a4c6 CX   0x0003a430 DX   0x00000093
> DI      0x0003a4c7 SI   0x0003ea19 BP   0x0003e9f0

there's s+n in AX.  t is likely to be BX or DI, judging from the
pointer values; it has either written 3 or 4 bytes past the
end of the allocated section, which explains the abort.
you'd have to disassemble plumbpackattr to make sure.
it would be interesting to print *(*plumbpackattr:s\s)
to see if the string is corrupted.

> the interesting thing that happened at the
> time was that one of plumber's clients was
> off in the weeds waiting for something to
> happen.

i don't understand what you mean.
plumbers clients are always waiting for something
to happen.  the plumber's only job is to tell them
when it does.

i suspect the global buffer in plumbpackattr's quote.
if you had multiple threads running through
plumbpackattr at once, it might cause the kind of crash you saw.
all the ordinary plumbing is protected by the QLock named queue,
but it looks like maybe if you'd been writing the rules file
at exactly the same time, that might have triggered
a simultaneous plumbpackattr call.

i'd prefer to be sure before throwing a lock in plumbpackattr.

russ


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] self inflicted gunshot wound
  2009-04-08  1:48 ` Russ Cox
@ 2009-04-08  2:59   ` erik quanstrom
  2009-04-08  3:33     ` Russ Cox
  0 siblings, 1 reply; 5+ messages in thread
From: erik quanstrom @ 2009-04-08  2:59 UTC (permalink / raw)
  To: 9fans

On Tue Apr  7 21:50:14 EDT 2009, rsc@swtch.com wrote:
> > abort()+0x0 /sys/src/libc/9sys/abort.c:6
> > plumbpackattr(attr=0x28b00)+0x126 /sys/src/libplumb/mesg.c:125
> >        n=0x93
> >        a=0x3e990
> >        s=0x3a430
> >        t=0x3a497
>
> t is unlikely to be correct here; it would have been saved
> at the last call to strlen but since then got +='ed with the result.
>
> > acid: regs()
> > PC      0x0000c80c abort  /sys/src/libc/9sys/abort.c:6
> > SP      0x00068e78 ECODE 0x00000004 EFLAG 0x00000206
> > CS      0x00000023 DS    0x0000001b SS  0x0000001b
> > GS      0x0000001b FS    0x0000001b ES  0x0000001b
> > TRAP    0x0000000e page fault
> > AX      0x0003a4c3 BX   0x0003a4c6 CX   0x0003a430 DX   0x00000093
> > DI      0x0003a4c7 SI   0x0003ea19 BP   0x0003e9f0
>
> there's s+n in AX.  t is likely to be BX or DI, judging from the
> pointer values; it has either written 3 or 4 bytes past the
> end of the allocated section, which explains the abort.
> you'd have to disassemble plumbpackattr to make sure.
> it would be interesting to print *(*plumbpackattr:s\s)
> to see if the string is corrupted.

acid: *(*plumbpackattr:s\s)
filetype=mail sender=xxxx@xxxx.xxx length=8749 mailtype=delete date='Sun Mar4de7153cecd4a9b45aead1clfs digest=aff98fb56526d94ab768adbc93d12d989a11ed53

hmmm.  i think you might be on to something after all.
maybe it was fratricide.

> i don't understand what you mean.
> plumbers clients are always waiting for something
> to happen.  the plumber's only job is to tell them
> when it does.

several were waiting on something else to happen; they were
sleeping waiting for an exclusive-open file.  the only reason
i mentioned it is that may have been 5 minutes between the
time that plumber tried to write the message and when it
could be delivered.

> i suspect the global buffer in plumbpackattr's quote.
> if you had multiple threads running through
> plumbpackattr at once, it might cause the kind of crash you saw.
> all the ordinary plumbing is protected by the QLock named queue,
> but it looks like maybe if you'd been writing the rules file
> at exactly the same time, that might have triggered
> a simultaneous plumbpackattr call.

unfortunately, i was not writing the rules file, that i know of.

- erik



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] self inflicted gunshot wound
  2009-04-08  2:59   ` erik quanstrom
@ 2009-04-08  3:33     ` Russ Cox
  2009-04-08  3:36       ` erik quanstrom
  0 siblings, 1 reply; 5+ messages in thread
From: Russ Cox @ 2009-04-08  3:33 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> acid: *(*plumbpackattr:s\s)
> filetype=mail sender=xxxx@xxxx.xxx length=8749 mailtype=delete
> date='Sun Mar4de7153cecd4a9b45aead1clfs
> digest=aff98fb56526d94ab768adbc93d12d989a11ed53

> several were waiting on something else to happen; they were
> sleeping waiting for an exclusive-open file.  the only reason
> i mentioned it is that may have been 5 minutes between the
> time that plumber tried to write the message and when it
> could be delivered.

aha.  plumbunpackattr is also using attrbuf.
that explains it.  a new plumbing message came
in at the same time an old one was being
delivered.  this can only happen if a reader
gets behind and is catching up while new
messages are still coming in.

i would put a lock around the use of attrbuf
in both plumbpackattr and plumbunpackattr.

russ


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] self inflicted gunshot wound
  2009-04-08  3:33     ` Russ Cox
@ 2009-04-08  3:36       ` erik quanstrom
  0 siblings, 0 replies; 5+ messages in thread
From: erik quanstrom @ 2009-04-08  3:36 UTC (permalink / raw)
  To: 9fans

> i would put a lock around the use of attrbuf
> in both plumbpackattr and plumbunpackattr.
>
> russ

why not just use malloc?

- erik



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-08  3:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-07 14:16 [9fans] self inflicted gunshot wound erik quanstrom
2009-04-08  1:48 ` Russ Cox
2009-04-08  2:59   ` erik quanstrom
2009-04-08  3:33     ` Russ Cox
2009-04-08  3:36       ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).