9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] odd out-of-memory behavior
@ 2008-07-25 16:44 erik quanstrom
  2008-07-25 18:09 ` Russ Cox
  0 siblings, 1 reply; 7+ messages in thread
From: erik quanstrom @ 2008-07-25 16:44 UTC (permalink / raw)
  To: 9fans

i'm seeing some out-of-memory behavior i don't quite understand.
there is no swap configured.  the machine is a cpuserver.
the symptom is this message is repeated on the console
maybe 20x.  fs in this case is upas/fs.  (the standard one.)

311954: fs killed: out of memory
out of physical memory; no swap configured
311954: fs killed: out of memory
out of physical memory; no swap configured
311954: fs killed: out of memory

there's a sleep of 5 seconds after killbig
is called.  so, though it's hard to imagine,
it must be taking 100s to clean up this process.

i'm not sure i have any ideas on how this could
happen.

- erik




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-25 16:44 [9fans] odd out-of-memory behavior erik quanstrom
@ 2008-07-25 18:09 ` Russ Cox
  2008-07-25 18:25   ` erik quanstrom
  2008-07-27  1:20   ` erik quanstrom
  0 siblings, 2 replies; 7+ messages in thread
From: Russ Cox @ 2008-07-25 18:09 UTC (permalink / raw)
  To: 9fans

> 311954: fs killed: out of memory
> out of physical memory; no swap configured
> 311954: fs killed: out of memory
> out of physical memory; no swap configured
> 311954: fs killed: out of memory
>
> there's a sleep of 5 seconds after killbig
> is called.  so, though it's hard to imagine,
> it must be taking 100s to clean up this process.

/sys/src/9/port/proc.c:/^killbig marks the process
to be killed, but if it can't acquire the lock on that
process's segments, the memory is not actually
freed immediately:

	kp->procctl = Proc_exitbig;
	for(i = 0; i < NSEG; i++) {
		s = kp->seg[i];
		if(s != 0 && canqlock(&s->lk)) {
			mfreeseg(s, s->base, (s->top - s->base)/BY2PG);
			qunlock(&s->lk);
		}
	}

Perhaps another upas/fs proc sharing the same
segment is holding the segment lock and
blocking on something else.

If you can make it happen again, you could try
to run

	acid -k -l kernel 1 /386/9pccpu  # or your kernel image
	stacks()

though of course without any memory it's going to
be hard to start acid.  You might be able to pull it off
if you cpu somewhere else, bind /mnt/term/proc /proc,
and then start acid there before you run the machine
out of memory.  As long as the exportfs serving /mnt/term
doesn't need any new memory pages, it should be able
to serve /proc well enough to the remote acid.

Russ



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-25 18:09 ` Russ Cox
@ 2008-07-25 18:25   ` erik quanstrom
  2008-07-25 19:39     ` Russ Cox
  2008-07-27  1:20   ` erik quanstrom
  1 sibling, 1 reply; 7+ messages in thread
From: erik quanstrom @ 2008-07-25 18:25 UTC (permalink / raw)
  To: 9fans

unfortunately or fortunately this is a rare problem.
hopefully the caching upas will mature faster than
our mailboxes grow.

thanks for re-pointing out the acid tricks.  i shall
lay a trap.  but in the interest of covering the careful
thought bit ...

> /sys/src/9/port/proc.c:/^killbig marks the process
> to be killed, but if it can't acquire the lock on that
> process's segments, the memory is not actually
> freed immediately:
>
> 	kp->procctl = Proc_exitbig;
> 	for(i = 0; i < NSEG; i++) {
> 		s = kp->seg[i];
> 		if(s != 0 && canqlock(&s->lk)) {
> 			mfreeseg(s, s->base, (s->top - s->base)/BY2PG);
> 			qunlock(&s->lk);
> 		}
> 	}
>
> Perhaps another upas/fs proc sharing the same
> segment is holding the segment lock and
> blocking on something else.

how would that happen?  upas/fs -p doesn't fork.
(it's being run from imap4d.)

is there some other reason that segments would
be shared?

i originally thought someone else might be sitting
on the shared segments, but i couldn't explain how
that might be happening.  i also thought the purpose
of this loop was to hunt down relatives sharing memory
with killbig's vic:

	for(p = procalloc.arena; p < ep; p++) {
		if(p->state == Dead || p->kp)
			continue;
		if(p != kp && p->seg[BSEG] && p->seg[BSEG] == kp->seg[BSEG])
			p->procctl = Proc_exitbig;
	}

so much for the "careful" thought.  what am i missing?

- erik




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-25 18:25   ` erik quanstrom
@ 2008-07-25 19:39     ` Russ Cox
  0 siblings, 0 replies; 7+ messages in thread
From: Russ Cox @ 2008-07-25 19:39 UTC (permalink / raw)
  To: 9fans

> how would that happen?  upas/fs -p doesn't fork.
> (it's being run from imap4d.)

maybe more than one process isn't involved.
that would make your job easier.  ☺

> i originally thought someone else might be sitting
> on the shared segments, but i couldn't explain how
> that might be happening.  i also thought the purpose
> of this loop was to hunt down relatives sharing memory
> with killbig's vic:
>
> 	for(p = procalloc.arena; p < ep; p++) {
> 		if(p->state == Dead || p->kp)
> 			continue;
> 		if(p != kp && p->seg[BSEG] && p->seg[BSEG] == kp->seg[BSEG])
> 			p->procctl = Proc_exitbig;
> 	}
>
> so much for the "careful" thought.  what am i missing?

that loop identifies and marks them, but it doesn't kill them.
they won't die until the next time they attempt to cross
the kernel-user boundary.

i also wonder if perhaps there is some way that you
can manage to end up sleeping for the pager in fixfault
while holding the lock of the big segment.  i don't see
one immediately, but that doesn't mean it's not there.

if you can get acid running, 100 seconds should be plenty
of time to get stack traces that would solve this.

russ



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-25 18:09 ` Russ Cox
  2008-07-25 18:25   ` erik quanstrom
@ 2008-07-27  1:20   ` erik quanstrom
  2008-07-27  2:20     ` Russ Cox
  2008-07-27  2:21     ` Russ Cox
  1 sibling, 2 replies; 7+ messages in thread
From: erik quanstrom @ 2008-07-27  1:20 UTC (permalink / raw)
  To: rsc, 9fans

> If you can make it happen again, you could try
> to run
>
> 	acid -k -l kernel 1 /386/9pccpu  # or your kernel image
> 	stacks()
>

it's not immediately obvious what i am doing wrong:

akin# acid -k -l kernel 1 /386/9pccpu
/386/9pccpu:386 plan 9 boot image
/sys/lib/acid/port
/sys/lib/acid/386
/sys/lib/acid/kernel
acid: include("acid")
acid: include("procacid")
acid: stacks()
=========================================================
0xf0312008 1: init dennis pc 0x00008984 Await (Wakeme) ut 2 st 2 qpc 0x00000000
<stdin>:5: (error) no stack frame: can't translate address 0xf001bf30

- erik



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-27  1:20   ` erik quanstrom
@ 2008-07-27  2:20     ` Russ Cox
  2008-07-27  2:21     ` Russ Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Russ Cox @ 2008-07-27  2:20 UTC (permalink / raw)
  To: quanstro; +Cc: 9fans

> it's not immediately obvious what i am doing wrong:
>
> akin# acid -k -l kernel 1 /386/9pccpu
> /386/9pccpu:386 plan 9 boot image
> /sys/lib/acid/port
> /sys/lib/acid/386
> /sys/lib/acid/kernel
> acid: include("acid")
> acid: include("procacid")
> acid: stacks()
> =========================================================
> 0xf0312008 1: init dennis pc 0x00008984 Await (Wakeme) ut 2 st 2 qpc 0x00000000
> <stdin>:5: (error) no stack frame: can't translate address 0xf001bf30

i forgot to say you should

	mappc()

first.  the pc kernel maps some extra data memory
below the text segment, which isn't accounted for
in the default acid map.

russ



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] odd out-of-memory behavior
  2008-07-27  1:20   ` erik quanstrom
  2008-07-27  2:20     ` Russ Cox
@ 2008-07-27  2:21     ` Russ Cox
  1 sibling, 0 replies; 7+ messages in thread
From: Russ Cox @ 2008-07-27  2:21 UTC (permalink / raw)
  To: quanstro, 9fans

Apologies for the double-post.  Now that I look,
the function to do the mapping is now called kinit.

Russ



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-07-27  2:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-25 16:44 [9fans] odd out-of-memory behavior erik quanstrom
2008-07-25 18:09 ` Russ Cox
2008-07-25 18:25   ` erik quanstrom
2008-07-25 19:39     ` Russ Cox
2008-07-27  1:20   ` erik quanstrom
2008-07-27  2:20     ` Russ Cox
2008-07-27  2:21     ` Russ Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).