9front - general discussion about 9front
 help / color / mirror / Atom feed
* Kernel memory leak
@ 2020-08-21 15:40 Alex Musolino
  2020-08-23  1:20 ` [9front] " cinap_lenrek
  2020-08-23  3:07 ` cinap_lenrek
  0 siblings, 2 replies; 5+ messages in thread
From: Alex Musolino @ 2020-08-21 15:40 UTC (permalink / raw)
  To: 9front

Hi all,

As has been mentioned on #cat-v, some of us have known for a while now
that there is a memory leak in the 9front kernel.  Each month or so my
vultr VPS needs a reboot because it has run out of kernel memory.
Usually I reboot it just before this happens to minimise downtime.
Anyway, this week I finally tracked it down and now have a fix.

I noticed some time ago that there were small step changes in the
amount of kernel memory being used every minute, pretty well exactly.
Eventually I realised that there was a cron job running every minute;
disabling the cron job stopped the leak.

After some investigation I found that the following sequence is enough
to trigger the bug:

	@{rfork n; mount -a '#s/boot' /mnt/root; bind /mnt/root /}

This is more or less what cron(8) does when it runs a job in a new
namespace (see the first 2 line of /lib/namespace).  Each invocation
will leak one Mhead object, one Chan object, and some Path objects.
You can verify this yourself with kmem(1).

Here's what happens.  The initial mount triggers a special case in
namec which causes an Mhead to be allocated for the old Chan object
along with a Mount referencing the same Chan:

	if(m == nil){
		/*
		 *  nothing mounted here yet.  create a mount
		 *  head and add to the hash table.
		 */
		m = newmhead(old);
		*l = m;

		/*
		 *  if this is a union mount, add the old
		 *  node to the mount chain.
		 */
		if(order != MREPL)
			m->mount = newmount(old, 0, nil);
	}

Both newmhead and newmount bump the refcount of the old Chan.  This is
fine until the subsequent bind(2) where the original Chan is updated to
point back to the Mhead (see the Abind case in namec) and the refcount
of the Mhead object is incremented.  Now we have cycle and neither the
Mhead nor the Chan can or will be freed.

My fix (below) changes findmount to allocate a new Chan object if it
would otherwise create a cycle.  I've done only light testing but the
leak is gone and there don't seem to be any other ill effects.

Thoughts?

diff -r ec3da4e6c943 sys/src/9/port/chan.c
--- a/sys/src/9/port/chan.c	Fri Aug 21 10:06:22 2020 +0930
+++ b/sys/src/9/port/chan.c	Sat Aug 22 00:46:08 2020 +0930
@@ -864,6 +864,8 @@
 			}
 			if(*cp != nil)
 				cclose(*cp);
+			if(to == m->from)
+				to = cunique(to);
 			*cp = to;
 			return 1;
 		}

--
Cheers,
Alex Musolino


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] Kernel memory leak
  2020-08-21 15:40 Kernel memory leak Alex Musolino
@ 2020-08-23  1:20 ` cinap_lenrek
  2020-08-23  2:36   ` Alex Musolino
  2020-08-23  3:07 ` cinap_lenrek
  1 sibling, 1 reply; 5+ messages in thread
From: cinap_lenrek @ 2020-08-23  1:20 UTC (permalink / raw)
  To: 9front

very good catch.

i think the cunique() would *ALWAYS* be required when we are going
to stuff something on the chan's umh, as otherwise we'd have a race
condition because the returned chan from domount() -> findmount()
is shared.

alternatively, we could probably get rid of Abind mode completely,
and just have cmount() do the mounthead lookup itself instead of
using new->umh.

but for now, i think the best would be to do it like this:

	switch(amode){
	case Abind:
		/* no need to maintain path - cannot dotdot an Abind */
		m = nil;
		if(!nomount)
			domount(&c, &m, nil);
+		c = cunique(c);
+		if(c->umh != nil){	//BUG
+			print("bind umh\n");
+			putmhead(c->umh);
+		}
-		putmhead(c->umh);
		c->umh = m;
		break;

can you check if this works?

--
cinap


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] Kernel memory leak
  2020-08-23  1:20 ` [9front] " cinap_lenrek
@ 2020-08-23  2:36   ` Alex Musolino
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Musolino @ 2020-08-23  2:36 UTC (permalink / raw)
  To: 9front

> i think the cunique() would *ALWAYS* be required when we are going
> to stuff something on the chan's umh, as otherwise we'd have a race
> condition because the returned chan from domount() -> findmount()
> is shared.

Yeah, it makes more sense to do it at the same time/place as the umh
update.  I guess we do the same thing for the Aopen/Acreate case where
we might also update the umh?  We also change umh in devshr.c but I
guess this is fine since it's only temporary.

> can you check if this works?

Works fine as far as I can tell.  How about the following patch then?

diff -r ec3da4e6c943 sys/src/9/port/chan.c
--- a/sys/src/9/port/chan.c	Fri Aug 21 10:06:22 2020 +0930
+++ b/sys/src/9/port/chan.c	Sun Aug 23 12:01:46 2020 +0930
@@ -1410,7 +1410,11 @@
 		m = nil;
 		if(!nomount)
 			domount(&c, &m, nil);
-		putmhead(c->umh);
+		c = cunique(c);
+		if(c->umh != nil){	//BUG
+			print("bind umh\n");
+			putmhead(c->umh);
+		}
 		c->umh = m;
 		break;
 
@@ -1454,9 +1458,14 @@
 				c->umh = nil;
 			}
 			/* only save the mount head if it's a multiple element union */
-			if(m != nil && m->mount != nil && m->mount->next != nil)
+			if(m != nil && m->mount != nil && m->mount->next != nil){
+				c = cunique(c);
+				if(c->umh != nil){	//BUG
+					print("Aopen/Acreate umh\n");
+					putmhead(c->umh);
+				}
 				c->umh = m;
-			else
+			}else
 				putmhead(m);
 
 			/* save registers else error() in open has wrong value of c saved */

--
Cheers,
Alex Musolino


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] Kernel memory leak
  2020-08-21 15:40 Kernel memory leak Alex Musolino
  2020-08-23  1:20 ` [9front] " cinap_lenrek
@ 2020-08-23  3:07 ` cinap_lenrek
  2020-08-23  3:16   ` Alex Musolino
  1 sibling, 1 reply; 5+ messages in thread
From: cinap_lenrek @ 2020-08-23  3:07 UTC (permalink / raw)
  To: 9front

changeset:   7937:e4d33fdafb17
tag:         tip
user:        cinap_lenrek@felloff.net
date:        Sun Aug 23 05:07:30 2020 +0200
summary:     kernel: fix Abind cyclic reference and mounthead leaks (thanks Alex Musolino)

--
cinap


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] Kernel memory leak
  2020-08-23  3:07 ` cinap_lenrek
@ 2020-08-23  3:16   ` Alex Musolino
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Musolino @ 2020-08-23  3:16 UTC (permalink / raw)
  To: 9front

Even better, thanks!


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-23  3:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21 15:40 Kernel memory leak Alex Musolino
2020-08-23  1:20 ` [9front] " cinap_lenrek
2020-08-23  2:36   ` Alex Musolino
2020-08-23  3:07 ` cinap_lenrek
2020-08-23  3:16   ` Alex Musolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).