From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from vultr.musolino.id.au ([45.76.123.158]) by ewsd; Fri Aug 21 11:41:05 EDT 2020
Received: from 58.170.204.252 ([58.170.204.252]) by vultr; Sat Aug 22 01:40:29 EST 2020
Message-ID: <65DEAD1E639AB624F6FBBC11D8EEDD17@musolino.id.au>
To: 9front@9front.org
Subject: Kernel memory leak
From: Alex Musolino <alex@musolino.id.au>
Date: Sat, 22 Aug 2020 01:10:27 +0930
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
List-ID: <9front.9front.org>
List-Help: <http://lists.9front.org>
X-Glyph: ➈
X-Bullshit: ACPI event scripting shader full-stack locator 

Hi all,

As has been mentioned on #cat-v, some of us have known for a while now
that there is a memory leak in the 9front kernel.  Each month or so my
vultr VPS needs a reboot because it has run out of kernel memory.
Usually I reboot it just before this happens to minimise downtime.
Anyway, this week I finally tracked it down and now have a fix.

I noticed some time ago that there were small step changes in the
amount of kernel memory being used every minute, pretty well exactly.
Eventually I realised that there was a cron job running every minute;
disabling the cron job stopped the leak.

After some investigation I found that the following sequence is enough
to trigger the bug:

	@{rfork n; mount -a '#s/boot' /mnt/root; bind /mnt/root /}

This is more or less what cron(8) does when it runs a job in a new
namespace (see the first 2 line of /lib/namespace).  Each invocation
will leak one Mhead object, one Chan object, and some Path objects.
You can verify this yourself with kmem(1).

Here's what happens.  The initial mount triggers a special case in
namec which causes an Mhead to be allocated for the old Chan object
along with a Mount referencing the same Chan:

	if(m == nil){
		/*
		 *  nothing mounted here yet.  create a mount
		 *  head and add to the hash table.
		 */
		m = newmhead(old);
		*l = m;

		/*
		 *  if this is a union mount, add the old
		 *  node to the mount chain.
		 */
		if(order != MREPL)
			m->mount = newmount(old, 0, nil);
	}

Both newmhead and newmount bump the refcount of the old Chan.  This is
fine until the subsequent bind(2) where the original Chan is updated to
point back to the Mhead (see the Abind case in namec) and the refcount
of the Mhead object is incremented.  Now we have cycle and neither the
Mhead nor the Chan can or will be freed.

My fix (below) changes findmount to allocate a new Chan object if it
would otherwise create a cycle.  I've done only light testing but the
leak is gone and there don't seem to be any other ill effects.

Thoughts?

diff -r ec3da4e6c943 sys/src/9/port/chan.c
--- a/sys/src/9/port/chan.c	Fri Aug 21 10:06:22 2020 +0930
+++ b/sys/src/9/port/chan.c	Sat Aug 22 00:46:08 2020 +0930
@@ -864,6 +864,8 @@
 			}
 			if(*cp != nil)
 				cclose(*cp);
+			if(to == m->from)
+				to = cunique(to);
 			*cp = to;
 			return 1;
 		}

--
Cheers,
Alex Musolino