9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: "Russ Cox" <rsc@swtch.com>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net>
Subject: Re: [9fans] infrequent panics
Date: Wed, 12 Mar 2008 20:33:26 -0400	[thread overview]
Message-ID: <ee9e417a0803121733k71b2940v9afb6bc804d6320b@mail.gmail.com> (raw)
In-Reply-To: <c65e570684403b79be5b6c6f8b4259dd@quanstro.net>

This is a cool bug; thanks for the report.
I love a good concurrency bug.

>  Wed Mar 12 13:20:01: hdr 0a110c09 00000020 f018c7ce cafebabe 20737023 7064755b
>  Wed Mar 12 13:20:01: tail 0a110c09 00000020 f018c7ce cafebabe 20737023 7064755b | 72657320 20726576

This says "#ps [udp server]" (I'm guessing about the last few bytes),
which is the name of the dns proc as it shows up in ps.

The stack trace showed this buffer being corrupt in devattach,
which executes:

	buf = smalloc(4+strlen(spec)+1);
	sprint(buf, "#%C%s", tc, spec);
	c->path = newpath(buf);
	free(buf);

There's already something wrong here -- if spec contains bad
UTF then buf will overflow all the time.  Really it should be
snprint or sprint(buf, "#%C") followed by strcat.  Let's ignore that,
since there's no bad UTF in your string.

If you poke around at how this gets called on a file like #p/1/blah,
you'll find that at this point in the code, spec == up->genbuf+2.
So if something else was overwriting up->genbuf at just the
wrong time (between the strlen and the execution of %s)
with the process name as it shows up in ps, then you'd see
the corruption in the stack trace.

Since up->genbuf is private to up (i.e., the current process)
I wondered at first if somehow the scheduler was broken and
was trying to run the same Proc on multiple CPUs, but
Erik assured me the scheduler had not been touched.

But aha!  If you poke around in the source code you will find
that there is one place where one proc uses another's genbuf:

% g genbuf | grep -v 'up->genbuf'
chan.c:1630: 	/* place final element in genbuf for e.g. exec */
devproc.c:717: 		j = procargs(p, p->genbuf, sizeof p->genbuf);
devproc.c:723: 		memmove(a, &p->genbuf[offset], n);
portdat.h:701: 	char	genbuf[128];	/* buffer used e.g. for last name
element from namec */
%

And look, it's the code for reading #p/1/args, which is
how ps figures out what to print:

/sys/src/9/port/devproc.c:/^procread
	case Qargs:
		qlock(&p->debug);
		j = procargs(p, p->genbuf, sizeof p->genbuf);
		qunlock(&p->debug);
		if(offset >= j)
			return 0;
		if(offset+n > j)
			n = j-offset;
		memmove(a, &p->genbuf[offset], n);
		return n;

So if you ran ps at exactly the right time (while one of the
procs on the machine was opening a #-ed file name and
was between strlen and sprint), you'd win the race.
You might improve your chances if the machine were
close to out of memory and the smalloc had to wait for
more memory.

Oops.  Changing those lines to use up->genbuf instead
of p->genbuf should eliminate the race.

Russ


  reply	other threads:[~2008-03-13  0:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-12 20:08 erik quanstrom
2008-03-13  0:33 ` Russ Cox [this message]
2008-03-13 10:44   ` roger peppe
2008-03-13  2:14 erik quanstrom
2008-03-13  5:57 Brian L. Stuart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee9e417a0803121733k71b2940v9afb6bc804d6320b@mail.gmail.com \
    --to=rsc@swtch.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).