Development discussion of WireGuard
 help / color / mirror / Atom feed
From: Brad Spengler <spender@grsecurity.net>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Pipacs <pageexec@gmail.com>,
	WireGuard mailing list <wireguard@lists.zx2c4.com>
Subject: Re: kernel warning with 0.0.20170223: entered softirq 3 NET_RX net_rx_action+0x0/0x760 with preempt_count 00000101, exited with 00000100?
Date: Mon, 27 Feb 2017 06:53:08 -0500	[thread overview]
Message-ID: <20170227115307.GA2499@grsecurity.net> (raw)
In-Reply-To: <CAHmME9ppn7wonfYTjOu9V-mmuRnF-S6v+1aoLneq4zLeoEShGg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3124 bytes --]

Hi,

From looking at the code, this seems to have been introduced during
the port to 4.9.  In previous kernels (and in both the stable kernels)
a get_cpu was used to look at the per-cpu irq stack pointer value,
which was properly put in all cases.  When the code was reworked
for 4.9 to use some new upstream helper functions to identify the
various stacks (and one which uses this_cpu_ptr to identify the
irq stack) that single put_cpu() wasn't removed with the others.
We'll have this fixed in the next patch.

Thanks!
-Brad

On Mon, Feb 27, 2017 at 04:22:34AM +0100, Jason A. Donenfeld wrote:
> Hey Pipacs,
> 
> I've been receiving reports of strange bugs from grsec users with
> WireGuard. The first set of bugs was a heisenbug crash, and I never
> found the root cause, but it seemed to happen in the rx path. Then
> today Timoth??e emailed another different bug from a grsec box, also
> along the rx path. This time it was related to the preemption count
> being wrong coming into and going out of the rx softirq. This kind of
> preemption mismatch, I figure, might account for the earlier bug I
> never solved.
> 
> So armed with this new information, I went hunting. I followed the
> path inward, surrounding the body of each function with:
> 
> int i = preempt_count();
> function_body...
> if (i != preempt_count()) pr_err("LORDHAVEMERCY\n");
> 
> Eventually I isolated the bug to an interesting situation like this:
> 
> int i = preempt_count();
> other_function(...);
> if (i != preempt_count()) pr_err("This will print out\n");
> 
> void other_function(int a)
> {
> int vla[a];
> int i = preempt_count();
> function_body...
> if (i != preempt_count()) pr_err("This will NOT print out\n");
> }
> 
> Since I only got the outer print, I thought this was strange, so I rearranged:
> 
> void other_function(int a)
> {
> int i = preempt_count();
> int vla[a];
> if (i != preempt_count()) pr_err("This will print out\n");
> function_body...
> }
> 
> Yay, we found the bug. But wtf, what could possibly be changing the
> preempt_count there?
> 
> So I went disassembling, and lo and behold the clever PaX stack leak
> plugin was adding calls to pax_check_alloca. Very nice! But still, why
> the preemption bug situation? I went hunting further:
> 
> void __used pax_check_alloca(unsigned long size)
> {
>  ...
>        case STACK_TYPE_IRQ:
>                stack_left = sp & (IRQ_STACK_SIZE - 1);
>                put_cpu();
>                break;
>  ...
> }
> 
> Do you see the bug? Looks like somebody snuck in a "put_cpu()" there,
> where it really does not belong. "put_cpu()" basically just jiggers
> the preempt_count. I can confirm that removing the erroneous call to
> "put_cpu()" fixes the bug.
> 
> So, either this is by design, and there's some odd subtlety I'm
> missing, or this is a bug that should be fixed in grsec/PaX.
> 
> In the case of the latter, I believe this introduces a security
> vulnerability, since it opens up a whole host of interesting race
> conditions that can be exploited.
> 
> Thanks,
> Jason

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2017-02-27 11:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-26 19:47 Timothée Ravier
     [not found] ` <CAHmME9pH5xbVR04b9JLqFfho=i_K-jod6N8tJt0ggDXPfqQ_LA@mail.gmail.com>
     [not found]   ` <a80303ca-e841-77f1-5ea6-6833f69b6059@gmail.com>
     [not found]     ` <CAApVa_kyD=KZYN3ABZU5yZGExNhC-34ryF+JZAPZ0JAKgmdJLw@mail.gmail.com>
2017-02-27  3:22       ` Jason A. Donenfeld
2017-02-27 11:53         ` Brad Spengler [this message]
2017-02-27 15:51           ` Jason A. Donenfeld
2017-02-27 17:33             ` Timothée Ravier
2017-02-27 23:36         ` kernel warning with 0.0.20170223: entered softirq 3 NET_RX net_rx_action+0x0/0x760 with preempt_count 00000101, exited with PaX Team

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170227115307.GA2499@grsecurity.net \
    --to=spender@grsecurity.net \
    --cc=Jason@zx2c4.com \
    --cc=pageexec@gmail.com \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).