From mboxrd@z Thu Jan  1 00:00:00 1970
MIME-Version: 1.0
In-Reply-To: <018fd7d22990276158c51cac837d041b@quanstro.net>
References: <9ab217670904161449s715246f0te2c24244e9c9865a@mail.gmail.com>
	<018fd7d22990276158c51cac837d041b@quanstro.net>
Date: Thu, 16 Apr 2009 19:36:40 -0400
Message-ID: <9ab217670904161636p62f77a18ufe0c14ac6245f078@mail.gmail.com>
From: "Devon H. O'Dell" <devon.odell@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [9fans] security questions
Topicbox-Message-UUID: dee2a7b4-ead4-11e9-9d60-3106f5b1d025

2009/4/16 erik quanstrom <quanstro@quanstro.net>:
> On Thu Apr 16 17:51:42 EDT 2009, devon.odell@gmail.com wrote:
>> 2009/4/16 erik quanstrom <quanstro@quanstro.net>:
>> > have you taken a look at the protection measures already
>> > built into the kernel like smalloc?
>>
>> At least in FreeBSD, you can't sleep in an interrupt thread. I suppose
>> that's probably also the case in Plan 9 interrupt handlers, and this
>> would mitigate that situation.
>
> plan 9 doesn't have interrupt threads, but that's beside the point.
>
> interrupts are driven by the hardware, not users.  so smalloc, which
> is used to allow user space to wait for memory if it is not currently
> available doesn't make any sense.

My misunderstanding then, as smalloc is available in port/alloc.c,
which is also compiled into the kernel. I'm not concerned about oom
conditions in userland.

> having the potential for running out of memory in an interrupt
> handler might be a sign that a little code reorg is in order, if you
> are worried about this sort of thing.  (and even if you're not.)

The potential for running out of memory in an interrupt handler exists
if a user has found a way to consume kernel resources from userland
and the interrupt needs to allocate that extra 1 byte.

> in any event, i think there is more code to deal with these problems
> in the kernel that first meets the eye.  much of it is small and, if you'=
re
> not looking for it, easy to miss.

I don't think so. You can get very specific about the problem or you
can get very generic. This is a generic implementation that would
allow for dealing with such problems as they arise, and actually
dealing with them in an easy, extensible fashion, without having to
add huge support code diffs for calculation of every resource you're
tracking (as is the case with rlimits, which require you to add
functions for each limit, hooks everywhere you want to track them,
shell support, and sysctls).

>>From a codebase perspective, it's not a lot more code than you'd
think. It's very little code so far, and I'm about halfway finished
with the support part. Initial implementation will probably be
suboptimal, but I think useful for proof of concept.

Identifying the areas that need attention is the difficult part, but
I'll address that further down.

>> Depends again on the application. If you're talking about a terminal,
>> yes. If you're talking about a CPU server where someone is working on
>> code, someone else is writing a presentation, and yet another person
>> is in the middle of a video transcode, you're talking about a lot of
>> wasted time, potentially.
>
> and potentially, i could win the lottery.  =E2=98=BA.

And potentially, this code would go into Plan 9. =E2=98=BA =E2=98=BA =E2=98=
=BA.

I think those chances are a little smaller. Fileservers and terminal
servers, I don't think this is a big issue. CPU servers, I think it
is. I'm considering setting up a public Plan 9 cluster, though however
useless/useful it is to be remains to be seen. Part of the purpose of
this endeavor is to find exactly these kind of conditions. Whether
anybody is interested remains to be seen, but perhaps some incentives
can be provided. I don't know.

> i have had exactly 1 out-of-resource reboot in the last 18 months.
> without real data on what and where the problems are, i would
> think this would become a difficult issue.

I don't know what your use case is, though I know that you probably
use the system more heavily than I. I think with people trying to find
issues, it could be a much easier endeavor, and I think it's a fun one
to address.

The fact that there was one proves that these issues can occur
theoretically. All it needs is identification and reproduction.

We're shielded in part by our small userbase and relative lack of
interest in examining code and auditing. But that's not security.

> - erik

--dho