From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Eckhardt To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-Reply-To: <13426df11002181527tf091fbek49420d9a074614e7@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <21665.1266616762.1@lunacy.ugrad.cs.cmu.edu> Date: Fri, 19 Feb 2010 16:59:22 -0500 Message-ID: <21666.1266616762@lunacy.ugrad.cs.cmu.edu> Subject: Re: [9fans] pineview atom Topicbox-Message-UUID: d71b2686-ead5-11e9-9d60-3106f5b1d025 > You'd need to look at fraction of total that is data vs. code, > then at fraction of total code that is going to cause hurt if > flipped. This stuff can have numbers attached. > > Here's an example from my world. 1 MB of code, 32 MB of kernel, > and 2GB minus that of data. This is a lower end ratio as the > nodes don't have much memory. > > If the data is flipped, you're not going to know of errors unless > you are looking for numerical instability. Also subtract out all of the kernel code which is boot-only: it needs to be uncorrupted for just the twinkling of an eye. Almost all of every format string (used or not) can be corrupted without anything dramatic happening. While you're in the kernel, the exception-handling label stack could be totally trashed as long as nobody invokes error() during this system call. Or maybe a bit flip rewrites an instruction to use %ebx instead of %eax, but at a point when they both contain the same value. There's lots of stuff which doesn't have to be totally right to "work", and even the stuff that must be 100% right may be fine if it's wrong at the the right time. "Back in the old days", a lot of VAX-11/750's running BSD Unix crashed because of parity errors in their TLB's. 750's running VMS "didn't have this problem", because VMS would silently work around it; BSD grew that code--see, for example, <229@astrovax.UUCP>. Then bits could flip all the time with nobody noticing! Dave Eckhardt