From mboxrd@z Thu Jan 1 00:00:00 1970 References: <0a7dc5268ce4dceb21ea20cdcc191693@terzarima.net> <27544caa847ff61fed1ae5f4d87218d0@ladd.quanstro.net> <20110717073847.GB539@polynum.com> <20110717084411.95552B827@mail.bitblocks.com> <817df5ce4b22486b880c9a2df854300b@ladd.quanstro.net> In-Reply-To: <817df5ce4b22486b880c9a2df854300b@ladd.quanstro.net> Mime-Version: 1.0 (iPhone Mail 8J2) Content-Type: text/plain; charset=us-ascii Message-Id: <39EE9680-DB31-47AC-981E-BC2CAAAE133E@bitblocks.com> Content-Transfer-Encoding: quoted-printable From: Bakul Shah Date: Sun, 17 Jul 2011 10:16:44 -0700 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Subject: Re: [9fans] NUMA Topicbox-Message-UUID: 03a9b89c-ead7-11e9-9d60-3106f5b1d025 On Jul 17, 2011, at 8:24 AM, erik quanstrom wrote: > On Sun Jul 17 04:45:18 EDT 2011, bakul@bitblocks.com wrote: >=20 >> Also note that the ISA implementations these days are quite >> complex (perhaps even more than your typical program). We >> don't see this complexty because it is all hidden behind a >> relatively simple ISA. But remember the FOOF bug? Usually the >> vendor has a long errata list (typically only available on a >> need to know basis and only under NDA!). And usually they >> don't formally prove the implementation right; they just run >> zillions of test vectors! I bet you would be scandalized if >> you knew what they do :-) >=20 > i have the errata. i've read them. and i find them reassuring. > you might find that surprising, but the longer and more detailed > the errata, the longer and more intricate the testing was. also > long errata sheets, especially of really arcane bugs indicate the > vendor isn't sweeping embarassing ones under the rug. i've > seen parts with 2-3 errata that were just buggy. they hadn't even > tested some large bits of functionality once! on the other hand > some processors i work with have very long errata, but none of > them matter. intel kindly makes the errata available to the public > for their gbe controllers. e.g. >=20 > http://download.intel.com/design/network/specupdt/322444.pdf > page 15, errata#10 is typical. the spec was violated, but it is > difficult to imagine working hardware for which this would matter. >=20 > i can't speak for vendors on why errata is sometimes nda, > but i would imagine that the main fear is that the errata can > reveal too much about the implementation. on the other hand, > many vendors have open errata. i've yet to see need-to-know > errata. I am sure (or sure hope) things have changed but in at two cases in the past= the vendor reps told me that yes the bug was known *after* I told them I ha= s logic analyzer traces that showed the bug. One a very well known CPU vendo= r, the a scsi chip manufacturer. I suspect incidents like the FOOF bug changed attitudes quite a bit, at leas= t for vendors like intel. > by the way, proving an implementation correct seems simply > impossible. many errata (perhaps like the one i mentioned) > come down to variations in the process that might not have > met the models. and how would you prove that one of the > many physical steps in producing a chip correct anyway? You can perhaps prove logical properties for simpler subsystems (ALU for ins= tance). Or generate logic from a description in HLL such as Scheme, which mi= ght be easier to prove, but of course then you have to worry about the trans= lator! But not the physical processes. I do think more formal proof method might get used as more and more parallel= ism gets exploited. The combinatorial explosion of testing might lead us the= re! Anyway, my point was just that there are no certainties; just degrees of unc= ertainties! You should *almost always* opt for speed (and simplicity) by fig= uring out how much uncertainty will be tolerated by your customers:-) A 99.9= % solution available today has more value than a 100% solution that is 10 ti= mes slower and a year late. 99.9% of the time! But I guess that is my engine= er's bias! >=20 > - erik >=20