From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <8ae9ed145e1976dbe61a49a643a4e682@9netics.com> To: 9fans@cse.psu.edu Subject: Re: [9fans] quantity vs. quality Date: Tue, 13 Jun 2006 09:34:27 -0700 From: Skip Tavakkolian <9nut@9netics.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 69dfac58-ead1-11e9-9d60-3106f5b1d025 excellent points; i believe this. there's no sense in masking errors with pseudo recovery. good test coverage should expose programmer misunderstanding. if the system can't afford memory allocation errors, then preallocating (static or dynamic) and capping a maximum that the system should ever need will help simulate exhaustion in testing and make the memory usage and response times bounded. watchdog processes and memory checksums are possible additional measures. > i think this has been mentioned on the list before (otherwise i wouldn't > have known to look for it) but when considering error recovery tactics, it's > worth looking at http://www.sics.se/~joe/thesis/armstrong_thesis_2003.pdf > ("Making reliable software systems in the presence of errors") > > he summarises their approach to error recovery as follows: > > - if you can't do what you want to do, die. > - let it crash. > - do not program defensively. > > they built a telecoms switching system with a reported measured reliability of 99.9999999% > following this philosophy. > > see section 4.3 (page 101) for details. > > the key is that another process gets notified of the error. > > he makes this useful distinction between "error" and "exception": > > - exceptions occur when the run-time system does not know what to do. > - errors occur when the programmer does not know what to do. > > i would suggest that most out-of-memory conditions are best > classed as errors, not exceptions.