At 2019-10-18T19:20:35-0400, Arthur Krewat wrote: > I didn't have an 8087 floating point accelerator, so I wrote my > assembler example to use two 16-bit words of integers, combining them > for a 31-bit integer value with sign. > > Now mind you, the C version used real floating point, and a software > floating point library with no hardware accelerator. At that point, I > realized C was the way to go. It had passed my experiment with flying > colors. The C compiler, I believe, was from Computer Innovations, > Copyright (c)1981,82,83,84,85 > > The reason this is similar to Ken's statement above: In the assembler > version, the cube would deform quite a bit before the run would > finish. A 31-bit integer didn't accurately reflect the result of the > math. Over time, that slight inaccuracy really added up. The accuracy > of the C version using floats was spot on.  So while I basically > cheated for the assembler version, causing the deformation of the cube > over time, the C version was 100% accurate even though it was slower. > > I wonder, is there something inherently different between PDP-11/7 > floats and Intel's leading to the inaccuracy Ken mentions? Was the > PDP-11 (or the -7) floating point that much different than IEEE-754 ? It sounds like it could be a simple matter of precision to me. It takes 32 bits to store a single-precision floating point value. Double-precision requires 64. In IEEE 754, the significand is 53 bits (52 bits plus the implicit leading 1). I can never remember the C type promotion rules without looking them up, but IIRC at least in some circumstances C promotes floats to doubles, at least for intermediate results. And the software floating-point library you used could well have done the same, or perhaps it used doubles all the way internally. Either of these could have prevented accumulated roundoff. I've heard, with a level of conviction somewhere between folklore and formal demonstration[1], that for many practical numerical problems, single-precision is just not quite good enough, but double-precision is ample. Somewhere between 24 and 53 bits of significant, perhaps, there is a sweet spot. The wisdom I've absorbed is, if you have to do floating-point, use doubles, unless you can clearly and convincingly articulate why you absolutely need more precision, or can get away with less. (For same 3D game-rendering applications, half-precision is adequate.) A non-quantified "single-precision will be faster" declaration should be understood to include a lot of "!!1!11" punctuation after it, and such people should be handled as delicately as any other Gentoo user. Regards, Branden [1] Example: Ben Klemens, _21st-Century C_, O'Reilly.