From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <3e1162e60712190715q3f50d601sd88b1e0c6c74afc8@mail.gmail.com> Date: Wed, 19 Dec 2007 07:15:09 -0800 From: "David Leimbach" To: 9fans@cse.psu.edu MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_15451_9062593.1198077309064" Subject: [9fans] fun and scary evil C code Topicbox-Message-UUID: 1ca47a20-ead3-11e9-9d60-3106f5b1d025 ------=_Part_15451_9062593.1198077309064 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I was amused by this: http://www.steike.com/code/useless/evil-c/ I particularly liked the "internalEndianMagic". I see this in some XML libs, as well as GMP and other open sourced code. http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2005-12/msg00198.html From: https://svn.r-project.org/R/trunk/src/extra/trio/trionan.c /* * Endian-agnostic indexing macro. * * The value of internalEndianMagic, when converted into a 64-bit * integer, becomes 0x0706050403020100 (we could have used a 64-bit * integer value instead of a double, but not all platforms supports * that type). The value is automatically encoded with the correct * endianess by the compiler, which means that we can support any * kind of endianess. The individual bytes are then used as an index * for the IEEE 754 bit-patterns and masks. */ #define TRIO_DOUBLE_INDEX(x) (((unsigned char *)&internalEndianMagic)[7-(x)]) static TRIO_CONST double internalEndianMagic = 7.949928895127363e-275; #endif pretty weird stuff. ------=_Part_15451_9062593.1198077309064 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I was amused by this:

http://www.steike.com/code/useless/evil-c/

I particularly liked the "internalEndianMagic".

I see this in some XML libs, as well as GMP and other open sourced code.

http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2005-12/msg00198.html 

From:
https://svn.r-project.org/R/trunk/src/extra/trio/trionan.c
/*
 * Endian-agnostic indexing macro.
 *
 * The value of internalEndianMagic, when converted into a 64-bit
 * integer, becomes 0x0706050403020100 (we could have used a 64-bit
 * integer value instead of a double, but not all platforms supports
 * that type). The value is automatically encoded with the correct
 * endianess by the compiler, which means that we can support any
 * kind of endianess. The individual bytes are then used as an index
 * for the IEEE 754 bit-patterns and masks.
 */
#define TRIO_DOUBLE_INDEX(x) (((unsigned char *)&internalEndianMagic)[7-(x)])
static TRIO_CONST double internalEndianMagic = 7.949928895127363e-275;
#endif
pretty weird stuff.

------=_Part_15451_9062593.1198077309064-- From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu Subject: Re: [9fans] fun and scary evil C code From: "Russ Cox" Date: Wed, 19 Dec 2007 15:37:25 -0500 In-Reply-To: <3e1162e60712190715q3f50d601sd88b1e0c6c74afc8@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20071219203654.3714F1E8C1C@holo.morphisms.net> Topicbox-Message-UUID: 1cb0d78e-ead3-11e9-9d60-3106f5b1d025 > #define TRIO_DOUBLE_INDEX(x) (((unsigned char *)&internalEndianMagic)[7-(x)]) this is actually done in /sys/src/9/port/devcons.c too: static uvlong uvorder = 0x0001020304050607ULL; static uchar* le2vlong(vlong *to, uchar *f) { uchar *t, *o; int i; t = (uchar*)to; o = (uchar*)&uvorder; for(i = 0; i < sizeof(vlong); i++) t[o[i]] = f[i]; return f+sizeof(vlong); } static uchar* vlong2le(uchar *t, vlong from) { uchar *f, *o; int i; f = (uchar*)&from; o = (uchar*)&uvorder; for(i = 0; i < sizeof(vlong); i++) t[i] = f[o[i]]; return t+sizeof(vlong); } presotto wrote the code but said he learned the trick from ken. there, of course, we have a real compiler and don't have to write uvlong constants as floating point numbers (wow that seems fragile). russ From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47699C10.5060708@gmail.com> Date: Wed, 19 Dec 2007 17:32:48 -0500 From: Robert William Fuller User-Agent: Thunderbird 2.0.0.6 (X11/20071013) MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] fun and scary evil C code References: <20071219203654.3714F1E8C1C@holo.morphisms.net> In-Reply-To: <20071219203654.3714F1E8C1C@holo.morphisms.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 1cbd2f34-ead3-11e9-9d60-3106f5b1d025 Russ Cox wrote: >> #define TRIO_DOUBLE_INDEX(x) (((unsigned char *)&internalEndianMagic)[7-(x)]) > > this is actually done in /sys/src/9/port/devcons.c too: > > static uvlong uvorder = 0x0001020304050607ULL; > > static uchar* > le2vlong(vlong *to, uchar *f) > { > uchar *t, *o; > int i; > > t = (uchar*)to; > o = (uchar*)&uvorder; > for(i = 0; i < sizeof(vlong); i++) > t[o[i]] = f[i]; > return f+sizeof(vlong); > } > > static uchar* > vlong2le(uchar *t, vlong from) > { > uchar *f, *o; > int i; > > f = (uchar*)&from; > o = (uchar*)&uvorder; > for(i = 0; i < sizeof(vlong); i++) > t[i] = f[o[i]]; > return t+sizeof(vlong); > } > > presotto wrote the code but said he learned the trick from ken. > > there, of course, we have a real compiler and don't have to > write uvlong constants as floating point numbers > (wow that seems fragile). > > russ Now THAT's something to be proud of. Especially without comments. From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <3cfdec3af4604f467d7e0e9c5e07e4ef@quanstro.net> From: erik quanstrom Date: Wed, 19 Dec 2007 17:38:57 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] fun and scary evil C code In-Reply-To: <47699C10.5060708@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Topicbox-Message-UUID: 1cc15ffa-ead3-11e9-9d60-3106f5b1d025 > > presotto wrote the code but said he learned the trick from ken. > >=20 > > there, of course, we have a real compiler and don't have to > > write uvlong constants as floating point numbers > > (wow that seems fragile). > >=20 > > russ >=20 > Now THAT's something to be proud of. Especially without comments. which. writing the code or learning tricks from ken? =E2=98=BA - erik From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] fun and scary evil C code In-reply-to: Your message of "Wed, 19 Dec 2007 15:37:25 EST." <20071219203654.3714F1E8C1C@holo.morphisms.net> From: Bakul Shah Date: Wed, 19 Dec 2007 14:51:21 -0800 Message-Id: <20071219225121.BF3715B42@mail.bitblocks.com> Topicbox-Message-UUID: 1cc58a94-ead3-11e9-9d60-3106f5b1d025 > > #define TRIO_DOUBLE_INDEX(x) (((unsigned char *)&internalEndianMagic)[7-(x) > ]) > > this is actually done in /sys/src/9/port/devcons.c too: > > static uvlong uvorder = 0x0001020304050607ULL; > > static uchar* > le2vlong(vlong *to, uchar *f) > { > uchar *t, *o; > int i; > > t = (uchar*)to; > o = (uchar*)&uvorder; > for(i = 0; i < sizeof(vlong); i++) > t[o[i]] = f[i]; > return f+sizeof(vlong); > } > > static uchar* > vlong2le(uchar *t, vlong from) > { > uchar *f, *o; > int i; > > f = (uchar*)&from; > o = (uchar*)&uvorder; > for(i = 0; i < sizeof(vlong); i++) > t[i] = f[o[i]]; > return t+sizeof(vlong); > } > > presotto wrote the code but said he learned the trick from ken. > > there, of course, we have a real compiler and don't have to > write uvlong constants as floating point numbers > (wow that seems fragile). Most likely the TRIO_DOUBLE_INDEX macro came from some code from the pre-long-long days, when you had no other way to write down an 8 byte value in host endian order (even in a less than portable way). From mboxrd@z Thu Jan 1 00:00:00 1970 Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: <20071219203654.3714F1E8C1C@holo.morphisms.net> References: <20071219203654.3714F1E8C1C@holo.morphisms.net> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <67C6BAFA-908C-4756-848F-4D65095913EB@mac.com> Content-Transfer-Encoding: 7bit From: dave.l@mac.com Subject: Re: [9fans] fun and scary evil C code Date: Thu, 20 Dec 2007 00:18:14 +0000 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Topicbox-Message-UUID: 1cd1219c-ead3-11e9-9d60-3106f5b1d025 > there, of course, we have a real compiler and don't have to > write uvlong constants as floating point numbers > (wow that seems fragile). Scarily: they're not: if you read on, that macro is for picking apart a double into bytes and vice-versa. (i.e. it's insanity is internally consistent). It's still ludicrous and fragile. As far as I can see, they're assuming that double is IEEE 754 and that you can rip apart such a representation on an random-endian machine and re-assemble it on an any other random-endian machine. Does C99 or any other C mandate the actual memory layout of floats and doubles or the exact conversion of constant representations? I'm fairly sure they somehow mandate IEEE 754 properties, but do they actually say that floats and doubles have to be stored exactly that way in 4 or 8 bytes? Even if we assume sizeof(double) == 8, what if my implementation is perverse and interleaves the exponent bits amongst the mantissa bits? Where is this disallowed in the standard(s)? DaveL From mboxrd@z Thu Jan 1 00:00:00 1970 Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: <67C6BAFA-908C-4756-848F-4D65095913EB@mac.com> References: <20071219203654.3714F1E8C1C@holo.morphisms.net> <67C6BAFA-908C-4756-848F-4D65095913EB@mac.com> Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <3FA5A9CA-EE09-4FDB-A9AA-118E3E8A3AB6@mac.com> Content-Transfer-Encoding: 7bit From: dave.l@mac.com Subject: Re: [9fans] fun and scary evil C code Date: Thu, 20 Dec 2007 00:33:50 +0000 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Topicbox-Message-UUID: 1cd4dc42-ead3-11e9-9d60-3106f5b1d025 > As far as I can see, > they're assuming that double is IEEE 754 and that you can > rip apart such a representation on an random-endian machine > and re-assemble it on an any other random-endian machine. I lied. They're actually using that stuff to pick apart the (presumed IEEE 754) doubles into exponent, mantissa, ... My questions still stand ... Also, that file is such a tangled nest of ifdefs ... is anyone sure whether (and if so, when) that code is invoked? DaveL From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <45359f79d760ffae8f9d6d769e8087c2@quanstro.net> From: erik quanstrom Date: Wed, 19 Dec 2007 19:51:38 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] fun and scary evil C code In-Reply-To: <3FA5A9CA-EE09-4FDB-A9AA-118E3E8A3AB6@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 1d4c3e90-ead3-11e9-9d60-3106f5b1d025 > I lied. > They're actually using that stuff to pick apart the (presumed IEEE 754) > doubles into exponent, mantissa, ... > > My questions still stand ... > > Also, that file is such a tangled nest of ifdefs ... > is anyone sure whether (and if so, when) that code is invoked? interesting conflicts on the web. http://en.wikipedia.org/wiki/IEEE_754 says c doesn't require ieee, http://home.datacomm.ch/t_wolf/tw/c/c9x_changes.html point #21. says it does. in any event, if ieee is used, i think the trick is safe. - erik From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Thu, 20 Dec 2007 09:29:29 +0000 From: "roger peppe" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu> Subject: Re: [9fans] fun and scary evil C code In-Reply-To: <45359f79d760ffae8f9d6d769e8087c2@quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3FA5A9CA-EE09-4FDB-A9AA-118E3E8A3AB6@mac.com> <45359f79d760ffae8f9d6d769e8087c2@quanstro.net> Topicbox-Message-UUID: 1da651e6-ead3-11e9-9d60-3106f5b1d025 On Dec 20, 2007 12:51 AM, erik quanstrom wrote: > in any event, if ieee is used, i think the trick is safe. surely it depends crucially on the consistency of the decimal to float conversion? is that standard? From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <8031fba473b2f8f2d904869083bd3f74@quanstro.net> From: erik quanstrom Date: Thu, 20 Dec 2007 08:02:22 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] fun and scary evil C code In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 1e42b27a-ead3-11e9-9d60-3106f5b1d025 as long as the conversion you're after has an exact ieee representation, i can't see how two compliant implementations could come up with differing representations. (two different numbers can't have the same ieee representation, except -0 and +0.) the conversion process doesn't need any floating point itself and the only interpolation comes when numbers don't have exact representations. it didn't occur to me immediately after russ' post that what he said was literally correct: #include #include #define Magici(x) (((uchar*)&magic)[7-(x)]) static double magic = 7.949928895127363e-275; void main(void) { uint sign, exp; uvlong sig, *v; v = (uvlong*)&magic; sign = *v>>63; exp = *v>>52; sig = *v&~(4096-1LL<<52); print("%b*%b*%ullx\n", sign, exp, sig); print("%ullx\n", *v); } - erik From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu Subject: Re: [9fans] fun and scary evil C code From: "Russ Cox" Date: Thu, 20 Dec 2007 11:29:55 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20071220162924.124251E8C1C@holo.morphisms.net> Topicbox-Message-UUID: 1e88f8ca-ead3-11e9-9d60-3106f5b1d025 > Does C99 or any other C mandate the actual memory layout > of floats and doubles or the exact conversion of constant > representations? > I'm fairly sure they somehow mandate IEEE 754 properties, > but do they actually say that floats and doubles have to be stored > exactly that way in 4 or 8 bytes? > > Even if we assume sizeof(double) == 8, > what if my implementation is perverse and interleaves the exponent > bits amongst the mantissa bits? > Where is this disallowed in the standard(s)? It doesn't matter what the standard says; it matters what implementations do. C implementations are going to provide what the underlying hardware does, and almost all hardware does IEEE 754. > as long as the conversion you're after has an exact ieee representation, > i can't see how two compliant implementations could come up with > differing representations. (two different numbers can't have the same > ieee representation, except -0 and +0.) the conversion process doesn't > need any floating point itself and the only interpolation comes when > numbers don't have exact representations. Historically, it has not always been true that decimal <-> binary conversion of doubles has been precise enough to replicate a specific bit pattern. While conversion doesn't require floating point, it is often done in floating point anyway for convenience. Russ