From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Xavier.Leroy@inria.fr>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id 1BB05BBB8
	for <caml-list@yquem.inria.fr>; Mon, 23 Jun 2008 09:58:31 +0200 (CEST)
X-IronPort-AV: E=Sophos;i="4.27,688,1204498800"; 
   d="scan'208";a="12431094"
Received: from estephe.inria.fr (HELO [128.93.11.95]) ([128.93.11.95])
  by mail2-relais-roc.national.inria.fr with ESMTP; 23 Jun 2008 09:58:31 +0200
Message-ID: <485F57A7.8020704@inria.fr>
Date: Mon, 23 Jun 2008 09:58:31 +0200
From: Xavier Leroy <Xavier.Leroy@inria.fr>
User-Agent: Thunderbird 1.5.0.7 (X11/20060915)
MIME-Version: 1.0
To: Brian Hurt <bhurt@spnz.org>
Cc: =?ISO-8859-1?Q?Daniel_B=FCnzli?= <daniel.buenzli@erratique.ch>,
	caml-list caml-list <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] Strange behaviour of string_of_float
References: <4b5157c30806220956p164d8346i3e832bc8b8666306@mail.gmail.com>	<F2ECF8A0-AC98-4AC1-90C5-40AA94511004@erratique.ch> <Pine.LNX.4.64.0806222059280.3616@localhost>
In-Reply-To: <Pine.LNX.4.64.0806222059280.3616@localhost>
X-Enigmail-Version: 0.94.0.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam: no; 0.00; serialize:01 integers:01 integers:01 deserialize:01 nans:01 infinities:01 iirc:01 specifies:01 overkill:01 integer:01 caml-list:01 int:01 int:01 behaviour:01 caml:02 

>> If you can serialize to binary, you can acheive what you want by
>> serializing the 64 bits integers you get with Int64.bits_of_float and
>> applying Int64.float_of_bits to the integers you deserialize.
>
> Actually, for serialization, I strongly reccommend first using
> classify_float to split off and handle NaNs, Infinities, etc., then
> using frexp to split the float into a fraction and exponent.  The
> exponent is just an int, and the fractional part can be multiplied by,
> say, 2^56 and then converted into an integer.
>
> The advantage of doing things this way, despite it being slightly more
> complicated, is two fold: one, it gaurentees you the ability to recovery
> the exact binary value of the float, and two, it sidesteps a huge number
> of compatibility issues between architectures- IIRC, IEEE 754 specifies
> how many bits have to be used to represent each part of the float, but
> not where they have to be in the word.

The only architecture I know where this problem could occur is the old
(pre-EABI) ABI for ARM, which has "mixed-endian" floats.  But the
implementation of Int64.{bits_of_float,float_of_bits} goes to some
length to rearrange bits as expected, i.e. with the sign bit in the
most significant bit of the int64, followed by the exponent bits,
followed by the mantissa bits in the least significant bits of the
int64.

So, the case analysis on the float that Brian suggests is a bit of an
overkill, and I strongly suggest using the result of
Int64.bits_of_float as the exact, serializable representation of a
Caml float.

- Xavier Leroy