[Caml-list] float pretty-printing precision, once more.

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] float pretty-printing precision, once more.
@ 2002-12-09 23:04 jeanmarc.eber
  2002-12-09 23:46 ` Yaron M. Minsky
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: jeanmarc.eber @ 2002-12-09 23:04 UTC (permalink / raw)
  To: caml-list

caml 3.06+1:  

# let f = 1. /. 86400.;;   
val f : float = 1.15740740741e-05   
# let s = string_of_float f;;   
val s : string = "1.15740740741e-05"   
# let f1 = float_of_string s;;   
val f1 : float = 1.15740740741e-05   
# f1 = f;;   
- : bool = false   
# f1 -. f;;   
- : float = 2.59259844496e-17   

This situation may be understandable, but is unfortunate.  

Disclaimer: I'm not a specialist of the IEEE float format.  

Do I have at hand, at least on an architecture supporting the IEEE format, a  
function that pretty-prints any valid float value (by valid I mean that I  
exclude the "special" values like NaN, infinity, etc.) so that  
float_of_string applied to the resulting string returns my initial value,  
or, at least, a value that, if substracted from my initial one, returns  
zero ?  

Background:  

In fact, my question goes a little bit further, as it concerns indeed the  
parsing of floats in the caml compiler (that uses internally float_of_string  
if I'm correct).  

Suppose you calculate somewhere (with an caml program, say) a float  
constant (such a calculation may last for hours!), and you want after  
obtaining the result to *generate* a caml source using this calculated  
value. You will probably generate something like  

let my_const = <a float text representation>  

But my example shows that you are loosing precision and accuracy if you  
just use string_of_float. 

Of course the goal is to incorporate this value in a caml source, not  
to read it in binary form from a file (that would be easy!).  

Do anybody know a solution to my problem ?  

Jean-Marc Eber  

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
@ 2002-12-09 23:46 ` Yaron M. Minsky
  2002-12-10  0:07 ` Brian Hurt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Yaron M. Minsky @ 2002-12-09 23:46 UTC (permalink / raw)
  To: jeanmarc.eber; +Cc: caml-list

I'm no expert on floating point representations either, but sprintf with
a sufficiently long length seems to work.

# let test x = float_of_string (sprintf "%.30e" x) = x;;
val test : float -> bool = <fun>
# test (sqrt 2.);;
- : bool = true
# test (sqrt 2. *. 1.343e26);;
- : bool = true

y

On Mon, 2002-12-09 at 18:04, jeanmarc.eber@lexifi.com wrote:
> caml 3.06+1:  
>    
> # let f = 1. /. 86400.;;   
> val f : float = 1.15740740741e-05   
> # let s = string_of_float f;;   
> val s : string = "1.15740740741e-05"   
> # let f1 = float_of_string s;;   
> val f1 : float = 1.15740740741e-05   
> # f1 = f;;   
> - : bool = false   
> # f1 -. f;;   
> - : float = 2.59259844496e-17   
>   
> This situation may be understandable, but is unfortunate.  
>   
> Disclaimer: I'm not a specialist of the IEEE float format.  
>   
> Do I have at hand, at least on an architecture supporting the IEEE format, a  
> function that pretty-prints any valid float value (by valid I mean that I  
> exclude the "special" values like NaN, infinity, etc.) so that  
> float_of_string applied to the resulting string returns my initial value,  
> or, at least, a value that, if substracted from my initial one, returns  
> zero ?  
>   
> Background:  
>   
> In fact, my question goes a little bit further, as it concerns indeed the  
> parsing of floats in the caml compiler (that uses internally float_of_string  
> if I'm correct).  
>   
> Suppose you calculate somewhere (with an caml program, say) a float  
> constant (such a calculation may last for hours!), and you want after  
> obtaining the result to *generate* a caml source using this calculated  
> value. You will probably generate something like  
>   
> let my_const = <a float text representation>  
>   
> But my example shows that you are loosing precision and accuracy if you  
> just use string_of_float. 
>   
> Of course the goal is to incorporate this value in a caml source, not  
> to read it in binary form from a file (that would be easy!).  
>   
> Do anybody know a solution to my problem ?  
>   
> Jean-Marc Eber  
>   
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
  2002-12-09 23:46 ` Yaron M. Minsky
@ 2002-12-10  0:07 ` Brian Hurt
  2002-12-10  2:13 ` David Chase
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Brian Hurt @ 2002-12-10  0:07 UTC (permalink / raw)
  To: jeanmarc.eber; +Cc: caml-list


This looks just like rounding error to me.  I note that the string output 
has only 12 signifigant digits- and the error is, coincidentally enough, 
right about f * 10e-12.

But then, I'm not an expert in IEEE FP either...

Brian

On Tue, 10 Dec 2002 jeanmarc.eber@lexifi.com wrote:

> caml 3.06+1:  
>    
> # let f = 1. /. 86400.;;   
> val f : float = 1.15740740741e-05   
> # let s = string_of_float f;;   
> val s : string = "1.15740740741e-05"   
> # let f1 = float_of_string s;;   
> val f1 : float = 1.15740740741e-05   
> # f1 = f;;   
> - : bool = false   
> # f1 -. f;;   
> - : float = 2.59259844496e-17   
>   
> This situation may be understandable, but is unfortunate.  
>   
> Disclaimer: I'm not a specialist of the IEEE float format.  
>   
> Do I have at hand, at least on an architecture supporting the IEEE format, a  
> function that pretty-prints any valid float value (by valid I mean that I  
> exclude the "special" values like NaN, infinity, etc.) so that  
> float_of_string applied to the resulting string returns my initial value,  
> or, at least, a value that, if substracted from my initial one, returns  
> zero ?  
>   
> Background:  
>   
> In fact, my question goes a little bit further, as it concerns indeed the  
> parsing of floats in the caml compiler (that uses internally float_of_string  
> if I'm correct).  
>   
> Suppose you calculate somewhere (with an caml program, say) a float  
> constant (such a calculation may last for hours!), and you want after  
> obtaining the result to *generate* a caml source using this calculated  
> value. You will probably generate something like  
>   
> let my_const = <a float text representation>  
>   
> But my example shows that you are loosing precision and accuracy if you  
> just use string_of_float. 
>   
> Of course the goal is to incorporate this value in a caml source, not  
> to read it in binary form from a file (that would be easy!).  
>   
> Do anybody know a solution to my problem ?  
>   
> Jean-Marc Eber  
>   
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> 

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
  2002-12-09 23:46 ` Yaron M. Minsky
  2002-12-10  0:07 ` Brian Hurt
@ 2002-12-10  2:13 ` David Chase
  2002-12-10  9:49 ` Xavier Leroy
  2002-12-10 13:09 ` Damien Doligez
  4 siblings, 0 replies; 10+ messages in thread
From: David Chase @ 2002-12-10  2:13 UTC (permalink / raw)
  To: caml-list

You need David Gay's gdtoa (strtod) from netlib.
If I had more time, I would give you more information, but
that is the answer to your question if your question involves
translation between strings and machine floating point.  Holler
when you get stuck.

And yes, on this particular topic, I am now an expert.

David

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
                   ` (2 preceding siblings ...)
  2002-12-10  2:13 ` David Chase
@ 2002-12-10  9:49 ` Xavier Leroy
  2002-12-10 13:09 ` Damien Doligez
  4 siblings, 0 replies; 10+ messages in thread
From: Xavier Leroy @ 2002-12-10  9:49 UTC (permalink / raw)
  To: jeanmarc.eber; +Cc: caml-list

> # let f = 1. /. 86400.;;   
> val f : float = 1.15740740741e-05   
> # let s = string_of_float f;;   
> val s : string = "1.15740740741e-05"   
> # let f1 = float_of_string s;;   
> val f1 : float = 1.15740740741e-05   
> # f1 = f;;   
> - : bool = false   
> # f1 -. f;;   
> - : float = 2.59259844496e-17   
>   
> This situation may be understandable, but is unfortunate.  

Well, you get a relative error of 2e-12.  Are you sure your
computations require more precision than this?  In physics, the answer
would be almost universally "no".  In finance, you know better than I...

> Suppose you calculate somewhere (with an caml program, say) a float  
> constant (such a calculation may last for hours!), and you want after  
> obtaining the result to *generate* a caml source using this calculated  
> value. You will probably generate something like  
>   
> let my_const = <a float text representation>  
>   
> But my example shows that you are loosing precision and accuracy if you  
> just use string_of_float. 

There are two approaches to your problem:

1- The physicist's approach:
Your float constant is known to have no more than N significant digits
(e.g. because it's based on measurements that have a 10^-N error margin).
Then, use the printf %g format corresponding to N to generate your source.

2- The programmer's approach:
Having no idea on the actual precision of their data, programmers want
bit-for-bit identity on float representation.  You can do that in
OCaml using Int64.bits_of_float and Int64.float_of_bits, which give
direct access to the IEEE bit-level representation of floats.
For instance, generate your Caml code with

printf "let my_const = Int64.float_of_bits(Int64.of_string \"%Ld\")\n"
       (Int64.bits_of_float my_const)

That will generate something like

let my_const = Int64.float_of_bits(Int64.of_string "4532949752942055721")

which is truly unreadable, but it guaranteed to give you the exact
same float when executed.

(To help reading the source, consider putting the decimal
representation of the float in a generated comment.)

Hope this helps,

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
                   ` (3 preceding siblings ...)
  2002-12-10  9:49 ` Xavier Leroy
@ 2002-12-10 13:09 ` Damien Doligez
  2002-12-10 15:37   ` Jacques Carette
  2002-12-10 15:47   ` Xavier Leroy
  4 siblings, 2 replies; 10+ messages in thread
From: Damien Doligez @ 2002-12-10 13:09 UTC (permalink / raw)
  To: caml-list

On Tuesday, Dec 10, 2002, at 00:04 Europe/Paris, 
jeanmarc.eber@lexifi.com wrote:

> caml 3.06+1:
[string_of_float loses precision on floating-point numbers]

In the current working version (3.06+18), the precision used by
string_of_float has been increased to 17 digits.  I *think* this
is enough to represent all the double-precision floating-point
numbers:

           Objective Caml version 3.06+18 (2002-11-07)

   # let f = 1. /. 86400.;;
   val f : float = 1.1574074074074073e-05
   # let s = string_of_float f;;
   val s : string = "1.1574074074074073e-05"
   # let f1 = float_of_string s;;
   val f1 : float = 1.1574074074074073e-05
   # f1 = f;;
   - : bool = true
   # f1 -. f;;
   - : float = 0.

However, it has the unfortunate side-effect of revealing this
awful truth about FP numbers: many "interesting" numbers are
impossible to represent in floating-point.  For example:

   # 0.1;;
   - : float = 0.10000000000000001

There is no floating-point number equal to 0.1 and the
best approximation you can get is 0.10000000000000001.
I expect this will quickly become a much-used FAQ entry...

I don't know if we will keep this change for the next release
of O'Caml.  But you can always use (Printf.sprintf "%.17g")
instead of string_of_float, and you'll get all the digits,
significant or not.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [Caml-list] float pretty-printing precision, once more.
  2002-12-10 13:09 ` Damien Doligez
@ 2002-12-10 15:37   ` Jacques Carette
  2002-12-10 15:47   ` Xavier Leroy
  1 sibling, 0 replies; 10+ messages in thread
From: Jacques Carette @ 2002-12-10 15:37 UTC (permalink / raw)
  To: caml-list

Damien Doligez wrote:

> In the current working version (3.06+18), the precision used by
> string_of_float has been increased to 17 digits.  I *think* this
> is enough to represent all the double-precision floating-point
> numbers:

Almost...  there are a few (rare) cases where 18 Digits are needed!  But 18
is indeed the 'right' number, assuming the system does proper IEEE rounding.
I am not an expert on this, but I used to have one as an employee (back when
I was manager of the Mathematics Group at Maple), and I learned this from
him.

> I don't know if we will keep this change for the next release
> of O'Caml.

Correctness and ease-of-use do clash badly when it comes to FP.  I prefer
correctness, because latter syntactic sugar can deal with the ease-of-use
problem, but the other way around does not work at all.

Jacques C.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-10 13:09 ` Damien Doligez
  2002-12-10 15:37   ` Jacques Carette
@ 2002-12-10 15:47   ` Xavier Leroy
  2002-12-11  4:03     ` David Chase
  1 sibling, 1 reply; 10+ messages in thread
From: Xavier Leroy @ 2002-12-10 15:47 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

> [string_of_float loses precision on floating-point numbers]
> In the current working version (3.06+18), the precision used by
> string_of_float has been increased to 17 digits.

Yes, this was my feeble attempt to work around the precision loss
using only what is available, i.e. sprintf.  However, as you say:

> However, it has the unfortunate side-effect of revealing this
> awful truth about FP numbers: many "interesting" numbers are
> impossible to represent in floating-point.  For example:
> 
>    # 0.1;;
>    - : float = 0.10000000000000001
> 
> There is no floating-point number equal to 0.1 and the
> best approximation you can get is 0.10000000000000001.

If that was really the best approximation, there would be nothing to
argue.  But both 0.10000000000000001 and 0.1 read back as identical
floats, so the latter should be printed instead, but sprintf (on
Linux at least) is too stupid to realize this.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-10 15:47   ` Xavier Leroy
@ 2002-12-11  4:03     ` David Chase
  2002-12-12  1:41       ` David Chase
  0 siblings, 1 reply; 10+ messages in thread
From: David Chase @ 2002-12-11  4:03 UTC (permalink / raw)
  To: caml-list

At 04:47 PM 12/10/2002 +0100, Xavier Leroy wrote:
>If that was really the best approximation, there would be nothing to
>argue.  But both 0.10000000000000001 and 0.1 read back as identical
>floats, so the latter should be printed instead, but sprintf (on
>Linux at least) is too stupid to realize this.

Gdtoa will supply you with the non-stupid answer, if you care
to ask it.  Again -- this I know well, I have used gdtoa to get
specification-conforming double->String conversion in a Java
VM/library.  Here is a test program that demonstrates how to
use gdtoa in "minimum unambiguous result length" mode.  This
assumes int is 32 bits long:

#include "gdtoa.h"
/* Always define to zero or one.  It is zero on Pentium. */
#define LITTLE_ENDIAN 1
FPI d_convert = { 53, 1-1023-53+1, 2046-1023-53+1, FPI_Round_near, 0};
union pun {
  double d;
  int i[2];
};
char * test_gdtoa(double d, int mode, int ndigits, int * plength, int * pdecpt) {
  union pun convert;
  convert.d = d;
  {
  int i0 = convert.i[0^LITTLE_ENDIAN];
  int i1 = convert.i[1^LITTLE_ENDIAN];
  
  int isNeg = i0 < 0;
  int m0 = i0 & 0xfffff;
  int e0 = ((unsigned)i0 >> 20) & 0x7ff; 
  int int_args[8];
  char * ptr_args[1];
  char * result;
  int length;
  
  /* NaN or Inf */
  if (e0 == 0x7ff) {
    if ((m0 | i1) != 0) {
      *plength = -1;
      return "NaN";
    } else {
      *plength = -1;
      return isNeg ? "-Infinity" : "Infinity";
    }
  }
  
  /* +/- 0 */
  if ((m0 | i1 | e0) == 0) {
    *plength = -1;
    return isNeg ? "-0.0" : "0.0";
  }
  
  /* If the exponent is larger than zero,
     then the leading "1" in the mantisssa
     is implicit.  Otherwise, the number is
     denormalized, and the exponent is
     really 1. */
  
  if (e0 != 0) m0 |= 0x100000;
  else e0 = 1;
  
  /* Adjust exponent to remove bias and to
     make the mantissa all "integer", e.g.
     12345e-4 instead of 1.2345e+0 */
  e0 -= 0x3ff + 52;
  int_args[0] = i1;
  int_args[1] = m0;
  int_args[2] = STRTOG_Normal;
  result = gdtoa(&d_convert, /* fpi */
                 e0,         /* be   */
                 int_args + 0, /* bits */
                 int_args + 2, /* kindP */
                 mode,         /* mode */
                 ndigits,      /* ndigits */
                 int_args + 3, /* decpt */
                 ptr_args + 0); /* rve */
  *pdecpt = int_args[3];
  length = ptr_args[0] - result;
  if (length == 0) {
      *plength = -1;
    return isNeg ? "-0.0" : "0.0";
  }
  *plength = length;
  return result;
  }
}

static test_one(double d) {
  int decpt;
  int length;
  char * result = test_gdtoa(d, 0, 0, &length, &decpt);

  if (length == -1) {
    printf("test of %f returns special value %s\n", d, result);
  } else {
    printf("test of %f returns regular value %s, decpt %d, length %d\n", d, result, decpt, length);
  }
}

int main(int argc, char ** argv) {
  test_one(1.0);
  test_one(1.0/0.0);
  test_one(-1.0/0.0);
  test_one(-1.0/0.0 + 1.0/0.0);
  test_one(0.0);
  test_one(-0.0);
  test_one(-1.0);
  test_one(0.25);
  test_one(4.0);
  test_one(8192.125);
  test_one(0.1);
  test_one(0.01);
  test_one(0.9);
  test_one(0.09);
  test_one(0.2);
  test_one(0.02);
  test_one(0.3);
  test_one(0.03);
}


David Chase


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] float pretty-printing precision, once more.
  2002-12-11  4:03     ` David Chase
@ 2002-12-12  1:41       ` David Chase
  0 siblings, 0 replies; 10+ messages in thread
From: David Chase @ 2002-12-12  1:41 UTC (permalink / raw)
  To: caml-list

Correction to the source code

At 11:03 PM 12/10/2002 -0500, David Chase wrote:
>#include "gdtoa.h"
>/* Always define to zero or one.  It is zero on Pentium. */
>#define LITTLE_ENDIAN 1

No, it is 1 on Pentium, and zero on Sparc and most other
machines.  Sorry about that.

David


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-12-12  1:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-09 23:04 [Caml-list] float pretty-printing precision, once more jeanmarc.eber
2002-12-09 23:46 ` Yaron M. Minsky
2002-12-10  0:07 ` Brian Hurt
2002-12-10  2:13 ` David Chase
2002-12-10  9:49 ` Xavier Leroy
2002-12-10 13:09 ` Damien Doligez
2002-12-10 15:37   ` Jacques Carette
2002-12-10 15:47   ` Xavier Leroy
2002-12-11  4:03     ` David Chase
2002-12-12  1:41       ` David Chase

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).