Re: [Caml-list] string_of_float less accurate than sprintf "%f" ?

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Xavier Leroy <xavier.leroy@inria.fr>
To: "Beck01, Wolfgang" <BeckW@t-systems.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] string_of_float less accurate than sprintf "%f" ?
Date: Sat, 4 May 2002 10:53:24 +0200	[thread overview]
Message-ID: <20020504105324.A15588@pauillac.inria.fr> (raw)
In-Reply-To: <C3F9C806AEC6D5119643000347055E322088D8@G9JNW.mgb01.telekom.de>; from BeckW@t-systems.com on Tue, Apr 30, 2002 at 10:21:43AM +0200

> while doing some time measurements with Unix.gettimeofday() I
> discovered a problem with string_of_float:
> 
> # string_of_float 123456789.123456789;;
> - : string = "123456789.123"
> 
> OK, just it may by just an inaccuracy of the float type. However,
> sprintf returns a different result:
> 
> # sprintf "%f" 123456789.123456789;;
> - : string = "123456789.1234567"

This is not an inaccuracy of the underlying float value, which is
indeed double-precision, but a consequence of the definition of
string_of_float, which is essentially sprintf "%.12g".  So, you get a
different rounding than sprintf "%f".  Here, %f is more precise, but
not always (try with 1e-12 for instance).

> My application needs to be fast (that's why I am using OCaml :-) and
> sprintf is of course slower than string_of_float.

Not by much.  I recommend that you use sprintf with the floating-point
format appropriate for your application, e.g. "%.3f" for printing
times with a millisecond precision.  

Now, you might wonder why string_of_float doesn't "do the right thing"
and prints its float argument with as many digits as necessary to ensure
e.g. float_of_string(string_of_float f) = f.  The main reason is
pragmatic: OCaml's float-to-string conversions are built on top of the
sprintf() function from the C runtime library, and the latter doesn't
provide a "print a float with enough digits to represent it exactly"
format.  David Chase mentioned some third-party source that does this
(thanks for the pointer); I wish the C library would provide something
like this.

There might be a more philosophical issue behind this.  For a
numerical analyst, or physicist, or experimental scientist in general,
floating-point numbers are just approximate measurements of
experimental measures, or results of computations on these approximate
measurements.  Hence, there is no such thing as "the" string
representation of a floating-point value: not all digits are meaningful,
and how many significant digits to print depends on the physical
problem being modeled and solved.  With this viewpoint,
string_of_float doesn't make any sense, and you should always use
sprintf with the float format appropriate for your problem.

Then, there is the computer engineering viewpoint on floating-point
numbers, which are collections of (say) 64 bits with well-defined (if
a bit convoluted) operations on them such as addition, multiplication,
etc, as specified in IEEE 754.  From this viewpoint, it makes sense to
have conversions to and from strings that are exact, i.e. without
information loss.  (It is feasible, but much harder than it sounds;
there was two full papers at PLDI 94 (?) on this problem.)

I'm not taking sides here, just noticing that Java takes the computer
engineering viewpoint and C (and Caml, by inheritance of
implementation :-) takes the physicist's viewpoint...

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

next prev parent reply	other threads:[~2002-05-04  8:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-04-30  8:21 Beck01, Wolfgang
2002-05-02 12:44 ` John Max Skaller
2002-05-02 12:54   ` Francois Thomasset
2002-05-05 17:33     ` John Max Skaller
2002-05-02 13:28   ` David Chase
2002-05-05 18:19     ` John Max Skaller
2002-05-02 13:46   ` jeanmarc.eber
2002-05-03 14:41   ` Oliver Bandel
     [not found]   ` <Pine.LNX.3.95.1020503162341.541E-100000@first.in-berlin.de >
2002-05-03 18:28     ` David Chase
2002-05-04  8:53 ` Xavier Leroy [this message]
2002-05-05  0:31   ` David McClain
2002-05-06 14:19   ` David Chase
2002-05-06 18:21     ` David McClain
2002-05-03 19:25 David Chase

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020504105324.A15588@pauillac.inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=BeckW@t-systems.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).