caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Nicolas George <nicolas.george@ens.fr>
To: Caml mailing list <caml-list@inria.fr>
Cc: David MENTRE <dmentre@linux-france.org>
Subject: Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
Date: Thu, 16 Jun 2005 21:02:07 +0200	[thread overview]
Message-ID: <20050616190206.GA553@clipper.ens.fr> (raw)
In-Reply-To: <87slzima67.fsf@linux-france.org>

[-- Attachment #1: Type: text/plain, Size: 2369 bytes --]

L'octidi 28 prairial, an CCXIII, David MENTRE a écrit :
>  1. convert between big and little endian 32 bits integers;

Don't do that.

>  2. convert between 32 bits integers and string binary representation
>     (to store integers in Buffer and string data structures);

What you mean to do is represent an integer in a bounded interval as a
fixed-length sequence of finite-valued objects. Said that way, children
learn how to do it in school: it's writing the number in some base. Since
bytes in a string can take 256 values, one will obviously use base 256.

The first (rightmost) "digit" will be (n mod 256).
The second "digit" will be ((n / 256) mod 256).
The third "digit" will be ((n / (256 * 256)) mod 256)
The fourth (leftmost) "digit" will be ((n / (256 * 256 * 256)) mod 256).

And so on, but since your numbers are less than 256*256*256*256, all
remaining "digits" are 0. So all you have to do is store these four bytes in
your string, in any order you may prefer.

"Big endian" is when you store the fourth, the third, the second and the
first; it is the nearest to the way we humans write numbers; and the lexical
order is the same as the numeric order. "Small endian" is when you store the
first, the second, the third and the fourth.

But, and that is important, this does not depend on the hardware it runs on:
it is purely arithmetic.

The reverse operation is simply

n = d1 + d2 * 256 + d3 * 256 * 256 + d4 * 256 * 256 * 256

>  3. detect machine endianness at runtime.

Don't do that. I develop: there are no guarantees that numbers are either in
big or little endian. I have heard that some architectures exist where
8-bits bytes in 16-bits words are in little endian, but 16-bits words in
32-bit words are in big endian, which gives 3412 as a global order.

Using the internal representation of integers can so never be reliable. On
the contrary, compilers ensure that arithmetic in reasonable interval is the
real Peano arithmetic, for all architectures.

Using the internal representation of numbers may allow to gain some cycles
on the packing-unpacking, but it is probably nothing in regard to anything
that will be done with the data (disc access or network for example).
Furthermore, if you have to worry about inverting the order of the bytes in
the number, the gain will be even smaller.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 185 bytes --]

  reply	other threads:[~2005-06-16 19:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-16 18:25 David MENTRE
2005-06-16 19:02 ` Nicolas George [this message]
2005-06-16 19:32   ` [Caml-list] " David MENTRE
2005-06-16 20:14     ` Nicolas George
2005-06-17  7:29     ` Luca Pascali
2005-06-16 22:25 ` Nicolas Cannasse
2005-06-17 18:10   ` David MENTRE

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050616190206.GA553@clipper.ens.fr \
    --to=nicolas.george@ens.fr \
    --cc=caml-list@inria.fr \
    --cc=dmentre@linux-france.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).