caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* How to handle endianness and binary string conversion for 32 bits integers (Int32)?
@ 2005-06-16 18:25 David MENTRE
  2005-06-16 19:02 ` [Caml-list] " Nicolas George
  2005-06-16 22:25 ` Nicolas Cannasse
  0 siblings, 2 replies; 7+ messages in thread
From: David MENTRE @ 2005-06-16 18:25 UTC (permalink / raw)
  To: caml-list

Hello,

I would like to:

 1. convert between big and little endian 32 bits integers;

 2. convert between 32 bits integers and string binary representation
    (to store integers in Buffer and string data structures);

 3. detect machine endianness at runtime.

How could I do that? My main issue is the point 2.

I know some of my issues have been already discussed on this list but I
have been unable to google them.

Yours,
d.
-- 
pub  1024D/A3AD7A2A 2004-10-03 David MENTRE <dmentre@linux-france.org>
 5996 CC46 4612 9CA4 3562  D7AC 6C67 9E96 A3AD 7A2A


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 18:25 How to handle endianness and binary string conversion for 32 bits integers (Int32)? David MENTRE
@ 2005-06-16 19:02 ` Nicolas George
  2005-06-16 19:32   ` David MENTRE
  2005-06-16 22:25 ` Nicolas Cannasse
  1 sibling, 1 reply; 7+ messages in thread
From: Nicolas George @ 2005-06-16 19:02 UTC (permalink / raw)
  To: Caml mailing list; +Cc: David MENTRE

[-- Attachment #1: Type: text/plain, Size: 2369 bytes --]

L'octidi 28 prairial, an CCXIII, David MENTRE a écrit :
>  1. convert between big and little endian 32 bits integers;

Don't do that.

>  2. convert between 32 bits integers and string binary representation
>     (to store integers in Buffer and string data structures);

What you mean to do is represent an integer in a bounded interval as a
fixed-length sequence of finite-valued objects. Said that way, children
learn how to do it in school: it's writing the number in some base. Since
bytes in a string can take 256 values, one will obviously use base 256.

The first (rightmost) "digit" will be (n mod 256).
The second "digit" will be ((n / 256) mod 256).
The third "digit" will be ((n / (256 * 256)) mod 256)
The fourth (leftmost) "digit" will be ((n / (256 * 256 * 256)) mod 256).

And so on, but since your numbers are less than 256*256*256*256, all
remaining "digits" are 0. So all you have to do is store these four bytes in
your string, in any order you may prefer.

"Big endian" is when you store the fourth, the third, the second and the
first; it is the nearest to the way we humans write numbers; and the lexical
order is the same as the numeric order. "Small endian" is when you store the
first, the second, the third and the fourth.

But, and that is important, this does not depend on the hardware it runs on:
it is purely arithmetic.

The reverse operation is simply

n = d1 + d2 * 256 + d3 * 256 * 256 + d4 * 256 * 256 * 256

>  3. detect machine endianness at runtime.

Don't do that. I develop: there are no guarantees that numbers are either in
big or little endian. I have heard that some architectures exist where
8-bits bytes in 16-bits words are in little endian, but 16-bits words in
32-bit words are in big endian, which gives 3412 as a global order.

Using the internal representation of integers can so never be reliable. On
the contrary, compilers ensure that arithmetic in reasonable interval is the
real Peano arithmetic, for all architectures.

Using the internal representation of numbers may allow to gain some cycles
on the packing-unpacking, but it is probably nothing in regard to anything
that will be done with the data (disc access or network for example).
Furthermore, if you have to worry about inverting the order of the bytes in
the number, the gain will be even smaller.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 185 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 19:02 ` [Caml-list] " Nicolas George
@ 2005-06-16 19:32   ` David MENTRE
  2005-06-16 20:14     ` Nicolas George
  2005-06-17  7:29     ` Luca Pascali
  0 siblings, 2 replies; 7+ messages in thread
From: David MENTRE @ 2005-06-16 19:32 UTC (permalink / raw)
  To: Nicolas George; +Cc: Caml mailing list

Hello Nicolas,

Nicolas George <nicolas.george@ens.fr> writes:

>>  1. convert between big and little endian 32 bits integers;
>
> Don't do that.

Except that I'm writing a network interface that should be specified so
that externally written programs can read my data.

In fact, your post reminded me of a chapter of "The Practice of
Programming" by Kernighan and Pike. They advise to read/write integers
byte by byte using 8 bit masking and shifting.

So I'll follow that approach, which is close to yours and is independent
of machine endianness.

Thank you for the tip, ;-)
Yours,
d.
-- 
pub  1024D/A3AD7A2A 2004-10-03 David MENTRE <dmentre@linux-france.org>
 5996 CC46 4612 9CA4 3562  D7AC 6C67 9E96 A3AD 7A2A


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 19:32   ` David MENTRE
@ 2005-06-16 20:14     ` Nicolas George
  2005-06-17  7:29     ` Luca Pascali
  1 sibling, 0 replies; 7+ messages in thread
From: Nicolas George @ 2005-06-16 20:14 UTC (permalink / raw)
  To: David MENTRE; +Cc: Caml mailing list

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

L'octidi 28 prairial, an CCXIII, David MENTRE a écrit :
> Except that I'm writing a network interface that should be specified so
> that externally written programs can read my data.

My point is: you do not need to convert between big and little endian, you
only have to convert between sequences of bytes and integers. You never need
even to know that integers are stored as bytes in the memory of the
computer.

In fact, I expect that you could not find access to the internal
representation of integers in OCaml, which is a good thing. Most of the
portability bugs in C come from people who have not understood the principle
of having access to the internal representation of objects (which is not to
use it :-). Alas, a lot of books about C are very i86-centered and explain
it really badly.

> In fact, your post reminded me of a chapter of "The Practice of
> Programming" by Kernighan and Pike. They advise to read/write integers
> byte by byte using 8 bit masking and shifting.
> 
> So I'll follow that approach, which is close to yours and is independent
> of machine endianness.

I think this is really exactly the same approach, and I am flattered to have
the same position af Kernighan.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 185 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 18:25 How to handle endianness and binary string conversion for 32 bits integers (Int32)? David MENTRE
  2005-06-16 19:02 ` [Caml-list] " Nicolas George
@ 2005-06-16 22:25 ` Nicolas Cannasse
  2005-06-17 18:10   ` David MENTRE
  1 sibling, 1 reply; 7+ messages in thread
From: Nicolas Cannasse @ 2005-06-16 22:25 UTC (permalink / raw)
  To: caml-list, David MENTRE

> Hello,
> 
> I would like to:
> 
>  1. convert between big and little endian 32 bits integers;
> 
>  2. convert between 32 bits integers and string binary representation
>     (to store integers in Buffer and string data structures);

Extlib IO module have some code about that,
See http://ocaml-lib.sourceforge.net/doc/IO.html 

Nicolas


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 19:32   ` David MENTRE
  2005-06-16 20:14     ` Nicolas George
@ 2005-06-17  7:29     ` Luca Pascali
  1 sibling, 0 replies; 7+ messages in thread
From: Luca Pascali @ 2005-06-17  7:29 UTC (permalink / raw)
  To: caml-list

David MENTRE wrote:

>Hello Nicolas,
>
>Nicolas George <nicolas.george@ens.fr> writes:
>
>  
>
>>> 1. convert between big and little endian 32 bits integers;
>>>      
>>>
>>Don't do that.
>>    
>>
>
>Except that I'm writing a network interface that should be specified so
>that externally written programs can read my data.
>
>In fact, your post reminded me of a chapter of "The Practice of
>Programming" by Kernighan and Pike. They advise to read/write integers
>byte by byte using 8 bit masking and shifting.
>
>So I'll follow that approach, which is close to yours and is independent
>of machine endianness.
>
>Thank you for the tip, ;-)
>Yours,
>d.
>  
>
C language provides, for this purpose, some functions:
htons
htonl
ntohs
ntohl
(and I don't remember if there are for other formats)

hton_ functions covert numbers from machine endianess (whatever it is) 
to network endianess, ntoh_ is the contrary.

These functions are always present, if needed they swap bytes into the 
integer, otherwise they are just identity.

I can suggest to use them instead of doing the same job in Ocaml.


An alternative for doing the job in Ocaml is to convert the number in 
exadecimal through a sprintf "%08x" and covert to a char each couple 
(rember to use the *0* in the format to have a 8 chars string as 
result). Then, you can send the first couple (MSB) first to produce an 
endianess, or send the last couple (LSB) to have the other one.
On the other side, you can get characters and build the number using 
mathematical formula.

In this way you are not aware of your local endianess, but you are sure 
about the endianess of the link.
It is longer, but works.

LP

-- 
*********************************************************************
Luca Pascali
luca@barettadeit.com
asxcaml-guru@barettadeit.com

http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. 02 370 111 55
fax. 02 370 111 54

Our technology:
http://www.asxcaml.org/
http://www.freerp.org/


	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] How to handle endianness and binary string conversion for 32 bits integers (Int32)?
  2005-06-16 22:25 ` Nicolas Cannasse
@ 2005-06-17 18:10   ` David MENTRE
  0 siblings, 0 replies; 7+ messages in thread
From: David MENTRE @ 2005-06-17 18:10 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: caml-list

Hello,

"Nicolas Cannasse" <warplayer@free.fr> writes:

> Extlib IO module have some code about that,
> See http://ocaml-lib.sourceforge.net/doc/IO.html 

I've look at this code but, from a quick look, it seems to handle only
OCaml int.

For what is worth, here is my own weel (here under public domain
"license"):


<<timestamp.ml>>=
open Int32
@ 

\section{32 bits integer to/from string conversion}

Function [[be_of_int32]] converts an [[Int32]] into its big
endian binary representation.

<<timestamp.ml>>=
let be_of_int32 n =
  let byte_mask = of_int 0xff in
  let char_of_int32 x = Char.chr (to_int x) in
  let d0 = char_of_int32 (logand n byte_mask) in
  let d1 = char_of_int32 (logand (shift_right_logical n 8) byte_mask) in
  let d2 = char_of_int32 (logand (shift_right_logical n 16) byte_mask) in
  let d3 = char_of_int32 (logand (shift_right_logical n 24) byte_mask) in
  let big_endian = String.make 4 d3 in
  big_endian.[1] <- d2;
  big_endian.[2] <- d1;
  big_endian.[3] <- d0;
  big_endian
@ 


Function [[int32_of_be]] converts an big endian binary
representation of a 32 bits integer into an [[Int32]].

<<timestamp.ml>>=
let int32_of_be be =
  if String.length be <> 4 then 
   raise (Invalid_argument "int32_from_big_endian");
  let d3 = of_int (Char.code be.[3])
  and d2 = of_int (Char.code be.[2]) 
  and d1 = of_int (Char.code be.[1]) 
  and d0 = of_int (Char.code be.[0]) in
  (logor (shift_left d0 24) 
     (logor (shift_left d1 16) 
        (logor (shift_left d2 8) d3)))
@ 



\section{Automatic tests}

<<timestamp.ml>>=
let _ =
  if Config.do_autotests then begin
    Printf.printf "  timestamp autotests...";
    assert(int32_of_be "\001\002\003\004" = of_string "0x01020304");
    assert(be_of_int32 (of_string "0x01020304") = "\001\002\003\004");
    assert(int32_of_be "\255\254\253\252" = of_string "0xfffefdfc");
    assert(be_of_int32 (of_string "0xfffefdfc") = "\255\254\253\252");
    Printf.printf "done\n"
  end
@ 


Yours,
d.
-- 
pub  1024D/A3AD7A2A 2004-10-03 David MENTRE <dmentre@linux-france.org>
 5996 CC46 4612 9CA4 3562  D7AC 6C67 9E96 A3AD 7A2A


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-06-17 18:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-16 18:25 How to handle endianness and binary string conversion for 32 bits integers (Int32)? David MENTRE
2005-06-16 19:02 ` [Caml-list] " Nicolas George
2005-06-16 19:32   ` David MENTRE
2005-06-16 20:14     ` Nicolas George
2005-06-17  7:29     ` Luca Pascali
2005-06-16 22:25 ` Nicolas Cannasse
2005-06-17 18:10   ` David MENTRE

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).