caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* ocaml, int32/64, bigarray and unsigned values ...
@ 2005-04-11  7:46 Sven Luther
  2005-04-11 12:57 ` [Caml-list] " Eric Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Sven Luther @ 2005-04-11  7:46 UTC (permalink / raw)
  To: caml-list, sven.luther

Hello,

I had plans to do a rewrite of GNU parted, a project which i am involved with,
in ocaml, and am being blocked by a few issues. 

I know i can read disk sectors easily with the large-file support, which would
mean that i could support all underlying files that the ocaml standard library
supports, as opposed to parted which has some special code for linux, hurd,
and a couple of others. This would probably mean that if i did it right, i
could even use said library on windows, haven't investigated though.

Now to my problems, which are basically two.

  1) most disk partition tables and filesystem have a mapping from a given
  disk 512 byte sector to a descriptive structure. In C you simply define the
  structure which corresponds to it, and you cast the sector to it, and then
  test if some magic numbers and checksums are or not enough to identify the
  sector as of the given type. The nearest to that would be either trying to
  use the bigarray infrastructure and mmap capability, but it only makes
  provision for mapping arrays and not structures. The other possibilities is
  to either have some C bindings which do the proper cast, or to have access
  functions which transform parts of a byte array into values. The first one
  is ugly, as i was aiming for a purely ocaml solution (so i can build and
  arch/plateform independent bytecode tool), and the second would probably be
  a disaster speed wise, and also somewhat ugly unless properly encapsulated
  in an abstract module.

Which brings me to the second problem.

  2) Disk descriptors like partition table and filesystems, need to have exact
  values, and the values are mostly unsigned 8, 16, 32 or 64 bit integers,
  strings and bit fields. The int64 and int32 offer these kind of values, but
  only the signed version. Is it save to make calculation on a signed number
  and ignoring the sign bit ? Does this not cause risk of overflow ? I am not
  particularly knowledgeable of the different signed/unsigned implementations
  on the different architectures and plateform that i would need to support.
  Also, i believe that bit fields are not easily available, altough there is
  some support in the Int32 and int64 bit-wise operators, but again we have
  the signed vs unsigned problem, altough it is maybe ignored for bit
  operations ?

These two questions also are of importance if you want to write chip drivers
in ocaml, since you have to mmap the mmio registers of the chips, and have a
similar exact access to the registers used, altough the registers should fall
better in the bigarray mapping, since you mostly access those as 8, 16, 32, 64
or even 128 bit values.

But maybe ocaml 3.09 could have direct support for these kind of operations,
opening a new field of usage for ocaml ?

Friendly,

Sven Luther


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11  7:46 ocaml, int32/64, bigarray and unsigned values Sven Luther
@ 2005-04-11 12:57 ` Eric Cooper
  2005-04-11 15:35   ` Sven Luther
  2005-04-12 17:19   ` [Caml-list] ocaml, int32/64, bigarray and unsigned values Paul Snively
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Cooper @ 2005-04-11 12:57 UTC (permalink / raw)
  To: caml-list

On Mon, Apr 11, 2005 at 09:46:19AM +0200, Sven Luther wrote:
> I had plans to do a rewrite of GNU parted, a project which i am
> involved with, in ocaml, and am being blocked by a few issues.
> [...]
>   1) most disk partition tables and filesystem have a mapping from a
>   given disk 512 byte sector to a descriptive structure.
>   [...]
>   or to have access functions which transform parts of
>   a byte array into values. The first one is ugly, as i was aiming
>   for a purely ocaml solution (so i can build and arch/plateform
>   independent bytecode tool), and the second would probably be a
>   disaster speed wise, and also somewhat ugly unless properly
>   encapsulated in an abstract module.

I would use the second approach.  I would define a logically
equivalent OCaml record or class, and conversion functions between
that object and a string + offset (or Bigarray of bytes, plus
offset).  Passing around an offset into a larger byte array can save a
lot of copying.

You can probably structure your code so that you only convert to/from
bytes in a few places, not likely to be performance-critical.

> Which brings me to the second problem.
> 
>   2) Disk descriptors like partition table and filesystems, need to
>   have exact values, and the values are mostly unsigned 8, 16, 32 or
>   64 bit integers, strings and bit fields. The int64 and int32 offer
>   these kind of values, but only the signed version. Is it save to
>   make calculation on a signed number and ignoring the sign bit ?
>   Does this not cause risk of overflow ?

That's the beauty of 2's-complement representation of signed numbers.
The sign bit is just a consequence of which half of the values encode
negative numbers, from -1 (0xFF...FF) to min_int (0x80...00), so the
leading bit is the sign bit.  You can just do arithmetic and interpret
the results as unsigned.

>   Also, i believe that bit fields are not easily available, altough
>   there is some support in the Int32 and int64 bit-wise operators,
>   but again we have the signed vs unsigned problem, altough it is
>   maybe ignored for bit operations ?

You can do anything you need with shifting and masking.  That should
probably also be hidden in the bytearray-to-record conversion
routines.

It would be very cool to have such a "hard core" utility as a
disk partition editor in OCaml!

-- 
Eric Cooper             e c c @ c m u . e d u


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11 12:57 ` [Caml-list] " Eric Cooper
@ 2005-04-11 15:35   ` Sven Luther
  2005-04-11 16:13     ` Eric Cooper
                       ` (3 more replies)
  2005-04-12 17:19   ` [Caml-list] ocaml, int32/64, bigarray and unsigned values Paul Snively
  1 sibling, 4 replies; 9+ messages in thread
From: Sven Luther @ 2005-04-11 15:35 UTC (permalink / raw)
  To: caml-list

On Mon, Apr 11, 2005 at 08:57:05AM -0400, Eric Cooper wrote:
> On Mon, Apr 11, 2005 at 09:46:19AM +0200, Sven Luther wrote:
> > I had plans to do a rewrite of GNU parted, a project which i am
> > involved with, in ocaml, and am being blocked by a few issues.
> > [...]
> >   1) most disk partition tables and filesystem have a mapping from a
> >   given disk 512 byte sector to a descriptive structure.
> >   [...]
> >   or to have access functions which transform parts of
> >   a byte array into values. The first one is ugly, as i was aiming
> >   for a purely ocaml solution (so i can build and arch/plateform
> >   independent bytecode tool), and the second would probably be a
> >   disaster speed wise, and also somewhat ugly unless properly
> >   encapsulated in an abstract module.
> 
> I would use the second approach.  I would define a logically
> equivalent OCaml record or class, and conversion functions between
> that object and a string + offset (or Bigarray of bytes, plus
> offset).  Passing around an offset into a larger byte array can save a
> lot of copying.
> 
> You can probably structure your code so that you only convert to/from
> bytes in a few places, not likely to be performance-critical.

Mmm, one could imagine a generic set of access function inside a byte array
(would have to handle endianess and such though), and then a structure defined
as a set of lazy values corresponding to the access functions in question, so
only values actually accessed get computed.

That said, 

> > Which brings me to the second problem.
> > 
> >   2) Disk descriptors like partition table and filesystems, need to
> >   have exact values, and the values are mostly unsigned 8, 16, 32 or
> >   64 bit integers, strings and bit fields. The int64 and int32 offer
> >   these kind of values, but only the signed version. Is it save to
> >   make calculation on a signed number and ignoring the sign bit ?
> >   Does this not cause risk of overflow ?
> 
> That's the beauty of 2's-complement representation of signed numbers.
> The sign bit is just a consequence of which half of the values encode
> negative numbers, from -1 (0xFF...FF) to min_int (0x80...00), so the
> leading bit is the sign bit.  You can just do arithmetic and interpret
> the results as unsigned.

Ok, but it would be nice to tell this black on white in the manual. I was
half-guessing that something such was the case, but wasn't entirely sure about
the fact, and as well, partitioning is very sensitive stuff, i wanted to be
sure.

Now, what about conversion to Int32 or Int64 ? Would an unsigned Int32 which
is represented as a negative signed Int32 not get broken when used to
calculate Int64 values ? And what about comparisons ? Obviously max_int + 1 >
max_int will be wrong since max_int + 1 would be considered a negative number
(-0 maybe ?).

> >   Also, i believe that bit fields are not easily available, altough
> >   there is some support in the Int32 and int64 bit-wise operators,
> >   but again we have the signed vs unsigned problem, altough it is
> >   maybe ignored for bit operations ?
> 
> You can do anything you need with shifting and masking.  That should
> probably also be hidden in the bytearray-to-record conversion
> routines.

Yeah, bit shifting should be ok, since the sign is ignored for those.

> It would be very cool to have such a "hard core" utility as a
> disk partition editor in OCaml!

Yep, altough having to do ugly hacks in the first part to map the sectors to
ocaml structures is not a good advertizement once you want to convince C users
that it is a better implementation.

Also, the next difficulty is providing C callbacks which are compatible with
libparted.

Friendly,

Sven Luther


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11 15:35   ` Sven Luther
@ 2005-04-11 16:13     ` Eric Cooper
  2005-04-13  6:54     ` Florian Hars
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Eric Cooper @ 2005-04-11 16:13 UTC (permalink / raw)
  To: caml-list

On Mon, Apr 11, 2005 at 05:35:51PM +0200, Sven Luther wrote:
> Now, what about conversion to Int32 or Int64 ? Would an unsigned
> Int32 which is represented as a negative signed Int32 not get broken
> when used to calculate Int64 values ?

You'll have to watch out for sign-extension: when a signed integer is
widened, the leading bits get filled with 1s to preserve the sign.
That's the wrong behavior if you want to widen an unsigned integer.
The Int{32,64} modules don't seem to have of_unsigned_int functions,
but you can simulate them by checking if the result is negative and
adjusting it (by adding 2^n).

> And what about comparisons ?

Right, you'll have to define your own, because for example -1 < 0,
but you want 0 < 0xFF...FF.  You can just test for negative numbers to
simulate it yourself (since any negative int is greater than any
positive int when treating them as unsigned, otherwise the native int
comparison works).

> Obviously max_int + 1 > max_int will be wrong since max_int + 1
> would be considered a negative number (-0 maybe ?).

Well, max_int + 1 = min_int, but that's what you want when that bit pattern is
interpreted as unsigned.  The only incorrect results will come from
overflow, which silently "wraps around" just like in C.

-- 
Eric Cooper             e c c @ c m u . e d u


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11 12:57 ` [Caml-list] " Eric Cooper
  2005-04-11 15:35   ` Sven Luther
@ 2005-04-12 17:19   ` Paul Snively
  1 sibling, 0 replies; 9+ messages in thread
From: Paul Snively @ 2005-04-12 17:19 UTC (permalink / raw)
  To: Eric Cooper; +Cc: caml-list


On Apr 11, 2005, at 5:57 AM, Eric Cooper wrote:

> It would be very cool to have such a "hard core" utility as a
> disk partition editor in OCaml!
>
Let me ditto this: a lot of people are falling all over the new book, 
"Practical Common Lisp," because it talks about things like parsing ID3 
tags from MP3 files, and so talks about the general process of parsing 
binary files in Common Lisp. What you're trying to do is very similar, 
and in fact I'd love to see a "general module" for defining, reading, 
and writing C/C++-style structures to/from binary files. parted sounds 
like a great example of that kind of work. Thanks for digging into it!

> -- 
> Eric Cooper             e c c @ c m u . e d u
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
Best regards,
Paul Snively


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11 15:35   ` Sven Luther
  2005-04-11 16:13     ` Eric Cooper
@ 2005-04-13  6:54     ` Florian Hars
  2005-04-13 18:28     ` Ken Rose
  2005-05-25  6:06     ` partition tables and ocaml Taras
  3 siblings, 0 replies; 9+ messages in thread
From: Florian Hars @ 2005-04-13  6:54 UTC (permalink / raw)
  To: Sven Luther; +Cc: caml-list

Sven Luther wrote:

> Ok, but it would be nice to tell this black on white in the manual.

It is. "All arithmetic operations over int64 are taken modulo 2^64" ;-)

The core of a solution would be something like

open Int64

let of_uint32 n =
   if Int32.compare n Int32.zero >= 0 then
     of_int32 n
   else
     add (of_int32 n) 4294967296L

let compare a b =
   if logand min_int (logxor a b) = min_int then
     if logand min_int a = min_int then 1 else (-1)
   else
     Pervasives.compare a b

let to_string n = format "%u" n

But you can't enter numbers larger than Int64.max_int that way, this would 
require a C function and a fix to the compiler:

         Objective Caml version 3.08.3

# #load "uInt64.cma";;
# Int64.compare Int64.max_int (Int64.succ Int64.max_int);;
- : int = 1
# UInt64.compare Int64.max_int (Int64.succ Int64.max_int);;
- : int = -1
# Int64.of_int32 (Int32.min_int);;
- : int64 = -2147483648L
# UInt64.of_uint32 (Int32.min_int);;
- : int64 = 2147483648L
# UInt64.of_uint32 (Int32.of_int (-1));;
- : int64 = 4294967295L
# UInt64.to_string (-1L);;
- : string = "18446744073709551615"
# Int64.of_string (UInt64.to_string (-1L));;
Exception: Failure "int_of_string".
# Int64.to_string 18446744073709551615L;;
Integer literal exceeds the range of representable integers of type int64

See http://pauillac.inria.fr/bin/caml-bugs/feature%20wish?id=2928
Strangely enough, ocaml accepts max_int + 1 although that literal exceeds the
range of representable integers, too:

# UInt64.to_string Int64.min_int;;
- : string = "9223372036854775808"
# Int64.to_string 9223372036854775808L;;
- : string = "-9223372036854775808"
# Int64.to_string 9223372036854775809L;;
Integer literal exceeds the range of representable integers of type int64
# min_int;;
- : int = -4611686018427387904
# 4611686018427387904;;
- : int = -4611686018427387904
# 4611686018427387905;;
Integer literal exceeds the range of representable integers of type int


Yours, Florian.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Caml-list] ocaml, int32/64, bigarray and unsigned values ...
  2005-04-11 15:35   ` Sven Luther
  2005-04-11 16:13     ` Eric Cooper
  2005-04-13  6:54     ` Florian Hars
@ 2005-04-13 18:28     ` Ken Rose
  2005-05-25  6:06     ` partition tables and ocaml Taras
  3 siblings, 0 replies; 9+ messages in thread
From: Ken Rose @ 2005-04-13 18:28 UTC (permalink / raw)
  To: caml-list

Sven Luther wrote:

> Yeah, bit shifting should be ok, since the sign is ignored for those.

But be careful about right shifts, which are different signed than 
unsigned.  Signed repeats the sign bit, while unsigned shifts in zeroes. 
  If you mask afterwards, it should be fine.

  - ken


^ permalink raw reply	[flat|nested] 9+ messages in thread

* partition tables and ocaml
  2005-04-11 15:35   ` Sven Luther
                       ` (2 preceding siblings ...)
  2005-04-13 18:28     ` Ken Rose
@ 2005-05-25  6:06     ` Taras
  2005-05-25  9:09       ` Sven Luther
  3 siblings, 1 reply; 9+ messages in thread
From: Taras @ 2005-05-25  6:06 UTC (permalink / raw)
  To: Sven Luther; +Cc: caml-list

Sven Luther wrote:

>On Mon, Apr 11, 2005 at 08:57:05AM -0400, Eric Cooper wrote:
>  
>
>>On Mon, Apr 11, 2005 at 09:46:19AM +0200, Sven Luther wrote:
>>    
>>
>>>I had plans to do a rewrite of GNU parted, a project which i am
>>>involved with, in ocaml, and am being blocked by a few issues.
>>>[...]
>>>  1) most disk partition tables and filesystem have a mapping from a
>>>  given disk 512 byte sector to a descriptive structure.
>>>  [...]
>>>  or to have access functions which transform parts of
>>>  a byte array into values. The first one is ugly, as i was aiming
>>>  for a purely ocaml solution (so i can build and arch/plateform
>>>  independent bytecode tool), and the second would probably be a
>>>  disaster speed wise, and also somewhat ugly unless properly
>>>  encapsulated in an abstract module.
>>>      
>>>
>>I would use the second approach.  I would define a logically
>>equivalent OCaml record or class, and conversion functions between
>>that object and a string + offset (or Bigarray of bytes, plus
>>offset).  Passing around an offset into a larger byte array can save a
>>lot of copying.
>>
>>You can probably structure your code so that you only convert to/from
>>bytes in a few places, not likely to be performance-critical.
>>    
>>
>
>Mmm, one could imagine a generic set of access function inside a byte array
>(would have to handle endianess and such though), and then a structure defined
>as a set of lazy values corresponding to the access functions in question, so
>only values actually accessed get computed.
>
>That said, 
>  
>
Hello,
Is a port of Parted to OCaml still in the works? In case you are still 
wondering, I would like to show out that OCaml can indeed deal with 
partitions in a painless manner.
See http://glek.net/subversion/os/kernel/modules/PartitionTable.ml
The read_tables function is capable of doing primary and extended 
partition table parsing in next to no code. Obviously Parted would need 
a bit more code than that, but things wouldn't get much hairier since 
partition entries are only 16 bytes and that code parses 6 of them :)

Cheers,
Taras


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: partition tables and ocaml
  2005-05-25  6:06     ` partition tables and ocaml Taras
@ 2005-05-25  9:09       ` Sven Luther
  0 siblings, 0 replies; 9+ messages in thread
From: Sven Luther @ 2005-05-25  9:09 UTC (permalink / raw)
  To: Taras; +Cc: Sven Luther, caml-list

On Tue, May 24, 2005 at 11:06:12PM -0700, Taras wrote:
> Sven Luther wrote:
> 
> >On Mon, Apr 11, 2005 at 08:57:05AM -0400, Eric Cooper wrote:
> > 
> >
> >>On Mon, Apr 11, 2005 at 09:46:19AM +0200, Sven Luther wrote:
> >>   
> >>
> >>>I had plans to do a rewrite of GNU parted, a project which i am
> >>>involved with, in ocaml, and am being blocked by a few issues.
> >>>[...]
> >>> 1) most disk partition tables and filesystem have a mapping from a
> >>> given disk 512 byte sector to a descriptive structure.
> >>> [...]
> >>> or to have access functions which transform parts of
> >>> a byte array into values. The first one is ugly, as i was aiming
> >>> for a purely ocaml solution (so i can build and arch/plateform
> >>> independent bytecode tool), and the second would probably be a
> >>> disaster speed wise, and also somewhat ugly unless properly
> >>> encapsulated in an abstract module.
> >>>     
> >>>
> >>I would use the second approach.  I would define a logically
> >>equivalent OCaml record or class, and conversion functions between
> >>that object and a string + offset (or Bigarray of bytes, plus
> >>offset).  Passing around an offset into a larger byte array can save a
> >>lot of copying.
> >>
> >>You can probably structure your code so that you only convert to/from
> >>bytes in a few places, not likely to be performance-critical.
> >>   
> >>
> >
> >Mmm, one could imagine a generic set of access function inside a byte array
> >(would have to handle endianess and such though), and then a structure 
> >defined
> >as a set of lazy values corresponding to the access functions in question, 
> >so
> >only values actually accessed get computed.
> >
> >That said, 
> > 
> >
> Hello,
> Is a port of Parted to OCaml still in the works? In case you are still 

Yes, altough i am a bit short on time right now.

> wondering, I would like to show out that OCaml can indeed deal with 
> partitions in a painless manner.
> See http://glek.net/subversion/os/kernel/modules/PartitionTable.ml

Cool, will look at this. Also maybe you would be interested in cooperating on
the project ? 

> The read_tables function is capable of doing primary and extended 
> partition table parsing in next to no code. Obviously Parted would need 
> a bit more code than that, but things wouldn't get much hairier since 
> partition entries are only 16 bytes and that code parses 6 of them :)

Yes, but fdisk/mbr like partition tables are trivial, parted supports many
kind of partition table, among them the mac/amiga ones which i am most
familiar with, and which handle linked lists of partitions.

Friendly,

Sven Luther


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-05-25  9:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-11  7:46 ocaml, int32/64, bigarray and unsigned values Sven Luther
2005-04-11 12:57 ` [Caml-list] " Eric Cooper
2005-04-11 15:35   ` Sven Luther
2005-04-11 16:13     ` Eric Cooper
2005-04-13  6:54     ` Florian Hars
2005-04-13 18:28     ` Ken Rose
2005-05-25  6:06     ` partition tables and ocaml Taras
2005-05-25  9:09       ` Sven Luther
2005-04-12 17:19   ` [Caml-list] ocaml, int32/64, bigarray and unsigned values Paul Snively

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).