caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* picking / marshaling to strings in ocaml-revision-stable way
@ 2008-05-31  6:43 Luca de Alfaro
  2008-05-31  7:24 ` [Caml-list] " asmadeus77
  2008-05-31  8:43 ` Jacques Garrigue
  0 siblings, 2 replies; 16+ messages in thread
From: Luca de Alfaro @ 2008-05-31  6:43 UTC (permalink / raw)
  To: Inria Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

I need a way to convert  data structures to strings, in a way that is robust
with respect to different versions of Ocaml.
What I need to translate are mostly mixes of tuples, lists and variant
types.  A typical example of data to marshal/pickle may look like:

(3.4, [Move (4, 3, 5); Del (4, 2); Ins (4, 2)], "an example")

I heard that the marshaling of the module Marshal is not robust with respect
to changes in the version of Ocaml, and since I need to insert the data in a
database for long-term use, this is a serious drawback. I need the
marshaling and unmarshaling to be completely independent from the version of
Ocaml, and from the particular architecture where the marshaling occurs.
I could of course write my own solution, but I am wondering if there are any
suitable modules  available that I could use.

Many thanks!

Luca

[-- Attachment #2: Type: text/html, Size: 912 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31  6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro
@ 2008-05-31  7:24 ` asmadeus77
  2008-05-31  8:43 ` Jacques Garrigue
  1 sibling, 0 replies; 16+ messages in thread
From: asmadeus77 @ 2008-05-31  7:24 UTC (permalink / raw)
  To: Luca de Alfaro; +Cc: Inria Ocaml Mailing List

Hello,
You can try ocaml sexplib, which use lisp-like structures to store
data... And I don't think this will change anytime soon :)

(I don't know if it can store anything "worse" than a 3-uple, but it
works with pairs and triples, lists, and arrays, which should be
enough for you)

Regards,
Dominique Martinet


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31  6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro
  2008-05-31  7:24 ` [Caml-list] " asmadeus77
@ 2008-05-31  8:43 ` Jacques Garrigue
  2008-05-31  9:38   ` Berke Durak
  1 sibling, 1 reply; 16+ messages in thread
From: Jacques Garrigue @ 2008-05-31  8:43 UTC (permalink / raw)
  To: luca; +Cc: caml-list

From: "Luca de Alfaro" <luca@dealfaro.org>

> I need a way to convert  data structures to strings, in a way that is robust
> with respect to different versions of Ocaml.
> What I need to translate are mostly mixes of tuples, lists and variant
> types.  A typical example of data to marshal/pickle may look like:
> 
> (3.4, [Move (4, 3, 5); Del (4, 2); Ins (4, 2)], "an example")
> 
> I heard that the marshaling of the module Marshal is not robust with respect
> to changes in the version of Ocaml, and since I need to insert the data in a
> database for long-term use, this is a serious drawback. I need the
> marshaling and unmarshaling to be completely independent from the version of
> Ocaml, and from the particular architecture where the marshaling occurs.
> I could of course write my own solution, but I am wondering if there are any
> suitable modules  available that I could use.

AFAIK, ocaml's marshalling doesn't depend on the version, and is
architecture independent (there is only a limitation with integer
overflow when passing more than 31-bit integer values from 64-bit to
32-bit). So marshalling should be sufficient for your needs.
It is however sensitive to the data format you use (the types for your
data), and there is currently no way to verify that the type has not
changed between two versions of a program.

---------------------------------------------------------------------------
Jacques Garrigue      Nagoya University     garrigue at math.nagoya-u.ac.jp
		   <A HREF=http://www.math.nagoya-u.ac.jp/~garrigue/>JG</A>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31  8:43 ` Jacques Garrigue
@ 2008-05-31  9:38   ` Berke Durak
  2008-05-31 16:54     ` Luca de Alfaro
  0 siblings, 1 reply; 16+ messages in thread
From: Berke Durak @ 2008-05-31  9:38 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: luca, caml-list

I second Luca's suggestion to use Sexplib.  At the very least, use a
plaintext format.
Don't use Marshal for long-term storage of values.  Avoid it if you
can.  Been there, done that.
Why?

(1) Not type-safe.  Translation: your program *wil segfault* and you
won't know why.
(2) Not human-readable nor editable.
(3) Not future-proof.  What happens if you change your type
definition?  Your program
will segfault.  So you'll have to migrate your data.  But how?  You'll
have to find
the exact revision used to generate the binary data.  Good luck with
that.  Did you put
a revision number in your data?  Are you sure it was up-to-date?  Then
you'll have to hand-write a converter that uses type declarations from
the old and the new modules.
I hope your dependencies are not too complex.  Not fun *at all*.

However, there are some situations where Marshal is appropriate :

(1) Your data is not acyclic, contains closures, or needs sharing to
be compact enough.  Sexplib doesn't handle these.
(2) The data won't live long anyway.  As in: you're doing IPC between
known versions of Ocaml programs.
(3) You desperately need speed.  As in: you're processing 200GB of
Wikipedia data.
Then I can understand.
-- 
Berke Durak


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31  9:38   ` Berke Durak
@ 2008-05-31 16:54     ` Luca de Alfaro
  2008-05-31 17:00       ` Robert Fischer
  2008-05-31 17:06       ` Yaron Minsky
  0 siblings, 2 replies; 16+ messages in thread
From: Luca de Alfaro @ 2008-05-31 16:54 UTC (permalink / raw)
  To: Berke Durak; +Cc: Jacques Garrigue, caml-list

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

Thanks for this insight... I imagined the lack of robustness of Marshaling,
but without all the details you mentioned!... actually, I DO desperately
need speed, as I am processing TB's of Wikipedia data, but precisely because
the datasets are so large, I cannot afford having to recompute / convert
them often, and so I want a robust format. Furthermore, I think the
bottleneck for me is anyway the speed of mysql and the disk, not really the
small amount of time that natively compiled Ocaml would take for the
conversion (I have anyway to do more complex computation that converting a
few lists and datatypes to ascii, unfortunately).  Moreover, a plaintext
format greatly helps debugging; it also helps that I can read the same data
with other programming languages.

Speaking of debugging, and said in passing, I cannot say enough how much I
LOVE the ability of ocamldebug of executing code backwards.  It is such a
revelation.  You simply go to the error, then back off a bit to see how you
got there.  But, this is a topic for another thread.

Many thanks,

Luca


On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> wrote:

> I second Luca's suggestion to use Sexplib.  At the very least, use a
> plaintext format.
> Don't use Marshal for long-term storage of values.  Avoid it if you
> can.  Been there, done that.
> Why?
>
> (1) Not type-safe.  Translation: your program *wil segfault* and you
> won't know why.
> (2) Not human-readable nor editable.
> (3) Not future-proof.  What happens if you change your type
> definition?  Your program
> will segfault.  So you'll have to migrate your data.  But how?  You'll
> have to find
> the exact revision used to generate the binary data.  Good luck with
> that.  Did you put
> a revision number in your data?  Are you sure it was up-to-date?  Then
> you'll have to hand-write a converter that uses type declarations from
> the old and the new modules.
> I hope your dependencies are not too complex.  Not fun *at all*.
>
> However, there are some situations where Marshal is appropriate :
>
> (1) Your data is not acyclic, contains closures, or needs sharing to
> be compact enough.  Sexplib doesn't handle these.
> (2) The data won't live long anyway.  As in: you're doing IPC between
> known versions of Ocaml programs.
> (3) You desperately need speed.  As in: you're processing 200GB of
> Wikipedia data.
> Then I can understand.
> --
> Berke Durak
>

[-- Attachment #2: Type: text/html, Size: 2930 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 16:54     ` Luca de Alfaro
@ 2008-05-31 17:00       ` Robert Fischer
  2008-05-31 17:24         ` Luca de Alfaro
                           ` (3 more replies)
  2008-05-31 17:06       ` Yaron Minsky
  1 sibling, 4 replies; 16+ messages in thread
From: Robert Fischer @ 2008-05-31 17:00 UTC (permalink / raw)
  To: caml-list

How far is the reach from the Jane St S-exp library from producing JSON?  I've not actually looked
at it, but that'd be super nifty in the interoperation world.

~~ Robert.

Luca de Alfaro wrote:
> Thanks for this insight... I imagined the lack of robustness of Marshaling,
> but without all the details you mentioned!... actually, I DO desperately
> need speed, as I am processing TB's of Wikipedia data, but precisely because
> the datasets are so large, I cannot afford having to recompute / convert
> them often, and so I want a robust format. Furthermore, I think the
> bottleneck for me is anyway the speed of mysql and the disk, not really the
> small amount of time that natively compiled Ocaml would take for the
> conversion (I have anyway to do more complex computation that converting a
> few lists and datatypes to ascii, unfortunately).  Moreover, a plaintext
> format greatly helps debugging; it also helps that I can read the same data
> with other programming languages.
> 
> Speaking of debugging, and said in passing, I cannot say enough how much I
> LOVE the ability of ocamldebug of executing code backwards.  It is such a
> revelation.  You simply go to the error, then back off a bit to see how you
> got there.  But, this is a topic for another thread.
> 
> Many thanks,
> 
> Luca
> 
> 
> On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> wrote:
> 
>> I second Luca's suggestion to use Sexplib.  At the very least, use a
>> plaintext format.
>> Don't use Marshal for long-term storage of values.  Avoid it if you
>> can.  Been there, done that.
>> Why?
>>
>> (1) Not type-safe.  Translation: your program *wil segfault* and you
>> won't know why.
>> (2) Not human-readable nor editable.
>> (3) Not future-proof.  What happens if you change your type
>> definition?  Your program
>> will segfault.  So you'll have to migrate your data.  But how?  You'll
>> have to find
>> the exact revision used to generate the binary data.  Good luck with
>> that.  Did you put
>> a revision number in your data?  Are you sure it was up-to-date?  Then
>> you'll have to hand-write a converter that uses type declarations from
>> the old and the new modules.
>> I hope your dependencies are not too complex.  Not fun *at all*.
>>
>> However, there are some situations where Marshal is appropriate :
>>
>> (1) Your data is not acyclic, contains closures, or needs sharing to
>> be compact enough.  Sexplib doesn't handle these.
>> (2) The data won't live long anyway.  As in: you're doing IPC between
>> known versions of Ocaml programs.
>> (3) You desperately need speed.  As in: you're processing 200GB of
>> Wikipedia data.
>> Then I can understand.
>> --
>> Berke Durak
>>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 16:54     ` Luca de Alfaro
  2008-05-31 17:00       ` Robert Fischer
@ 2008-05-31 17:06       ` Yaron Minsky
  1 sibling, 0 replies; 16+ messages in thread
From: Yaron Minsky @ 2008-05-31 17:06 UTC (permalink / raw)
  To: Caml-list List

[-- Attachment #1: Type: text/plain, Size: 3042 bytes --]

If you're willing to sacrifice readability for speed and compactness,  
you might want to consider jane street's bin-prot library as well...

Yaron Minsky

On May 31, 2008, at 12:54 PM, Luca de Alfaro <luca@dealfaro.org> wrote:

> Thanks for this insight... I imagined the lack of robustness of  
> Marshaling, but without all the details you mentioned!... actually,  
> I DO desperately need speed, as I am processing TB's of Wikipedia  
> data, but precisely because the datasets are so large, I cannot  
> afford having to recompute / convert them often, and so I want a  
> robust format. Furthermore, I think the bottleneck for me is anyway  
> the speed of mysql and the disk, not really the small amount of time  
> that natively compiled Ocaml would take for the conversion (I have  
> anyway to do more complex computation that converting a few lists  
> and datatypes to ascii, unfortunately).  Moreover, a plaintext  
> format greatly helps debugging; it also helps that I can read the  
> same data with other programming languages.
>
> Speaking of debugging, and said in passing, I cannot say enough how  
> much I LOVE the ability of ocamldebug of executing code backwards.   
> It is such a revelation.  You simply go to the error, then back off  
> a bit to see how you got there.  But, this is a topic for another  
> thread.
>
> Many thanks,
>
> Luca
>
>
> On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com>  
> wrote:
> I second Luca's suggestion to use Sexplib.  At the very least, use a
> plaintext format.
> Don't use Marshal for long-term storage of values.  Avoid it if you
> can.  Been there, done that.
> Why?
>
> (1) Not type-safe.  Translation: your program *wil segfault* and you
> won't know why.
> (2) Not human-readable nor editable.
> (3) Not future-proof.  What happens if you change your type
> definition?  Your program
> will segfault.  So you'll have to migrate your data.  But how?  You'll
> have to find
> the exact revision used to generate the binary data.  Good luck with
> that.  Did you put
> a revision number in your data?  Are you sure it was up-to-date?  Then
> you'll have to hand-write a converter that uses type declarations from
> the old and the new modules.
> I hope your dependencies are not too complex.  Not fun *at all*.
>
> However, there are some situations where Marshal is appropriate :
>
> (1) Your data is not acyclic, contains closures, or needs sharing to
> be compact enough.  Sexplib doesn't handle these.
> (2) The data won't live long anyway.  As in: you're doing IPC between
> known versions of Ocaml programs.
> (3) You desperately need speed.  As in: you're processing 200GB of
> Wikipedia data.
> Then I can understand.
> --
> Berke Durak
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: Type: text/html, Size: 4174 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 17:00       ` Robert Fischer
@ 2008-05-31 17:24         ` Luca de Alfaro
  2008-05-31 22:18           ` Martin Jambon
  2008-05-31 17:25         ` blue storm
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Luca de Alfaro @ 2008-05-31 17:24 UTC (permalink / raw)
  To: Robert Fischer; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 3928 bytes --]

Is there a standard way to represent variant types in Json?
As in:
type edit = Ins of int * int | Del of int * int | Mov of int * int * int

Using a list of two elements the first the name of the variant, the second
the encoding of the variant itself?
Is this the standard way?

Luca

On Sat, May 31, 2008 at 10:00 AM, Robert Fischer <robert@fischerventure.com>
wrote:

> How far is the reach from the Jane St S-exp library from producing JSON?
>  I've not actually looked
> at it, but that'd be super nifty in the interoperation world.
>
> ~~ Robert.
>
> Luca de Alfaro wrote:
> > Thanks for this insight... I imagined the lack of robustness of
> Marshaling,
> > but without all the details you mentioned!... actually, I DO desperately
> > need speed, as I am processing TB's of Wikipedia data, but precisely
> because
> > the datasets are so large, I cannot afford having to recompute / convert
> > them often, and so I want a robust format. Furthermore, I think the
> > bottleneck for me is anyway the speed of mysql and the disk, not really
> the
> > small amount of time that natively compiled Ocaml would take for the
> > conversion (I have anyway to do more complex computation that converting
> a
> > few lists and datatypes to ascii, unfortunately).  Moreover, a plaintext
> > format greatly helps debugging; it also helps that I can read the same
> data
> > with other programming languages.
> >
> > Speaking of debugging, and said in passing, I cannot say enough how much
> I
> > LOVE the ability of ocamldebug of executing code backwards.  It is such a
> > revelation.  You simply go to the error, then back off a bit to see how
> you
> > got there.  But, this is a topic for another thread.
> >
> > Many thanks,
> >
> > Luca
> >
> >
> > On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com>
> wrote:
> >
> >> I second Luca's suggestion to use Sexplib.  At the very least, use a
> >> plaintext format.
> >> Don't use Marshal for long-term storage of values.  Avoid it if you
> >> can.  Been there, done that.
> >> Why?
> >>
> >> (1) Not type-safe.  Translation: your program *wil segfault* and you
> >> won't know why.
> >> (2) Not human-readable nor editable.
> >> (3) Not future-proof.  What happens if you change your type
> >> definition?  Your program
> >> will segfault.  So you'll have to migrate your data.  But how?  You'll
> >> have to find
> >> the exact revision used to generate the binary data.  Good luck with
> >> that.  Did you put
> >> a revision number in your data?  Are you sure it was up-to-date?  Then
> >> you'll have to hand-write a converter that uses type declarations from
> >> the old and the new modules.
> >> I hope your dependencies are not too complex.  Not fun *at all*.
> >>
> >> However, there are some situations where Marshal is appropriate :
> >>
> >> (1) Your data is not acyclic, contains closures, or needs sharing to
> >> be compact enough.  Sexplib doesn't handle these.
> >> (2) The data won't live long anyway.  As in: you're doing IPC between
> >> known versions of Ocaml programs.
> >> (3) You desperately need speed.  As in: you're processing 200GB of
> >> Wikipedia data.
> >> Then I can understand.
> >> --
> >> Berke Durak
> >>
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Caml-list mailing list. Subscription management:
> > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> > Archives: http://caml.inria.fr
> > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> > Bug reports: http://caml.inria.fr/bin/caml-bugs
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 5453 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 17:00       ` Robert Fischer
  2008-05-31 17:24         ` Luca de Alfaro
@ 2008-05-31 17:25         ` blue storm
  2008-05-31 21:34         ` Berke Durak
  2008-06-02 11:13         ` Richard Jones
  3 siblings, 0 replies; 16+ messages in thread
From: blue storm @ 2008-05-31 17:25 UTC (permalink / raw)
  To: Robert Fischer; +Cc: caml-list

On 5/31/08, Robert Fischer <robert@fischerventure.com> wrote:
> How far is the reach from the Jane St S-exp library from producing JSON?
> I've not actually looked
> at it, but that'd be super nifty in the interoperation world.

You may be interested in the json-static syntax extension from Martin Jambon :
http://martin.jambon.free.fr/json-static.html


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 17:00       ` Robert Fischer
  2008-05-31 17:24         ` Luca de Alfaro
  2008-05-31 17:25         ` blue storm
@ 2008-05-31 21:34         ` Berke Durak
  2008-05-31 22:51           ` Stefano Zacchiroli
  2008-06-01 11:14           ` Martin Jambon
  2008-06-02 11:13         ` Richard Jones
  3 siblings, 2 replies; 16+ messages in thread
From: Berke Durak @ 2008-05-31 21:34 UTC (permalink / raw)
  To: Robert Fischer; +Cc: caml-list

On Sat, May 31, 2008 at 7:00 PM, Robert Fischer
<robert@fischerventure.com> wrote:
> How far is the reach from the Jane St S-exp library from producing JSON?  I've not actually looked at it, but that'd be super nifty in the interoperation world.

If you just want JSON syntax, you can use Sexplib to convert an
arbitrary type to a
Sexp.t

  type t = Atom of string | List of t list

and then output in Json format:

let rec output_json oc = function
| Atom u -> fprintf oc "%S" u
| List xl -> fprintf oc "[%a]" (fun oc xl -> List.iter (fun x ->
fprintf "%a," output_json x) xl) xl

You can then do the same thing for parsing.
-- 
Berke


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 17:24         ` Luca de Alfaro
@ 2008-05-31 22:18           ` Martin Jambon
  0 siblings, 0 replies; 16+ messages in thread
From: Martin Jambon @ 2008-05-31 22:18 UTC (permalink / raw)
  To: Luca de Alfaro; +Cc: Robert Fischer, caml-list

On Sat, 31 May 2008, Luca de Alfaro wrote:

> Is there a standard way to represent variant types in Json?
> As in:
> type edit = Ins of int * int | Del of int * int | Mov of int * int * int
>
> Using a list of two elements the first the name of the variant, the second
> the encoding of the variant itself?
> Is this the standard way?

For this type definition using json-static:

type json t = A | B of int | C of int * int | D of (int * int)

here's the mapping:

A -> "A"
B 1 -> [ "B", 1 ]
C (1, 2) -> [ "C", 1, 2 ]
D (1, 2) -> [ "D", [ 1, 2 ] ]

See http://martin.jambon.free.fr/json-static-readme.txt for more.

It is totally not standard, because the world of mainstream programming 
languages ignores the notion of variants.

Note that the option type uses null for None and x for Some x, which is 
very handy for loading foreign data, but has the problem of representing 
both None and Some None as null.

Overall json-static or the conventions used by json-static are not usable 
for arbitrary OCaml data supported by Marshal. The purpose is to be able 
to exchange data with other applications that use JSON as well. There are 
lots of them and the big advantage of JSON is its great simplicity.

And finally you don't get nude pictures when you look for "json" in 
Google...
;-)


Martin


> On Sat, May 31, 2008 at 10:00 AM, Robert Fischer <robert@fischerventure.com>
> wrote:
>
>> How far is the reach from the Jane St S-exp library from producing JSON?
>>  I've not actually looked
>> at it, but that'd be super nifty in the interoperation world.
>>
>> ~~ Robert.
>>
>> Luca de Alfaro wrote:
>>> Thanks for this insight... I imagined the lack of robustness of
>> Marshaling,
>>> but without all the details you mentioned!... actually, I DO desperately
>>> need speed, as I am processing TB's of Wikipedia data, but precisely
>> because
>>> the datasets are so large, I cannot afford having to recompute / convert
>>> them often, and so I want a robust format. Furthermore, I think the
>>> bottleneck for me is anyway the speed of mysql and the disk, not really
>> the
>>> small amount of time that natively compiled Ocaml would take for the
>>> conversion (I have anyway to do more complex computation that converting
>> a
>>> few lists and datatypes to ascii, unfortunately).  Moreover, a plaintext
>>> format greatly helps debugging; it also helps that I can read the same
>> data
>>> with other programming languages.
>>>
>>> Speaking of debugging, and said in passing, I cannot say enough how much
>> I
>>> LOVE the ability of ocamldebug of executing code backwards.  It is such a
>>> revelation.  You simply go to the error, then back off a bit to see how
>> you
>>> got there.  But, this is a topic for another thread.
>>>
>>> Many thanks,
>>>
>>> Luca
>>>
>>>
>>> On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com>
>> wrote:
>>>
>>>> I second Luca's suggestion to use Sexplib.  At the very least, use a
>>>> plaintext format.
>>>> Don't use Marshal for long-term storage of values.  Avoid it if you
>>>> can.  Been there, done that.
>>>> Why?
>>>>
>>>> (1) Not type-safe.  Translation: your program *wil segfault* and you
>>>> won't know why.
>>>> (2) Not human-readable nor editable.
>>>> (3) Not future-proof.  What happens if you change your type
>>>> definition?  Your program
>>>> will segfault.  So you'll have to migrate your data.  But how?  You'll
>>>> have to find
>>>> the exact revision used to generate the binary data.  Good luck with
>>>> that.  Did you put
>>>> a revision number in your data?  Are you sure it was up-to-date?  Then
>>>> you'll have to hand-write a converter that uses type declarations from
>>>> the old and the new modules.
>>>> I hope your dependencies are not too complex.  Not fun *at all*.
>>>>
>>>> However, there are some situations where Marshal is appropriate :
>>>>
>>>> (1) Your data is not acyclic, contains closures, or needs sharing to
>>>> be compact enough.  Sexplib doesn't handle these.
>>>> (2) The data won't live long anyway.  As in: you're doing IPC between
>>>> known versions of Ocaml programs.
>>>> (3) You desperately need speed.  As in: you're processing 200GB of
>>>> Wikipedia data.
>>>> Then I can understand.
>>>> --
>>>> Berke Durak
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Caml-list mailing list. Subscription management:
>>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>>> Archives: http://caml.inria.fr
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>

--
http://wink.com/profile/mjambon
http://mjambon.com


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 21:34         ` Berke Durak
@ 2008-05-31 22:51           ` Stefano Zacchiroli
  2008-06-02  9:04             ` Berke Durak
  2008-06-01 11:14           ` Martin Jambon
  1 sibling, 1 reply; 16+ messages in thread
From: Stefano Zacchiroli @ 2008-05-31 22:51 UTC (permalink / raw)
  To: caml-list

On Sat, May 31, 2008 at 11:34:34PM +0200, Berke Durak wrote:
> and then output in Json format:

Aren't you being naive about escaping needs here?

I don't know the details of the two involved languages, but it would
mean to be very lucky if the escaping conventions are the same ...

Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science ............... now what?
zack@{upsilon.cc,cs.unibo.it,debian.org}  -<%>-  http://upsilon.cc/zack/
(15:56:48)  Zack: e la demo dema ?    /\    All one has to do is hit the
(15:57:15)  Bac: no, la demo scema    \/    right keys at the right time


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 21:34         ` Berke Durak
  2008-05-31 22:51           ` Stefano Zacchiroli
@ 2008-06-01 11:14           ` Martin Jambon
  1 sibling, 0 replies; 16+ messages in thread
From: Martin Jambon @ 2008-06-01 11:14 UTC (permalink / raw)
  To: Berke Durak; +Cc: Robert Fischer, caml-list

On Sat, 31 May 2008, Berke Durak wrote:

> On Sat, May 31, 2008 at 7:00 PM, Robert Fischer
> <robert@fischerventure.com> wrote:
>> How far is the reach from the Jane St S-exp library from producing JSON?  I've not actually looked at it, but that'd be super nifty in the interoperation world.
>
> If you just want JSON syntax, you can use Sexplib to convert an
> arbitrary type to a
> Sexp.t
>
>  type t = Atom of string | List of t list
>
> and then output in Json format:
>
> let rec output_json oc = function
> | Atom u -> fprintf oc "%S" u
> | List xl -> fprintf oc "[%a]" (fun oc xl -> List.iter (fun x ->
> fprintf "%a," output_json x) xl) xl
>
> You can then do the same thing for parsing.

You won't obtain anything useful if you treat Atoms as JSON strings and 
Lists as JSON arrays because JSON has also null, numbers, booleans and 
objects.

This is the JSON standard: http://www.json.org/

And that is the concrete type used to represent JSON trees in the 
json-wheel library (which json-static uses):

type json_type =
     Object of (string * json_type) list
   | Array of json_type list
   | String of string
   | Int of int
   | Float of float
   | Bool of bool
   | Null


Cheers,

Martin

--
http://wink.com/profile/mjambon
http://mjambon.com


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 22:51           ` Stefano Zacchiroli
@ 2008-06-02  9:04             ` Berke Durak
  2008-06-02  9:21               ` Stefano Zacchiroli
  0 siblings, 1 reply; 16+ messages in thread
From: Berke Durak @ 2008-06-02  9:04 UTC (permalink / raw)
  To: caml-list

On Sun, Jun 1, 2008 at 12:51 AM, Stefano Zacchiroli <zack@upsilon.cc> wrote:
> On Sat, May 31, 2008 at 11:34:34PM +0200, Berke Durak wrote:
>> and then output in Json format:
>
> Aren't you being naive about escaping needs here?
>
> I don't know the details of the two involved languages, but it would
> mean to be very lucky if the escaping conventions are the same ...

Of course, the code was intended for illustrative purposes. JSON
escapes characters based using 4-digit Unicode hex codes.
Except for \n, \r, etc. sot it would probably work for ASCII.

Martin Jambon:

> You won't obtain anything useful if you treat Atoms as JSON strings and Lists as JSON arrays because JSON has also null, numbers, booleans and objects.

The real issue is that records are mapped to lists of lists, making
lookup difficult and cumbersome.  But that's still JSON syntax,
formally... You could theoretically write a piece of code to
"recognize" a record and emit a Json object but that wouldn't be very
elegant.

> This is the JSON standard: http://www.json.org/

I know, we are using Json (but not Json-wheel) to pass the annotated
syntax tree from the C legacy front-end to the Ocaml JVM backend.
However we are using Sexplib for all the internal serialization needs
(mostly for debugging) since it integrates so well with Ocaml.

> And that is the concrete type used to represent JSON trees in the json-wheel library (which json-static uses):

That's a blessing but also a curse.  You retain more information on
your format, but that also complexifies anything that manipulates
syntax trees.

The nice thing about Sexp is that its Path module provides a small
manipulation language for migrating your data.  If that's not enough,
you can always load it in any Scheme or Lisp and spit it back.
-- 
Berke Durak


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-06-02  9:04             ` Berke Durak
@ 2008-06-02  9:21               ` Stefano Zacchiroli
  0 siblings, 0 replies; 16+ messages in thread
From: Stefano Zacchiroli @ 2008-06-02  9:21 UTC (permalink / raw)
  To: caml-list

On Mon, Jun 02, 2008 at 11:04:58AM +0200, Berke Durak wrote:
> > I don't know the details of the two involved languages, but it would
> > mean to be very lucky if the escaping conventions are the same ...
> Of course, the code was intended for illustrative purposes. JSON
> escapes characters based using 4-digit Unicode hex codes.
> Except for \n, \r, etc. sot it would probably work for ASCII.

OK then, but the original question was "how far" is Sexplib to obtain
JSON and the answer then should be "still a bit far", it is not just a
function of a couple of lines. You have to write some code to obtain
fully compliant JSON.

Instead of asking people to write it over and over again, it would
probably be a good idea to provide a patch for Sexplib adding
serialization capabilities towards JSON. That assuming, of course, that
Sexplib authors are interested in integrating such a feature.

Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7
zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/
I'm still an SGML person,this newfangled /\ All one has to do is hit the
XML stuff is so ... simplistic  -- Manoj \/ right keys at the right time


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way
  2008-05-31 17:00       ` Robert Fischer
                           ` (2 preceding siblings ...)
  2008-05-31 21:34         ` Berke Durak
@ 2008-06-02 11:13         ` Richard Jones
  3 siblings, 0 replies; 16+ messages in thread
From: Richard Jones @ 2008-06-02 11:13 UTC (permalink / raw)
  To: Robert Fischer; +Cc: caml-list

On Sat, May 31, 2008 at 12:00:17PM -0500, Robert Fischer wrote:
> How far is the reach from the Jane St S-exp library from producing
> JSON?  I've not actually looked at it, but that'd be super nifty in
> the interoperation world.

It's worth noting:

  http://martin.jambon.free.fr/json-wheel.html
  http://code.google.com/p/deriving/

and I guess maybe even:

  http://code.google.com/p/bitmatch/

if you wanted a way to generate and parse a stable binary format.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-06-02 11:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-31  6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro
2008-05-31  7:24 ` [Caml-list] " asmadeus77
2008-05-31  8:43 ` Jacques Garrigue
2008-05-31  9:38   ` Berke Durak
2008-05-31 16:54     ` Luca de Alfaro
2008-05-31 17:00       ` Robert Fischer
2008-05-31 17:24         ` Luca de Alfaro
2008-05-31 22:18           ` Martin Jambon
2008-05-31 17:25         ` blue storm
2008-05-31 21:34         ` Berke Durak
2008-05-31 22:51           ` Stefano Zacchiroli
2008-06-02  9:04             ` Berke Durak
2008-06-02  9:21               ` Stefano Zacchiroli
2008-06-01 11:14           ` Martin Jambon
2008-06-02 11:13         ` Richard Jones
2008-05-31 17:06       ` Yaron Minsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).