From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id EC830BBAF for ; Sun, 1 Jun 2008 00:21:16 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoUAAKRsQUjAXQIno2dsb2JhbACSCwEBAQEBAQcFCAcRmhI X-IronPort-AV: E=Sophos;i="4.27,572,1204498800"; d="scan'208";a="11388493" Received: from concorde.inria.fr ([192.93.2.39]) by mail2-smtp-roc.national.inria.fr with ESMTP; 01 Jun 2008 00:21:16 +0200 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id m4VMLGvo026119 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Sun, 1 Jun 2008 00:21:16 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoEAAKRsQUhCbwQZlGdsb2JhbACSCwEBAQEJAwoHEQSaDg X-IronPort-AV: E=Sophos;i="4.27,572,1204498800"; d="scan'208";a="13028361" Received: from out1.smtp.messagingengine.com ([66.111.4.25]) by mail1-smtp-roc.national.inria.fr with ESMTP; 01 Jun 2008 00:21:15 +0200 Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id F0DA711126A; Sat, 31 May 2008 18:21:14 -0400 (EDT) Received: from heartbeat2.messagingengine.com ([10.202.2.161]) by compute1.internal (MEProxy); Sat, 31 May 2008 18:21:15 -0400 X-Sasl-enc: ImnXQM9KjYw7hrkpbVWWYSlhaF/9ECjCvroHnz0BZfXC 1212272474 Received: from [192.168.1.10] (ALyon-157-1-145-205.w90-42.abo.wanadoo.fr [90.42.152.205]) by mail.messagingengine.com (Postfix) with ESMTPSA id DF9D035D88; Sat, 31 May 2008 18:21:13 -0400 (EDT) Date: Sun, 1 Jun 2008 00:18:35 +0200 (CEST) From: Martin Jambon X-X-Sender: martin@martin.ec.wink.com To: Luca de Alfaro Cc: Robert Fischer , caml-list@inria.fr Subject: Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way In-Reply-To: <28fa90930805311024g282b232fyb99f253e437f69e8@mail.gmail.com> Message-ID: References: <28fa90930805302343n18b98b17t72a22ea82539fc6f@mail.gmail.com> <20080531.174314.107716495.garrigue@math.nagoya-u.ac.jp> <28fa90930805310954w3089478bqfd8c3f821fff207e@mail.gmail.com> <48418421.5090601@fischerventure.com> <28fa90930805311024g282b232fyb99f253e437f69e8@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at concorde with ID 4841CF5C.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ens-lyon:01 marshaling:01 variants:01 ocaml:01 marshaling:01 wikipedia:01 ocaml:01 datatypes:01 ocamldebug:01 berke:01 durak:01 berke:01 durak:01 type-safe:01 segfault:01 On Sat, 31 May 2008, Luca de Alfaro wrote: > Is there a standard way to represent variant types in Json? > As in: > type edit = Ins of int * int | Del of int * int | Mov of int * int * int > > Using a list of two elements the first the name of the variant, the second > the encoding of the variant itself? > Is this the standard way? For this type definition using json-static: type json t = A | B of int | C of int * int | D of (int * int) here's the mapping: A -> "A" B 1 -> [ "B", 1 ] C (1, 2) -> [ "C", 1, 2 ] D (1, 2) -> [ "D", [ 1, 2 ] ] See http://martin.jambon.free.fr/json-static-readme.txt for more. It is totally not standard, because the world of mainstream programming languages ignores the notion of variants. Note that the option type uses null for None and x for Some x, which is very handy for loading foreign data, but has the problem of representing both None and Some None as null. Overall json-static or the conventions used by json-static are not usable for arbitrary OCaml data supported by Marshal. The purpose is to be able to exchange data with other applications that use JSON as well. There are lots of them and the big advantage of JSON is its great simplicity. And finally you don't get nude pictures when you look for "json" in Google... ;-) Martin > On Sat, May 31, 2008 at 10:00 AM, Robert Fischer > wrote: > >> How far is the reach from the Jane St S-exp library from producing JSON? >> I've not actually looked >> at it, but that'd be super nifty in the interoperation world. >> >> ~~ Robert. >> >> Luca de Alfaro wrote: >>> Thanks for this insight... I imagined the lack of robustness of >> Marshaling, >>> but without all the details you mentioned!... actually, I DO desperately >>> need speed, as I am processing TB's of Wikipedia data, but precisely >> because >>> the datasets are so large, I cannot afford having to recompute / convert >>> them often, and so I want a robust format. Furthermore, I think the >>> bottleneck for me is anyway the speed of mysql and the disk, not really >> the >>> small amount of time that natively compiled Ocaml would take for the >>> conversion (I have anyway to do more complex computation that converting >> a >>> few lists and datatypes to ascii, unfortunately). Moreover, a plaintext >>> format greatly helps debugging; it also helps that I can read the same >> data >>> with other programming languages. >>> >>> Speaking of debugging, and said in passing, I cannot say enough how much >> I >>> LOVE the ability of ocamldebug of executing code backwards. It is such a >>> revelation. You simply go to the error, then back off a bit to see how >> you >>> got there. But, this is a topic for another thread. >>> >>> Many thanks, >>> >>> Luca >>> >>> >>> On Sat, May 31, 2008 at 2:38 AM, Berke Durak >> wrote: >>> >>>> I second Luca's suggestion to use Sexplib. At the very least, use a >>>> plaintext format. >>>> Don't use Marshal for long-term storage of values. Avoid it if you >>>> can. Been there, done that. >>>> Why? >>>> >>>> (1) Not type-safe. Translation: your program *wil segfault* and you >>>> won't know why. >>>> (2) Not human-readable nor editable. >>>> (3) Not future-proof. What happens if you change your type >>>> definition? Your program >>>> will segfault. So you'll have to migrate your data. But how? You'll >>>> have to find >>>> the exact revision used to generate the binary data. Good luck with >>>> that. Did you put >>>> a revision number in your data? Are you sure it was up-to-date? Then >>>> you'll have to hand-write a converter that uses type declarations from >>>> the old and the new modules. >>>> I hope your dependencies are not too complex. Not fun *at all*. >>>> >>>> However, there are some situations where Marshal is appropriate : >>>> >>>> (1) Your data is not acyclic, contains closures, or needs sharing to >>>> be compact enough. Sexplib doesn't handle these. >>>> (2) The data won't live long anyway. As in: you're doing IPC between >>>> known versions of Ocaml programs. >>>> (3) You desperately need speed. As in: you're processing 200GB of >>>> Wikipedia data. >>>> Then I can understand. >>>> -- >>>> Berke Durak >>>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Caml-list mailing list. Subscription management: >>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >>> Archives: http://caml.inria.fr >>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> _______________________________________________ >> Caml-list mailing list. Subscription management: >> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >> Archives: http://caml.inria.fr >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> > -- http://wink.com/profile/mjambon http://mjambon.com