From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p11Fqu4D014343 for ; Tue, 1 Feb 2011 16:52:56 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqYBACu7R03U4xEJkWdsb2JhbACEF5JNjiMVAQEBAQkLCgcRAyGsH5A5AoEldoJBdQSIAodj X-IronPort-AV: E=Sophos;i="4.60,411,1291590000"; d="scan'208";a="90445568" Received: from moutng.kundenserver.de ([212.227.17.9]) by mail2-smtp-roc.national.inria.fr with ESMTP; 01 Feb 2011 16:52:51 +0100 Received: from office1.lan.sumadev.de (dslb-084-059-076-183.pools.arcor-ip.net [84.59.76.183]) by mrelayeu.kundenserver.de (node=mrbap0) with ESMTP (Nemesis) id 0M1iEk-1Q0RmC0sva-00toTH; Tue, 01 Feb 2011 16:52:50 +0100 Received: from [192.168.5.106] (dslb-084-059-076-183.pools.arcor-ip.net [84.59.76.183]) by office1.lan.sumadev.de (Postfix) with ESMTPA id 005B35F701; Tue, 1 Feb 2011 16:52:49 +0100 (CET) From: Gerd Stolpmann To: caml-list@inria.fr Cc: plasma-list@ocaml-programming.de Content-Type: text/plain; charset="UTF-8" Date: Tue, 01 Feb 2011 16:52:48 +0100 Message-ID: <1296575568.24058.83.camel@thinkpad> Mime-Version: 1.0 X-Mailer: Evolution 2.28.1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:UTCTJNjwPSTOxuSJUqZ4Hrk+qVeDBwDuBPrJ4u5HRc4 zeyEsoZRxroHDFrUfSDws1ED9kwPztcyBocqXz6791eQQdOsLi SfnmnYU+wc3g07fcZLNTBfuQHBiq/t4PuwEWcGbR/cOzeWj+JG GLFdc6DgvoDRrvlv5/QPt2z8Za+Yxju8Aqcd4fcwXO5IKF4hGs MEI4tFIFf22GTRTMLXXJw== Subject: [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.3 Hi, I've just released Plasma-0.3. Plasma consists of two parts (for now), namely Plasma MapReduce, a map/reduce compute framework, and PlasmaFS, the underlying distributed filesystem. Major changes in version 0.3 : * Optimized blocklist representation (extent-based) * Improved block allocator to minimize disk seeks * Allocating datanode access tickets in advance * Sophisticated RAM management * The command-line utility "plasma" supports wildcards Of course, there are also numerous bug fixes and performance improvements. Plasma MapReduce is a distributed implementation of the map/reduce algorithm scheme. In a sentence, map/reduce performs a parallel List.map on an input file, sorts and splits the output by some criterion into partitions, and runs a List.fold_left on each partition. Only that it does not do that sequentially, but in a distributed way, and chunk by chunk. Because of this Plasma MapReduce can process very large files, and if run on enough computers, this also will work in reasonable time. Of course, map and reduce are Ocaml functions here. This all works on top of a distributed filesystem, PlasmaFS. This is a user-space filesystem that is primarily accessed over RPC (but it is also mountable as NFS volume). Actually, most of the effort went here. PlasmaFS focuses on reliability and speed for big blocksizes. To get this, it implements ACID transactions, replicates data and metadata with two-phase commit, uses a shared memory data channel if possible, and monitors itself. Unlike other filesystems for map/reduce, PlasmaFS implements the complete set of usual file operations, including random reads and writes. It can also be used as unspecialized global filesystem. Both pieces of software are bundled together in one download. The project page with further links is http://projects.camlcity.org/projects/plasma.html There is now also a homepage at http://plasma.camlcity.org This is an early alpha release (0.3). A lot of things work already, and you can already run distributed map/reduce jobs. However, it is in no way complete. Plasma is installable via GODI for Ocaml 3.12. For discussions on specifics of Plasma there is a separate mailing list: https://godirepo.camlcity.org/mailman/listinfo/plasma-list Gerd -- ------------------------------------------------------------ Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ------------------------------------------------------------