caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: caml-list@inria.fr
Cc: plasma-list@ocaml-programming.de
Subject: [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.3
Date: Tue, 01 Feb 2011 16:52:48 +0100	[thread overview]
Message-ID: <1296575568.24058.83.camel@thinkpad> (raw)

Hi,

I've just released Plasma-0.3. Plasma consists of two parts (for now),
namely Plasma MapReduce, a map/reduce compute framework, and PlasmaFS,
the underlying distributed filesystem.

Major changes in version 0.3 :

      * Optimized blocklist representation (extent-based)
      * Improved block allocator to minimize disk seeks
      * Allocating datanode access tickets in advance
      * Sophisticated RAM management
      * The command-line utility "plasma" supports wildcards

Of course, there are also numerous bug fixes and performance
improvements.

Plasma MapReduce is a distributed implementation of the map/reduce
algorithm scheme. In a sentence, map/reduce performs a parallel List.map
on an input file, sorts and splits the output by some criterion into
partitions, and runs a List.fold_left on each partition. Only that it
does not do that sequentially, but in a distributed way, and chunk by
chunk. Because of this Plasma MapReduce can process very large files,
and if run on enough computers, this also will work in reasonable time.
Of course, map and reduce are Ocaml functions here.

This all works on top of a distributed filesystem, PlasmaFS. This is a
user-space filesystem that is primarily accessed over RPC (but it is
also mountable as NFS volume). Actually, most of the effort went here.
PlasmaFS focuses on reliability and speed for big blocksizes. To get
this, it implements ACID transactions, replicates data and metadata with
two-phase commit, uses a shared memory data channel if possible, and
monitors itself. Unlike other filesystems for map/reduce, PlasmaFS
implements the complete set of usual file operations, including random
reads and writes. It can also be used as unspecialized global
filesystem.

Both pieces of software are bundled together in one download. The
project page with further links is

http://projects.camlcity.org/projects/plasma.html

There is now also a homepage at

http://plasma.camlcity.org

This is an early alpha release (0.3). A lot of things work already, and
you can already run distributed map/reduce jobs. However, it is in no
way complete.

Plasma is installable via GODI for Ocaml 3.12.

For discussions on specifics of Plasma there is a separate mailing list:

https://godirepo.camlcity.org/mailman/listinfo/plasma-list

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


                 reply	other threads:[~2011-02-01 15:52 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1296575568.24058.83.camel@thinkpad \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=plasma-list@ocaml-programming.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).