caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.4
@ 2011-10-12 16:19 Gerd Stolpmann
  2011-10-13 22:07 ` Gerd Stolpmann
  0 siblings, 1 reply; 2+ messages in thread
From: Gerd Stolpmann @ 2011-10-12 16:19 UTC (permalink / raw)
  To: caml-list; +Cc: plasma-list

Hi,

I've just released Plasma-0.4. Plasma consists of two parts (for now),
namely Plasma MapReduce, a map/reduce compute framework, and PlasmaFS,
the underlying distributed filesystem.

Major changes in version 0.4:

      * Added a security system (including strong authentication, and
        authorization). This is a quite big change, and makes PlasmaFS a
        highly secure DFS.
      * Datanodes are now monitored, and failed nodes are automatically
        considered as unavailable. The monitoring system uses multicast
        messaging.
      * The namenode can now profit from multi-processing, removing a
        potential bottleneck.
      * Improved the caching subsystem.
      * Better management of file buffers in map/reduce jobs.

Of course, there are also numerous bug fixes and performance
improvements.

Plasma MapReduce is a distributed implementation of the map/reduce
algorithm scheme. In a sentence, map/reduce performs a parallel List.map
on an input file, sorts and splits the output by some criterion into
partitions, and runs a List.fold_left on each partition. Only that it
does not do that sequentially, but in a distributed way, and chunk by
chunk. Because of this Plasma MapReduce can process very large files,
and if run on enough computers, this also will work in reasonable time.
Of course, map and reduce are Ocaml functions here.

This all works on top of a distributed filesystem, PlasmaFS. This is a
user-space filesystem that is primarily accessed over RPC (but it is
also mountable as NFS volume). Actually, most of the effort went here.
PlasmaFS focuses on reliability and speed for big blocksizes. To get
this, it implements ACID transactions, replicates data and metadata with
two-phase commit, uses a shared memory data channel if possible, and
monitors itself. Unlike other filesystems for map/reduce, PlasmaFS
implements the complete set of usual file operations, including random
reads and writes. It can also be used as unspecialized global
filesystem.

Both pieces of software are bundled together in one download. The
project page with further links is

http://projects.camlcity.org/projects/plasma.html

There is now also a homepage at

http://plasma.camlcity.org

This is an early alpha release (0.4). A lot of things work already, and
you can already run distributed map/reduce jobs. However, it is in no
way complete.

Plasma is installable via GODI for Ocaml 3.12.

There is now a chart comparing Plasma with Hadoop. In one sentence,
PlasmaFS bases on a superior filesystem design, and has now to prove
that the implementation is really working. Plasma map/reduce generalizes
the algorithm scheme compared with Hadoop, but has still some
shortcomings in the implementation:

http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmafs_and_hdfs.html

http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmamr_and_hadoop.html


For discussions on specifics of Plasma there is a separate mailing list:

https://godirepo.camlcity.org/mailman/listinfo/plasma-list

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.4
  2011-10-12 16:19 [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.4 Gerd Stolpmann
@ 2011-10-13 22:07 ` Gerd Stolpmann
  0 siblings, 0 replies; 2+ messages in thread
From: Gerd Stolpmann @ 2011-10-13 22:07 UTC (permalink / raw)
  To: caml-list; +Cc: plasma-list

There is now even plasma-0.4.1, fixing some performance bugs.

Also, there are now some simple performance numbers:

http://plasma.camlcity.org/plasma/perf.html

Gerd

Am Mittwoch, den 12.10.2011, 18:19 +0200 schrieb Gerd Stolpmann:
> Hi,
> 
> I've just released Plasma-0.4. Plasma consists of two parts (for now),
> namely Plasma MapReduce, a map/reduce compute framework, and PlasmaFS,
> the underlying distributed filesystem.
> 
> Major changes in version 0.4:
> 
>       * Added a security system (including strong authentication, and
>         authorization). This is a quite big change, and makes PlasmaFS a
>         highly secure DFS.
>       * Datanodes are now monitored, and failed nodes are automatically
>         considered as unavailable. The monitoring system uses multicast
>         messaging.
>       * The namenode can now profit from multi-processing, removing a
>         potential bottleneck.
>       * Improved the caching subsystem.
>       * Better management of file buffers in map/reduce jobs.
> 
> Of course, there are also numerous bug fixes and performance
> improvements.
> 
> Plasma MapReduce is a distributed implementation of the map/reduce
> algorithm scheme. In a sentence, map/reduce performs a parallel List.map
> on an input file, sorts and splits the output by some criterion into
> partitions, and runs a List.fold_left on each partition. Only that it
> does not do that sequentially, but in a distributed way, and chunk by
> chunk. Because of this Plasma MapReduce can process very large files,
> and if run on enough computers, this also will work in reasonable time.
> Of course, map and reduce are Ocaml functions here.
> 
> This all works on top of a distributed filesystem, PlasmaFS. This is a
> user-space filesystem that is primarily accessed over RPC (but it is
> also mountable as NFS volume). Actually, most of the effort went here.
> PlasmaFS focuses on reliability and speed for big blocksizes. To get
> this, it implements ACID transactions, replicates data and metadata with
> two-phase commit, uses a shared memory data channel if possible, and
> monitors itself. Unlike other filesystems for map/reduce, PlasmaFS
> implements the complete set of usual file operations, including random
> reads and writes. It can also be used as unspecialized global
> filesystem.
> 
> Both pieces of software are bundled together in one download. The
> project page with further links is
> 
> http://projects.camlcity.org/projects/plasma.html
> 
> There is now also a homepage at
> 
> http://plasma.camlcity.org
> 
> This is an early alpha release (0.4). A lot of things work already, and
> you can already run distributed map/reduce jobs. However, it is in no
> way complete.
> 
> Plasma is installable via GODI for Ocaml 3.12.
> 
> There is now a chart comparing Plasma with Hadoop. In one sentence,
> PlasmaFS bases on a superior filesystem design, and has now to prove
> that the implementation is really working. Plasma map/reduce generalizes
> the algorithm scheme compared with Hadoop, but has still some
> shortcomings in the implementation:
> 
> http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmafs_and_hdfs.html
> 
> http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmamr_and_hadoop.html
> 
> 
> For discussions on specifics of Plasma there is a separate mailing list:
> 
> https://godirepo.camlcity.org/mailman/listinfo/plasma-list
> 
> Gerd
> -- 
> ------------------------------------------------------------
> Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
> Creator of GODI and camlcity.org.
> Contact details:        http://www.camlcity.org/contact.html
> Company homepage:       http://www.gerd-stolpmann.de
> *** Searching for new projects! Need consulting for system
> *** programming in Ocaml? Gerd Stolpmann can help you.
> ------------------------------------------------------------
> 
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-10-13 22:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-12 16:19 [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.4 Gerd Stolpmann
2011-10-13 22:07 ` Gerd Stolpmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).