[9fans] RFC: 9p file system with queue semantics (as in message queues)

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] RFC: 9p file system with queue semantics (as in message queues)
@ 2010-02-25 22:21 Ciprian Dorin, Craciun
  2010-02-26 12:55 ` roger peppe
  0 siblings, 1 reply; 2+ messages in thread
From: Ciprian Dorin, Craciun @ 2010-02-25 22:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

    Hello all!

    In what follows I would like to ask for your comments regarding a
9p file server that exports a file system with a (message) queue
semantics. (My major interest here is more about the actual semantic
itself, and less about the implementation details. But all comments
are welcomed :) .)

    To keep things short I shall prepare a small description of my
proposal (for those with not too much time), and also a longer one
(with more details for the patient ones).


----------------------------------------
    Short version
----------------------------------------

    [Motivation] I want to obtain a message queue like IPC for
(distributed) applications, where it is not possible (or not wanted)
to implement / use an existing queueing library / implementation (like
JMS or AMQP based) (but where using the file system is a trivial
operation). Possible target languages: Bash, Tcl, Lua, even Python or
C.

    Also there could be legacy (or maybe just old) applications that
already use the file system like an IPC mechanism and which could be
just slightly updated to use a queue. Possible applications: anything
related with SMTP, just think about how qmail or Postfix works.

    [Solution] Implementing a 9p file server that exports a file
system with the following structure:
    / (this / is actually relative to the mount point)
    --> queues
        --> <queue-name>
            --> enqueue -- a folder
            --> dequeue -- a folder
            --> commit -- folder :)
            --> rollback -- still a folder

    Possible operations (I'm assuming we use it via shell scripting,
and the commands found on most UNIX-es):
    * queue access (creation if not existing, or just "opening" it):
        mkdir /queues/to-smtp-gateway
    * enqueue operation:
        cp    /.../path-to-my-email-file
/queues/to-smtp-gateway/enqueue/email-192832.eml
        # instead of cp I could just create and edit the file in that folder
        touch /queues/to-smtp-gateway/commit/email-192832.eml
    * dequeue operation:
        touch /queues/from-pop-server/dequeue/email-9283828.eml
        # if there is no data in the queue the touch operation fails
        # do something with the file like reading it or copying it
        touch /queues/from-pop-server/commit/email-9283828.eml
    * rollback:
        in any case just touch the same file name inside the rollback
folder and the entire operation is rollbacked

    End of short version. :)
    Comments? (Or go for the extended version.)


----------------------------------------
    Extended version
----------------------------------------

    [Motivation -- extended] My real motives are in fact somehow
different: for the moment I'm working at a university, and here we
have a large (by our standards, but I'm betting small by your
standards) cluster of Linux servers. Sometimes I have to run some
independent simulations or jobs on (parts) of this cluster. So my
possible solutions are:
    * Condor, Slurm, or any other true-and-tried queueing system --
the problems with them are that most are big (as in heavy) solutions,
which need a stable environment, are tedious to install, and need a
lot of care; (also they need root access to install and operate...)
    * Globus (either pre WS or g-Lite): I don't even want to enter
into this :) :) it just scares me... :) :)
    * XCPU -- I'm aware of XCPU, but it needs me to push tasks onto
the worker nodes... (it could be used for the execution of the jobs in
my queue;)
    * SSH + dtach -- my current solution -- I distribute an equal
number of job files to the servers, and then I just run a couple of
processes that try to grab a job file and execute it; (the problem is
that the job assignment is static and if one worker nodes finishes
early it just idles;)

    What I would like to have:
    * (on the submitter) just copy the job files in a folder on my
workstation (laptop) and that's all;
    * (on the worker nodes) just try to acquire a file from a folder,
execute it and write back the result to another folder;

    [Features] What should the queue file system support:
    * transactional processing of individual enqueues / dequeues: as
seen from my short description I want to be able to obtain the data
file (in case of dequeue), read it (maybe multiple times, as in open /
read / close, again open / read / close, etc.) and only when I'm done
processing it, I want to tell the system to commit the dequeue
operation;
    * transactional processing of multiple related enqueues /
dequeues: just think about an application that acts like a pipeline:
it dequeues a task, executes it and enqueues it for further processing
(to another queue); now the dequeue of the original message and the
enqueue of the processed message should be atomic; (this is of course
extended to multiple enqueues / dequeues from multiple queues);
    * (maybe) tagging a messages with some meta-data, and allowing me
to dequeue only those messages that are tagged in a certain way (think
of pattern matching in Clips or Prolog); (this allows me to match two
related messages from two different queues, wait until I have both of
them and to process them as one (like a join in a workflow)); (this
could allow me to implement something like map-reduce, if one process
chooses to dequeue all messages tagged in a certain way);
   * any other ideas? :)

    [Semantics] The semantics for the first feature set I've described
in the short explanation so I don't repeat them again here.

    For the multi-operation / multi-queue semantics I would propose
something like this:
    / (root)
    --> queues -- the same like before (only one operation is transactional)
    --> transactions
        --> <transaction-id>
            --> queues
                --> <queue-name>
                    --> enqueue
                    --> dequeue
                    --> commit -- only to allow applications to work
unmodified under the new transaction semantics
                    --> rollback
        --> commit -- overall transactions commit folder
        --> rollback -- likewise

    How these transactions work is simple: just `mkdir` a transaction
folder inside `transactions`. Then `mkdir` those queues that we want
to access. Then when rolling-back or commiting, just touch a file
named exactly like the transaction inside the `commit` or `rollback`
folder.

    Now about the names for transactions or enqueue / dequeue files: I
would have proposed UUIDs (and impose these names), as this would
reduce the likelihood of name clashes.

    Also because we have a central 9p file server that exports this
file system, there are two possible ways to "attach" the file system
to a node:
    * each client when attaches the same file system (`aname`), it
obtains a fresh view that is not shared with any other client; (thus
one node can't interfere with another one's transaction);
    * each client attaches the same `aname` and obtains a consolidated
view of all the other operations going on in the cluster; (we could
obtain thus distributed transactions;) (something like what was
obtained inside `/proc` inside a Beowulf cluster -- as I understood
from the XCPU and Beowulf papers;)


----------------------------------------
    Technical (as in implementation) details
----------------------------------------

    I already have the operations implemented in Python (in an OOP
fashion) (both individual transactions and multi-operation /
multi-queue operations thanks to BerkeleyDB). I've already managed to
export and test a (local) file system based on Fuse (but with a
slightly different way to obtain the semantics, and only for the
individual operation transactions.)

    About the 9p protocol, I've already implemented the protocol
(decoding messages from the client -> server, and encoding messages
from the server -> client, thus the server side) in Python, and I've
exposed it to the network with the help of Twisted framework. (The
message decoder / encoder, OOP entities that embody / hide the 9p file
system semantics, and Twisted protocol and factory are all decoupled
and can be reused independently.)

    If this works nice I'm thinking to moving (if time allows) to
RabbitMQ (obtaining now distributed queues), and Erlang (better
performance from the network part of the project). Another direction
would be to stick to BerkelyDB and add support for it's key / value
tables (as BTrees or hash tables).

    Any comments or observations about my technical choices?


---------- Finally the end :) :)


    I hope I haven't missed anything. And I also hope that at least
someone has reached this phrase :).

    Thanks all of you that have devoted time to my email,
    Ciprian Craciun.



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [9fans] RFC: 9p file system with queue semantics (as in message queues)
  2010-02-25 22:21 [9fans] RFC: 9p file system with queue semantics (as in message queues) Ciprian Dorin, Craciun
@ 2010-02-26 12:55 ` roger peppe
  0 siblings, 0 replies; 2+ messages in thread
From: roger peppe @ 2010-02-26 12:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

you might want to take a look at the Owen system
under inferno, which addresses some of your issues
here are some pointers:

http://inferno-owen.googlecode.com/hg/doc/owen/intro.pdf
http://www.vitanuova.com/papers/ugrid.pdf
http://code.google.com/p/inferno-owen/source/browse/man?r=d640591ab8c2faf30a8a44c9abfb02801ae2407d

On 25 February 2010 22:21, Ciprian Dorin, Craciun
<ciprian.craciun@gmail.com> wrote:
>    Hello all!
>
>    In what follows I would like to ask for your comments regarding a
> 9p file server that exports a file system with a (message) queue
> semantics. (My major interest here is more about the actual semantic
> itself, and less about the implementation details. But all comments
> are welcomed :) .)
>
>    To keep things short I shall prepare a small description of my
> proposal (for those with not too much time), and also a longer one
> (with more details for the patient ones).
>
>
> ----------------------------------------
>    Short version
> ----------------------------------------
>
>    [Motivation] I want to obtain a message queue like IPC for
> (distributed) applications, where it is not possible (or not wanted)
> to implement / use an existing queueing library / implementation (like
> JMS or AMQP based) (but where using the file system is a trivial
> operation). Possible target languages: Bash, Tcl, Lua, even Python or
> C.
>
>    Also there could be legacy (or maybe just old) applications that
> already use the file system like an IPC mechanism and which could be
> just slightly updated to use a queue. Possible applications: anything
> related with SMTP, just think about how qmail or Postfix works.
>
>    [Solution] Implementing a 9p file server that exports a file
> system with the following structure:
>    / (this / is actually relative to the mount point)
>    --> queues
>        --> <queue-name>
>            --> enqueue -- a folder
>            --> dequeue -- a folder
>            --> commit -- folder :)
>            --> rollback -- still a folder
>
>    Possible operations (I'm assuming we use it via shell scripting,
> and the commands found on most UNIX-es):
>    * queue access (creation if not existing, or just "opening" it):
>        mkdir /queues/to-smtp-gateway
>    * enqueue operation:
>        cp    /.../path-to-my-email-file
> /queues/to-smtp-gateway/enqueue/email-192832.eml
>        # instead of cp I could just create and edit the file in that folder
>        touch /queues/to-smtp-gateway/commit/email-192832.eml
>    * dequeue operation:
>        touch /queues/from-pop-server/dequeue/email-9283828.eml
>        # if there is no data in the queue the touch operation fails
>        # do something with the file like reading it or copying it
>        touch /queues/from-pop-server/commit/email-9283828.eml
>    * rollback:
>        in any case just touch the same file name inside the rollback
> folder and the entire operation is rollbacked
>
>    End of short version. :)
>    Comments? (Or go for the extended version.)
>
>
> ----------------------------------------
>    Extended version
> ----------------------------------------
>
>    [Motivation -- extended] My real motives are in fact somehow
> different: for the moment I'm working at a university, and here we
> have a large (by our standards, but I'm betting small by your
> standards) cluster of Linux servers. Sometimes I have to run some
> independent simulations or jobs on (parts) of this cluster. So my
> possible solutions are:
>    * Condor, Slurm, or any other true-and-tried queueing system --
> the problems with them are that most are big (as in heavy) solutions,
> which need a stable environment, are tedious to install, and need a
> lot of care; (also they need root access to install and operate...)
>    * Globus (either pre WS or g-Lite): I don't even want to enter
> into this :) :) it just scares me... :) :)
>    * XCPU -- I'm aware of XCPU, but it needs me to push tasks onto
> the worker nodes... (it could be used for the execution of the jobs in
> my queue;)
>    * SSH + dtach -- my current solution -- I distribute an equal
> number of job files to the servers, and then I just run a couple of
> processes that try to grab a job file and execute it; (the problem is
> that the job assignment is static and if one worker nodes finishes
> early it just idles;)
>
>    What I would like to have:
>    * (on the submitter) just copy the job files in a folder on my
> workstation (laptop) and that's all;
>    * (on the worker nodes) just try to acquire a file from a folder,
> execute it and write back the result to another folder;
>
>    [Features] What should the queue file system support:
>    * transactional processing of individual enqueues / dequeues: as
> seen from my short description I want to be able to obtain the data
> file (in case of dequeue), read it (maybe multiple times, as in open /
> read / close, again open / read / close, etc.) and only when I'm done
> processing it, I want to tell the system to commit the dequeue
> operation;
>    * transactional processing of multiple related enqueues /
> dequeues: just think about an application that acts like a pipeline:
> it dequeues a task, executes it and enqueues it for further processing
> (to another queue); now the dequeue of the original message and the
> enqueue of the processed message should be atomic; (this is of course
> extended to multiple enqueues / dequeues from multiple queues);
>    * (maybe) tagging a messages with some meta-data, and allowing me
> to dequeue only those messages that are tagged in a certain way (think
> of pattern matching in Clips or Prolog); (this allows me to match two
> related messages from two different queues, wait until I have both of
> them and to process them as one (like a join in a workflow)); (this
> could allow me to implement something like map-reduce, if one process
> chooses to dequeue all messages tagged in a certain way);
>   * any other ideas? :)
>
>    [Semantics] The semantics for the first feature set I've described
> in the short explanation so I don't repeat them again here.
>
>    For the multi-operation / multi-queue semantics I would propose
> something like this:
>    / (root)
>    --> queues -- the same like before (only one operation is transactional)
>    --> transactions
>        --> <transaction-id>
>            --> queues
>                --> <queue-name>
>                    --> enqueue
>                    --> dequeue
>                    --> commit -- only to allow applications to work
> unmodified under the new transaction semantics
>                    --> rollback
>        --> commit -- overall transactions commit folder
>        --> rollback -- likewise
>
>    How these transactions work is simple: just `mkdir` a transaction
> folder inside `transactions`. Then `mkdir` those queues that we want
> to access. Then when rolling-back or commiting, just touch a file
> named exactly like the transaction inside the `commit` or `rollback`
> folder.
>
>    Now about the names for transactions or enqueue / dequeue files: I
> would have proposed UUIDs (and impose these names), as this would
> reduce the likelihood of name clashes.
>
>    Also because we have a central 9p file server that exports this
> file system, there are two possible ways to "attach" the file system
> to a node:
>    * each client when attaches the same file system (`aname`), it
> obtains a fresh view that is not shared with any other client; (thus
> one node can't interfere with another one's transaction);
>    * each client attaches the same `aname` and obtains a consolidated
> view of all the other operations going on in the cluster; (we could
> obtain thus distributed transactions;) (something like what was
> obtained inside `/proc` inside a Beowulf cluster -- as I understood
> from the XCPU and Beowulf papers;)
>
>
> ----------------------------------------
>    Technical (as in implementation) details
> ----------------------------------------
>
>    I already have the operations implemented in Python (in an OOP
> fashion) (both individual transactions and multi-operation /
> multi-queue operations thanks to BerkeleyDB). I've already managed to
> export and test a (local) file system based on Fuse (but with a
> slightly different way to obtain the semantics, and only for the
> individual operation transactions.)
>
>    About the 9p protocol, I've already implemented the protocol
> (decoding messages from the client -> server, and encoding messages
> from the server -> client, thus the server side) in Python, and I've
> exposed it to the network with the help of Twisted framework. (The
> message decoder / encoder, OOP entities that embody / hide the 9p file
> system semantics, and Twisted protocol and factory are all decoupled
> and can be reused independently.)
>
>    If this works nice I'm thinking to moving (if time allows) to
> RabbitMQ (obtaining now distributed queues), and Erlang (better
> performance from the network part of the project). Another direction
> would be to stick to BerkelyDB and add support for it's key / value
> tables (as BTrees or hash tables).
>
>    Any comments or observations about my technical choices?
>
>
> ---------- Finally the end :) :)
>
>
>    I hope I haven't missed anything. And I also hope that at least
> someone has reached this phrase :).
>
>    Thanks all of you that have devoted time to my email,
>    Ciprian Craciun.
>
>



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-02-26 12:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-25 22:21 [9fans] RFC: 9p file system with queue semantics (as in message queues) Ciprian Dorin, Craciun
2010-02-26 12:55 ` roger peppe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).