9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Enrico Weigelt <weigelt@metux.de>
To: 9fans@9fans.net
Subject: Re: [9fans] Fossil+Venti on Linux
Date: Thu, 29 May 2008 11:12:25 +0200	[thread overview]
Message-ID: <20080529091225.GA1617@nibiru.local> (raw)
In-Reply-To: <78feb60ec33f8a38ccbc38625b6ea653@quanstro.net>

* erik quanstrom <quanstro@quanstro.net> wrote:

> > As a more sophisticated aproach, I'm planning an *real* clustered
> > venti, which also keeps track of block atime's and copy-counters.
> > This way, seldomly used blocks can be removed from one node as long
> > as there are still enough copies in the cluster. (probably requires
> > redesign of the log architecture).
>
> one of venti's design goals was to structure the arenas so that filled
> arenas are immutable.  this is important for recoverability.  if you
> know the arena was filled and thus has not changed, any backup
> will do.
> put simply, venti trades the ability to delete for reliability.

Right, my approach would be an paradigm change. But my venti-2 would
used for completely different things: distributed data storage
instead of eternal log ;P

> since storage is very cheep, i think this is a good tradeoff.

I'm thinking of an scale where storage isn't that cheap ...

> > This still isn't an replicated/distributed fs, but an clustered
> > block storage, maybe even as a basis for an truely replicated fs.
> > BTW: with a bit more logic, we even could build something like
> > Amazon's S3 on that ;-)
>
> what problem are you trying to solve?  if you are trying to go for
> reliability, i would think it would be easier to use raid+backups
> for data stability.

Easier, yes, but more expensive (at least the iron).

> consider this case.  two fs want to add different files to the same
> directory "at the same time".  i don't see how block storage can
> help you with any of the problems that arise from this case.

It shouldn't, same as an RAID can't help an local fs with multiple
users addings files to the same directory.

In my concept, the distribution of the block storage has nothing
to do with the (eventual) distribution of the fs. My venti-2 will
be like a SAN, just with content-addressing :)
So, instead of an SAN or local RAID you can simply use an venti-2
cloud. The venti-clients (eg. fossil, vac, ...) do not any knowledge
about this fact.

A venti-based distributed filesystem is an completely different issue.
All nodes will store their (payload) data in one venti (-cloud).
Of course the nodes have to coordinate their actions (through an
separate channel), but this will only be required for metadata,
not payload. Data cache coherency isn't an issue anymore, since a
data block itself cannot change - only a file's data pointers,
which belong to metadata, will change.

For example, if only one node can write to a file (and writes don't
have to appear to others simultaniously reading the same file,
aka. transaction methodology ;-)), single files could be stored via
vac, and the fs-cluster only has to manage directories. The directory
server(s) than manage the permissions and directory updates. Each
commit of a new file or file change triggers an directory update.
This can be done transactionally via an RDBMS.

The fine thing of this concept is, the venti cloud could even be built
of hosts which aren't completely trusted (as long as data itself is
properly encrypted) - as long as there are enough copies and you've got
enough peerings in the cloud, single nodes can't harm your data.


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 cellphone: +49 174 7066481   email: info@metux.de   skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------



  parent reply	other threads:[~2008-05-29  9:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-25  7:56 Enrico Weigelt
2008-05-25 14:59 ` a
2008-05-25 15:48   ` Francisco J Ballesteros
2008-05-25 20:24     ` erik quanstrom
2008-05-26 12:58   ` Enrico Weigelt
2008-05-26 14:01     ` erik quanstrom
     [not found]     ` <78feb60ec33f8a38ccbc38625b6ea653@quanstro.net>
2008-05-29  9:12       ` Enrico Weigelt [this message]
2008-05-29  9:27         ` Christian Kellermann
2008-05-29 12:17           ` Enrico Weigelt
2008-05-29 13:51             ` Russ Cox
2008-05-29 12:26         ` erik quanstrom
2008-05-29 13:33           ` Wes Kussmaul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080529091225.GA1617@nibiru.local \
    --to=weigelt@metux.de \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).