9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Roman Shaposhnik <rvs@sun.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Changelogs & Patches?
Date: Mon, 19 Jan 2009 22:48:08 -0800	[thread overview]
Message-ID: <F421E2C8-1CC1-4D1D-BEFE-3E748AF0B136@sun.com> (raw)
In-Reply-To: <14ec7b180901060622q2a179705g53c7fe62e70ee90a@mail.gmail.com>

Hi Andrey!

Sorry, it took me a longer time to dig through the code than
I hoped to. So, if you're still game...

On Jan 6, 2009, at 6:22 AM, andrey mirtchovski wrote:
> i'm using zfs right now for a project storing a few terabytes worth of
> data and vm images.

Is it how it was from the get go, or did you use venti-based solutions
before?

> i have two zfs servers and about 10 pools of
> different sizes with several hundred different zfs filesystems and
> volumes of raw disk exported via iscsi.

What kind of clients are on the other side of iscsi?

> clones play a vital part in the whole set up (they number in the
> thousands).
> for what it's worth, zfs is the best thing in linux-world (sorry,
> solaris and *bsd too)

You're using it on Linux?

>> Fair enough. But YourTextGoesHere then becomes a transient property
>> of my namespace, where in case of ZFS it is truly a tag for a
>> snapshot.
>
> all snapshots have tags: their top-level sha1 score. what i supplied
> was simply a way to translate that to any random text. you don't need
> to, nor do you have to do this (by the way, do you get the irony of
> forcing snapshots to contain the '@' character in their name? sounds a
> lot like '#' to me ;)

Ok. Fair enough. I think I'm convinced on that point.

> snapshots are generally accessible via fossil as a directory with the
> date of the snapshot as its name. this starts making more sense when
> you take into consideration that snapshots are global per fossil, but
> then you can run several fossils without having them step on their
> toes when it comes to venti. at least until you get a collision in
> blocks' hashes.

Aha! And here are my first questions: you say that I can run multiple
fossils
off of the same venti and thus have a setup that is very close to zfs
clones:
    1. how do you do that exactly? fossil -f doesn't work for me (nor
should it
        according to the docs)
    2. how do you work around the fact that each fossil needs its own
         partition (unlike ZFS where all the clones can share the same
pool
         of blocks)?

> venti is write-once. if you instantiate a fossil from a venti score it
> is, by definition, read-only, as all changes to the current fossil
> will not appear to another fossil instantiated from the same venti
> score. changes are committed to venti once you do a fossil snap,
> however that automatically generates a new snapshot score (not
> modifying the old one). it should be clear from the paper.

I think I understand it now (except for the fossil -f part), but how do
you promote (zfs promote) such a clone?

>> where the second choice becomes a nuisance for me is in the case
>> where
> one has thousands of clones and needs to keep track of thousands of
> names in order to ensure that when the right one has finished the
> right clone disappears.

I see what you mean, but in case of venti -- nothing disappears, really.
 From that perspective you can sort of make those zfs clones linger.
The storage consumption won't be any different, right?

>>> - none of this can be done remotely
>>
>> Meaning?
>
> from machine X in the datacentre i want to be able to say "please
> create me a clone of the latest snapshot of this filesystem" without
> having to ssh to the solaris node running zfs.

Well, if its the protocol you don't like -- writing your own daemon
that will respond to such requests sounds like a trivial task
to me.

> i couldn't find the source for libzfs either, without having to
> register to the opensolaris developers' site.
[...]

> and i think i'm using a pretty new version of zfs and my experiences
> are, in fact, quite recent :)

well, the fact that you had to register in order to access the code
suggest a pretty dated experience ;-)
     http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/

> instead of reverse engineering a library that i have not much faith
> in, i wrote a python 9p server that uses local zfs/zpool commands to
> do what i could've done with C and libzfs. it's a hack but it gets the
> job done. now i can access block X of zfs volume Y remotely via 9p (at
> one third the speed, to be fair).

Well, Solaris desperately wanted to enter the Open Source geekdom
and from your experience it seems like it was a success ;-) Seriously
though, I personally found reading source code of zdb to be
absolutely illuminating about all sorts of things ZFS:
     http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/zdb/zdb.c

But yes -- just like with any unruly OS project you have to really
invest
your time if you want to tag along. I think it was Russ who made a
comment
that the Free Software is only free if your time has no value :-(

> i would be glad to help you understand the differences between zfs and
> fossil/venti with my limited knowledge of both.

Great! I tired to do as much homework as possible (hence the delay) but
I still have some questions left:
     0. A dumb one: what's the proper way of cleanly shutting down
fossil
     and venti?

    1. What's the use of copying arenas to CD/DVD? Is it purely back up,
     since they have to stay on-line forever?

    2. Would fossil/venti notice silent data corruptions in blocks?

    3. Do you think its a good idea to have volume management be
    part of filesystems, since that way you can try to heal the data
    on-the-fly?

    4. If I have a venti server and a bunch of sha1 codes, can I somehow
    instantiate a single fossil serving all of them under /archive?


Thanks,
Roman.




  parent reply	other threads:[~2009-01-20  6:48 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-22 15:27 Venkatesh Srinivas
2008-12-22 15:29 ` erik quanstrom
2008-12-22 16:41 ` Charles Forsyth
2008-12-25  6:34   ` Roman Shaposhnik
2008-12-25  6:40     ` erik quanstrom
2008-12-26  4:28       ` Roman Shaposhnik
2008-12-26  4:45         ` lucio
2008-12-26  4:57         ` Anthony Sorace
2008-12-26  6:19           ` blstuart
2008-12-27  8:00           ` Roman Shaposhnik
2008-12-27 11:56             ` erik quanstrom
2008-12-30  0:31               ` Roman Shaposhnik
2008-12-30  0:57                 ` erik quanstrom
2009-01-05  5:19                   ` Roman V. Shaposhnik
2009-01-05  5:28                     ` erik quanstrom
2008-12-22 17:03 ` Devon H. O'Dell
2008-12-23  4:31   ` Uriel
2008-12-23  4:46 ` Nathaniel W Filardo
2008-12-25  6:50   ` Roman Shaposhnik
2008-12-25 14:37     ` erik quanstrom
2008-12-26 13:27       ` Charles Forsyth
2008-12-26 13:33         ` Charles Forsyth
2008-12-26 14:27         ` tlaronde
2008-12-26 17:25           ` blstuart
2008-12-26 18:14             ` tlaronde
2008-12-26 18:20               ` erik quanstrom
2008-12-26 18:52                 ` tlaronde
2008-12-26 21:44                   ` blstuart
2008-12-26 22:04                     ` Eris Discordia
2008-12-26 22:30                       ` erik quanstrom
2008-12-26 23:00                         ` blstuart
2008-12-27  6:04                         ` Eris Discordia
2008-12-27 10:36                           ` tlaronde
2008-12-27 16:27                             ` Eris Discordia
2008-12-29 23:54         ` Roman Shaposhnik
2008-12-30  0:13           ` hiro
2008-12-30  1:07           ` erik quanstrom
2008-12-30  1:48           ` Charles Forsyth
2008-12-30 13:18             ` Uriel
2008-12-30 15:06               ` C H Forsyth
2008-12-30 17:31                 ` Uriel
2008-12-31  1:58                   ` Noah Evans
2009-01-03 22:03           ` sqweek
2009-01-05  5:05             ` Roman V. Shaposhnik
2009-01-05  5:12               ` erik quanstrom
2009-01-06  5:06                 ` Roman Shaposhnik
2009-01-06 13:55                   ` erik quanstrom
2009-01-05  5:24               ` andrey mirtchovski
2009-01-06  5:49                 ` Roman Shaposhnik
2009-01-06 14:22                   ` andrey mirtchovski
2009-01-06 16:19                     ` erik quanstrom
2009-01-06 23:23                       ` Roman V. Shaposhnik
2009-01-06 23:44                         ` erik quanstrom
2009-01-08  0:36                           ` Roman V. Shaposhnik
2009-01-08  1:11                             ` erik quanstrom
2009-01-20  6:20                               ` Roman Shaposhnik
2009-01-20 14:19                                 ` erik quanstrom
2009-01-20 22:30                                   ` Roman V. Shaposhnik
2009-01-20 23:36                                     ` erik quanstrom
2009-01-21  1:43                                       ` Roman V. Shaposhnik
2009-01-21  2:02                                         ` erik quanstrom
2009-01-26  6:28                                           ` Roman V. Shaposhnik
2009-01-26 13:42                                             ` erik quanstrom
2009-01-26 16:15                                               ` Roman V. Shaposhnik
2009-01-26 16:39                                                 ` erik quanstrom
2009-01-27  4:45                                                   ` Roman Shaposhnik
2009-01-21 19:02                                         ` Uriel
2009-01-21 19:53                                           ` Steve Simon
2009-01-24  3:15                                             ` Roman V. Shaposhnik
2009-01-24  3:36                                               ` erik quanstrom
2009-01-26  6:21                                                 ` Roman V. Shaposhnik
2009-01-26 13:53                                                   ` erik quanstrom
2009-01-26 16:21                                                     ` Roman V. Shaposhnik
2009-01-26 17:37                                                       ` erik quanstrom
2009-01-27  4:51                                                         ` Roman Shaposhnik
2009-01-27  5:44                                                           ` erik quanstrom
2009-01-21 20:01                                           ` erik quanstrom
2009-01-24  3:19                                           ` Roman V. Shaposhnik
2009-01-24  3:25                                             ` erik quanstrom
2009-01-20  6:48                     ` Roman Shaposhnik [this message]
2009-01-20 14:13                       ` erik quanstrom
2009-01-20 16:19                         ` Steve Simon
2009-01-20 23:52                       ` andrey mirtchovski
2009-01-21  4:49                         ` Dave Eckhardt
2009-01-21  6:38                         ` Steve Simon
2009-01-21 14:02                           ` erik quanstrom
2009-01-26  6:16                         ` Roman V. Shaposhnik
2009-01-26 16:22                           ` Russ Cox
2009-01-26 19:42                             ` Roman V. Shaposhnik
2009-01-26 20:11                               ` Steve Simon
2008-12-27  7:40       ` Roman Shaposhnik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F421E2C8-1CC1-4D1D-BEFE-3E748AF0B136@sun.com \
    --to=rvs@sun.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).