[9fans] Changelogs & Patches?

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] Changelogs & Patches?
@ 2008-12-22 15:27 Venkatesh Srinivas
  2008-12-22 15:29 ` erik quanstrom
                   ` (3 more replies)
  0 siblings, 4 replies; 91+ messages in thread
From: Venkatesh Srinivas @ 2008-12-22 15:27 UTC (permalink / raw)
  To: 9fans

Hi,

The contrib index mentions that daily changelogs for Plan 9 are in
sources/extra/changes, but those haven't been updated since early 2007.
Is there any preferred way to get changelogs / diffs these days?

Also, in sources/patch, there are patches neither in applied/ or sorry/.
Are these patches in queue? Applied? Not applied?

Thanks,
-- vs

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 15:27 [9fans] Changelogs & Patches? Venkatesh Srinivas
@ 2008-12-22 15:29 ` erik quanstrom
  2008-12-22 16:41 ` Charles Forsyth
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2008-12-22 15:29 UTC (permalink / raw)
  To: 9fans

> Also, in sources/patch, there are patches neither in applied/ or sorry/.
> Are these patches in queue? Applied? Not applied?

in the queue and not applied.

- erik




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 15:27 [9fans] Changelogs & Patches? Venkatesh Srinivas
  2008-12-22 15:29 ` erik quanstrom
@ 2008-12-22 16:41 ` Charles Forsyth
  2008-12-25  6:34   ` Roman Shaposhnik
  2008-12-22 17:03 ` Devon H. O'Dell
  2008-12-23  4:46 ` Nathaniel W Filardo
  3 siblings, 1 reply; 91+ messages in thread
From: Charles Forsyth @ 2008-12-22 16:41 UTC (permalink / raw)
  To: 9fans

>Is there any preferred way to get changelogs / diffs these days?

i use

9fs sources
diff /whatever /n/sources/plan9/whatever

and after a pull

yesterday -d ...
when i'm especially curious or anxious.

it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!)
maintained by the command that applies patches, which appends the readme/notes
file(s) for each patch as it is applied.  not all changes are done through patches.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 15:27 [9fans] Changelogs & Patches? Venkatesh Srinivas
  2008-12-22 15:29 ` erik quanstrom
  2008-12-22 16:41 ` Charles Forsyth
@ 2008-12-22 17:03 ` Devon H. O'Dell
  2008-12-23  4:31   ` Uriel
  2008-12-23  4:46 ` Nathaniel W Filardo
  3 siblings, 1 reply; 91+ messages in thread
From: Devon H. O'Dell @ 2008-12-22 17:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2008/12/22 Venkatesh Srinivas <me@acm.jhu.edu>:
> Hi,
>
> The contrib index mentions that daily changelogs for Plan 9 are in
> sources/extra/changes, but those haven't been updated since early 2007.
> Is there any preferred way to get changelogs / diffs these days?

I used to maintain the changelogs, but ended up generating ENOTIME,
pretty much just as everyone else who has worked on that. It's
something I think I might pick up again; either Russ or Uriel emailed
me a set of scripts to maintain it. Perhaps I'll start doing it again;
it's mostly just a question of getting the scripts set up and doing
it.

--dho

> Also, in sources/patch, there are patches neither in applied/ or sorry/.
> Are these patches in queue? Applied? Not applied?
>
> Thanks,
> -- vs
>
>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 17:03 ` Devon H. O'Dell
@ 2008-12-23  4:31   ` Uriel
  0 siblings, 0 replies; 91+ messages in thread
From: Uriel @ 2008-12-23  4:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

It is pretty much a question of it being a totally backwards way of
doing things, with one set of people doing the changes, and another
set of people guessing the meaning of the changes writing the
changelog.

(This is claimed to be due to the first set of people not having the
time to writing down what changes they make. Of course those same
people seem to think the time spent when the second group has to
inquire as to the nature of changes is not wasteful.)

But following more conventional practices and heeding the crazy advice
of unqualified people like Brian when he writes:

"*Keep records*. I maintain a FIXES file that describes every change
to the code since the Awk book was published in 1988" [1]

would be anathema to the Plan 9 way of doing things.

uriel

[1]: http://www.cs.princeton.edu/~bwk/testing.html

On Mon, Dec 22, 2008 at 6:03 PM, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> 2008/12/22 Venkatesh Srinivas <me@acm.jhu.edu>:
>> Hi,
>>
>> The contrib index mentions that daily changelogs for Plan 9 are in
>> sources/extra/changes, but those haven't been updated since early 2007.
>> Is there any preferred way to get changelogs / diffs these days?
>
> I used to maintain the changelogs, but ended up generating ENOTIME,
> pretty much just as everyone else who has worked on that. It's
> something I think I might pick up again; either Russ or Uriel emailed
> me a set of scripts to maintain it. Perhaps I'll start doing it again;
> it's mostly just a question of getting the scripts set up and doing
> it.
>
> --dho
>
>> Also, in sources/patch, there are patches neither in applied/ or sorry/.
>> Are these patches in queue? Applied? Not applied?
>>
>> Thanks,
>> -- vs
>>
>>
>
>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 15:27 [9fans] Changelogs & Patches? Venkatesh Srinivas
                   ` (2 preceding siblings ...)
  2008-12-22 17:03 ` Devon H. O'Dell
@ 2008-12-23  4:46 ` Nathaniel W Filardo
  2008-12-25  6:50   ` Roman Shaposhnik
  3 siblings, 1 reply; 91+ messages in thread
From: Nathaniel W Filardo @ 2008-12-23  4:46 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

> Hi,
>
> The contrib index mentions that daily changelogs for Plan 9 are in
> sources/extra/changes, but those haven't been updated since early 2007.
> Is there any preferred way to get changelogs / diffs these days?

Relatedly, is there a better way to mirror the development history of Plan 9
than running "@{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar -x}" or
similar?

Thanks.
--nwf;

[-- Attachment #2: Type: application/pgp-signature, Size: 204 bytes --]

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-22 16:41 ` Charles Forsyth
@ 2008-12-25  6:34   ` Roman Shaposhnik
  2008-12-25  6:40     ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-25  6:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 22, 2008, at 8:41 AM, Charles Forsyth wrote:
>> Is there any preferred way to get changelogs / diffs these days?
>
> yesterday -d ...
> when i'm especially curious or anxious.

But yesterday won't work in a more lightweight environment (such as
9vx) will it?

> it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!)
> maintained by the command that applies patches, which appends the
> readme/notes
> file(s) for each patch as it is applied.  not all changes are done
> through patches.


Speaking of which -- is there any FAQ on the current development
practices
of the Plan9 project? Things like patch lifecycle, etc.?

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-25  6:34   ` Roman Shaposhnik
@ 2008-12-25  6:40     ` erik quanstrom
  2008-12-26  4:28       ` Roman Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2008-12-25  6:40 UTC (permalink / raw)
  To: 9fans

>>> Is there any preferred way to get changelogs / diffs these days?
>>
>> yesterday -d ...
>> when i'm especially curious or anxious.
>
> But yesterday won't work in a more lightweight environment (such as
> 9vx) will it?

exactly the same as plan 9 does.

as long as the fs supports a dump fs, 9vx will support yesterday.

for example, i've been mounting my diskless fs with 9vx.  yesterday
works just fine.  i'm sure you could use a linux-based venti with
plan 9-based fossil as well.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-23  4:46 ` Nathaniel W Filardo
@ 2008-12-25  6:50   ` Roman Shaposhnik
  2008-12-25 14:37     ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-25  6:50 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 22, 2008, at 8:46 PM, Nathaniel W Filardo wrote:
>> Hi,
>>
>> The contrib index mentions that daily changelogs for Plan 9 are in
>> sources/extra/changes, but those haven't been updated since early
>> 2007.
>> Is there any preferred way to get changelogs / diffs these days?
>
> Relatedly, is there a better way to mirror the development history
> of Plan 9
> than running "@{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar -
> x}" or
> similar?

I surely hope the festive mood of the season will protect me from being
ostracized for asking this, but is there any chance to map Plan9
development practices to some of the established ways of source
code management? I mostly long for things like being able to browse
Plan9 history with a clear understanding of who did what and for
what reason.

Say what you will about Linux kernel, but things like these:
    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary
surely make it much more bearable to work with^H^H^H^H^H around.

Thanks,
Roman.

P.S. I see that Russ uses Mercurial SCM for some of his other
projects, so may be my question is not that weird, after all...

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-25  6:50   ` Roman Shaposhnik
@ 2008-12-25 14:37     ` erik quanstrom
  2008-12-26 13:27       ` Charles Forsyth
  2008-12-27  7:40       ` Roman Shaposhnik
  0 siblings, 2 replies; 91+ messages in thread
From: erik quanstrom @ 2008-12-25 14:37 UTC (permalink / raw)
  To: 9fans

> I surely hope the festive mood of the season will protect me from being
> ostracized for asking this, but is there any chance to map Plan9
> development practices to some of the established ways of source
> code management? I mostly long for things like being able to browse
> Plan9 history with a clear understanding of who did what and for
> what reason.

in the holiday spirit ☺, isn't this similar logic
1  scm packages are peace hope and light, everybody knows that.
2. if you don't use a scum package you are in the darkness
3. if you are in the darkness, you must be saved, or be cast into
the pit.

despite the season, and typical attitudes, i don't think that
development practices are a spiritual or moral decision.
they are a practical one.  and what they have done at the
labs appears to be working to me.  in my own experience,
i've found scum always to cost time.  but my big objection
is the automatic merge.  automatic merges make it way too
easy to merge bad code without reviewing the diffs.

while a descriptive history is good, it takes a lot of extra work
to generate.  just because it's part of the scum process doesn't
make it free.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-25  6:40     ` erik quanstrom
@ 2008-12-26  4:28       ` Roman Shaposhnik
  2008-12-26  4:45         ` lucio
  2008-12-26  4:57         ` Anthony Sorace
  0 siblings, 2 replies; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-26  4:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 24, 2008, at 10:40 PM, erik quanstrom wrote:
>>>> Is there any preferred way to get changelogs / diffs these days?
>>>
>>> yesterday -d ...
>>> when i'm especially curious or anxious.
>>
>> But yesterday won't work in a more lightweight environment (such as
>> 9vx) will it?
>
> exactly the same as plan 9 does.
>
> as long as the fs supports a dump fs, 9vx will support yesterday.

True. But not having an fs that supports dump is exactly what makes
9vx a lighter weight environment (unless I'm grossly mistaken and
#Z in 9vx actually has a way of supporting dump).

> for example, i've been mounting my diskless fs with 9vx.  yesterday
> works just fine.  i'm sure you could use a linux-based venti with
> plan 9-based fossil as well.


True, but I'd really like to NOT have any extra software running and
still have and ability to do replica/* and yesterday under 9vx.

Can this be done?

Thanks,
Roman.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26  4:28       ` Roman Shaposhnik
@ 2008-12-26  4:45         ` lucio
  2008-12-26  4:57         ` Anthony Sorace
  1 sibling, 0 replies; 91+ messages in thread
From: lucio @ 2008-12-26  4:45 UTC (permalink / raw)
  To: 9fans

> True, but I'd really like to NOT have any extra software running and
> still have and ability to do replica/* and yesterday under 9vx.

I'm only vaguely familiar with 9vx, so there I can't speak, but you
can certainly do replica/* as it is a user-level tool and as for
yesterday, you can apply it to "/n/sources", which is what you seem to
imply is your requirement.

++L

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26  4:28       ` Roman Shaposhnik
  2008-12-26  4:45         ` lucio
@ 2008-12-26  4:57         ` Anthony Sorace
  2008-12-26  6:19           ` blstuart
  2008-12-27  8:00           ` Roman Shaposhnik
  1 sibling, 2 replies; 91+ messages in thread
From: Anthony Sorace @ 2008-12-26  4:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

depends what you mean by "extra". if that means "outside 9vx", then
yes; if it means "besides what 9vx uses by default", no.

yesterday(1) relies on having dump-style snapshots. 9vx, as shipped,
gets its root file system from #Z, which doesn't have snapshots.

erik offered some suggestions for hosting various bits of things
outside 9vx and connecting to that in order to get the dumps. those
options are valid, but you can just as well host the entire thing
within 9vx. it's not the default configuration, but i believe
instructions are out there (9fans or the wiki).

using fossil for your root, instead of #Z, will obviously cost you the
benefits of #Z - namely, the pass-through transparency. if your
primary interest is for replica/*, though, you might consider the
direction i've been headed: root from fossil, but import $home or /usr
from #Z.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26  4:57         ` Anthony Sorace
@ 2008-12-26  6:19           ` blstuart
  2008-12-27  8:00           ` Roman Shaposhnik
  1 sibling, 0 replies; 91+ messages in thread
From: blstuart @ 2008-12-26  6:19 UTC (permalink / raw)
  To: 9fans

> using fossil for your root, instead of #Z, will obviously cost you the
> benefits of #Z - namely, the pass-through transparency. if your
> primary interest is for replica/*, though, you might consider the
> direction i've been headed: root from fossil, but import $home or /usr
> from #Z.

That's close to what I'm doing.  When I'm running stand-alone,
I boot from fossil, bind #Z to /n/unix and bind my UNIX
home directory to a mount point in the fossil file system.
When running as a terminal, I boot from my file server, and
still use pretty much the same binds.

BLS




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-25 14:37     ` erik quanstrom
@ 2008-12-26 13:27       ` Charles Forsyth
  2008-12-26 13:33         ` Charles Forsyth
                           ` (2 more replies)
  2008-12-27  7:40       ` Roman Shaposhnik
  1 sibling, 3 replies; 91+ messages in thread
From: Charles Forsyth @ 2008-12-26 13:27 UTC (permalink / raw)
  To: 9fans

>while a descriptive history is good, it takes a lot of extra work
>to generate.

i've rarely found per-change histories to be any more useful than most other comments, i'm afraid.
you'd hope it would answer "what was he thinking?" but i found either it was obvious or i still had to ask.
still, perhaps it could be regarded as an aid to future computer archaeologists, after
all shared context has been lost.

the intention of things like /CHANGES is mainly to point out moderate to large changes (eg, if you've
been waiting for a bug fix or there's a significant change to usage or operation).
it isn't intended to give details or rationale of the fix, any more than there is any of that for the
original code, really.  perhaps literate programming will fix that if it ever takes off.
(the set of people that write good descriptions and the set of people that write good code
don't necessarily have a big intersection.)  for larger additions or changes i sometimes wrote
short notes giving the background, the changes/additions and the rationale for them,
ranging from the equivalent of a long e-mail to a several-page paper. that worked quite
well, but was somewhat more work.

also useful for compilers are links to bug demonstration programs and regression tests.

the advantage of dump and snap is that the scope is the whole system: including emails, discussion documents,
the code, supporting tools -- everything in digital form.  if software works differently today
compared to yesterday, then

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 13:27       ` Charles Forsyth
@ 2008-12-26 13:33         ` Charles Forsyth
  2008-12-26 14:27         ` tlaronde
  2008-12-29 23:54         ` Roman Shaposhnik
  2 siblings, 0 replies; 91+ messages in thread
From: Charles Forsyth @ 2008-12-26 13:33 UTC (permalink / raw)
  To: 9fans

>the advantage of dump and snap is that the scope is the whole system: including emails, discussion documents,
>the code, supporting tools -- everything in digital form.  if software works differently today
>compared to yesterday, then

sorry, got cut off.   then in most cases, i'd expect 9fs dump to make it easy to track down the
set of differences, and narrow the search to the culprit. it might not even be a source change,
but a configuration file, or a file was moved or removed.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 13:27       ` Charles Forsyth
  2008-12-26 13:33         ` Charles Forsyth
@ 2008-12-26 14:27         ` tlaronde
  2008-12-26 17:25           ` blstuart
  2008-12-29 23:54         ` Roman Shaposhnik
  2 siblings, 1 reply; 91+ messages in thread
From: tlaronde @ 2008-12-26 14:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Dec 26, 2008 at 01:27:49PM +0000, Charles Forsyth wrote:
> perhaps literate programming will fix that if it ever takes off.

I use CWEB (D. Knuth and Levy's) intensively and it is indeed
invaluable.
It doesn't magically improve code (my first attempts have just shown
how poor my programming was: it's a magnifying glass, and one just saw
with it bug's blinking eyes with bright smiles).

It is absolutely easy to use. But it is not another mean for
programming, but another way.

But once you think about what you want to do (and recognize the layout
of CWEB as the layout of good old text books---the paragraphs), and
start putting down "axioms" and implementing the correct pieces, the
payoff is great in consistency and conciseness, hence in maintenance.
(At the beginning, I was writting "books", and descriptions were
long and poor, even sometimes pure non-sense. Quality has increased
while length has decreased.)

BTW, I also use CVS and record a short description of the modifications
or extensions made. But to be honest, except for tagging what fault has
been suppressed and from which version, the remaining has not been of any
use (it is supposed to be correctly explained in the doc written with
CWEB...).

I also use CVS as a backup mean, i.e. a lot of short time changing
revisions have no engineering sense since there are only backup of
a work in progress. So my use of CVS is an impure one and can not claim
to resort exclusively to engineering.

I do plan to set up a plan9 file server. But it's TODO.
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 14:27         ` tlaronde
@ 2008-12-26 17:25           ` blstuart
  2008-12-26 18:14             ` tlaronde
  0 siblings, 1 reply; 91+ messages in thread
From: blstuart @ 2008-12-26 17:25 UTC (permalink / raw)
  To: 9fans

> I use CWEB (D. Knuth and Levy's) intensively and it is indeed
> invaluable.
> It doesn't magically improve code (my first attempts have just shown
> how poor my programming was: it's a magnifying glass, and one just saw
> with it bug's blinking eyes with bright smiles).

Back when I used CWEB on a regular basis (I don't find myself
writing as much substantive code from scratch of late), I
experienced an interesting phenomenon.  I could write
pretty good code, almost as a stream of consciousness.
The tool made it natural to present the code in the order
in which I could understand it, rather than the order the
compiler wanted it.  But it was the effect of this that was
really interesting.  I found that as I wrote I'd think in terms
of several things I needed to do and I'd put placeholders in
(chunk names) for all but the one I was writing just then.
As I'd finish a chunk, I'd go back an find another one
that I hadn't written yet, and I could easily pick them in
the order I figured out the way I wanted to handle it.
At some point, I just ran out of chunks that needed to
be written, and the code would be done.  It was almost
as if the completion of the code snuck up on me.  At
first, it was sort of a "maybe Knuth's on to something
here" but it happened often enough that I now consider
it a basic feature of the style.

Back to the topic in question though, I did find that
writing and maintaining good descriptions tool almost
as much discipline as any other code documentation.
I did have to resist the urge to leave the textual part
of a chunk blank and just write the code.  I also had
to be diligent about updating the descriptions when
the code changed.  But for whatever reason (asthetics,
tool, living up to Knuth's example...) it did seem a
little easier in that context.

However, in terms of changelogs and such, I'd say
that's still an open question.  It would seem that there
should be some way to automate the creation of a
changelog (at least in the form of a list of pointers)
from the literate source.  But the literate style itself
doesn't really seem to create anything new in terms
of the high level overview that you'd see in release
notes or changelogs.

BLS

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 17:25           ` blstuart
@ 2008-12-26 18:14             ` tlaronde
  2008-12-26 18:20               ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: tlaronde @ 2008-12-26 18:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Dec 26, 2008 at 11:25:33AM -0600, blstuart@bellsouth.net wrote:
>
> Back when I used CWEB on a regular basis (I don't find myself
> writing as much substantive code from scratch of late), I
> experienced an interesting phenomenon.  I could write
> pretty good code, almost as a stream of consciousness.
> The tool made it natural to present the code in the order
> in which I could understand it, rather than the order the
> compiler wanted it.

Yes, but this means you have adapted the way you are writing the code
to the logics behind litterate programming. Starting with a "structured
programming" approach (litterate is indeed more) is probably the best.
If, as I have done..., one looks to the finger instead of the moon, and
takes it to be a way for formatting comments, with all the bells and
whistles of TeX, one is definitively not on the right track---and that's
why the packages to format C comments embedded in source is definitely
not the same.

Once you get at it, it really helps as you describe. (I have one library
that I wrote almost in one go---the Esri's SHAPE lib support for
KerGIS--- and that does the job; but it was not the first, but it was
the first I wrote with explanations in _french_, my native and thinking
language; so now, since I think in french, I write in french---but code,
including identifiers and one line comments are in \CEE. This is the
second lesson I learned).

>
> However, in terms of changelogs and such, I'd say
> that's still an open question.  It would seem that there
> should be some way to automate the creation of a
> changelog (at least in the form of a list of pointers)
> from the literate source.  But the literate style itself
> doesn't really seem to create anything new in terms
> of the high level overview that you'd see in release
> notes or changelogs.

I like text, because of diffs. And CWEB has diffs ;) You can even confer
this with Brooks' "The mythical man-month", and adapting slightly CWEB
diffs features will gave the highlighting changes doc Brooks has written
about.

Even with data, to get to the point one needs only diffs (I use it with
vectorial map stuff to highlight what changes have been made between
different versions provided by surveyors. This with the ability to show
the state of data at YYYY-MM-DD hh:mm:ss is invaluable.)

That is one of the many reasons I found plan9 so interesting: text
oriented.
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 18:14             ` tlaronde
@ 2008-12-26 18:20               ` erik quanstrom
  2008-12-26 18:52                 ` tlaronde
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2008-12-26 18:20 UTC (permalink / raw)
  To: 9fans

>> Back when I used CWEB on a regular basis (I don't find myself
>> writing as much substantive code from scratch of late), I

is it just me, or is hard to read someone else's cweb code?
if it's not just me...

i wonder if the same reason it's easy to write from the top
down doesn't make it hard to read.  you have to be thinking
the same way from the top otherwise you're lost.

appropriately, this being a plan 9 list and all, i find code
written from the bottom up easier to read.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 18:20               ` erik quanstrom
@ 2008-12-26 18:52                 ` tlaronde
  2008-12-26 21:44                   ` blstuart
  0 siblings, 1 reply; 91+ messages in thread
From: tlaronde @ 2008-12-26 18:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:
> appropriately, this being a plan 9 list and all, i find code
> written from the bottom up easier to read.
>

Depending on the task (on the aim of the software), one happens to split
from top to bottom, and to review and amend from bottom to top.
There is a navigation between the two.

Bottom to top is more easier because you are building more complicate
stuff from basic stuff.

But the definition of these elements (the software ortho normal base),
the justification of these elements can be in part, has to be in
part, a result of a top to bottom thought.

The general papers about Unix and Plan 9, the explanations of the logics
of the whole can not be, IMHO, tagged as "bottom to top". They are
simply to the point ;)
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 18:52                 ` tlaronde
@ 2008-12-26 21:44                   ` blstuart
  2008-12-26 22:04                     ` Eris Discordia
  0 siblings, 1 reply; 91+ messages in thread
From: blstuart @ 2008-12-26 21:44 UTC (permalink / raw)
  To: 9fans

> On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:
>> appropriately, this being a plan 9 list and all, i find code
>> written from the bottom up easier to read.
>
> Depending on the task (on the aim of the software), one happens to split
> from top to bottom, and to review and amend from bottom to top.
> There is a navigation between the two.
>
> Bottom to top is more easier because you are building more complicate
> stuff from basic stuff.

Some time back, I was trying to understand how to teach the
reality of composing software.  (Yes, I do think of it as a creative
activity very similar to composing music.)  The top-down and
bottom-up ideas abound and make sense, but they never seemed
to capture the reality.  Then one day, after introspecting on the
way I write code, I realized it's not one or the other; it's outside-in.
I don't know what little tools I need to build until I have some
sense of the big picture, but I can't really establish the exact
boundaries between major elements until I've worked out the
cleanest way to build the lower-level bits.  So I iterative work
back and forth between big picture and building blocks until
they meet in the middle.

As an aside, that's also when I realized what had always bugged
me about the classic approach to team programming.  The
interfaces between major parts really comes last, but in assigning
work to team members, you have to force it to come first.
And of course, from that perpsective, it makes perfect sense
why the best examples of programming are ones where the
first versions are created by only 1 or 2 people and why the
monstrosities created by large teams of "professional software
engineers" are so often massive collections of mechanisms
that don't work well together.

BLS

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 21:44                   ` blstuart
@ 2008-12-26 22:04                     ` Eris Discordia
  2008-12-26 22:30                       ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Eris Discordia @ 2008-12-26 22:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> The Story of Mel
>
> [...]
>
> I compared Mel's hand-optimized programs with the same code massaged by
> the optimizing assembler program, and Mel's always ran faster. That was
> because the "top-down" method of program design hadn't been invented
> yet, and Mel wouldn't have used it anyway. He wrote the innermost parts
> of his program loops first, so they would get first choice of the optimum
> address locations on the drum. The optimizing assembler wasn't smart
> enough to do it that way.
>
> [...]

-- <http://catb.org/jargon/html/story-of-mel.html>

Know why Mel is no more in business? 'Cause one man can only do so much
work. The Empire State took many men to build, so did Khufu's pyramid, and
there was no whining about "many mechanisms that don't work well together."
Now go call your managers "PHBs."

--On Friday, December 26, 2008 3:44 PM -0600 blstuart@bellsouth.net wrote:

>> On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:
>>> appropriately, this being a plan 9 list and all, i find code
>>> written from the bottom up easier to read.
>>
>> Depending on the task (on the aim of the software), one happens to split
>> from top to bottom, and to review and amend from bottom to top.
>> There is a navigation between the two.
>>
>> Bottom to top is more easier because you are building more complicate
>> stuff from basic stuff.
>
> Some time back, I was trying to understand how to teach the
> reality of composing software.  (Yes, I do think of it as a creative
> activity very similar to composing music.)  The top-down and
> bottom-up ideas abound and make sense, but they never seemed
> to capture the reality.  Then one day, after introspecting on the
> way I write code, I realized it's not one or the other; it's outside-in.
> I don't know what little tools I need to build until I have some
> sense of the big picture, but I can't really establish the exact
> boundaries between major elements until I've worked out the
> cleanest way to build the lower-level bits.  So I iterative work
> back and forth between big picture and building blocks until
> they meet in the middle.
>
> As an aside, that's also when I realized what had always bugged
> me about the classic approach to team programming.  The
> interfaces between major parts really comes last, but in assigning
> work to team members, you have to force it to come first.
> And of course, from that perpsective, it makes perfect sense
> why the best examples of programming are ones where the
> first versions are created by only 1 or 2 people and why the
> monstrosities created by large teams of "professional software
> engineers" are so often massive collections of mechanisms
> that don't work well together.
>
> BLS
>
>







^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 22:04                     ` Eris Discordia
@ 2008-12-26 22:30                       ` erik quanstrom
  2008-12-26 23:00                         ` blstuart
  2008-12-27  6:04                         ` Eris Discordia
  0 siblings, 2 replies; 91+ messages in thread
From: erik quanstrom @ 2008-12-26 22:30 UTC (permalink / raw)
  To: 9fans

> Know why Mel is no more in business? 'Cause one man can only do so much
> work. The Empire State took many men to build, so did Khufu's pyramid, and
> there was no whining about "many mechanisms that don't work well together."
> Now go call your managers "PHBs."

building a pyramid, starting at the top is one of those things
that just doesn't scale.

- erik



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 22:30                       ` erik quanstrom
@ 2008-12-26 23:00                         ` blstuart
  2008-12-27  6:04                         ` Eris Discordia
  1 sibling, 0 replies; 91+ messages in thread
From: blstuart @ 2008-12-26 23:00 UTC (permalink / raw)
  To: 9fans

> building a pyramid, starting at the top is one of those things
> that just doesn't scale.

But if you figure out how, it's probably worth a Nobel.

BLS




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 22:30                       ` erik quanstrom
  2008-12-26 23:00                         ` blstuart
@ 2008-12-27  6:04                         ` Eris Discordia
  2008-12-27 10:36                           ` tlaronde
  1 sibling, 1 reply; 91+ messages in thread
From: Eris Discordia @ 2008-12-27  6:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> building a pyramid, starting at the top is one of those things
> that just doesn't scale.

For that, you have "bottom-up," right? But there's no "meet-in-the-middle"
for a pyramid, or for software. Unless, the big picture is small enough to
fit in one man's head and let him "context-switch" back and forth between
general and particular, in which case you have to give up expanding
software functionality at the one man barrier.

All admirable architecture, and admirable software, is, in addition to
being manifestation of great technique, manifestation of great
management--even informal management is management in the end. Instead of
"it all begins with Adam and Steve," as Brian Stuart suggests, ways have
been found of managing large teams of people with different specializations
and those ways work. The Mgmt has a raison d'etre, despite what
techno-people like to suggest.

--On Friday, December 26, 2008 5:30 PM -0500 erik quanstrom
<quanstro@quanstro.net> wrote:

>> Know why Mel is no more in business? 'Cause one man can only do so much
>> work. The Empire State took many men to build, so did Khufu's pyramid,
>> and  there was no whining about "many mechanisms that don't work well
>> together."  Now go call your managers "PHBs."
>
> building a pyramid, starting at the top is one of those things
> that just doesn't scale.
>
> - erik
>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-25 14:37     ` erik quanstrom
  2008-12-26 13:27       ` Charles Forsyth
@ 2008-12-27  7:40       ` Roman Shaposhnik
  1 sibling, 0 replies; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-27  7:40 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 25, 2008, at 6:37 AM, erik quanstrom wrote:
>>
> despite the season, and typical attitudes, i don't think that
> development practices are a spiritual or moral decision.
> they are a practical one.

Absolutely! Agreed 100%. My original question was not
at all aimed at "saving" Plan9 development practices from
the fiery inferno. Far from it. I simply wanted to figure out
whether the things that really help me follow the development
of other open source projects are available under Plan9. It is
ok for them to be different (e.g. not based on traditional SCMs)
and it is even ok for them not to be available at all.

> and what they have done at the labs appears to be working to me.

It surely does work in a sense that Plan9 is very much alive and
kicking.
But there are also some things that make following Plan9 development
and doing software archeology more difficult that, lets say, plan9port.

It very well may be just my own ignorance (in which case, please
educate me on these subjects) but my current impression is that
sources.cs.bell-labs.com is the de-facto SCM for Plan9. In fact,
it is the only way to get new source into the official tree, yet still
have some ability to track the old stuff via main/archive. This model,
however well suited for the closely-knitted inner circle of developers,
makes it difficult for me to follow the project. Why? Well, here's my
top reason:
      Plan9 development history is not "quantized" in atomic
changesets, but
      rather in 24hour periods. Even if a developer wanted to record
the fact
      that a particular state of the tree corresponds to a bug fix or
a feature
      implementation the only way to do that would be not to allow any
other
      changes in within the 24h window. This seem rather awkward.

Two less severe problems are the lack of easy tracking of change
ownership
and code migration through time and space. Both are quite important when
one tries to figure out how (and why!) did we get from
    /n/sourcesdump/2002/*
to
   /n/sourcesdump/2008/*

> in my own experience, i've found scum always to cost time.
> but my big objection is the automatic merge.  automatic merges
> make it way to easy to merge bad code without reviewing the diffs.
>
> while a descriptive history is good, it takes a lot of extra work
> to generate.  just because it's part of the scum process doesn't
> make it free.

Agreed. As much as there's price to pay when one tries to
write clean code, there's a price to pay when one tries to
maintain a clean history(*). In both cases, however, I, personally,
would gladly pay that price. Otherwise I simply risk insanity
if the project gets over a couple thousand lines of code or
a more than a year old.

Thanks,
Roman.

(*) My definition of a clean history is a set of smallest self-reliant
changesets.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26  4:57         ` Anthony Sorace
  2008-12-26  6:19           ` blstuart
@ 2008-12-27  8:00           ` Roman Shaposhnik
  2008-12-27 11:56             ` erik quanstrom
  1 sibling, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-27  8:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 25, 2008, at 8:57 PM, Anthony Sorace wrote:
> erik offered some suggestions for hosting various bits of things
> outside 9vx and connecting to that in order to get the dumps. those
> options are valid, but you can just as well host the entire thing
> within 9vx. it's not the default configuration, but i believe
> instructions are out there (9fans or the wiki).
>
> using fossil for your root, instead of #Z, will obviously cost you the
> benefits of #Z - namely, the pass-through transparency.

That's a good advice. Thanks. I wonder, however, if such a transparency
can be achieved the other way around -- serving my entire home
directory via fossil from plan9port under UNIX and 9vx. Has anyone
tried such a config?

> if your primary interest is for replica/*,

I'm actually still trying to figure out how replica/* fits together with
sources being a fossil server. These two, somehow, have to
click, but I haven't figured out the connection just yet. Any pointers
to the good docs?

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-27  6:04                         ` Eris Discordia
@ 2008-12-27 10:36                           ` tlaronde
  2008-12-27 16:27                             ` Eris Discordia
  0 siblings, 1 reply; 91+ messages in thread
From: tlaronde @ 2008-12-27 10:36 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sat, Dec 27, 2008 at 06:04:42AM +0000, Eris Discordia wrote:
> "it all begins with Adam and Steve," as Brian Stuart suggests, ways have
> been found of managing large teams of people with different specializations
> and those ways work. The Mgmt has a raison d'etre, despite what
> techno-people like to suggest.

Because when, say Napoleon was commanding hundreds of thousands of
soldiers, he was not commanding individually hundreds of thousands of
soldiers. But he gave order to a handful, giving orders each to a
handful etc. But is was his idea that was going from to to bottom.

French: "main tenir": holding ("tenir") in one hand ("main").

You can "maintenir" a huge software if it is orthogonalized: when you
take one piece, not the whole plate of spaghetti comes (it just pulling
on the articulation, the communication, the API with the rest).

And for people, the military adds: hold in one hand, so the other is
free to slap when needed (and a foot free to kick if first lesson was
not received strong enough).
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-27  8:00           ` Roman Shaposhnik
@ 2008-12-27 11:56             ` erik quanstrom
  2008-12-30  0:31               ` Roman Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2008-12-27 11:56 UTC (permalink / raw)
  To: 9fans

> I'm actually still trying to figure out how replica/* fits together with
> sources being a fossil server. These two, somehow, have to
> click, but I haven't figured out the connection just yet. Any pointers
> to the good docs?

there's no connection.  replica would work without a fossil server.
for that matter, replica would work without a dump.  all you need
is an original and a changed version.

i used replica (plus a few additional tools) to make a faithful copy
of the coraid fileserver.  http://www.quanstro.net/plan9/history.pdf

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-27 10:36                           ` tlaronde
@ 2008-12-27 16:27                             ` Eris Discordia
  0 siblings, 0 replies; 91+ messages in thread
From: Eris Discordia @ 2008-12-27 16:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I'm baffled. Slap me, or kick me--your choice.

--On Saturday, December 27, 2008 11:36 AM +0100 tlaronde@polynum.com wrote:

> On Sat, Dec 27, 2008 at 06:04:42AM +0000, Eris Discordia wrote:
>> "it all begins with Adam and Steve," as Brian Stuart suggests, ways have
>> been found of managing large teams of people with different
>> specializations  and those ways work. The Mgmt has a raison d'etre,
>> despite what  techno-people like to suggest.
>
> Because when, say Napoleon was commanding hundreds of thousands of
> soldiers, he was not commanding individually hundreds of thousands of
> soldiers. But he gave order to a handful, giving orders each to a
> handful etc. But is was his idea that was going from to to bottom.
>
> French: "main tenir": holding ("tenir") in one hand ("main").
>
> You can "maintenir" a huge software if it is orthogonalized: when you
> take one piece, not the whole plate of spaghetti comes (it just pulling
> on the articulation, the communication, the API with the rest).
>
> And for people, the military adds: hold in one hand, so the other is
> free to slap when needed (and a foot free to kick if first lesson was
> not received strong enough).
> --
> Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
>                  http://www.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
>







^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-26 13:27       ` Charles Forsyth
  2008-12-26 13:33         ` Charles Forsyth
  2008-12-26 14:27         ` tlaronde
@ 2008-12-29 23:54         ` Roman Shaposhnik
  2008-12-30  0:13           ` hiro
                             ` (3 more replies)
  2 siblings, 4 replies; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-29 23:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 26, 2008, at 5:27 AM, Charles Forsyth wrote:
>> while a descriptive history is good, it takes a lot of extra work
>> to generate.
>
> i've rarely found per-change histories to be any more useful than
> most other comments, i'm afraid.

I believe that it all depends on what is it that you look at source
code for. Long time ago I used
to study mathematics. Soviet mathematical schooling was really quite
exceptional, but there
was one thing that I now wish was different. You see, soviet math got
a Bourbaki virus in its
early childhood. And that meant that math texts and math teaching was
all about polished
final results. None of that messy and disgusting process of actually
discovering those results.
None. The process itself was considered too imprecise and muddy:
    "Rigor consisted in getting rid of an accretion of superfluous
details. Conversely, lack of rigor
     gave my father an impression of a proof where one was walking in
mud, where one had to pick
     up some sort of filth in order to get ahead. Once that filth was
taken away, one could get at the
     mathematical object, a sort of crystallized body whose essence is
its structure."
                                                               From: http://ega-math.ru/Cartier.htm
And thus the circle of those who "just got it" was formed.

Back when I was a student, I wanted to belong to that circle so badly,
that I missed a fundamental
point: the very creation of the circle turned all of us from active
participants in the process into
art gallery goers. And that was a fine change for those who just
wanted to appreciate fine
math, but was a kiss of death for less gifted individuals who wanted
to do math themselves (I won't
touch the subject of whether less gifted individuals are supposed to
do math in the first place, since
its too personal and painful).

Ok, with math it is a bit difficult to have the records of the process
AND the final object at the
same time (well, good teachers understood that and their lectures were
the ones worth attending).
But in software engineering we DO have a chance to have our cake and
eat it too. Albeit only
if we put as much focus on maintaining history (our records of the
process) as we put on
maintaing the code itself (final results).

> the advantage of dump and snap is that the scope is the whole
> system: including emails, discussion documents,
> the code, supporting tools -- everything in digital form.  if
> software works differently today
> compared to yesterday, then in most cases, i'd expect 9fs dump to
> make it easy to track down the
> set of differences, and narrow the search to the culprit. it might
> not even be a source change,
> but a configuration file, or a file was moved or removed.

I don't deny that 9fs dump is quite useful and it seems to match the
organization of Plan9 developer
club pretty well. Personally, though, I'd say that the usefulness of
the dump would be greatly improved
if one had an ability to do ad-hoc archival snapshots AND assigning
tags, not only dates to them.

That would, in effect, bring the whole process quite close to what
established SCMs do. With
the only major feature (the ability to easily trade history between
different hosts) still missing.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-29 23:54         ` Roman Shaposhnik
@ 2008-12-30  0:13           ` hiro
  2008-12-30  1:07           ` erik quanstrom
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: hiro @ 2008-12-30  0:13 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

So is it time for a new file server then? :D



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-27 11:56             ` erik quanstrom
@ 2008-12-30  0:31               ` Roman Shaposhnik
  2008-12-30  0:57                 ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2008-12-30  0:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Dec 27, 2008, at 3:56 AM, erik quanstrom wrote:
>> I'm actually still trying to figure out how replica/* fits together
>> with
>> sources being a fossil server. These two, somehow, have to
>> click, but I haven't figured out the connection just yet. Any
>> pointers
>> to the good docs?
>
> there's no connection.  replica would work without a fossil server.
> for that matter, replica would work without a dump.  all you need
> is an original and a changed version.

Got it. The bit that I didn't quite get initially was the fact that
there's
history accumulated in dumps and that history might need to be
transferred *exactly* like it is to a different fileserver. And with
replica
only transferring the end result ("present moment" in history
terms) there seemed to be a missing link...

> i used replica (plus a few additional tools) to make a faithful copy
> of the coraid fileserver.  http://www.quanstro.net/plan9/history.pdf

...but your article answered that last question completely. Although,
I wonder whether direct transfer of history between two venti
servers would be possible.

Thanks,
Roman.

P.S. I also didn't quite understand the business of synchronizing Qids.
I have always thought that they are only meaningful for the duration
of the server's lifetime and thus all applications are quite immune to
potential Qid changes as long as the connection get dropped and
re-established. Or was it that your goal was to migrate so seamlessly
that *running* applications wouldn't notice?

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30  0:31               ` Roman Shaposhnik
@ 2008-12-30  0:57                 ` erik quanstrom
  2009-01-05  5:19                   ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2008-12-30  0:57 UTC (permalink / raw)
  To: 9fans

> ...but your article answered that last question completely. Although,
> I wonder whether direct transfer of history between two venti
> servers would be possible.

if one were to transfer history between two fs with the same on-disk
format, a simple device copy would be sufficient.  i was moving from
a 32-bit 4k block fs to geoff's 64 bit work with 8k blocks.

history is not a property of venti.  venti is a sparse virtual drive
with ~2^80 bits storage.  blocks are addressed by sha1 hash of
their content. fossil is the fileserver.  the analogy would be a change
in fossil format.  my technique would work for fossil, too.

> P.S. I also didn't quite understand the business of synchronizing Qids.
> I have always thought that they are only meaningful for the duration
> of the server's lifetime and thus all applications are quite immune to
> potential Qid changes as long as the connection get dropped and
> re-established. Or was it that your goal was to migrate so seamlessly
> that *running* applications wouldn't notice?

that's okay.  russ think's i'm nuts on this point, too.

perhaps the paper wasn't fully clear.  i wanted to make the assertion
that if on the original fs,qid(patha) == qid(pathb) then on the new
fs, qid(patha') == qid(pathb').  the qids weren't the same.  for
various reasons (i.e.  not every copy of every file makes it to a
dump), they can't be.  it's just a very complicated way of saying, i
didn't want to recopy the same data needlessly and increase the size
of the fs.  i just couldn't think of an easy way of making the same
assertion another way without reading every file for each day of
the dump.  remember, the original fs was a pentium ii with a
100mbit ethernet card.  it's still took 2 weeks to copy the data
to the new fs.

and russ is right in that it was overkill.  but, hey, if it's worth doing,
it's worth doing in grand excess.

oh, by the way, the replica db's are reusable.  they could also, if
one wished by generated by the fs as part of the dump process.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-29 23:54         ` Roman Shaposhnik
  2008-12-30  0:13           ` hiro
@ 2008-12-30  1:07           ` erik quanstrom
  2008-12-30  1:48           ` Charles Forsyth
  2009-01-03 22:03           ` sqweek
  3 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2008-12-30  1:07 UTC (permalink / raw)
  To: 9fans

> I don't deny that 9fs dump is quite useful and it seems to match the
> organization of Plan9 developer
> club pretty well. Personally, though, I'd say that the usefulness of
> the dump would be greatly improved
> if one had an ability to do ad-hoc archival snapshots AND assigning
> tags, not only dates to them.

i can't recommed reading ken's kernel (the fs) enough.
it's recognizable as related to plan 9, but it is much simplier.
it can afford to be static.

it would be nifty, if an early version of the fs
(with typedef long Device) could be put up on sources
for historical interest.

- erik




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-29 23:54         ` Roman Shaposhnik
  2008-12-30  0:13           ` hiro
  2008-12-30  1:07           ` erik quanstrom
@ 2008-12-30  1:48           ` Charles Forsyth
  2008-12-30 13:18             ` Uriel
  2009-01-03 22:03           ` sqweek
  3 siblings, 1 reply; 91+ messages in thread
From: Charles Forsyth @ 2008-12-30  1:48 UTC (permalink / raw)
  To: 9fans

>> i've rarely found per-change histories to be any more useful than
>> most other comments, i'm afraid.

>And that meant that math texts and math teaching was   all about polished
>final results.

ah. my statement was ambiguous.
i meant per-change chatter in the history, not the changes in the history.
it's fine to have the chatter, but it isn't essential, because nothing
relies on it, in the sense that the chatter causes the system to change its
behaviour.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30  1:48           ` Charles Forsyth
@ 2008-12-30 13:18             ` Uriel
  2008-12-30 15:06               ` C H Forsyth
  0 siblings, 1 reply; 91+ messages in thread
From: Uriel @ 2008-12-30 13:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Knowing *who* made the change is often even more useful than the change comment.

uriel

On Tue, Dec 30, 2008 at 2:48 AM, Charles Forsyth <forsyth@terzarima.net> wrote:
>>> i've rarely found per-change histories to be any more useful than
>>> most other comments, i'm afraid.
>
>>And that meant that math texts and math teaching was   all about polished
>>final results.
>
> ah. my statement was ambiguous.
> i meant per-change chatter in the history, not the changes in the history.
> it's fine to have the chatter, but it isn't essential, because nothing
> relies on it, in the sense that the chatter causes the system to change its
> behaviour.
>
>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30 13:18             ` Uriel
@ 2008-12-30 15:06               ` C H Forsyth
  2008-12-30 17:31                 ` Uriel
  0 siblings, 1 reply; 91+ messages in thread
From: C H Forsyth @ 2008-12-30 15:06 UTC (permalink / raw)
  To: 9fans

>Knowing *who* made the change is often even more useful than the change comment.

yes. i use ls -lm on our trees, but that might not work on less direct things like sources.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30 15:06               ` C H Forsyth
@ 2008-12-30 17:31                 ` Uriel
  2008-12-31  1:58                   ` Noah Evans
  0 siblings, 1 reply; 91+ messages in thread
From: Uriel @ 2008-12-30 17:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth <forsyth@vitanuova.com> wrote:
>>Knowing *who* made the change is often even more useful than the change comment.
>
> yes. i use ls -lm on our trees, but that might not work on less direct things like sources.

It would work if the development trees were public...

uriel



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30 17:31                 ` Uriel
@ 2008-12-31  1:58                   ` Noah Evans
  0 siblings, 0 replies; 91+ messages in thread
From: Noah Evans @ 2008-12-31  1:58 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

http://code.google.com/hosting/createProject


On Tue, Dec 30, 2008 at 12:31 PM, Uriel <uriel99@gmail.com> wrote:
> On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth <forsyth@vitanuova.com> wrote:
>>>Knowing *who* made the change is often even more useful than the change comment.
>>
>> yes. i use ls -lm on our trees, but that might not work on less direct things like sources.
>
> It would work if the development trees were public...
>
> uriel
>
>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-29 23:54         ` Roman Shaposhnik
                             ` (2 preceding siblings ...)
  2008-12-30  1:48           ` Charles Forsyth
@ 2009-01-03 22:03           ` sqweek
  2009-01-05  5:05             ` Roman V. Shaposhnik
  3 siblings, 1 reply; 91+ messages in thread
From: sqweek @ 2009-01-03 22:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> Personally, though, I'd say that the usefulness of the
> dump would be greatly improved
> if one had an ability to do ad-hoc archival snapshots AND assigning tags,
> not only dates to them.

 Tags don't make that much sense in this context since the dump is for
the whole filesystem, not a specific project. However, tagging a
source tree can be done with a simple dircp. It's not as though the
duplicate data costs you anything when you're backed by venti.
-sqweek



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-03 22:03           ` sqweek
@ 2009-01-05  5:05             ` Roman V. Shaposhnik
  2009-01-05  5:12               ` erik quanstrom
  2009-01-05  5:24               ` andrey mirtchovski
  0 siblings, 2 replies; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-05  5:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Sun, 2009-01-04 at 07:03 +0900, sqweek wrote:
> On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> > Personally, though, I'd say that the usefulness of the
> > dump would be greatly improved
> > if one had an ability to do ad-hoc archival snapshots AND assigning tags,
> > not only dates to them.
>
>  Tags don't make that much sense in this context since the dump is for
> the whole filesystem, not a specific project.

Well, as Charles pointed out -- in case of Plan9 development the whole
system is the entire project.

> However, tagging a source tree can be done with a simple dircp.
> It's not as though the duplicate data costs you anything when you're
> backed by venti.

Hm. Good point. Although timing wise, I'd expect dircp to be dreadfully
slow.

Well, I guess I really got spoiled by ZFS's ability to do things like
    $ zfs snapshot pool/projects/foo@YourTextGoesHere
and especially:
    $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/branch

I'm still trying to figure out what kind of approximation of the above
would be possible with fossil/venti.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-05  5:05             ` Roman V. Shaposhnik
@ 2009-01-05  5:12               ` erik quanstrom
  2009-01-06  5:06                 ` Roman Shaposhnik
  2009-01-05  5:24               ` andrey mirtchovski
  1 sibling, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-05  5:12 UTC (permalink / raw)
  To: 9fans

> Well, I guess I really got spoiled by ZFS's ability to do things like
>     $ zfs snapshot pool/projects/foo@YourTextGoesHere
> and especially:
>     $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/branch
>
> I'm still trying to figure out what kind of approximation of the above
> would be possible with fossil/venti.

how about making a copy?  venti will coalesce duplicate blocks.

- erik




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2008-12-30  0:57                 ` erik quanstrom
@ 2009-01-05  5:19                   ` Roman V. Shaposhnik
  2009-01-05  5:28                     ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-05  5:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 2008-12-29 at 19:57 -0500, erik quanstrom wrote:
> history is not a property of venti.  venti is a sparse virtual drive
> with ~2^80 bits storage.  blocks are addressed by sha1 hash of
> their content. fossil is the fileserver.  the analogy would be a change
> in fossil format.  my technique would work for fossil, too.

I see now. Its interesting to note how it seems to map the division
of labor in ZFS, where we have ZFS pools as virtual drives. Now
that I've mentioned it, I think it'll do me good to try and evaluate
fossil/venti through the lens of ZFS. I guess its not a coincedence
that ZFS actually has a built-in support for the kind of history
transfer you were implementing.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-05  5:05             ` Roman V. Shaposhnik
  2009-01-05  5:12               ` erik quanstrom
@ 2009-01-05  5:24               ` andrey mirtchovski
  2009-01-06  5:49                 ` Roman Shaposhnik
  1 sibling, 1 reply; 91+ messages in thread
From: andrey mirtchovski @ 2009-01-05  5:24 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Well, I guess I really got spoiled by ZFS's ability to do things like
>    $ zfs snapshot pool/projects/foo@YourTextGoesHere

at the console type "snap". if you're allowing snaps to be mounted on
the local fs then the equivalent would be "mkdir /YourTextGoesHere;
bind /n/dump/... / /YourTextGoesHere". note that zfs restricts where
the snapshot can be mounted :p venti snapshots are, by default, read
only.

>    $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/branch

that's as simple as starting a new fossil with -f 'somehex', where
"somehex" is the score of the corresponding snap.

this gives you both read-only snapshots, and as many clones as you wish.

note that you're cheating here, and by quite a bit:

- snapshots are read only and generally unmountable (unless you go
through the effort of making them so by setting a special option,
which i'm not sure is per-snapshot)

- clones can only be created off of snapshots

- clones are read-writable but they can only be mounted within the
/pool/fs/branch hierarchy. if you want to share them you need to
explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;

- none of this can be done remotely

- libzfs has an unpublished interface, so if one wants to, say, write
a 9p server to expose zfs functionality to remote hosts they must
either reverse engineer libzfs or use other means.

so, while i'm sure you enjoy zfs quite a bit, for others used to
plan9's venti/fossil way of doing things zfs can be quite a pain.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-05  5:19                   ` Roman V. Shaposhnik
@ 2009-01-05  5:28                     ` erik quanstrom
  0 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-05  5:28 UTC (permalink / raw)
  To: 9fans

> fossil/venti through the lens of ZFS. I guess its not a coincedence
> that ZFS actually has a built-in support for the kind of history
> transfer you were implementing.

the transfer would have been trivial, had the filesystems been
compatable.  what i did was reenact the actions that built the
original fs on the new fs by manipulating the clock on the target.

- erik




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-05  5:12               ` erik quanstrom
@ 2009-01-06  5:06                 ` Roman Shaposhnik
  2009-01-06 13:55                   ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-06  5:06 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Jan 4, 2009, at 9:12 PM, erik quanstrom wrote:
>> Well, I guess I really got spoiled by ZFS's ability to do things like
>>    $ zfs snapshot pool/projects/foo@YourTextGoesHere
>> and especially:
>>    $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/
>> branch
>>
>> I'm still trying to figure out what kind of approximation of the
>> above
>> would be possible with fossil/venti.
>
> how about making a copy?  venti will coalesce duplicate blocks.

But wouldn't you still need to send these blocks over the wire (thus
consuming bandwidth and time)?

Thanks,
Roman.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-05  5:24               ` andrey mirtchovski
@ 2009-01-06  5:49                 ` Roman Shaposhnik
  2009-01-06 14:22                   ` andrey mirtchovski
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-06  5:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Cool! Looks like I found a "bi-lingual" person! ;-) Andrey,
would you mind if I ask you to translate some other things
between ZFS and venti/fossil for me?

On Jan 4, 2009, at 9:24 PM, andrey mirtchovski wrote:
>> Well, I guess I really got spoiled by ZFS's ability to do things like
>>   $ zfs snapshot pool/projects/foo@YourTextGoesHere
>
> at the console type "snap". if you're allowing snaps to be mounted on
> the local fs then the equivalent would be "mkdir /YourTextGoesHere;
> bind /n/dump/... / /YourTextGoesHere".

Fair enough. But YourTextGoesHere then becomes a transient property
of my namespace, where in case of ZFS it is truly a tag for a snapshot.

> note that zfs restricts where
> the snapshot can be mounted :p venti snapshots are, by default, read
> only.

Well, strictly speaking Solaris does have a reasonable approximation
of bind in a form of lofs -- so remapping default ZFS mount point to
something else is not a big deal.

>>   $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/branch
>
> that's as simple as starting a new fossil with -f 'somehex', where
> "somehex" is the score of the corresponding snap.
>
> this gives you both read-only snapshots,

Meaning?

> and as many clones as you wish.

Cool!

> note that you're cheating here, and by quite a bit:

Lets see about that ;-)

> - snapshots are read only and generally unmountable (unless you go
> through the effort of making them so by setting a special option,
> which i'm not sure is per-snapshot)

Huh? That's weird -- I routinely access them via
      /<pool>/<fs>/.zfs/snapshot/<snapshot name>
and I don't remember setting any kind of options. The visibility
of .zfs can be tweaked, but all it really affects is Tab in bash ;-)

> - clones can only be created off of snapshots

But that does sound reasonable. What else there is except snapshots
and an active tree? Or are you objecting to the extra step that is
needed where you really want to clone the active tree?

> - clones are read-writable but they can only be mounted within the
> /pool/fs/branch hierarchy. if you want to share them you need to
> explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;

In general -- this is true :-( But I think there's a way now to do that.
If you're really interested -- I can take a look and let you know.

> - none of this can be done remotely

Meaning?

> - libzfs has an unpublished interface, so if one wants to, say, write
> a 9p server to expose zfs functionality to remote hosts they must
> either reverse engineer libzfs or use other means.

This one is a bit unfair. The interface is published alright. As much
as anything in Open Source is. It is also documented at the level
that would be considered reasonable for Linux. The fact that
it is not *stable* makes the usual thorough Solaris documentation
lacking.

But all in all, following along doesn't require much more extra
effort compared to following along any other evolving OS
project.

And yes, the situation has changed compared to what it used to
be when Solaris 10 just came out. If you had bad experience
with libzfs sometime ago -- I'm sorry, but if you try again you
might find it more to your linking.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06  5:06                 ` Roman Shaposhnik
@ 2009-01-06 13:55                   ` erik quanstrom
  0 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-06 13:55 UTC (permalink / raw)
  To: 9fans

> >> I'm still trying to figure out what kind of approximation of the
> >> above
> >> would be possible with fossil/venti.
> >
> > how about making a copy?  venti will coalesce duplicate blocks.
>
> But wouldn't you still need to send these blocks over the wire (thus
> consuming bandwidth and time)?

key word "approximation".  ☺

assuming that not all of your tree is in cache,
moving the blocks over the wire would be much
faster than the disk access.  assuming just gbe,
you should be able to copy 50mb/s out of and
back into the same venti server.  how big are
your snapshots that this would be a problem?

i don't know enough about fossil's structure, but i think
you could write a specialized.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06  5:49                 ` Roman Shaposhnik
@ 2009-01-06 14:22                   ` andrey mirtchovski
  2009-01-06 16:19                     ` erik quanstrom
  2009-01-20  6:48                     ` Roman Shaposhnik
  0 siblings, 2 replies; 91+ messages in thread
From: andrey mirtchovski @ 2009-01-06 14:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

i'm using zfs right now for a project storing a few terabytes worth of
data and vm images. i have two zfs servers and about 10 pools of
different sizes with several hundred different zfs filesystems and
volumes of raw disk exported via iscsi. clones play a vital part in
the whole set up (they number in the thousands). for what it's worth,
zfs is the best thing in linux-world (sorry, solaris and *bsd too) for
that kind of task. my comment is that, coming from fossil/venti, zfs
feels just a bit more convoluted and there are more special cases that
seem like a design mishap at least when compared to what i'm used to
in the world i'm coming from.

i'll try to explain:

> Fair enough. But YourTextGoesHere then becomes a transient property
> of my namespace, where in case of ZFS it is truly a tag for a snapshot.

all snapshots have tags: their top-level sha1 score. what i supplied
was simply a way to translate that to any random text. you don't need
to, nor do you have to do this (by the way, do you get the irony of
forcing snapshots to contain the '@' character in their name? sounds a
lot like '#' to me ;)

snapshots are generally accessible via fossil as a directory with the
date of the snapshot as its name. this starts making more sense when
you take into consideration that snapshots are global per fossil, but
then you can run several fossils without having them step on their
toes when it comes to venti. at least until you get a collision in
blocks' hashes.

in fact, i'm so used to fossil's dated snapshots that in my setup i
have restricted 'YourTextGoesHere' to actually be a date. that gives
me so much more context in the case where something goes wrong and i
have to go back through the snapshots for a filesystem or a volume to
find the last known good one.

> Well, strictly speaking Solaris does have a reasonable approximation
> of bind in a form of lofs -- so remapping default ZFS mount point to
> something else is not a big deal.

did not know that

>
>>>  $ zfs clone pool/projects/foo@YourTextGoesHere pool/projects/branch
>>
>> that's as simple as starting a new fossil with -f 'somehex', where
>> "somehex" is the score of the corresponding snap.
>>
>> this gives you both read-only snapshots,
>
> Meaning?

venti is write-once. if you instantiate a fossil from a venti score it
is, by definition, read-only, as all changes to the current fossil
will not appear to another fossil instantiated from the same venti
score. changes are committed to venti once you do a fossil snap,
however that automatically generates a new snapshot score (not
modifying the old one). it should be clear from the paper.

>> - snapshots are read only and generally unmountable (unless you go
>> through the effort of making them so by setting a special option,
>> which i'm not sure is per-snapshot)
>
> Huh? That's weird -- I routinely access them via
>     /<pool>/<fs>/.zfs/snapshot/<snapshot name>
> and I don't remember setting any kind of options. The visibility
> of .zfs can be tweaked, but all it really affects is Tab in bash ;-)
>
>> - clones can only be created off of snapshots
>
> But that does sound reasonable. What else there is except snapshots
> and an active tree? Or are you objecting to the extra step that is
> needed where you really want to clone the active tree?

i have .zfs exports turned off (it's off by default) because the
read-only snapshots are useless in my environment. instead i must
create clones off one or many snapshots and keep track and delete them
when their tasks have been accomplished.

this is an example of the design decision difference between
fossil/venti and zfs: venti commits storage permanently and everything
becomes a snapshot, while the designers of zfs decided to create a
two-stage process introducing a read-only intermediary between the
original data and a read-write access to it independent of other
clients.

where the second choice becomes a nuisance for me is in the case where
one has thousands of clones and needs to keep track of thousands of
names in order to ensure that when the right one has finished the
right clone disappears. it's good that zfs can handle so many,
otherwise it would've been useless.

note that other systems take the plan9 approach to heart: qemu for
example has the -snapshot argument which allows me to boot many VMs,
fossil-style, off a single vm image without worrying whether they'll
step on each other's toes. that way seems so much simpler and natural
to me, but then i'm jaded by venti :)

>> - clones are read-writable but they can only be mounted within the
>> /pool/fs/branch hierarchy. if you want to share them you need to
>> explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;
>
> In general -- this is true :-( But I think there's a way now to do that.
> If you're really interested -- I can take a look and let you know.

my problem is with the local/remote duality of exports: if i create a
zfs cloned filesystem it's immediately locally available and perhaps
(via 'sharenfs' inheritance from its parent) i can mount it via nfs
from a remote node. if i create a zfs cloned volume i need to arrange
an iscsi method of access from a remote node. both nfs and iscsi have
a host of nasty settings that need to be correct on both ends in order
for things to work right. i can never hope to export an nfs share
outside my DMZ.

i don't see a solution to this problem: the unix world is committed to
nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so
i listed it as a complaint.

>> - none of this can be done remotely
>
> Meaning?

from machine X in the datacentre i want to be able to say "please
create me a clone of the latest snapshot of this filesystem" without
having to ssh to the solaris node running zfs.

>> - libzfs has an unpublished interface, so if one wants to, say, write
>> a 9p server to expose zfs functionality to remote hosts they must
>> either reverse engineer libzfs or use other means.
>
>
> This one is a bit unfair. The interface is published alright. As much
> as anything in Open Source is. It is also documented at the level
> that would be considered reasonable for Linux. The fact that
> it is not *stable* makes the usual thorough Solaris documentation
> lacking.
>
> But all in all, following along doesn't require much more extra
> effort compared to following along any other evolving OS
> project.

i wanted to write a filesystem exporting zfs command functionality to
nodes within a datacentre (create/modify/delete/list
filesystems/volumes/snapshots). i tried reading the libzfs
documentation for linking to it and couldn't find such documentation.
i couldn't find the source for libzfs either, without having to
register to the opensolaris developers' site.

instead of reverse engineering a library that i have not much faith
in, i wrote a python 9p server that uses local zfs/zpool commands to
do what i could've done with C and libzfs. it's a hack but it gets the
job done. now i can access block X of zfs volume Y remotely via 9p (at
one third the speed, to be fair).

and i think i'm using a pretty new version of zfs and my experiences
are, in fact, quite recent :)

i would be glad to help you understand the differences between zfs and
fossil/venti with my limited knowledge of both.

cheers: andrey

nb: please don't take this as a wholesale criticism of zfs. as stated
earlier, it is quite a sane system to work with. my gripes only appear
when one compares it to the "fossil/venti experience".

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06 14:22                   ` andrey mirtchovski
@ 2009-01-06 16:19                     ` erik quanstrom
  2009-01-06 23:23                       ` Roman V. Shaposhnik
  2009-01-20  6:48                     ` Roman Shaposhnik
  1 sibling, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-06 16:19 UTC (permalink / raw)
  To: 9fans

very interesting post.

> this is an example of the design decision difference between
> fossil/venti and zfs: venti commits storage permanently and everything
> becomes a snapshot, while the designers of zfs decided to create a
> two-stage process introducing a read-only intermediary between the
> original data and a read-write access to it independent of other
> clients.

a big difference between the decisions is in data integrety.
it's much easier to break a fs that rewrites than it is a worm-based
fs.  even if the actual media are the same.  and a broken rewriting
fs is much harder to recover.  russ wrote up a bit on recovering one
good venti from an old copy and a damaged current venti.  this
same approach, (basically fs | fs') works for any worm fs.

> from a remote node. if i create a zfs cloned volume i need to arrange
> an iscsi method of access from a remote node. both nfs and iscsi have
> a host of nasty settings that need to be correct on both ends in order
> for things to work right. i can never hope to export an nfs share
> outside my DMZ.
>
> i don't see a solution to this problem: the unix world is committed to
> nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so
> i listed it as a complaint.

oh, my perfect chance to shill aoe!  how to configure aoe on plan 9
	echo bind /net/ether0>/dev/aoe/ctl
now for the hard part
	# (this space intentionally left blank.)

- erik




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06 16:19                     ` erik quanstrom
@ 2009-01-06 23:23                       ` Roman V. Shaposhnik
  2009-01-06 23:44                         ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-06 23:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-06 at 11:19 -0500, erik quanstrom wrote:
> very interesting post.

indeed. I actually need some time to digest it ;-)

> > this is an example of the design decision difference between
> > fossil/venti and zfs: venti commits storage permanently and everything
> > becomes a snapshot, while the designers of zfs decided to create a
> > two-stage process introducing a read-only intermediary between the
> > original data and a read-write access to it independent of other
> > clients.
>
> a big difference between the decisions is in data integrety.
> it's much easier to break a fs that rewrites than it is a
> worm-based fs.

True. But there's a grey area here: an FS that *never* rewrites
live blocks, but can reclaim dead ones. That's essentially
what ZFS does.

> > i don't see a solution to this problem: the unix world is committed to
> > nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so
> > i listed it as a complaint.
>
> oh, my perfect chance to shill aoe!  how to configure aoe on plan 9
> 	echo bind /net/ether0>/dev/aoe/ctl
> now for the hard part
> 	# (this space intentionally left blank.)

;-)

What's your personal experience on aoe vs. iscsi?

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06 23:23                       ` Roman V. Shaposhnik
@ 2009-01-06 23:44                         ` erik quanstrom
  2009-01-08  0:36                           ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-06 23:44 UTC (permalink / raw)
  To: 9fans

>> a big difference between the decisions is in data integrety.
>> it's much easier to break a fs that rewrites than it is a
>> worm-based fs.
>
> True. But there's a grey area here: an FS that *never* rewrites
> live blocks, but can reclaim dead ones. That's essentially
> what ZFS does.

unfortunately, i would think that can result in data loss since
i can can no longer take a set of copies of the fs {fs_0, ... fs_n}
and create a new copy with all the data possibly recovered
by picking a set "good" blocks from the fs_i, since i can make
a block dead by removing the file using it and i can make it
live again by writing a new file.

perhaps i've misinterpreted what you are saying?

> What's your personal experience on aoe vs. iscsi?

i have no iscsi experience.

aoe has been pretty fun to work with.  the spec can
be read in half an hour.  (it's maybe ten pages.)  i
implemented a virtual aoe target for plan 9, vblade,
from scratch on a friday evening.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06 23:44                         ` erik quanstrom
@ 2009-01-08  0:36                           ` Roman V. Shaposhnik
  2009-01-08  1:11                             ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-08  0:36 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-06 at 18:44 -0500, erik quanstrom wrote:
> >> a big difference between the decisions is in data integrety.
> >> it's much easier to break a fs that rewrites than it is a
> >> worm-based fs.
> >
> > True. But there's a grey area here: an FS that *never* rewrites
> > live blocks, but can reclaim dead ones. That's essentially
> > what ZFS does.
>
> unfortunately, i would think that can result in data loss since
> i can can no longer take a set of copies of the fs {fs_0, ... fs_n}
> and create a new copy with all the data possibly recovered
> by picking a set "good" blocks from the fs_i, since i can make
> a block dead by removing the file using it and i can make it
> live again by writing a new file.
>
> perhaps i've misinterpreted what you are saying?

Lets see. May be its my misinterpretation of what venti does. But so
far I understand that it boils down to: I give venti a block of any
length, it gives me a score back. Now internally, venti might decide
to split that huge block into a series of smaller ones and store it
as a tree. But still all I get back is a single score. I don't care
whether that score really describes my raw data block, or a block full
of scores that actually describe raw data. All I care is that when
I give venti that score back -- it'll reconstruct the data. I also
have a guarantee that the data will never ever be deleted.

Now, because of that guarantee (blocks are never deleted) and since
all blocks bigger than 56k get split venti has a nice property of
reusing blocks from existing trees. This happens as a by-product
of the design: I ask venti to store a block and if that same block
was already there -- there will be an extra arrow pointing at it.
All in all -- very compact way of representing a forest of trees.
Each tree corresponds to a VtEntry data structure and blocks full
of VtEntry structures are called VtEntryDir's. Finally a root
VtEntryDir is pointed at by VtRoot structure.

Contrast this with ZFS, where blocks are *not* addressed via scores,
but rather with a vdev:offset pairs called DVAs. This, of course,
means that there's no block coalescing going on. You ask ZFS to store
a block it gives you a DVA back. You ask it to store the same block
again, you get a different DVA (well, actually it gives you a block
pointer which is DVA augmented by extra stuff).

That fundamental property of ZFS makes it impossible to have a
single block implicitly referenced by multiple trees, unless the
block happens to be part of an explicit snapshot of the same object
at some later point in time.

Thus, when there's a need to modify an existing object, ZFS never
touches the old blocks. It build a tree of blocks, *explicitly*
reusing those blocks that haven't changed. When it is done building
the new tree the old one is still the active one. The last transaction
that happens updates an uberblock (ZFS speak for VtRoot) in an atomic
fashion, thus making a new tree an active one. The old tree is still
around at that point and if it is not part of a snapshot it can be
"garbage collected" and the blocks can be freed if it is part of the
snapshot -- it is preserved. In the later case the behavior seems
to be exactly what venti does

But even in the former case I don't see how the corruption could be
possible. Please elaborate.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-08  0:36                           ` Roman V. Shaposhnik
@ 2009-01-08  1:11                             ` erik quanstrom
  2009-01-20  6:20                               ` Roman Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-08  1:11 UTC (permalink / raw)
  To: 9fans

> Lets see. May be its my misinterpretation of what venti does. But so
> far I understand that it boils down to: I give venti a block of any
> length, it gives me a score back. Now internally, venti might decide

just a clarification.  this is done by the client.  from venti(6):
       Files and Directories
          Venti accepts blocks up to 56 kilobytes in size. By conven-
          tion, Venti clients use hash trees of blocks to represent
          arbitrary-size data files. [...]

> But even in the former case I don't see how the corruption could be
> possible. Please elaborate.

i didn't say there would be corruption.  i assumed corruption
and outlined how one could recover the maximal set of data
and have a consistent fs (assuming the damage doesn't cut a
full strip across all backups) by simply picking a good
block at each lba from the available damaged and/or incomplete
backups, which may originate at different times.  (russ was the
first that i know of to put this into practice.)

in the case of zfs, my claim is that since zfs can reuse blocks, two
vdev backups, each with corruption or missing data in different places
are pretty well useless.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-08  1:11                             ` erik quanstrom
@ 2009-01-20  6:20                               ` Roman Shaposhnik
  2009-01-20 14:19                                 ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-20  6:20 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I think I'm now ready to pick up this old thread (if anybody's still
interested...)

On Jan 7, 2009, at 5:11 PM, erik quanstrom wrote:
>> Lets see. May be its my misinterpretation of what venti does. But so
>> far I understand that it boils down to: I give venti a block of any
>> length, it gives me a score back. Now internally, venti might decide
>
> just a clarification.  this is done by the client.  from venti(6):
>       Files and Directories
>          Venti accepts blocks up to 56 kilobytes in size. By conven-
>          tion, Venti clients use hash trees of blocks to represent
>          arbitrary-size data files. [...]

Right. This, by the way, suggests that the onus is on the clients
to help venti reuse as much blocks as possible. Has there been
any established practices of finding the best "cut-here" points?

>> But even in the former case I don't see how the corruption could be
>> possible. Please elaborate.
>
> i didn't say there would be corruption.  i assumed corruption
> and outlined how one could recover the maximal set of data
> and have a consistent fs (assuming the damage doesn't cut a
> full strip across all backups) by simply picking a good
> block at each lba from the available damaged and/or incomplete
> backups, which may originate at different times.  (russ was the
> first that i know of to put this into practice.)
>
> in the case of zfs, my claim is that since zfs can reuse blocks, two
> vdev backups, each with corruption or missing data in different places
> are pretty well useless.


Got it. However, I'm still not fully convinced there's a definite edge
one way or the other. Don't get me wrong: I'm not trying to defend
ZFS (I don't think it needs defending, anyway) but rather I'm trying
to test my mental model of how both work.

We assume a damaged set of arenas for venti and a damaged set
of vdevs for ZFS. Everything is off-line at that point and we are
running
strictly in forensics mode. The show, basically, consists of three acts:
     1. salvaging as many good data blocks as possible
     2. building higher-order structures out of primary data blocks
     3. trying to rebuild as much of a consistent FS as possible
          using all the available blocks

It seems to me that #1 and #2 are 100% the same in terms of
the probability of success. In fact, one might claim that ZFS has
a slight edge because of:
      a. "volume management" being part of the FS
      b. the "ditto blocks" IOW every block pointer having up to
          3 alternative locations for the block it points to
The net result is that you might end up with more good blocks
to choose from in ZFS world, than in venti's case. Which brings
us to #3.

Once again, we might have more blocks to choose from than
we want (including "free" blocks) but the generation number
should be enough of a clue to filter unwanted things out.

Thanks,
Roman.

P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
possibly re-silvered) under normal working conditions, while
fossil might not even notice a corruption.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-06 14:22                   ` andrey mirtchovski
  2009-01-06 16:19                     ` erik quanstrom
@ 2009-01-20  6:48                     ` Roman Shaposhnik
  2009-01-20 14:13                       ` erik quanstrom
  2009-01-20 23:52                       ` andrey mirtchovski
  1 sibling, 2 replies; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-20  6:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hi Andrey!

Sorry, it took me a longer time to dig through the code than
I hoped to. So, if you're still game...

On Jan 6, 2009, at 6:22 AM, andrey mirtchovski wrote:
> i'm using zfs right now for a project storing a few terabytes worth of
> data and vm images.

Is it how it was from the get go, or did you use venti-based solutions
before?

> i have two zfs servers and about 10 pools of
> different sizes with several hundred different zfs filesystems and
> volumes of raw disk exported via iscsi.

What kind of clients are on the other side of iscsi?

> clones play a vital part in the whole set up (they number in the
> thousands).
> for what it's worth, zfs is the best thing in linux-world (sorry,
> solaris and *bsd too)

You're using it on Linux?

>> Fair enough. But YourTextGoesHere then becomes a transient property
>> of my namespace, where in case of ZFS it is truly a tag for a
>> snapshot.
>
> all snapshots have tags: their top-level sha1 score. what i supplied
> was simply a way to translate that to any random text. you don't need
> to, nor do you have to do this (by the way, do you get the irony of
> forcing snapshots to contain the '@' character in their name? sounds a
> lot like '#' to me ;)

Ok. Fair enough. I think I'm convinced on that point.

> snapshots are generally accessible via fossil as a directory with the
> date of the snapshot as its name. this starts making more sense when
> you take into consideration that snapshots are global per fossil, but
> then you can run several fossils without having them step on their
> toes when it comes to venti. at least until you get a collision in
> blocks' hashes.

Aha! And here are my first questions: you say that I can run multiple
fossils
off of the same venti and thus have a setup that is very close to zfs
clones:
    1. how do you do that exactly? fossil -f doesn't work for me (nor
should it
        according to the docs)
    2. how do you work around the fact that each fossil needs its own
         partition (unlike ZFS where all the clones can share the same
pool
         of blocks)?

> venti is write-once. if you instantiate a fossil from a venti score it
> is, by definition, read-only, as all changes to the current fossil
> will not appear to another fossil instantiated from the same venti
> score. changes are committed to venti once you do a fossil snap,
> however that automatically generates a new snapshot score (not
> modifying the old one). it should be clear from the paper.

I think I understand it now (except for the fossil -f part), but how do
you promote (zfs promote) such a clone?

>> where the second choice becomes a nuisance for me is in the case
>> where
> one has thousands of clones and needs to keep track of thousands of
> names in order to ensure that when the right one has finished the
> right clone disappears.

I see what you mean, but in case of venti -- nothing disappears, really.
 From that perspective you can sort of make those zfs clones linger.
The storage consumption won't be any different, right?

>>> - none of this can be done remotely
>>
>> Meaning?
>
> from machine X in the datacentre i want to be able to say "please
> create me a clone of the latest snapshot of this filesystem" without
> having to ssh to the solaris node running zfs.

Well, if its the protocol you don't like -- writing your own daemon
that will respond to such requests sounds like a trivial task
to me.

> i couldn't find the source for libzfs either, without having to
> register to the opensolaris developers' site.
[...]

> and i think i'm using a pretty new version of zfs and my experiences
> are, in fact, quite recent :)

well, the fact that you had to register in order to access the code
suggest a pretty dated experience ;-)
     http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/

> instead of reverse engineering a library that i have not much faith
> in, i wrote a python 9p server that uses local zfs/zpool commands to
> do what i could've done with C and libzfs. it's a hack but it gets the
> job done. now i can access block X of zfs volume Y remotely via 9p (at
> one third the speed, to be fair).

Well, Solaris desperately wanted to enter the Open Source geekdom
and from your experience it seems like it was a success ;-) Seriously
though, I personally found reading source code of zdb to be
absolutely illuminating about all sorts of things ZFS:
     http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/zdb/zdb.c

But yes -- just like with any unruly OS project you have to really
invest
your time if you want to tag along. I think it was Russ who made a
comment
that the Free Software is only free if your time has no value :-(

> i would be glad to help you understand the differences between zfs and
> fossil/venti with my limited knowledge of both.

Great! I tired to do as much homework as possible (hence the delay) but
I still have some questions left:
     0. A dumb one: what's the proper way of cleanly shutting down
fossil
     and venti?

    1. What's the use of copying arenas to CD/DVD? Is it purely back up,
     since they have to stay on-line forever?

    2. Would fossil/venti notice silent data corruptions in blocks?

    3. Do you think its a good idea to have volume management be
    part of filesystems, since that way you can try to heal the data
    on-the-fly?

    4. If I have a venti server and a bunch of sha1 codes, can I somehow
    instantiate a single fossil serving all of them under /archive?


Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20  6:48                     ` Roman Shaposhnik
@ 2009-01-20 14:13                       ` erik quanstrom
  2009-01-20 16:19                         ` Steve Simon
  2009-01-20 23:52                       ` andrey mirtchovski
  1 sibling, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-20 14:13 UTC (permalink / raw)
  To: 9fans

>     1. What's the use of copying arenas to CD/DVD? Is it purely back up,
>      since they have to stay on-line forever?

backup.

>     2. Would fossil/venti notice silent data corruptions in blocks?

venti would.  the score wouldn't match the block.

>     3. Do you think its a good idea to have volume management be
>     part of filesystems, since that way you can try to heal the data
>     on-the-fly?

i think they are seperate questions.  i see a couple of strong
disadvantages to combining volume management with the fs
- it's hard to reason about; zfs redundency stratgeies
seem ideosyncratic.
- you need a different volume management solution for
non zfs needs.
- to manage the storage you need to be a zfs expert.
conversely to manage zfs you need to be a storage
expert.
- raid5 is very slow if you move the raid computation away
from the data as you need to move the data to the
computation.

>     4. If I have a venti server and a bunch of sha1 codes, can I somehow
>     instantiate a single fossil serving all of them under /archive?

i don't understand the question.

- erik



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20  6:20                               ` Roman Shaposhnik
@ 2009-01-20 14:19                                 ` erik quanstrom
  2009-01-20 22:30                                   ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-20 14:19 UTC (permalink / raw)
  To: 9fans

> > in the case of zfs, my claim is that since zfs can reuse blocks, two
> > vdev backups, each with corruption or missing data in different places
> > are pretty well useless.
>
>
> Got it. However, I'm still not fully convinced there's a definite edge
> one way or the other. Don't get me wrong: I'm not trying to defend
> ZFS (I don't think it needs defending, anyway) but rather I'm trying
> to test my mental model of how both work.

if you end up rewriting a free block in zfs, there sure is.  you
can't decide which one is correct.

> P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
> possibly re-silvered) under normal working conditions, while
> fossil might not even notice a corruption.

not true.  one of many score checks:

srv/lump.c:103: 				seterr(EStrange, "lookuplump returned bad score %V not %V", u->score, score);

- erik



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 14:13                       ` erik quanstrom
@ 2009-01-20 16:19                         ` Steve Simon
  0 siblings, 0 replies; 91+ messages in thread
From: Steve Simon @ 2009-01-20 16:19 UTC (permalink / raw)
  To: 9fans

>     4. If I have a venti server and a bunch of sha1 codes, can I somehow
>     instantiate a single fossil serving all of them under /archive?

Not at present, there is no way to insert a vac score into a fossil hierarchy
other than at the root of the hierarchy (flfmt -v).

what you can do is copy your archives into a fossil and then snap -a them into the archive
hierarchy - I did this form some small old backups (100s Mb) but it is slow as the data
must be copied, the directory entries come out with a modern date, and your fossil must
be big enough to hold the whole backup.

details on the tip o' the day page on the wiki.

-Steve

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 14:19                                 ` erik quanstrom
@ 2009-01-20 22:30                                   ` Roman V. Shaposhnik
  2009-01-20 23:36                                     ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-20 22:30 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-20 at 09:19 -0500, erik quanstrom wrote:
> > > in the case of zfs, my claim is that since zfs can reuse blocks, two
> > > vdev backups, each with corruption or missing data in different places
> > > are pretty well useless.
> >
> >
> > Got it. However, I'm still not fully convinced there's a definite edge
> > one way or the other. Don't get me wrong: I'm not trying to defend
> > ZFS (I don't think it needs defending, anyway) but rather I'm trying
> > to test my mental model of how both work.
>
> if you end up rewriting a free block in zfs, there sure is.  you
> can't decide which one is correct.

You don't have to "decide". You get use generation # for that.

> > P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
> > possibly re-silvered) under normal working conditions, while
> > fossil might not even notice a corruption.
>
> not true.  one of many score checks:
>
> srv/lump.c:103: 				seterr(EStrange, "lookuplump returned bad score %V not %V", u->score, score);

I don't buy this argument for a simple reason: here's a very
easy example that proves my point:

term% fossil/fossil -f /tmp/fossil.bin
fsys: dialing venti at net!$venti!venti
warning: connecting to venti: Connection refused
term% mount /srv/fossil /n/f
term% cd /n/f/test
term% echo 'this  is innocent text' > text.txt
term% cat text.txt
this  is innocent text
term% dd -if /dev/cons -of /tmp/fossil.bin -bs 1 -count 8 -oseek 278528 -trunc 0
this WAS
8+0 records in
8+0 records out

term% rm /srv/fossil /srv/fscons
term% fossil/fossil -f /tmp/fossil.bin
fsys: dialing venti at net!$venti!venti
warning: connecting to venti: Connection refused
create /active/adm: file already exists
create /active/adm adm sys d775: create /active/adm: file already exists
create /active/adm/users: file already exists
create /active/adm/users adm sys 664: create /active/adm/users: file already exists
        nuser 5 len 84
term% mount /srv/fossil /n/f2
term% cat /n/f2/test/text.txt
this WAS innocent text
term%

Of course, with ZFS, the above corruption would be always
noticed and sometimes (depending on your vdev setup)
even silently fixed.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 22:30                                   ` Roman V. Shaposhnik
@ 2009-01-20 23:36                                     ` erik quanstrom
  2009-01-21  1:43                                       ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-20 23:36 UTC (permalink / raw)
  To: 9fans

> > > Got it. However, I'm still not fully convinced there's a definite edge
> > > one way or the other. Don't get me wrong: I'm not trying to defend
> > > ZFS (I don't think it needs defending, anyway) but rather I'm trying
> > > to test my mental model of how both work.
> >
> > if you end up rewriting a free block in zfs, there sure is.  you
> > can't decide which one is correct.
>
> You don't have to "decide". You get use generation # for that.
>

what generation number?  are there other things that your argument
depends on that you haven't mentioned yet?

> > not true.  one of many score checks:
> >
> > srv/lump.c:103: 				seterr(EStrange, "lookuplump returned bad score %V not %V", u->score, score);
>
> I don't buy this argument for a simple reason: here's a very
> easy example that proves my point:
>
> term% fossil/fossil -f /tmp/fossil.bin
> fsys: dialing venti at net!$venti!venti
> warning: connecting to venti: Connection refused

well, there's your problem.  you corrupted
the cache, not the venti store.  (you have no
venti store in this example.)

i should have been more clear that venti does the
checking.  there are many things that fossil doesn't
do that it should.

- erik



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20  6:48                     ` Roman Shaposhnik
  2009-01-20 14:13                       ` erik quanstrom
@ 2009-01-20 23:52                       ` andrey mirtchovski
  2009-01-21  4:49                         ` Dave Eckhardt
                                           ` (2 more replies)
  1 sibling, 3 replies; 91+ messages in thread
From: andrey mirtchovski @ 2009-01-20 23:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Is it how it was from the get go, or did you use venti-based solutions
> before?

it's how i found it.

>> i have two zfs servers and about 10 pools of
>> different sizes with several hundred different zfs filesystems and
>> volumes of raw disk exported via iscsi.
>
> What kind of clients are on the other side of iscsi?

linux machines.

> You're using it on Linux?

the zfs servers are OpenSolaris boxes.

> Aha! And here are my first questions: you say that I can run multiple
> fossils
> off of the same venti and thus have a setup that is very close to zfs
> clones:
>   1. how do you do that exactly? fossil -f doesn't work for me (nor should
> it according to the docs)

i meant formatting the fossil disk with flfmt -v, sorry. it had been
quite a while since i last had to restart from an old venti score :)

>   2. how do you work around the fact that each fossil needs its own
>        partition (unlike ZFS where all the clones can share the same pool
>        of blocks)?

ultimately all blocks are shared on the same venti server unless you
use separate ones. fossil does have the functionality to serve two
different file systems from two different disks, but i don't  think
anyone has used that (but see example at the end).

> I think I understand it now (except for the fossil -f part), but how do
> you promote (zfs promote) such a clone?

i'm unconvinced that 'promoting' is a genuine feature: it seems to me
that the designers had to invent 'promoting' because they made the
decision to make snapshots read-only in the first place. perhaps i'm
wrong, but if the purpose of promoting something is to make it a true
member of the filesystem community (with all capabilities that
entails), then the corresponding feature in fossil would be to
instantiate one from the particular venti score for the dump. i.e.,
flfmt -v.

> I see what you mean, but in case of venti -- nothing disappears, really.
> From that perspective you can sort of make those zfs clones linger.
> The storage consumption won't be any different, right?

the storage consumption should be the same, i presume. my problem is
that in the case of zfs having several hundred snapshots significantly
degrades the performance of the management tools to the extend that
zfs list takes 30 seconds with about a thousand entries. compared to
fossil handling 5 years worth of daily dumps in less than a second.
but that's not really a serious argument ;)

>
> Great! I tired to do as much homework as possible (hence the delay) but
> I still have some questions left:
>    0. A dumb one: what's the proper way of cleanly shutting down fossil
>    and venti?

see fshalt. it used to be, like most other things, that one could just
turn the machine off without worry. then some bad things happened and
fshalt was written.

>   1. What's the use of copying arenas to CD/DVD? Is it purely back up,
>    since they have to stay on-line forever?

people who back up to cd/dvd can answer that :)

>   3. Do you think its a good idea to have volume management be
>   part of filesystems, since that way you can try to heal the data
>   on-the-fly?

i don't know...

>   4. If I have a venti server and a bunch of sha1 codes, can I somehow
>   instantiate a single fossil serving all of them under /archive?

not sure if this will work. you'll need as many partitions as the sha1
scores you have. then for each do fossil/flfmt -v score partition.

once you've started fossil on the console type, for each partition/score:

fsys somename config partition
fsys somename venti ventiserver
fsys somename open

it's convoluted, yes. there may be an easier way. i know of people
using vacfs and vac to backup their linux machines to venti. actions
like the ones you're describing would be much easier there, although i
am not sure vacfs has all the functionality to be a usable file system
(for example, it's read-only).

for my personal $0.02 i will say that this argument seems to revolve
around trying to bend fossil and venti to match the functionality of
zfs and the design decisions of the team that wrote it. i, frankly,
think that it should be the other way around; zfs should provide the
equivalent of the fossil/venti snapshot/dump functionality to its
users. that, to me would be a benefit (of course it gets you sued by
netapp too, but that's besides the point). all this
filesystem/snapshot/clone games are just a bunch of toys to make the
admins happy and have little effective use for the end user.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 23:36                                     ` erik quanstrom
@ 2009-01-21  1:43                                       ` Roman V. Shaposhnik
  2009-01-21  2:02                                         ` erik quanstrom
  2009-01-21 19:02                                         ` Uriel
  0 siblings, 2 replies; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-21  1:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-20 at 18:36 -0500, erik quanstrom wrote:
> > > > Got it. However, I'm still not fully convinced there's a definite edge
> > > > one way or the other. Don't get me wrong: I'm not trying to defend
> > > > ZFS (I don't think it needs defending, anyway) but rather I'm trying
> > > > to test my mental model of how both work.
> > >
> > > if you end up rewriting a free block in zfs, there sure is.  you
> > > can't decide which one is correct.
> >
> > You don't have to "decide". You get use generation # for that.
> >
>
> what generation number?

I'm talking about a field in each ZFS block pointer. The
field is actually called "birth txg", but I thought alluding
to VtEntry.gen would make it easier to understand what I had
in mind.

> are there other things that your argument
> depends on that you haven't mentioned yet?

Fair question. It depends on at leas cursory reading
of ZFS-on-disk specification. I felt uneasy in this
conversation precisely because I had a very vague recollection
of Venti/Fossil paper. I guess it cuts both ways:
   http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

> > term% fossil/fossil -f /tmp/fossil.bin
> > fsys: dialing venti at net!$venti!venti
> > warning: connecting to venti: Connection refused
>
> well, there's your problem.  you corrupted
> the cache, not the venti store.  (you have no
> venti store in this example.)

I was specifically referring to a "normal operations"
to conjure an image of a typical setup of fossil+venti.

In such a setup a corrupted block from a fossil
partition will go undetected and could end up
being stored in venti. At that point it will become
venti "problem".

> i should have been more clear that venti does the
> checking.  there are many things that fossil doesn't
> do that it should.

Sure, but I can't really use venti  without using
fossil (again: we are talking about a typical setup
here not something like vac/vacfs), can I?

If I can NOT than fossil becomes a weak link that
can let corrupted data go undetected all the way
to a venti store.

This is quite worrisome for me. At least compared to
ZFS it is.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21  1:43                                       ` Roman V. Shaposhnik
@ 2009-01-21  2:02                                         ` erik quanstrom
  2009-01-26  6:28                                           ` Roman V. Shaposhnik
  2009-01-21 19:02                                         ` Uriel
  1 sibling, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-21  2:02 UTC (permalink / raw)
  To: 9fans

> > well, there's your problem.  you corrupted
> > the cache, not the venti store.  (you have no
> > venti store in this example.)
>
> I was specifically referring to a "normal operations"
> to conjure an image of a typical setup of fossil+venti.
>
> In such a setup a corrupted block from a fossil
> partition will go undetected and could end up
> being stored in venti. At that point it will become
> venti "problem".

it's important to keep in mind that fossil is just a write buffer.
it is not intended for the perminant storage of data.  while
corrupt data could end up in venti, the exposure lies only
between snapshots.  you can rollback to the previous good
score and continue.

ken's fs has a proper cache.  a corrupt cache can be recovered
from by dumping the cache and restarting from the last good
superblock.  in the days when the fs was really a worm stored
on mo disks, the worm was said to be very reliable storage.
with raid+scrubbing we try to overcome the limitations of
magnetic media.  while there isn't any block checksum, there is a
block tag.  tag checking has spotted a few instances of
corruption on my fs.  fs-level checksumming and encryption
is definately something i've considered.  actually, with tags
and encryption, checksumming is not necessary for error
detection.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 23:52                       ` andrey mirtchovski
@ 2009-01-21  4:49                         ` Dave Eckhardt
  2009-01-21  6:38                         ` Steve Simon
  2009-01-26  6:16                         ` Roman V. Shaposhnik
  2 siblings, 0 replies; 91+ messages in thread
From: Dave Eckhardt @ 2009-01-21  4:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

>> What's the use of copying arenas to CD/DVD?  Is it purely
>> back up, since they have to stay on-line forever?

> people who back up to cd/dvd can answer that :)

My venti, which backs a fossil used by 70 student accounts, of
which five are currently "active", fills arenas *very* slowly.
I burn the newest arena to one of two CD-RW's every week or so,
and when an arena fills I burn it and all the previous ones to
CD-R (nobody knows how long CD-R's last, because by the time
you can do a longevity study the dyes have all changed).  At
the end of most semesters I burn just the most recent arena
to CD-R even though it's not full.

Off-site I have a small stack of CD-R's, total cost maybe $5.
The two CD-RW's set me back around $3 each, a couple years ago
when they were more expensive.

This wouldn't be fun for somebody generating lots of of new
data, but then again I'm not hosting a Debian mirror, and
this way I'd still have *something* even after a machine room
flood or EMP event.

Dave Eckhardt

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 23:52                       ` andrey mirtchovski
  2009-01-21  4:49                         ` Dave Eckhardt
@ 2009-01-21  6:38                         ` Steve Simon
  2009-01-21 14:02                           ` erik quanstrom
  2009-01-26  6:16                         ` Roman V. Shaposhnik
  2 siblings, 1 reply; 91+ messages in thread
From: Steve Simon @ 2009-01-21  6:38 UTC (permalink / raw)
  To: 9fans

> ... fossil does have the functionality to serve two
> different file systems from two different disks, but i don't  think
> anyone has used that ...

I do this, 'main' backed up by venti and 'other' which holds useful stuff
that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is
accessed via 9fs juke as an homage to the CDROM jukebox that once provided
a similar filesystem at the labs.

-Steve



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21  6:38                         ` Steve Simon
@ 2009-01-21 14:02                           ` erik quanstrom
  0 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-21 14:02 UTC (permalink / raw)
  To: 9fans

On Wed Jan 21 01:40:13 EST 2009, steve@quintile.net wrote:
> > ... fossil does have the functionality to serve two
> > different file systems from two different disks, but i don't  think
> > anyone has used that ...
>
> I do this, 'main' backed up by venti and 'other' which holds useful stuff
> that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is
> accessed via 9fs juke as an homage to the CDROM jukebox that once provided
> a similar filesystem at the labs.

actually, it was a hp jukebox that had mo disks.
alliance (neé plasmon) makes 60gb udo2 drives
  http://www.plasmon.com/archive_solutions/udodrives.html
and these libraries
  http://www.plasmon.com/archive_solutions/glibrary.html
the media are supposedly good for 50 years.

www.quanstro.net/plan9/disklessfs.pdf describes coraid's
worm-replacement strategy.  it is both better (offsite,
very fast access) and not better (the media are less reliable
and not write-once).

it would be neat to have a filesystem built as
filsys main cpe2.0"kcache"{e2.1jw0w1}
all the speed of disks and a perminant record,
but clearly not very cost effective.  and direct-
attach storage doesn't like the right place for
the worm.  it should be offsite.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21  1:43                                       ` Roman V. Shaposhnik
  2009-01-21  2:02                                         ` erik quanstrom
@ 2009-01-21 19:02                                         ` Uriel
  2009-01-21 19:53                                           ` Steve Simon
                                                             ` (2 more replies)
  1 sibling, 3 replies; 91+ messages in thread
From: Uriel @ 2009-01-21 19:02 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik <rvs@sun.com> wrote:
> I was specifically referring to a "normal operations"
> to conjure an image of a typical setup of fossil+venti.
>
> In such a setup a corrupted block from a fossil
> partition will go undetected and could end up
> being stored in venti. At that point it will become
> venti "problem".
>
>> i should have been more clear that venti does the
>> checking.  there are many things that fossil doesn't
>> do that it should.
>
> Sure, but I can't really use venti  without using
> fossil (again: we are talking about a typical setup
> here not something like vac/vacfs), can I?
>
> If I can NOT than fossil becomes a weak link that
> can let corrupted data go undetected all the way
> to a venti store.

Fossil has always been a weak link, and probably will always be until
somebody replaces it. There was some idea of replacing it with a
version of ken's fs that uses a venti backend...

Venti's rock solid design is the only thing that makes fossil
minimally tolerable despite its usual tendency of stepping on its hair
and falling on his face.

uriel

> This is quite worrisome for me. At least compared to
> ZFS it is.
>
> Thanks,
> Roman.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21 19:02                                         ` Uriel
@ 2009-01-21 19:53                                           ` Steve Simon
  2009-01-24  3:15                                             ` Roman V. Shaposhnik
  2009-01-21 20:01                                           ` erik quanstrom
  2009-01-24  3:19                                           ` Roman V. Shaposhnik
  2 siblings, 1 reply; 91+ messages in thread
From: Steve Simon @ 2009-01-21 19:53 UTC (permalink / raw)
  To: 9fans

> Fossil has always been a weak link, and probably will always be until
> somebody replaces it. There was some idea of replacing it with a
> version of ken's fs that uses a venti backend...
>
> Venti's rock solid design is the only thing that makes fossil
> minimally tolerable despite its usual tendency of stepping on its hair
> and falling on his face.

Interesting, you have first hand experience of this?

I have found fossil and venti to be a completely reliable combination,
running 24x7 on two machines for four years now.

I have had three failures of fossil, all due to disks dying, and two
of those where my own fault (over cooling).

-Steve



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21 19:02                                         ` Uriel
  2009-01-21 19:53                                           ` Steve Simon
@ 2009-01-21 20:01                                           ` erik quanstrom
  2009-01-24  3:19                                           ` Roman V. Shaposhnik
  2 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-21 20:01 UTC (permalink / raw)
  To: 9fans

> Fossil has always been a weak link, and probably will always be until
> somebody replaces it. There was some idea of replacing it with a
> version of ken's fs that uses a venti backend...

i looked into how that would go enough to see
that venti would work at cross purposes to the
fs.  having a w address doesn't make much sense
when you can address by content.

in hindsight, that was likely obvious to everyone
but me.

i think ken's fs makes perfect sense without venti.
it has reasonable device support these days
(aoe, ata, ahci, marvell 88sx).

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21 19:53                                           ` Steve Simon
@ 2009-01-24  3:15                                             ` Roman V. Shaposhnik
  2009-01-24  3:36                                               ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-24  3:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, 2009-01-21 at 19:53 +0000, Steve Simon wrote:
> > Fossil has always been a weak link, and probably will always be until
> > somebody replaces it. There was some idea of replacing it with a
> > version of ken's fs that uses a venti backend...
> >
> > Venti's rock solid design is the only thing that makes fossil
> > minimally tolerable despite its usual tendency of stepping on its hair
> > and falling on his face.
>
> Interesting, you have first hand experience of this?
>
> I have found fossil and venti to be a completely reliable combination,
> running 24x7 on two machines for four years now.
>
> I have had three failures of fossil, all due to disks dying, and two
> of those where my own fault (over cooling).

You never know when end-to-end data consistency will start to really
matter. Just the other day I attended the cloud conference where
some Amazon EC2 customers were swapping stories of Amazon's networking
"stack" malfunctioning and silently corrupting data that was written
onto EBS. All of sudden, something like ZFS started to sound like
a really good idea to them.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21 19:02                                         ` Uriel
  2009-01-21 19:53                                           ` Steve Simon
  2009-01-21 20:01                                           ` erik quanstrom
@ 2009-01-24  3:19                                           ` Roman V. Shaposhnik
  2009-01-24  3:25                                             ` erik quanstrom
  2 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-24  3:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, 2009-01-21 at 20:02 +0100, Uriel wrote:
> On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik <rvs@sun.com> wrote:
> > Sure, but I can't really use venti  without using
> > fossil (again: we are talking about a typical setup
> > here not something like vac/vacfs), can I?
> >
> > If I can NOT than fossil becomes a weak link that
> > can let corrupted data go undetected all the way
> > to a venti store.
>
> Fossil has always been a weak link, and probably will always be until
> somebody replaces it. There was some idea of replacing it with a
> version of ken's fs that uses a venti backend...
>
> Venti's rock solid design is the only thing that makes fossil
> minimally tolerable despite its usual tendency of stepping on its hair
> and falling on his face.

After spending sometime reading the sources and grokking fossil
I don't think it is a walking disaster. Far from it.

There are a couple of places where things can be improved,
to make *me* happier (YMMV), and I'll try to focus on these
in replying to Andrei's email. Just to get some closure on
this discussion.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-24  3:19                                           ` Roman V. Shaposhnik
@ 2009-01-24  3:25                                             ` erik quanstrom
  0 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-24  3:25 UTC (permalink / raw)
  To: 9fans

> After spending sometime reading the sources and grokking fossil
> I don't think it is a walking disaster. Far from it.
>
> There are a couple of places where things can be improved,
> to make *me* happier (YMMV), and I'll try to focus on these
> in replying to Andrei's email. Just to get some closure on
> this discussion.

it's important to note, though, that fossil is a write
buffer and not a proper cache.  i believe this fact
is the main source of legitimate gripes with fossil.

the other source of trouble is that both fossil and
venti have at times suffered from being quite unfriendly
when shut down unexpectedly.  since they run on
cpu servers, and since there is a temptation to have
an all-in-wonder cpu server, unexpected shutdowns
can be more common than one would like.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-24  3:15                                             ` Roman V. Shaposhnik
@ 2009-01-24  3:36                                               ` erik quanstrom
  2009-01-26  6:21                                                 ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-24  3:36 UTC (permalink / raw)
  To: 9fans

> You never know when end-to-end data consistency will start to really
> matter. Just the other day I attended the cloud conference where
> some Amazon EC2 customers were swapping stories of Amazon's networking
> "stack" malfunctioning and silently corrupting data that was written
> onto EBS. All of sudden, something like ZFS started to sound like
> a really good idea to them.

i know we need to bow down before zfs's greatness, but i still have
some questions. ☺

does ec2 corrupt all one's data en mass?  how do you do meaningful
redundency in a cloud where one controls none of the failure-prone
pieces.

finally, if p is the probability of a lost block, when does p become too
large for zfs' redundency to overcome failures?  does this depend on
the amount of i/o one does on the data or does zfs scrub at a minimum
rate anyway.  if it does, that would be expensive.

maybe ec2 is heads amazon wins, tails you loose?

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-20 23:52                       ` andrey mirtchovski
  2009-01-21  4:49                         ` Dave Eckhardt
  2009-01-21  6:38                         ` Steve Simon
@ 2009-01-26  6:16                         ` Roman V. Shaposhnik
  2009-01-26 16:22                           ` Russ Cox
  2 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26  6:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-20 at 16:52 -0700, andrey mirtchovski wrote:
> for my personal $0.02 i will say that this argument seems to revolve
> around trying to bend fossil and venti to match the functionality of
> zfs and the design decisions of the team that wrote it.

That is NOT the conversation I'm interested in. My main objective is
to evaluate venti/fossil approach to storage and what kind of benefits
it might provide. It is inevitable that I will contrast venti/fossil
with ZFS, simply because it is the background I'm coming from.

> i, frankly, think that it should be the other way around; zfs should
> provide the equivalent of the fossil/venti snapshot/dump functionality
> to its users. that, to me would be a benefit.

Ok. It is fair to turn the tables. So now, let me ask you: what are
the benefits of fossil/venti that you want to see in ZFS? So far
the only real issue that you've identified is this:

  ||| where the second choice becomes a nuisance for me is in the
  ||| case where one has thousands of clones and needs to keep track
  ||| of thousands of names in order to ensure that when the right one
  ||| has finished the right clone disappears.

And I think it is a valid one. But is there anything else (execpt
the issues that have to do with the fact tha ZFS lives in UNIX
where fossil/venti in Plan9)?

As for me, here's my wish list so far. It is all about fossil, since
it looks like venti is quite fine (at least for my purposes) the
way it is:
     1. Block consistency. Yes I know the argument here is that you
     can always roll-back to the last known archival snapshot on venti.
     But the point is to kown *when* to roll back. And unless fossil
     warns you that a block has been corrupted you wouldn't know.

     2. live "mounting" of arbitrary scores corresponding to vac
     VtRoot's to arbitrary sub-directories in my fossil tree. After
     all, if I can do "create" of regular files and sub-directories
     via fossil's console why can't I create pointers to the existing
     venti file-hierarchies?

     3. Not sure whether this is a fossil requirement or not, but I
     feel uneasy that a root score is sort of unrecoverable from the
     pure venti archive. Its either that I know it or I don't.

> all this filesystem/snapshot/clone games are just a bunch of toys to
> make the admins happy and have little effective use for the end user.

I disagree. Remember that this whole conversation started from
a simple premise that a good archival system could be an
efficient replacement for the SCM. If your end users are
software developers -- that IS very relevant to them.

It is actually quite remarkable how similar the models of
fossil/venti and Git seem to be: both build on the notion
of the immutable history. Both address the history by the
hash index. Both have a mutable area whose only purpose
is to stage data for the subsequent commit to the permanent
history. Etc.

> > I see what you mean, but in case of venti -- nothing disappears, really.
> > From that perspective you can sort of make those zfs clones linger.
> > The storage consumption won't be any different, right?
>
> the storage consumption should be the same, i presume. my problem is
> that in the case of zfs having several hundred snapshots significantly
> degrades the performance of the management tools to the extend that
> zfs list takes 30 seconds with about a thousand entries.

Really?!?

> compared to
> fossil handling 5 years worth of daily dumps in less than a second.
> but that's not really a serious argument ;)

And what's the output of
   term% ls -d <path-to-your-fossil>/archive/*/*/* | wc -l

> > Great! I tired to do as much homework as possible (hence the delay) but
> > I still have some questions left:
> >    0. A dumb one: what's the proper way of cleanly shutting down fossil
> >    and venti?
>
> see fshalt.

Hm. There doesn't seem to be much of shutdown code for fossil/venti
in there. Does it mean that sync'ing venti and then just slay(1)'ing
it is ok?

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-24  3:36                                               ` erik quanstrom
@ 2009-01-26  6:21                                                 ` Roman V. Shaposhnik
  2009-01-26 13:53                                                   ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26  6:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Fri, 2009-01-23 at 22:36 -0500, erik quanstrom wrote:
> > You never know when end-to-end data consistency will start to really
> > matter. Just the other day I attended the cloud conference where 
> > some Amazon EC2 customers were swapping stories of Amazon's networking
> > "stack" malfunctioning and silently corrupting data that was written
> > onto EBS. All of sudden, something like ZFS started to sound like 
> > a really good idea to them.
> 
> i know we need to bow down before zfs's greatness, but i still have
> some questions. ☺

Oh, come on! I said "something like ZFS" ;-) These guys are on
Linux, for crying out loud! They need to be saved one way
or the other (and Solaris at least have *some* AMIs available
on EC2).

> does ec2 corrupt all one's data en mass?  

>From what I understood -- it was NOT en mass. But the scary
thing is that they only noticed because of the dumb luck
(the app coredumped because the input it was getting was not
properly formatted or something) 

> how do you do meaningful redundency in a cloud where one controls 
> none of the failure-prone pieces.

Well, that's the very point I'm trying to make: you have
to be at least notified that your data got corrupted.

Once you do get notified -- you can recover in variety
of different ways: starting from simply re-uploading/re-generating
your data all the way to the RAID-like things.

> finally, if p is the probability of a lost block, when does p become too
> large for zfs' redundency to overcome failures? 

It depends on the vdev configuration. You can do simple mirroring
or you can do RAID-Z (which is more or less RAID-5 done properly).

> does this depend on the amount of i/o one does on the data or does 
> zfs scrub at a minimum rate anyway.  if it does, that would be expensive.  

You can do resilvering (fixing the data that is known to be
bad) or scrubbing (verifying and fixing *all* the data). You
also can configure things so that bad blocks either trigger
or don't automatic resilvering. Does this answer your question?

> maybe ec2 is heads amazon wins, tails you loose?

The scariest takeaway from the conference was: with the economy
the way it is physical on-site datacenters are becoming a 
luxury for all but the most wealthy companies. Thus whether
we like it or not virtual data centers are here to stay.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-21  2:02                                         ` erik quanstrom
@ 2009-01-26  6:28                                           ` Roman V. Shaposhnik
  2009-01-26 13:42                                             ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26  6:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 2009-01-20 at 21:02 -0500, erik quanstrom wrote:
> > In such a setup a corrupted block from a fossil
> > partition will go undetected and could end up
> > being stored in venti. At that point it will become
> > venti "problem".
>
> it's important to keep in mind that fossil is just a write buffer.
> it is not intended for the perminant storage of data.

Sure. But it must store the data *intact* long enough
for me to be able to do a snap. It has to be able to
at least warn me about data corruption.

> while corrupt data could end up in venti, the exposure lies only
> between snapshots.  you can rollback to the previous good
> score and continue.

That is my *entire* point. If fossil doesn't tell you that
the data in its buffer was/is corrupted -- you have no
reason to rollback.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26  6:28                                           ` Roman V. Shaposhnik
@ 2009-01-26 13:42                                             ` erik quanstrom
  2009-01-26 16:15                                               ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-26 13:42 UTC (permalink / raw)
  To: 9fans

> > it's important to keep in mind that fossil is just a write buffer.
> > it is not intended for the perminant storage of data.
>
> Sure. But it must store the data *intact* long enough
> for me to be able to do a snap. It has to be able to
> at least warn me about data corruption.

do you have any references to spontaenous data corruption
happening so soon on media that can be written elsewhere
without corruption?  ian ibm paper argus for raid[56] + chksum
that claimed that the p(lifetime) = 10^-13.

http://domino.watson.ibm.com/library/cyberdig.nsf/80741a79b3d5f4d085256b3600733b05/ca7b221ad09be77885257149004f7c53?OpenDocument&Highlight=0,RZ3652

but i didn't see any reason that this would apply to short-term
storage.

> That is my *entire* point. If fossil doesn't tell you that
> the data in its buffer was/is corrupted -- you have no
> reason to rollback.

if you're that worried, you do not need to modify fossil.
why don't you write a sdecc driver that as configuration
another sd device and a blocksize.  then you can just
add ecc on the way in and check it on the way out.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26  6:21                                                 ` Roman V. Shaposhnik
@ 2009-01-26 13:53                                                   ` erik quanstrom
  2009-01-26 16:21                                                     ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-26 13:53 UTC (permalink / raw)
  To: 9fans

> It depends on the vdev configuration. You can do simple mirroring
> or you can do RAID-Z (which is more or less RAID-5 done properly).

"raid5 done properly"?  could you back up this claim?

also, with services like ec2, it's no use doing raid since all
your data could be on the same drive, regardless what the tell
you.

> > does this depend on the amount of i/o one does on the data or does
> > zfs scrub at a minimum rate anyway.  if it does, that would be expensive.
>
> You can do resilvering (fixing the data that is known to be
> bad) or scrubbing (verifying and fixing *all* the data). You
> also can configure things so that bad blocks either trigger
> or don't automatic resilvering. Does this answer your question?

no.  not at all.  if you're serious about using ec2, one of the
costs you need to control is your b/w usage.  you're going to
notice overly-aggressive scrubbing in your mothly bill.

> > maybe ec2 is heads amazon wins, tails you loose?
>
> The scariest takeaway from the conference was: with the economy
> the way it is physical on-site datacenters are becoming a
> luxury for all but the most wealthy companies. Thus whether
> we like it or not virtual data centers are here to stay.

if the numbers i came up with for coraid are correct, it would would cost
coraid about 50x more to use ec2.  that is, if we can run plan 9
at all.

- erik



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 13:42                                             ` erik quanstrom
@ 2009-01-26 16:15                                               ` Roman V. Shaposhnik
  2009-01-26 16:39                                                 ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26 16:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 2009-01-26 at 08:42 -0500, erik quanstrom wrote:
> > That is my *entire* point. If fossil doesn't tell you that
> > the data in its buffer was/is corrupted -- you have no
> > reason to rollback.
>
> if you're that worried, you do not need to modify fossil.
> why don't you write a sdecc driver that as configuration
> another sd device and a blocksize.  then you can just
> add ecc on the way in and check it on the way out.

This approach will work too. But it seems that asking fossil
to verify a checksum when the block is about to go to venti
is not that much of an overhead.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 13:53                                                   ` erik quanstrom
@ 2009-01-26 16:21                                                     ` Roman V. Shaposhnik
  2009-01-26 17:37                                                       ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26 16:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 2009-01-26 at 08:53 -0500, erik quanstrom wrote:
> > It depends on the vdev configuration. You can do simple mirroring
> > or you can do RAID-Z (which is more or less RAID-5 done properly).
>
> "raid5 done properly"?  could you back up this claim?

Yes. See here for details:
   http://blogs.sun.com/bonwick/entry/raid_z

> > > does this depend on the amount of i/o one does on the data or does
> > > zfs scrub at a minimum rate anyway.  if it does, that would be expensive.
> >
> > You can do resilvering (fixing the data that is known to be
> > bad) or scrubbing (verifying and fixing *all* the data). You
> > also can configure things so that bad blocks either trigger
> > or don't automatic resilvering. Does this answer your question?
>
> no.  not at all.

Then, please, restate it.

> if you're serious about using ec2, one of the
> costs you need to control is your b/w usage.  you're going to
> notice overly-aggressive scrubbing in your mothly bill.

Only if you asked for that to happen. Its all under your control.
You may decide to never ever do scrubbing.

> > The scariest takeaway from the conference was: with the economy
> > the way it is physical on-site datacenters are becoming a
> > luxury for all but the most wealthy companies. Thus whether
> > we like it or not virtual data centers are here to stay.
>
> if the numbers i came up with for coraid are correct, it would would cost
> coraid about 50x more to use ec2.  that is, if we can run plan 9
> at all.

You may think what you want, but obviously quite a few existing small to
mid-size companies disagree. Including a couple of labs with MPI apps
now running on EC2. May be your numbers are wrong, may be your usage
patterns are different. Who knows.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26  6:16                         ` Roman V. Shaposhnik
@ 2009-01-26 16:22                           ` Russ Cox
  2009-01-26 19:42                             ` Roman V. Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: Russ Cox @ 2009-01-26 16:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> As for me, here's my wish list so far. It is all about fossil, since
> it looks like venti is quite fine (at least for my purposes) the
> way it is:
>     1. Block consistency. Yes I know the argument here is that you
>     can always roll-back to the last known archival snapshot on venti.
>     But the point is to kown *when* to roll back. And unless fossil
>     warns you that a block has been corrupted you wouldn't know.

I don't understand what you mean.  Do you want fossil to tell
you when your disk is silently corrupting data, or something else?

>     2. live "mounting" of arbitrary scores corresponding to vac
>     VtRoot's to arbitrary sub-directories in my fossil tree. After
>     all, if I can do "create" of regular files and sub-directories
>     via fossil's console why can't I create pointers to the existing
>     venti file-hierarchies?

The only reason this is hard is the choice of qids.
You need to decide whether to reuse the qids in the archive
or renumber them to avoid conflicts with existing qids.
The vac format already has a way to offset the qids of whole
subtrees, but then if you make the tree editable and new files are
created, it gets complicated.

>     3. Not sure whether this is a fossil requirement or not, but I
>     feel uneasy that a root score is sort of unrecoverable from the
>     pure venti archive. Its either that I know it or I don't.

I don't understand what you mean here either.
>From a venti archive, you do cat file.vac to find
the actual score.

For what it's worth, I'll be the first to admit that fossil has a
ton of rough edges and things that could be done better.
There were early design decisions that we didn't know the
implications of until relatively late in the implementation,
and I would revisit many of those if I had the luxury of
doing it over.  It is very much version 0.

The amazing thing to me about fossil is how indestructable
it is when used with venti.  While I was finishing fossil,
I ran it on my laptop as my day-to-day file system, and I never
lost a byte of data despite numerous bugs, because venti
itself was solid, and I always did an archive to venti before
trying out new code.  Once you see the data in the archive
tree, you can be very sure it's not going away.

> It is actually quite remarkable how similar the models of
> fossil/venti and Git seem to be: both build on the notion
> of the immutable history. Both address the history by the
> hash index. Both have a mutable area whose only purpose
> is to stage data for the subsequent commit to the permanent
> history. Etc.

I don't think it's too remarkable.  Content hash addressing was
in the air for the last decade or so and there were a lot of
systems built using it.  The one thing it does really well
is eliminate any worry about cache coherency and versioning.
That makes it very attractive for any system with large
amounts of data or multiple machines.  Once you've gone down
that route, you have to come to grips with how to implement
mutability in a fundamentally immutable system, and the
obvious way is with a mutable write buffer staging writes
out to the immutable storage.

> Hm. There doesn't seem to be much of shutdown code for fossil/venti
> in there. Does it mean that sync'ing venti and then just slay(1)'ing
> it is ok?

Yes, it is.  Venti is designed to be crash-proof, as is fossil.
They get the write ordering right and pick up where they left off.
They are not, however, disk corruption-proof.

Russ

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 16:15                                               ` Roman V. Shaposhnik
@ 2009-01-26 16:39                                                 ` erik quanstrom
  2009-01-27  4:45                                                   ` Roman Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-26 16:39 UTC (permalink / raw)
  To: 9fans

> This approach will work too. But it seems that asking fossil
> to verify a checksum when the block is about to go to venti
> is not that much of an overhead.

if checksumming is a good idea, shouldn't it be available outside
fossil?

perhaps the argument is that it might be more efficient
to implement this inside fossil.  while this might be the case, i
don't see how the small overhead of a sd layer would matter
when you're assuming an ec2-style service, which will have a
minimum latency in the 10s of milliseconds.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 16:21                                                     ` Roman V. Shaposhnik
@ 2009-01-26 17:37                                                       ` erik quanstrom
  2009-01-27  4:51                                                         ` Roman Shaposhnik
  0 siblings, 1 reply; 91+ messages in thread
From: erik quanstrom @ 2009-01-26 17:37 UTC (permalink / raw)
  To: 9fans

> Yes. See here for details:
>    http://blogs.sun.com/bonwick/entry/raid_z

since these arguments rely heavily on the meme that
	software raid == bad
i have a hard time signing on.  i believe i'm repeating
myself by saying that afik, there is no such thing as pure
hardware raid; that is, there is no hardware that does
all of what raid level n does in hardware.  even if it's
an embedded processor, it's all software raid.  perhaps
there's an xor engine to speed things along.

the other part of the argument — the "write hole"
depends on two things that i don't think are universal
a) zfs' demand for transactional storage b) a particular
raid implentation.

fancy raid cards often have battery-backed ram and thus
from the pov of the host, writes are atomic.  i don't have
any nda that let me see the firmware for a variety of raid
devices, but i find it hard to believe that all raid vendors
rewrite the entire stripe whever the write is smaller than
the stripe size and all could rewrite the data before the
parity.

> You may think what you want, but obviously quite a few existing small to
> mid-size companies disagree. Including a couple of labs with MPI apps
> now running on EC2.

more people use windows than use plan 9.  should
i therefore conclude that my use of plan 9 is illogical?
http://en.wikipedia.org/wiki/Appeal_to_the_majority

why do you think that mpi has anything to do with
a plan 9 infastructure?

> May be your numbers are wrong, may be your usage
> patterns are different. Who knows.

a single cpu on ec2 costs $150/month.  my 6 personal
machines don't suck down that much juice.

the machines i have largely cost less than $500.  so
that's like $14/month.  that doesn't change the equation
much.

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 16:22                           ` Russ Cox
@ 2009-01-26 19:42                             ` Roman V. Shaposhnik
  2009-01-26 20:11                               ` Steve Simon
  0 siblings, 1 reply; 91+ messages in thread
From: Roman V. Shaposhnik @ 2009-01-26 19:42 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 2009-01-26 at 08:22 -0800, Russ Cox wrote:
> > As for me, here's my wish list so far. It is all about fossil, since
> > it looks like venti is quite fine (at least for my purposes) the
> > way it is:
> >     1. Block consistency. Yes I know the argument here is that you
> >     can always roll-back to the last known archival snapshot on venti.
> >     But the point is to kown *when* to roll back. And unless fossil
> >     warns you that a block has been corrupted you wouldn't know.
>
> I don't understand what you mean.  Do you want fossil to tell
> you when your disk is silently corrupting data, or something else?

Implementation vise I would be happy to see the same score checks that
venti does implemented in fossil. Complaining like this:
   seterr(EStrange, "lookuplump returned bad score %V not %V", u->score, score);
Would be good enough.

> >     2. live "mounting" of arbitrary scores corresponding to vac
> >     VtRoot's to arbitrary sub-directories in my fossil tree. After
> >     all, if I can do "create" of regular files and sub-directories
> >     via fossil's console why can't I create pointers to the existing
> >     venti file-hierarchies?
>
> The only reason this is hard is the choice of qids.
> You need to decide whether to reuse the qids in the archive
> or renumber them to avoid conflicts with existing qids.
> The vac format already has a way to offset the qids of whole
> subtrees, but then if you make the tree editable and new files are
> created, it gets complicated.

I see. Thanks for the explanation.

> >     3. Not sure whether this is a fossil requirement or not, but I
> >     feel uneasy that a root score is sort of unrecoverable from the
> >     pure venti archive. Its either that I know it or I don't.
>
> I don't understand what you mean here either.
> From a venti archive, you do cat file.vac to find
> the actual score.

As I mentioned: this one is not really a hard requirement, but
rather me thinking out loud. To me it feels that Venti is
opaque. In a sense that if I don't know the score to give to flfmt -v
then there's no way to browse through the venti to see what
could be there (unless I get physical access to arenas, I guess).

Now, suppose I have a fossil buffer that I constantly snap to venti.
That will build quite a lengthy chain of VtRoots. Then my fossil
buffer gets totally corrupted. I no longer know what was the
score of the most recent snapshot. And I don't think I know of any
way to find that out.

> The amazing thing to me about fossil is how indestructable
> it is when used with venti.

I agree. That has been very much the case during my short
evaluation of the two.

> > It is actually quite remarkable how similar the models of
> > fossil/venti and Git seem to be: both build on the notion
> > of the immutable history. Both address the history by the
> > hash index. Both have a mutable area whose only purpose
> > is to stage data for the subsequent commit to the permanent
> > history. Etc.
>
> I don't think it's too remarkable.  Content hash addressing was
> in the air for the last decade or so and there were a lot of
> systems built using it.  The one thing it does really well
> is eliminate any worry about cache coherency and versioning.
> That makes it very attractive for any system with large
> amounts of data or multiple machines.  Once you've gone down
> that route, you have to come to grips with how to implement
> mutability in a fundamentally immutable system, and the
> obvious way is with a mutable write buffer staging writes
> out to the immutable storage.

All true. Yet, it is surprising how many DSCMs that were built
on the idea of hash addressable history got the "implementation
of mutability" part wrong. Git is the closest one to, what I
now understand, is the fossil/venti approach.

Thanks,
Roman.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 19:42                             ` Roman V. Shaposhnik
@ 2009-01-26 20:11                               ` Steve Simon
  0 siblings, 0 replies; 91+ messages in thread
From: Steve Simon @ 2009-01-26 20:11 UTC (permalink / raw)
  To: 9fans

> Now, suppose I have a fossil buffer that I constantly snap to venti.
> That will build quite a lengthy chain of VtRoots. Then my fossil
> buffer gets totally corrupted. I no longer know what was the
> score of the most recent snapshot. And I don't think I know of any
> way to find that out.

there is a command fossil/last which prints the last snapped root score.
I run this from cron nightly and send the resulting score to a remote machine.

If all else fails there is a script  in /sys/src/cmd/venti/words/dumpvacroots
which interogates the http server built into venti and prints all the recent
root scores.

I have had to use this in the past when I had a dead disk and was less
carefull with my scores - all was fine, but I learnt my lesson.

-Steve

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 16:39                                                 ` erik quanstrom
@ 2009-01-27  4:45                                                   ` Roman Shaposhnik
  0 siblings, 0 replies; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-27  4:45 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Jan 26, 2009, at 8:39 AM, erik quanstrom wrote:
>> This approach will work too. But it seems that asking fossil
>> to verify a checksum when the block is about to go to venti
>> is not that much of an overhead.
>
> if checksumming is a good idea, shouldn't it be available outside
> fossil?

It is available -- in venti ;-)

> perhaps the argument is that it might be more efficient
> to implement this inside fossil.

The argument has nothing to do with the efficiency. However
the way fossil is structured -- I think you're right it won't be
able to get additional benefits from its own checksuming.

>  while this might be the case, i
> don't see how the small overhead of a sd layer would matter
> when you're assuming an ec2-style service, which will have a
> minimum latency in the 10s of milliseconds.

Somehow you've got this strange idea that I'm engineering
something for ec2-style services. I am not. EC2 was a simple
example I used once. If it agitates you too much I promise
not too use it in the future ;-)

Thanks,
Roman.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-26 17:37                                                       ` erik quanstrom
@ 2009-01-27  4:51                                                         ` Roman Shaposhnik
  2009-01-27  5:44                                                           ` erik quanstrom
  0 siblings, 1 reply; 91+ messages in thread
From: Roman Shaposhnik @ 2009-01-27  4:51 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Jan 26, 2009, at 9:37 AM, erik quanstrom wrote:
> the other part of the argument — the "write hole"
> depends on two things that i don't think are universal
> a) zfs' demand for transactional storage

Huh?!?

> b) a particular raid implentation.
>
> fancy raid cards

I think you missed what I in RAID is supposed to
be expanding into ;-)

>  i don't have any nda that let me see the firmware for a variety of  
> raid
> devices, but i find it hard to believe that all raid vendors
> rewrite the entire stripe whever the write is smaller than
> the stripe size and all could rewrite the data before the
> parity.

Fancy ones might try to do fancy things, but see above.

> why do you think that mpi has anything to do with
> a plan 9 infastructure?

It is the other way around: the fact that Plan9 still
doesn't have anything to do with MPI keeps it
away from the kind of clusters Ron used to care
about (although, in reality, it is all about gcc anyway,
so MPI is a lesser argument here).

>> May be your numbers are wrong, may be your usage
>> patterns are different. Who knows.
>
> a single cpu on ec2 costs $150/month.

I don't know where did you get that #, but my instance
on EC2 costs me about $70/m. Oh, wait! I know!
It is all because Solaris is so energy efficient ;-)

> the machines i have largely cost less than $500.  so
> that's like $14/month.  that doesn't change the equation
> much.

I believe you are distorting my argument on purpose.

So lets just drop this conversation, ok?

Thanks,
Roman.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [9fans] Changelogs & Patches?
  2009-01-27  4:51                                                         ` Roman Shaposhnik
@ 2009-01-27  5:44                                                           ` erik quanstrom
  0 siblings, 0 replies; 91+ messages in thread
From: erik quanstrom @ 2009-01-27  5:44 UTC (permalink / raw)
  To: 9fans

>> the other part of the argument — the "write hole"
>> depends on two things that i don't think are universal
>> a) zfs' demand for transactional storage
>
> Huh?!?

why else would the zfs guys be worried about a
"write hole" for zfs?

what would happen to a raid-z if a write returned
as successful but were really written to the disk's cache?
and before the whole write is competed, the disk or
chassis looses power.  isn't that also a "write hole"?

i suppose the answer to this problem is the checksumming.
but if that is the case, what is the point of raid-z?

- erik

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2009-01-27  5:44 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-22 15:27 [9fans] Changelogs & Patches? Venkatesh Srinivas
2008-12-22 15:29 ` erik quanstrom
2008-12-22 16:41 ` Charles Forsyth
2008-12-25  6:34   ` Roman Shaposhnik
2008-12-25  6:40     ` erik quanstrom
2008-12-26  4:28       ` Roman Shaposhnik
2008-12-26  4:45         ` lucio
2008-12-26  4:57         ` Anthony Sorace
2008-12-26  6:19           ` blstuart
2008-12-27  8:00           ` Roman Shaposhnik
2008-12-27 11:56             ` erik quanstrom
2008-12-30  0:31               ` Roman Shaposhnik
2008-12-30  0:57                 ` erik quanstrom
2009-01-05  5:19                   ` Roman V. Shaposhnik
2009-01-05  5:28                     ` erik quanstrom
2008-12-22 17:03 ` Devon H. O'Dell
2008-12-23  4:31   ` Uriel
2008-12-23  4:46 ` Nathaniel W Filardo
2008-12-25  6:50   ` Roman Shaposhnik
2008-12-25 14:37     ` erik quanstrom
2008-12-26 13:27       ` Charles Forsyth
2008-12-26 13:33         ` Charles Forsyth
2008-12-26 14:27         ` tlaronde
2008-12-26 17:25           ` blstuart
2008-12-26 18:14             ` tlaronde
2008-12-26 18:20               ` erik quanstrom
2008-12-26 18:52                 ` tlaronde
2008-12-26 21:44                   ` blstuart
2008-12-26 22:04                     ` Eris Discordia
2008-12-26 22:30                       ` erik quanstrom
2008-12-26 23:00                         ` blstuart
2008-12-27  6:04                         ` Eris Discordia
2008-12-27 10:36                           ` tlaronde
2008-12-27 16:27                             ` Eris Discordia
2008-12-29 23:54         ` Roman Shaposhnik
2008-12-30  0:13           ` hiro
2008-12-30  1:07           ` erik quanstrom
2008-12-30  1:48           ` Charles Forsyth
2008-12-30 13:18             ` Uriel
2008-12-30 15:06               ` C H Forsyth
2008-12-30 17:31                 ` Uriel
2008-12-31  1:58                   ` Noah Evans
2009-01-03 22:03           ` sqweek
2009-01-05  5:05             ` Roman V. Shaposhnik
2009-01-05  5:12               ` erik quanstrom
2009-01-06  5:06                 ` Roman Shaposhnik
2009-01-06 13:55                   ` erik quanstrom
2009-01-05  5:24               ` andrey mirtchovski
2009-01-06  5:49                 ` Roman Shaposhnik
2009-01-06 14:22                   ` andrey mirtchovski
2009-01-06 16:19                     ` erik quanstrom
2009-01-06 23:23                       ` Roman V. Shaposhnik
2009-01-06 23:44                         ` erik quanstrom
2009-01-08  0:36                           ` Roman V. Shaposhnik
2009-01-08  1:11                             ` erik quanstrom
2009-01-20  6:20                               ` Roman Shaposhnik
2009-01-20 14:19                                 ` erik quanstrom
2009-01-20 22:30                                   ` Roman V. Shaposhnik
2009-01-20 23:36                                     ` erik quanstrom
2009-01-21  1:43                                       ` Roman V. Shaposhnik
2009-01-21  2:02                                         ` erik quanstrom
2009-01-26  6:28                                           ` Roman V. Shaposhnik
2009-01-26 13:42                                             ` erik quanstrom
2009-01-26 16:15                                               ` Roman V. Shaposhnik
2009-01-26 16:39                                                 ` erik quanstrom
2009-01-27  4:45                                                   ` Roman Shaposhnik
2009-01-21 19:02                                         ` Uriel
2009-01-21 19:53                                           ` Steve Simon
2009-01-24  3:15                                             ` Roman V. Shaposhnik
2009-01-24  3:36                                               ` erik quanstrom
2009-01-26  6:21                                                 ` Roman V. Shaposhnik
2009-01-26 13:53                                                   ` erik quanstrom
2009-01-26 16:21                                                     ` Roman V. Shaposhnik
2009-01-26 17:37                                                       ` erik quanstrom
2009-01-27  4:51                                                         ` Roman Shaposhnik
2009-01-27  5:44                                                           ` erik quanstrom
2009-01-21 20:01                                           ` erik quanstrom
2009-01-24  3:19                                           ` Roman V. Shaposhnik
2009-01-24  3:25                                             ` erik quanstrom
2009-01-20  6:48                     ` Roman Shaposhnik
2009-01-20 14:13                       ` erik quanstrom
2009-01-20 16:19                         ` Steve Simon
2009-01-20 23:52                       ` andrey mirtchovski
2009-01-21  4:49                         ` Dave Eckhardt
2009-01-21  6:38                         ` Steve Simon
2009-01-21 14:02                           ` erik quanstrom
2009-01-26  6:16                         ` Roman V. Shaposhnik
2009-01-26 16:22                           ` Russ Cox
2009-01-26 19:42                             ` Roman V. Shaposhnik
2009-01-26 20:11                               ` Steve Simon
2008-12-27  7:40       ` Roman Shaposhnik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).