* [9fans] FS to skip/put-together duplicate files
@ 2007-08-10 11:39 Enrico Weigelt
2007-08-10 11:51 ` Gabriel Diaz
2007-08-10 12:26 ` erik quanstrom
0 siblings, 2 replies; 18+ messages in thread
From: Enrico Weigelt @ 2007-08-10 11:39 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
Hi folks,
I'm host a lot of web applications which share 99% of their code.
Disk space is not the issue, but bandwidth on remote backup.
So my idea is to let an filesystem automatically link together
equal files in the storage, but present them as separate ones.
Once an file gets changed, it will be unlinked/copied automatically.
Is there already such an filesystem ?
thx
--
---------------------------------------------------------------------
Enrico Weigelt == metux IT service - http://www.metux.de/
---------------------------------------------------------------------
Please visit the OpenSource QM Taskforce:
http://wiki.metux.de/public/OpenSource_QM_Taskforce
Patches / Fixes for a lot dozens of packages in dozens of versions:
http://patches.metux.de/
---------------------------------------------------------------------
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-10 11:39 [9fans] FS to skip/put-together duplicate files Enrico Weigelt
@ 2007-08-10 11:51 ` Gabriel Diaz
2007-08-10 12:26 ` erik quanstrom
1 sibling, 0 replies; 18+ messages in thread
From: Gabriel Diaz @ 2007-08-10 11:51 UTC (permalink / raw)
To: weigelt, Fans of the OS Plan 9 from Bell Labs
hello
i think venti compression works at block level, so if the file
contents are the same you will have two files with references to the
same blocks, but read venti paper to be sure :)
slds.
gabi
On 8/10/07, Enrico Weigelt <weigelt@metux.de> wrote:
>
> Hi folks,
>
>
> I'm host a lot of web applications which share 99% of their code.
> Disk space is not the issue, but bandwidth on remote backup.
> So my idea is to let an filesystem automatically link together
> equal files in the storage, but present them as separate ones.
> Once an file gets changed, it will be unlinked/copied automatically.
>
> Is there already such an filesystem ?
>
>
> thx
> --
> ---------------------------------------------------------------------
> Enrico Weigelt == metux IT service - http://www.metux.de/
> ---------------------------------------------------------------------
> Please visit the OpenSource QM Taskforce:
> http://wiki.metux.de/public/OpenSource_QM_Taskforce
> Patches / Fixes for a lot dozens of packages in dozens of versions:
> http://patches.metux.de/
> ---------------------------------------------------------------------
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-10 11:39 [9fans] FS to skip/put-together duplicate files Enrico Weigelt
2007-08-10 11:51 ` Gabriel Diaz
@ 2007-08-10 12:26 ` erik quanstrom
2007-08-13 3:32 ` YAMANASHI Takeshi
1 sibling, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2007-08-10 12:26 UTC (permalink / raw)
To: weigelt, 9fans
> I'm host a lot of web applications which share 99% of their code.
> Disk space is not the issue, but bandwidth on remote backup.
> So my idea is to let an filesystem automatically link together
> equal files in the storage, but present them as separate ones.
> Once an file gets changed, it will be unlinked/copied automatically.
>
> Is there already such an filesystem ?
no.
however there are updatedb/compactdb which can be used to
create a list of changed files and replica/applylog which can
be used to apply them.
i used these tools to copy history from one kenfs to a new one.
i actually used cphist (/n/sources/patch/saved/cphist) and not
applylog.
you could also use the log on the generating machine to build
a mkfs archive and compress that, ftp it and apply it on the
other end.
- erik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-10 12:26 ` erik quanstrom
@ 2007-08-13 3:32 ` YAMANASHI Takeshi
2007-08-13 12:01 ` erik quanstrom
2007-08-14 13:16 ` erik quanstrom
0 siblings, 2 replies; 18+ messages in thread
From: YAMANASHI Takeshi @ 2007-08-13 3:32 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
how about mounting venti-backed fossil files from a linux as an AoE drives?
vblade on sources exports the plan 9 file and the aoe driver for linux
does the mounting.
Venti would be compressing and condensing duplicated blocks to a single block.
I'm not sure if same files are boundaried in same manner to each other though.
On 8/10/07, erik quanstrom <quanstro@quanstro.net> wrote:
> > I'm host a lot of web applications which share 99% of their code.
> > Disk space is not the issue, but bandwidth on remote backup.
> > So my idea is to let an filesystem automatically link together
> > equal files in the storage, but present them as separate ones.
> > Once an file gets changed, it will be unlinked/copied automatically.
> >
> > Is there already such an filesystem ?
>
> no.
>
> however there are updatedb/compactdb which can be used to
> create a list of changed files and replica/applylog which can
> be used to apply them.
>
> i used these tools to copy history from one kenfs to a new one.
> i actually used cphist (/n/sources/patch/saved/cphist) and not
> applylog.
>
> you could also use the log on the generating machine to build
> a mkfs archive and compress that, ftp it and apply it on the
> other end.
>
> - erik
>
--
YAMANASHI Takeshi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-13 3:32 ` YAMANASHI Takeshi
@ 2007-08-13 12:01 ` erik quanstrom
2007-08-13 13:43 ` Francisco J Ballesteros
2007-08-14 13:16 ` erik quanstrom
1 sibling, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2007-08-13 12:01 UTC (permalink / raw)
To: 9fans
On Sun Aug 12 23:35:45 EDT 2007, 9.nashi@gmail.com wrote:
> how about mounting venti-backed fossil files from a linux as an AoE drives?
> vblade on sources exports the plan 9 file and the aoe driver for linux
> does the mounting.
>
> Venti would be compressing and condensing duplicated blocks to a single block.
> I'm not sure if same files are boundaried in same manner to each other though.
we are going to do this with kenfs and aoe. our main filesystem is going to look
something like
cm0f{(m1m2m3)e99.0e100.1}
where e99.0 will be a local shelf and 100.1 will be remote. there is no compression,
but only changed blocks are dumped and kenfs doesn't really care how long the
dump takes.
- erik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-13 12:01 ` erik quanstrom
@ 2007-08-13 13:43 ` Francisco J Ballesteros
2007-08-13 13:52 ` erik quanstrom
0 siblings, 1 reply; 18+ messages in thread
From: Francisco J Ballesteros @ 2007-08-13 13:43 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
We use venti-fossil on Coraid´s SR aoe drives just fine.
The frontend is a separate Plan 9 machine that uses fs(3) to partition the aoe
drives (which are raid-1 lblades). It works great.
On 8/13/07, erik quanstrom <quanstro@quanstro.net> wrote:
> On Sun Aug 12 23:35:45 EDT 2007, 9.nashi@gmail.com wrote:
> > how about mounting venti-backed fossil files from a linux as an AoE drives?
> > vblade on sources exports the plan 9 file and the aoe driver for linux
> > does the mounting.
> >
> > Venti would be compressing and condensing duplicated blocks to a single block.
> > I'm not sure if same files are boundaried in same manner to each other though.
>
> we are going to do this with kenfs and aoe. our main filesystem is going to look
> something like
>
> cm0f{(m1m2m3)e99.0e100.1}
>
> where e99.0 will be a local shelf and 100.1 will be remote. there is no compression,
> but only changed blocks are dumped and kenfs doesn't really care how long the
> dump takes.
>
> - erik
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-13 13:43 ` Francisco J Ballesteros
@ 2007-08-13 13:52 ` erik quanstrom
2007-08-13 14:40 ` Francisco J Ballesteros
0 siblings, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2007-08-13 13:52 UTC (permalink / raw)
To: 9fans
if you're setting up a new venti+fossil+aoe fs, i would recommend using
sdaoe.
(i don't recommend fidding with something that's already working, though.)
- erik
> We use venti-fossil on Coraid´s SR aoe drives just fine.
> The frontend is a separate Plan 9 machine that uses fs(3) to partition the aoe
> drives (which are raid-1 lblades). It works great.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-13 13:52 ` erik quanstrom
@ 2007-08-13 14:40 ` Francisco J Ballesteros
0 siblings, 0 replies; 18+ messages in thread
From: Francisco J Ballesteros @ 2007-08-13 14:40 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
Ours is working from the day after we got the SR.
However, I'll try sdaoe just for fun, for a while, and then will go back to
our "production" scheme.
thanks for the hint
On 8/13/07, erik quanstrom <quanstro@coraid.com> wrote:
> if you're setting up a new venti+fossil+aoe fs, i would recommend using
> sdaoe.
>
> (i don't recommend fidding with something that's already working, though.)
>
> - erik
>
> > We use venti-fossil on Coraid´s SR aoe drives just fine.
> > The frontend is a separate Plan 9 machine that uses fs(3) to partition the aoe
> > drives (which are raid-1 lblades). It works great.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-13 3:32 ` YAMANASHI Takeshi
2007-08-13 12:01 ` erik quanstrom
@ 2007-08-14 13:16 ` erik quanstrom
2007-08-14 13:57 ` Steve Simon
2007-08-14 16:18 ` Uriel
1 sibling, 2 replies; 18+ messages in thread
From: erik quanstrom @ 2007-08-14 13:16 UTC (permalink / raw)
To: 9fans
> how about mounting venti-backed fossil files from a linux as an AoE drives?
> vblade on sources exports the plan 9 file and the aoe driver for linux
> does the mounting.
>
> Venti would be compressing and condensing duplicated blocks to a single block.
> I'm not sure if same files are boundaried in same manner to each other though.
>
this is always cited as the "killer functionality" of venti. essentially it trades cpu time
for disk space. however, the couple of times where a concrete venti solution was
discussed (seperating attachments into seperate files, e.g. for de-duping), it was
deemed to be slower because attachments need to be split out to be recognized
as the same. (9fans.net/archive/2005/10 and 9fans.net/archive/2005/11/1).
i wonder about this today with such large disks.
does anyone have an example of a case where compression and uniquing are required?
- erik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-14 13:16 ` erik quanstrom
@ 2007-08-14 13:57 ` Steve Simon
2007-08-14 14:13 ` erik quanstrom
2007-08-14 16:18 ` Uriel
1 sibling, 1 reply; 18+ messages in thread
From: Steve Simon @ 2007-08-14 13:57 UTC (permalink / raw)
To: 9fans
> does anyone have an example of a case where compression and uniquing are required?
the compression is nice to have of course but the uniqing is very
neat. I have always though of it as plan9's answer to CSV et al.
When you do a release of a software package you copy the files to
a new directory with the name of the release (the equivilent of
tagging your release in CVS) - and continue working. this tne takes
up the space for the directory entries and all releases are always
available. branching is trivial (dircp) only a pretty merge tool
is missing - I have diff3 from edition7 in my contrib area
but some sort of interactive differencting GUI tool would be very
hand somtimes.
have I drifted off topic I wonder...
-Steve
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-14 13:57 ` Steve Simon
@ 2007-08-14 14:13 ` erik quanstrom
2007-08-14 15:02 ` David Leimbach
0 siblings, 1 reply; 18+ messages in thread
From: erik quanstrom @ 2007-08-14 14:13 UTC (permalink / raw)
To: 9fans
> > does anyone have an example of a case where compression and uniquing are required?
>
> the compression is nice to have of course but the uniqing is very
> neat. I have always though of it as plan9's answer to CSV et al.
>
> When you do a release of a software package you copy the files to
> a new directory with the name of the release (the equivilent of
> tagging your release in CVS) - and continue working. this tne takes
> up the space for the directory entries and all releases are always
> available. branching is trivial (dircp) only a pretty merge tool
> is missing - I have diff3 from edition7 in my contrib area
> but some sort of interactive differencting GUI tool would be very
> hand somtimes.
>
> have I drifted off topic I wonder...
seems on topic to me.
the extra disk space used for a copy of source should be tiny.
if you have a 500GB disk (< $150 at newegg), making a several of extra
copies of /sys/src would cost you 1/1000th of your disk space if not uniqued.
the old way to do versioning is to remember the date of the release
and use history. this also doesn't use any disk space, except for the
deltas.
- erik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-14 14:13 ` erik quanstrom
@ 2007-08-14 15:02 ` David Leimbach
0 siblings, 0 replies; 18+ messages in thread
From: David Leimbach @ 2007-08-14 15:02 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
[-- Attachment #1: Type: text/plain, Size: 1399 bytes --]
Anyone run Venti on flash or eeprom before? It might be more suitable there
than on giant disks.
Dave
On 8/14/07, erik quanstrom <quanstro@coraid.com> wrote:
>
> > > does anyone have an example of a case where compression and uniquing
> are required?
> >
> > the compression is nice to have of course but the uniqing is very
> > neat. I have always though of it as plan9's answer to CSV et al.
> >
> > When you do a release of a software package you copy the files to
> > a new directory with the name of the release (the equivilent of
> > tagging your release in CVS) - and continue working. this tne takes
> > up the space for the directory entries and all releases are always
> > available. branching is trivial (dircp) only a pretty merge tool
> > is missing - I have diff3 from edition7 in my contrib area
> > but some sort of interactive differencting GUI tool would be very
> > hand somtimes.
> >
> > have I drifted off topic I wonder...
>
> seems on topic to me.
>
> the extra disk space used for a copy of source should be tiny.
> if you have a 500GB disk (< $150 at newegg), making a several of extra
> copies of /sys/src would cost you 1/1000th of your disk space if not
> uniqued.
>
> the old way to do versioning is to remember the date of the release
> and use history. this also doesn't use any disk space, except for the
> deltas.
>
> - erik
>
[-- Attachment #2: Type: text/html, Size: 1796 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
2007-08-14 13:16 ` erik quanstrom
2007-08-14 13:57 ` Steve Simon
@ 2007-08-14 16:18 ` Uriel
2007-08-14 16:25 ` erik quanstrom
1 sibling, 1 reply; 18+ messages in thread
From: Uriel @ 2007-08-14 16:18 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
On 8/14/07, erik quanstrom <quanstro@coraid.com> wrote:
> > how about mounting venti-backed fossil files from a linux as an AoE drives?
> > vblade on sources exports the plan 9 file and the aoe driver for linux
> > does the mounting.
> >
> > Venti would be compressing and condensing duplicated blocks to a single block.
> > I'm not sure if same files are boundaried in same manner to each other though.
> >
>
> this is always cited as the "killer functionality" of venti. essentially it trades cpu time
> for disk space. however, the couple of times where a concrete venti solution was
> discussed (seperating attachments into seperate files, e.g. for de-duping), it was
> deemed to be slower because attachments need to be split out to be recognized
> as the same. (9fans.net/archive/2005/10 and 9fans.net/archive/2005/11/1).
I think Mechiel Lukkien GSoC project might be helpful with such
issues, see http://gsoc.cat-v.org/people/mjl/blog//2007-08-06-1_Rabin_fingerprints
uriel
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
@ 2007-08-15 0:42 YAMANASHI Takeshi
2007-08-15 0:47 ` erik quanstrom
0 siblings, 1 reply; 18+ messages in thread
From: YAMANASHI Takeshi @ 2007-08-15 0:42 UTC (permalink / raw)
To: 9fans
> does anyone have an example of a case where compression and uniquing are required?
I'm not sure about compression, but uniquing must be a very neat feature
when you want to build a P2P overlaid venti.
--
"on travel, off the network ... and a fossil in my pocket"
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9fans] FS to skip/put-together duplicate files
@ 2007-08-15 0:49 YAMANASHI Takeshi
0 siblings, 0 replies; 18+ messages in thread
From: YAMANASHI Takeshi @ 2007-08-15 0:49 UTC (permalink / raw)
To: 9fans
By p2p overlaid venti, I meant something like this.
http://project-iris.net/isw-2003/papers/sit.pdf
--
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-08-15 0:49 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-10 11:39 [9fans] FS to skip/put-together duplicate files Enrico Weigelt
2007-08-10 11:51 ` Gabriel Diaz
2007-08-10 12:26 ` erik quanstrom
2007-08-13 3:32 ` YAMANASHI Takeshi
2007-08-13 12:01 ` erik quanstrom
2007-08-13 13:43 ` Francisco J Ballesteros
2007-08-13 13:52 ` erik quanstrom
2007-08-13 14:40 ` Francisco J Ballesteros
2007-08-14 13:16 ` erik quanstrom
2007-08-14 13:57 ` Steve Simon
2007-08-14 14:13 ` erik quanstrom
2007-08-14 15:02 ` David Leimbach
2007-08-14 16:18 ` Uriel
2007-08-14 16:25 ` erik quanstrom
2007-08-15 0:27 ` Uriel
2007-08-15 0:42 YAMANASHI Takeshi
2007-08-15 0:47 ` erik quanstrom
2007-08-15 0:49 YAMANASHI Takeshi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).