From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Sat, 14 Jan 2012 11:32:30 -0500 To: 9fans@9fans.net Message-ID: <2ca6969da468ef7d305866d2c3c484f4@chula.quanstro.net> In-Reply-To: References: <20120113113026.GA419@polynum.com> <20120114003032.1C08F1CC8F@mail.bitblocks.com> <201201140201.51504.dexen.devries@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] fossil pb: FOUND! Topicbox-Message-UUID: 5c2da14a-ead7-11e9-9d60-3106f5b1d025 > What about virtual machine images? >=20 > > the tradeoff for this compression is a large amount of memory, > > fragmentation, and cpu usage. =C2=A0that is to say, storage latency. >=20 > I have 24GB RAM. My primary laptops have 8GB RAM. I have all this RAM > not because of dedup but because I do memory intensive tasks, like > running virtual machines. I believe this is true for many users. russ posted some notes how how much memory and disk bandwidth are required to write at a constant b/w of Xmb/s to venti. venti requires enormous resources to perform this capability. also, 24gb isn't really much storage. that's 1000 vm images/disk, assumi= ng that you store the regions with all zeros. one thing to note is that we're silently comparing block (ish) storage (v= enti) to file systems. this isn't really a useful comparison. i don't know of= many folks who store big disk images on file systems. we have some customers who do do this, and they use the vsx to clone a base vm image. there's no de-dup, but only the change extents get stored. > I'm of a completely different opinion regarding fragmentation. On > SSDs, it's a non issue.=20 that's not correct. a very good ssd will do only about 10,000 r/w random iops. (certainly they show better numbers for the easy case of compressa= ble 100% write work loads.) that's less than 40mb/s. on the other hand, a g= ood ssd will do about 10x, if eading sequentially. > My CPU can SHA-1 hash orders of magnitude faster than it can read from > disk, and that's using only generic instructions, plus, it's sitting > idle anyway. it's not clear to me that the sha-1 hash in venti has any real bearing on venti's end performance. do you have any data or references for this? - erik