From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Your message of "Thu, 23 Jun 2011 09:28:49 PDT." References: <4E01F311.3060305@0x6a.com> <20110623104644.5cd888d7@wks-ddc.exosec.local> <4E0352E9.9050600@0x6a.com> <201106231725.10215.dexen.devries@gmail.com> Date: Thu, 23 Jun 2011 11:47:10 -0700 From: Bakul Shah Message-Id: <20110623184710.082F0B827@mail.bitblocks.com> Subject: Re: [9fans] Survey: Current Fossil+venti Filesystem Topicbox-Message-UUID: f52fb19a-ead6-11e9-9d60-3106f5b1d025 On Thu, 23 Jun 2011 09:28:49 PDT ron minnich wrote: > > The main point I took from the talk they gave was that failure was > most strongly related to the number of writes in FLASH. If your > striping strategy is to duplicate writes to each drive, you faced the > happy prospect of doing a write and having both drives fail at the > same time. Hard drives have a different way of failing. We've seen > weirdness like this here, with drives in a bunch of nodes that all > seem to fail simultaneously, well within rated lifetime. Not cheap > drives either. Of course that was a little while ago and things seem > to have gotten better, but it's worth a warning. All they are saying is to age SSDs at different rate to avoid correlated failures. Disk drives have a similar problem in that disks from the same batch seem to die at a similar age. One issue is that N years later it is not cost effective to get a replacement disk of the same size. Now I think this (dying at the same age) is actually a good thing! The key is to not wait to replace until they die; just replace them all when you decide to replace *any*! zfs helps since it will automatically grow the space (So for instance, on my home system originally I used a mirror of 2 250GB used IDE disks and another mirror of 2 300GB sata disks, striped together. I first replaced both IDE disks with bigger ATA disks. Later I replaced the 300GB sata disks with 1TB disks and now I have a lot more space to play with). @work I used ZFS raidz2 on 2TBx6 drives and a 2x80GB SSD mirror for root + the write intent log (this is a server for backing up N machines, so write performance is more critical). Due to a mixup we are using MLC SSDs instead of SLC SSDs (to be replaced at some point). Not ideal but works well enough.