From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 14964 invoked from network); 30 Aug 2021 03:37:24 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 30 Aug 2021 03:37:24 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 080249D538; Mon, 30 Aug 2021 13:37:23 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 36D0F9D52B; Mon, 30 Aug 2021 13:36:47 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=iitbombay-org.20150623.gappssmtp.com header.i=@iitbombay-org.20150623.gappssmtp.com header.b="sB9v9oGO"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 0AE399D52B; Mon, 30 Aug 2021 13:36:44 +1000 (AEST) Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by minnie.tuhs.org (Postfix) with ESMTPS id 60AAB9D52A for ; Mon, 30 Aug 2021 13:36:40 +1000 (AEST) Received: by mail-qt1-f172.google.com with SMTP id x5so10612229qtq.13 for ; Sun, 29 Aug 2021 20:36:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iitbombay-org.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=Q+xOhlOExgVnlFN8BjciWNl49YZWD9N7rjrbDBSkjLc=; b=sB9v9oGOslkcSC37Z+KWemAtnTZjt2EIait/U2DUbxJpb72AMbDPOtGz3CZlzcm25c hDNu++GW3Sjzx4Nc1JXZQZ+neL+InW0ptk6o8DWjsfa6AEcA3SjNzSsiTqi0LOdPYAy+ UnWNNDKA9mdAfzb7h4BjTArHuYNhTkiVYzhckYGIZ33OwDuKPHfFl17dB/Z9jEpPNQ5g WLiJllKg8ql1KKd1acH1aZU887Hwd1TjCmOOvy1GJELizkPUQ11LTExBKp9s/N+MT+rm cwNzofwc/cygiH296gRKMGY30jE3q0L+Z6++OgUHYRXnCPHRzsICOeKUkYnUEDrhIOmT /Y1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=Q+xOhlOExgVnlFN8BjciWNl49YZWD9N7rjrbDBSkjLc=; b=ahtkVlDMLHEkrf9wpfA2rXoumj9f78sQhSu0V2t5vuXieT/WvlGLsq1n19xJ+968T0 bcUg3ZYgsPpF6HeE6iCUNmiPoyDIvOHDa3u3X28/Rtjq2Uxwiu75VJxXmgH/6cAhM2+U HB6Z9DNR6pVYI011VZY5a2OGoenkYBw48RO65shgEcRWGFx6vWvkDeXXvrnqfsPaxdPm xtcMQjBa/lo+sxfHrh4nLu7dMpe2ClPIIL6xFUUt/tn2Cj69qa9SckpUYe+rSG4JGJps 23LUM52T5jvPW/1fYRZf0fsJZrcIMZgBypS7bZ5LTuEKLkdyQLj4Apitj9mSeDaKRMhy W7yw== X-Gm-Message-State: AOAM531qiEH7Z5chM7gGEMrMQsWWwUim5MLQB588bk2VY6Ko94zU8Rj9 cjeA8axHRRkHql7yI+yoFeJrpNl0mrpKaXKG X-Google-Smtp-Source: ABdhPJxqWfrZYRB8IHLZIpIt4Z76l10VaEK28NYM7vFfha8iChkWY+q5rceVjW3R6LT3FsGQtyKntg== X-Received: by 2002:ac8:7558:: with SMTP id b24mr18970744qtr.160.1630294599418; Sun, 29 Aug 2021 20:36:39 -0700 (PDT) Received: from smtpclient.apple (107-215-223-229.lightspeed.sntcca.sbcglobal.net. [107.215.223.229]) by smtp.gmail.com with ESMTPSA id v15sm7888818qta.82.2021.08.29.20.36.38 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 29 Aug 2021 20:36:39 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) From: Bakul Shah In-Reply-To: <202108292212.17TMCGow1448973@darkstar.fourwinds.com> Date: Sun, 29 Aug 2021 20:36:37 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <202108292212.17TMCGow1448973@darkstar.fourwinds.com> To: Jon Steinhart X-Mailer: Apple Mail (2.3654.120.0.1.13) Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)? X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Unix Heretics Society mailing list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" Chances are your disk has a URE 1 in 10^14 bits ("enterprise" disks may have a URE of 1 in 10^15). 10^14 bit is about 12.5TB. For 16TB disks you should use at least mirroring, provided some day you'd want to fill up the disk. And a machine with ECC RAM (& trust but verify!). I am no fan of btrfs but these are the things I'd consider for any FS. Even if you have done all this, consider the fact that disk mortality has a bathtub curve. I use FreeBSD + ZFS so I'd recommend ZFS (on Linux). ZFS scrub works in background on an active system. Similarly resilvering (thought things slow down). On my original zfs filesystem I replaced all the 4 disks twice. I have been using zfs since 2005 and it has = rarely required any babysitting. I reboot it when upgrading to a new release or applying kernel patches. "backups" via zfs send/recv of snapshots. > On Aug 29, 2021, at 3:12 PM, Jon Steinhart wrote: >=20 > I recently upgraded my machines to fc34. I just did a stock > uncomplicated installation using the defaults and it failed miserably. >=20 > Fc34 uses btrfs as the default filesystem so I thought that I'd give = it > a try. I was especially interested in the automatic checksumming = because > the majority of my storage is large media files and I worry about bit > rot in seldom used files. I have been keeping a separate database of > file hashes and in theory btrfs would make that automatic and = transparent. >=20 > I have 32T of disk on my system, so it took a long time to convert > everything over. A few weeks after I did this I went to unload my > camera and couldn't because the filesystem that holds my photos was > mounted read-only. WTF? I didn't do that. >=20 > After a bit of poking around I discovered that btrfs SILENTLY = remounted the > filesystem because it had errors. Sure, it put something in a log = file, > but I don't spend all day surfing logs for things that shouldn't be = going > wrong. Maybe my expectation that filesystems just work is antiquated. >=20 > This was on a brand new 16T drive, so I didn't think that it was worth > the month that it would take to run the badblocks program which = doesn't > really scale to modern disk sizes. Besides, SMART said that it was = fine. >=20 > Although it's been discredited by some, I'm still a believer in "stop = and > fsck" policing of disk drives. Unmounted the filesystem and ran fsck = to > discover that btrfs had to do its own thing. No idea why; I guess = some > think that incompatibility is a good thing. >=20 > Ran "btrfs check" which reported errors in the filesystem but was = otherwise > useless BECAUSE IT DIDN'T FIX ANYTHING. What good is knowing that the > filesystem has errors if you can't fix them? >=20 > Near the top of the manual page it says: >=20 > Warning > Do not use --repair unless you are advised to do so by a developer > or an experienced user, and then only after having accepted that > no fsck successfully repair all types of filesystem corruption. Eg. > some other software or hardware bugs can fatally damage a volume. >=20 > Whoa! I'm sure that operators are standing by, call 1-800-FIX-BTRFS. > Really? Is a ploy by the developers to form a support business? >=20 > Later on, the manual page says: >=20 > DANGEROUS OPTIONS > --repair > enable the repair mode and attempt to fix problems where = possible >=20 > Note there=E2=80=99s a warning and 10 second delay when this = option > is run without --force to give users a chance to think twice > before running repair, the warnings in documentation have > shown to be insufficient >=20 > Since when is it dangerous to repair a filesystem? That's a new one = to me. >=20 > Having no option other than not being able to use the disk, I ran = btrfs > check with the --repair option. It crashed. Lesson so far is that > trusting my data to an unreliable unrepairable filesystem is not a = good > idea. Since this was one of my media disks I just rebuilt it using = ext4. >=20 > Last week I was working away and tried to write out a file to discover > that /home and /root had become read-only. Charming. Tried = rebooting, > but couldn't since btrfs filesystems aren't checked and repaired. = Plugged > in a flash drive with a live version, managed to successfully run = --repair, > and rebooted. Lasted about 15 minutes before flipping back to read = only > with the same error. >=20 > Time to suck it up and revert. Started a clean reinstall. Got = stuck > because it crashed during disk setup with anaconda giving me a = completely > useless big python stack trace. Eventually figured out that it was > unable to delete the btrfs filesystem that had errors so it just = crashed > instead. Wiped it using dd; nice that some reliable tools still = survive. > Finished the installation and am back up and running. >=20 > Any of the rest of you have any experiences with btrfs? I'm sure that = it > works fine at large companies that can afford a team of disk = babysitters. > What benefits does btrfs provide that other filesystem formats such as > ext4 and ZFS don't? Is it just a continuation of the "we have to do > everything ourselves and under no circumstances use anything that came > from the BSD world" mentality? >=20 > So what's the future for filesystem repair? Does it look like the = past? > Is Ken's original need for dsw going to rise from the dead? >=20 > In my limited experience btrfs is a BiTteR FileSystem to swallow. >=20 > Or, as Saturday Night Live might put it: And now, linux, starring the > not ready for prime time filesystem. Seems like something that's been > under development for around 15 years should be in better shape. >=20 > Jon -- Bakul