From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 12361 invoked from network); 30 Aug 2021 03:15:18 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 30 Aug 2021 03:15:18 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 3168D9D549; Mon, 30 Aug 2021 13:15:16 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 4C4749D52B; Mon, 30 Aug 2021 13:14:42 +1000 (AEST) Received: by minnie.tuhs.org (Postfix, from userid 112) id 1ECFF9D52B; Mon, 30 Aug 2021 13:14:38 +1000 (AEST) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by minnie.tuhs.org (Postfix) with ESMTPS id 1A7C19D52A for ; Mon, 30 Aug 2021 13:14:34 +1000 (AEST) Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 17U3ESGa025264 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 29 Aug 2021 23:14:29 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 5D85D15C3CE6; Sun, 29 Aug 2021 23:14:28 -0400 (EDT) Date: Sun, 29 Aug 2021 23:14:28 -0400 From: "Theodore Ts'o" To: Henry Bent Message-ID: References: <202108292212.17TMCGow1448973@darkstar.fourwinds.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)? X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Unix Heretics Society mailing list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" On Sun, Aug 29, 2021 at 07:09:50PM -0400, Henry Bent wrote: > On Sun, 29 Aug 2021 at 18:13, Jon Steinhart wrote: > > > I recently upgraded my machines to fc34. I just did a stock > > uncomplicated installation using the defaults and it failed miserably. > > > > Fc34 uses btrfs as the default filesystem so I thought that I'd give it > > a try. > > ... cut out a lot about how no sane person would want to use btrfs ... The ext2/ext3/ext4 file system utilities is as far as I know the first fsck that was developed with a full regression test suite from the very beginning and integrated into the sources. (Just run "make check" and you'll know if you broken something --- or it's how I know the person contributing code was sloppy and didn't bother to run "make check" before sending me patches to review....) What a lot of people don't seem to understand is that file system utilities are *important*, and more work than you might think. The ext4 file system is roughly 71 kLOC (thousand lines of code) in the kernel. E2fsprogs is 340 kLOC. In contrast, the btrfs kernel code is 145 kLOC (btrfs does have a lot more "sexy new features"), but its btrfs-progs utilities is currently only 124 kLOC. And the e2fsprogs line count doesn't include the 350+ library of corrupted file system images that are part of its regression test suite. Btrfs has a few unit tests (as does e2fsprogs), but it doesn't have any thing similar in terms of a library corrupted file system images to test its fsck functionality. (Then again, neither does the file system utilities for FFS, so a regression test suite is not required to create a high quality fsck program. In my opinion, it very much helps, though!) > > Or, as Saturday Night Live might put it: And now, linux, starring the > > not ready for prime time filesystem. Seems like something that's been > > under development for around 15 years should be in better shape. > > > > To my way of thinking this isn't a Linux problem, or even a btrfs problem, > it's a Fedora problem. They're the ones who decided to switch their > default filesystem to something that clearly isn't ready for prime time. I was present at the very beginning of btrfs. In November, 2007, various file system developers from a number of the big IBM companies got together (IBM, Intel, HP, Red Hat, etc.) and folks decided that Linux "needed an answer to ZFS". In preparation for that meeting, I did some research asking various contacts I had at various companies how much effort and how long it took to create a new file system from scratch and make it be "enterprise ready". I asked folks at Digital how long it took for advfs, IBM for AIX and GPFS, etc., etc. And the answer I got back at that time was between 50 and 200 Person Years, with the bulk of the answers being between 100-200 PY's (the single 50PY estimate was an outlier). This was everything --- kernel and userspace coding, testing and QA, performance tuning, documentation, etc. etc. The calendar-time estimates I was given was between 5-7 calendar years, and even then, users would take at least another 2-3 years minimum of "kicking the tires", before they would trust *their* precious enterprise data on the file system. There was an Intel engineer at that meeting, who shall remain nameless, who said, "Don't tell the managers that or they'll never greenlight the project! Tell them 18 months...." And so I and other developers at IBM, continued working on ext4, which we never expected would be able to compete with btrfs and ZFS in terms of "sexy new features", but our focus was on performance, scalability, and robustness. And it probably was about 2015 or so that btrfs finally became more or less stable, but only if you restricted yourself to core functionality. (e.g., snapshots, file-system level RAID, etc., was still dodgy at the time.) I will say that at Google, ext4 is still our primary file system, mainly because all of our expertise is currently focused there. We are starting to support XFS in "beta" ("Preview") for Cloud Optimized OS, since there are some enterprise customers which are using XFS on their systems, and they want to continue using XFS as they migrate from on-prem to the Cloud. We fully support XFS for Anthos Migrate (which is a read-mostly workload), and we're still building our expertise, working on getting bug fixes backported, etc., so we can support XFS the way enterprises expect for Cloud Optimized OS, which is our high-security, ChromeOS based Linux distribution with a read-only, cryptographically signed root file system optimized for Docker and Kubernetes workloads. I'm not aware of any significant enterprise usage of btrfs, which is why we're not bothering to support btrfs at $WORK. The only big company which is using btrfs in production that I know of is Facebook, because they have a bunch of btrfs developers, but even there, they aren't using btrfs exclusively for all of their workloads. My understanding of why Fedora decided to make btrfs the default was because they wanted to get more guinea pigs to flush out the bugs. Note that Red Hat, which is responsible for Red Hat Enterprise Linux (their paid product, where they make $$$) and Fedora, which is their freebie "community distribution" --- Well, Red Hat does not currently support btrfs for their RHEL product. Make of that what you will.... - Ted