The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Henry Bent <henry.r.bent@gmail.com>
Cc: The Unix Heretics Society mailing list <tuhs@minnie.tuhs.org>
Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?
Date: Sun, 29 Aug 2021 23:14:28 -0400	[thread overview]
Message-ID: <YSxNFKq9r3dyHT7l@mit.edu> (raw)
In-Reply-To: <CAEdTPBeoaxAhnOGj5ywY_Y7nwRavfmn7FMw2JTGMn-t8W3mYaw@mail.gmail.com>

On Sun, Aug 29, 2021 at 07:09:50PM -0400, Henry Bent wrote:
> On Sun, 29 Aug 2021 at 18:13, Jon Steinhart <jon@fourwinds.com> wrote:
> 
> > I recently upgraded my machines to fc34.  I just did a stock
> > uncomplicated installation using the defaults and it failed miserably.
> >
> > Fc34 uses btrfs as the default filesystem so I thought that I'd give it
> > a try.
> 
> ... cut out a lot about how no sane person would want to use btrfs ...

The ext2/ext3/ext4 file system utilities is as far as I know the first
fsck that was developed with a full regression test suite from the
very beginning and integrated into the sources.  (Just run "make
check" and you'll know if you broken something --- or it's how I know
the person contributing code was sloppy and didn't bother to run
"make check" before sending me patches to review....)

What a lot of people don't seem to understand is that file system
utilities are *important*, and more work than you might think.  The
ext4 file system is roughly 71 kLOC (thousand lines of code) in the
kernel.  E2fsprogs is 340 kLOC.  In contrast, the btrfs kernel code is
145 kLOC (btrfs does have a lot more "sexy new features"), but its
btrfs-progs utilities is currently only 124 kLOC.

And the e2fsprogs line count doesn't include the 350+ library of
corrupted file system images that are part of its regression test
suite.  Btrfs has a few unit tests (as does e2fsprogs), but it doesn't
have any thing similar in terms of a library corrupted file system
images to test its fsck functionality.  (Then again, neither does the
file system utilities for FFS, so a regression test suite is not
required to create a high quality fsck program.  In my opinion, it
very much helps, though!)

> > Or, as Saturday Night Live might put it:  And now, linux, starring the
> > not ready for prime time filesystem.  Seems like something that's been
> > under development for around 15 years should be in better shape.
> >
> 
> To my way of thinking this isn't a Linux problem, or even a btrfs problem,
> it's a Fedora problem.  They're the ones who decided to switch their
> default filesystem to something that clearly isn't ready for prime time.

I was present at the very beginning of btrfs.  In November, 2007,
various file system developers from a number of the big IBM companies
got together (IBM, Intel, HP, Red Hat, etc.) and folks decided that
Linux "needed an answer to ZFS".  In preparation for that meeting, I
did some research asking various contacts I had at various companies
how much effort and how long it took to create a new file system from
scratch and make it be "enterprise ready".  I asked folks at Digital
how long it took for advfs, IBM for AIX and GPFS, etc., etc.  And the
answer I got back at that time was between 50 and 200 Person Years,
with the bulk of the answers being between 100-200 PY's (the single
50PY estimate was an outlier).  This was everything --- kernel and
userspace coding, testing and QA, performance tuning, documentation,
etc. etc.  The calendar-time estimates I was given was between 5-7
calendar years, and even then, users would take at least another 2-3
years minimum of "kicking the tires", before they would trust *their*
precious enterprise data on the file system.

There was an Intel engineer at that meeting, who shall remain
nameless, who said, "Don't tell the managers that or they'll never
greenlight the project!  Tell them 18 months...."

And so I and other developers at IBM, continued working on ext4, which
we never expected would be able to compete with btrfs and ZFS in terms
of "sexy new features", but our focus was on performance, scalability,
and robustness.  

And it probably was about 2015 or so that btrfs finally became more or
less stable, but only if you restricted yourself to core
functionality.  (e.g., snapshots, file-system level RAID, etc., was
still dodgy at the time.)

I will say that at Google, ext4 is still our primary file system,
mainly because all of our expertise is currently focused there.  We
are starting to support XFS in "beta" ("Preview") for Cloud Optimized
OS, since there are some enterprise customers which are using XFS on
their systems, and they want to continue using XFS as they migrate
from on-prem to the Cloud.  We fully support XFS for Anthos Migrate
(which is a read-mostly workload), and we're still building our
expertise, working on getting bug fixes backported, etc., so we can
support XFS the way enterprises expect for Cloud Optimized OS, which
is our high-security, ChromeOS based Linux distribution with a
read-only, cryptographically signed root file system optimized for
Docker and Kubernetes workloads.

I'm not aware of any significant enterprise usage of btrfs, which is
why we're not bothering to support btrfs at $WORK.  The only big
company which is using btrfs in production that I know of is Facebook,
because they have a bunch of btrfs developers, but even there, they
aren't using btrfs exclusively for all of their workloads.

My understanding of why Fedora decided to make btrfs the default was
because they wanted to get more guinea pigs to flush out the bugs.
Note that Red Hat, which is responsible for Red Hat Enterprise Linux
(their paid product, where they make $$$) and Fedora, which is their
freebie "community distribution" --- Well, Red Hat does not currently
support btrfs for their RHEL product.

Make of that what you will....

						- Ted

  reply	other threads:[~2021-08-30  3:15 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-29 22:12 Jon Steinhart
2021-08-29 23:09 ` Henry Bent
2021-08-30  3:14   ` Theodore Ts'o [this message]
2021-08-30 13:55     ` Steffen Nurpmeso
2021-08-30  9:14   ` John Dow via TUHS
2021-08-29 23:57 ` Larry McVoy
2021-08-30  1:21   ` Rob Pike
2021-08-30  3:46   ` Theodore Ts'o
2021-08-30 23:04     ` Bakul Shah
2021-09-02 15:52       ` Jon Steinhart
2021-09-02 16:57         ` Theodore Ts'o
2021-08-30  3:36 ` Bakul Shah
2021-08-30 11:56   ` Theodore Ts'o
2021-08-30 22:35     ` Bakul Shah
2021-08-30 15:05 ` Steffen Nurpmeso
2021-08-31 13:18   ` Steffen Nurpmeso
2021-08-30 21:38 ` Larry McVoy
2021-08-30 13:06 Norman Wilson
2021-08-30 14:42 ` Theodore Ts'o
2021-08-30 18:08   ` Adam Thornton
2021-08-30 16:46 ` Arthur Krewat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YSxNFKq9r3dyHT7l@mit.edu \
    --to=tytso@mit.edu \
    --cc=henry.r.bent@gmail.com \
    --cc=tuhs@minnie.tuhs.org \
    --subject='Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).