The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Larry McVoy <lm@mcvoy.com>
Cc: The Unix Heretics Society mailing list <tuhs@minnie.tuhs.org>
Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?
Date: Sun, 29 Aug 2021 23:46:47 -0400	[thread overview]
Message-ID: <YSxUpxoVnUquMwOz@mit.edu> (raw)
In-Reply-To: <20210829235745.GC20021@mcvoy.com>

On Sun, Aug 29, 2021 at 04:57:45PM -0700, Larry McVoy wrote:
> 
> I give them credit for remounting read-only when seeing errors, they may
> have gotten that from BitKeeper.

Actually, the btrfs folks got that from ext2/ext3/ext4.  The original
behavior was "don't worry, be happy" (log errors and continue), and I
added two additional options, "remount read-only", and "panic and
reboot the system".  I recommend the last especially for high
availability systems, since you can then fail over to the secondary
system, and fsck can repair the file system on the reboot path.


The primary general-purpose file systems in Linux which are under
active development these days are btrfs, ext4, f2fs, and xfs.  They
all have slightly different focus areas.  For example, f2fs is best
for low-end flash, the kind that is find on $30 dollar mobile handsets
on sale in countries like India (aka, "the next billion users").  It
has deep knowledge of "cost-optimized" flash where random writes are
to be avoided at all costs because write amplification is a terrible
thing with very primitive FTL's.

For very large file systems (e.g., large RAID arrays with pedabytes of
data), XFS will probably do better than ext4 for many workloads.

Btrfs is the file systems for users who have ZFS envy.  I believe many
of those sexy new features are best done at other layers in the
storage stack, but if you *really* want file-system level snapshots
and rollback, btrfs is the only game in town for Linux.  (Unless of
course you don't mind using ZFS and hope that Larry Ellison won't sue
the bejesus out of you, and if you don't care about potential GPL
violations....)

Ext4 is still getting new features added; we recently added a
light-weight journaling (a simplified version of the 2017 Usenix ATC
iJournaling paper[1]), and just last week we've added parallelized
orphan list called Orphan File[2] which optimizes parallel truncate
and unlink workloads.  (Neither of these features are enabled by
default yet, because maybe in a few years, or earlier if community
distros want to volunteer their users to be guinea pigs.  :-)

[1] https://www.usenix.org/system/files/conference/atc17/atc17-park.pdf
[2] https://www.spinics.net/lists/linux-ext4/msg79021.html

We currently aren't adding the "sexy new features" of btrfs or ZFS,
but that's mainly because there isn't a business justification to pay
for the engineering effort needed to add them.  I have some design
sketches of how we *could* add them to ext4, but most of the ext4
developers like food with our meals, and I'm still a working stiff so
I focus on work that adds value to my employer --- and, of course,
helping other ext4 developers working at other companies figure out
ways to justify new features that would add value to *their*
employers.

I might work on some sexy new features if I won the Powerball Lottery
and could retire rich, or I was working at company where engineers
could work on whatever technologies they wanted without getting
permission from the business types, but those companies tend not to
end well (especially after they get purchased by Oracle....)

						- Ted

  parent reply	other threads:[~2021-08-30  3:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-29 22:12 Jon Steinhart
2021-08-29 23:09 ` Henry Bent
2021-08-30  3:14   ` Theodore Ts'o
2021-08-30 13:55     ` Steffen Nurpmeso
2021-08-30  9:14   ` John Dow via TUHS
2021-08-29 23:57 ` Larry McVoy
2021-08-30  1:21   ` Rob Pike
2021-08-30  3:46   ` Theodore Ts'o [this message]
2021-08-30 23:04     ` Bakul Shah
2021-09-02 15:52       ` Jon Steinhart
2021-09-02 16:57         ` Theodore Ts'o
2021-08-30  3:36 ` Bakul Shah
2021-08-30 11:56   ` Theodore Ts'o
2021-08-30 22:35     ` Bakul Shah
2021-08-30 15:05 ` Steffen Nurpmeso
2021-08-31 13:18   ` Steffen Nurpmeso
2021-08-30 21:38 ` Larry McVoy
2021-08-30 13:06 Norman Wilson
2021-08-30 14:42 ` Theodore Ts'o
2021-08-30 18:08   ` Adam Thornton
2021-08-30 16:46 ` Arthur Krewat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YSxUpxoVnUquMwOz@mit.edu \
    --to=tytso@mit.edu \
    --cc=lm@mcvoy.com \
    --cc=tuhs@minnie.tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).