Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

From: Bakul Shah <bakul@iitbombay.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: The Unix Heretics Society mailing list <tuhs@minnie.tuhs.org>
Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?
Date: Mon, 30 Aug 2021 15:35:44 -0700	[thread overview]
Message-ID: <7E649324-2690-4D9D-A175-B74089EFEE9A@iitbombay.org> (raw)
In-Reply-To: <YSzHd9xcwTDO+cru@mit.edu>

On Aug 30, 2021, at 4:56 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> 
> On Sun, Aug 29, 2021 at 08:36:37PM -0700, Bakul Shah wrote:
>> Chances are your disk has a URE 1 in 10^14 bits ("enterprise" disks
>> may have a URE of 1 in 10^15). 10^14 bit is about 12.5TB. For 16TB
>> disks you should use at least mirroring, provided some day you'd want
>> to fill up the disk. And a machine with ECC RAM (& trust but verify!).
>> I am no fan of btrfs but these are the things I'd consider for any FS.
>> Even if you have done all this, consider the fact that disk mortality
>> has a bathtub curve.
> 
> You may find this article interesting: "The case of the 12TB URE:
> Explained and debunked"[1], and the following commit on a reddit
> post[2] discussiong this article:
> 
>   "Lol of course it's a myth.
> 
>   I don't know why or how anyone thought there would be a URE
>   anywhere close to every 12TB read.
> 
>   Many of us have large pools that are dozens and sometimes hundreds of TB.
> 
>   I have 2 64TB pools and scrub them every month. I can go years
>   without a checksum error during a scrub, which means that all my
>   ~50TB of data was read correctly without any URE many times in a
>   row which means that I have sometimes read 1PB (50TB x 2 pools x 10
>   months) worth from my disks without any URE.
> 
>   Last I checked, the spec sheets say < 1 in 1x1014 which means less
>   than 1 in 12TB. 0 in 1PB is less than 1 in 12TB so it meets the
>   spec."
> 
> [1] https://heremystuff.wordpress.com/2020/08/25/the-case-of-the-12tb-ure/
> [2] https://www.reddit.com/r/DataHoarder/comments/igmab7/the_12tb_ure_myth_explained_and_debunked/

It seems this guy doesn't understand statistics. He checked his 2 pools
and is confident that a sample of 4 disks (likely) he knows that URE
specs are crap. Even from an economic PoV it doen't make sense.
Why wouldn't the disk companies tout an even lower error rate if they
can get away with it? Presumably these rates are derived from reading
many many disks and averaged.

Here's what the author says on a serverfault thread:
https://serverfault.com/questions/812891/what-is-exactly-an-ure

  @DavidBalažic Evidently, your sample size of one invalidates the
  entirety of probability theory! I suggest you submit a paper to
  the Nobel Committee. – Ian Kemp Apr 16 '19 at 5:37

  @IanKemp If someone claims that all numbers are divisible by 7 and
  I find ONE that is not, then yes, a single find can invalidate an
  entire theory. BTW, still not a single person has confirmed the myth
  in practice (by experiment), did they? Why should they, when belief
  is more than knowledge...– David Balažic Apr 16 '19 at 12:22

Incidentally, it is hard to believe he scrubs his 2x64TB pools once a month.
Assuming 250MB/s sequential throughput and his scrubber can stream it at
that rate, it will take him close to 6 days (3 days if reading them in
parallel) to read every block. During this time these pools won't be
useful for anything else. Unclear if he is using any RAID or a filesystem
that does checksums. Without that he would be unable to detect hidden
data corruption.

In contrast, ZFS will only scrub *live* data. As more of the disks are 
filled up, scrub will take progressively more time. Similarly,
replacing a zfs mirror won't read the source disk in its entirety,
only the live data.

> Of course, disks do die, and ECC and backups and checksums are good
> things.  But the whole "read 12TB get an error", saying really
> misunderstands how hdd failures work.  Losing an entire platter, or
> maybe the entire 12TB disk die due to a head crash, adds a lot of
> uncorrectable read errors to the numerator of the UER statistics.

That is not how URE specs are derived.

> 
> It just goes to show that human intuition really sucks at statistics,

Indeed :-)

next prev parent reply	other threads:[~2021-08-30 22:36 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-29 22:12 Jon Steinhart
2021-08-29 23:09 ` Henry Bent
2021-08-30  3:14   ` Theodore Ts'o
2021-08-30 13:55     ` Steffen Nurpmeso
2021-08-30  9:14   ` John Dow via TUHS
2021-08-29 23:57 ` Larry McVoy
2021-08-30  1:21   ` Rob Pike
2021-08-30  3:46   ` Theodore Ts'o
2021-08-30 23:04     ` Bakul Shah
2021-09-02 15:52       ` Jon Steinhart
2021-09-02 16:57         ` Theodore Ts'o
2021-08-30  3:36 ` Bakul Shah
2021-08-30 11:56   ` Theodore Ts'o
2021-08-30 22:35     ` Bakul Shah [this message]
2021-08-30 15:05 ` Steffen Nurpmeso
2021-08-31 13:18   ` Steffen Nurpmeso
2021-08-30 21:38 ` Larry McVoy
2021-08-30 13:06 Norman Wilson
2021-08-30 14:42 ` Theodore Ts'o
2021-08-30 18:08   ` Adam Thornton
2021-08-30 16:46 ` Arthur Krewat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7E649324-2690-4D9D-A175-B74089EFEE9A@iitbombay.org \
    --to=bakul@iitbombay.org \
    --cc=tuhs@minnie.tuhs.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).