The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Arthur Krewat <krewat@kilonet.net>
To: tuhs@minnie.tuhs.org
Subject: Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?
Date: Mon, 30 Aug 2021 12:46:59 -0400	[thread overview]
Message-ID: <434d8b8d-ed64-cc73-9947-3d415a90bb08@kilonet.net> (raw)
In-Reply-To: <20210830130603.A7D4C640CC6@lignose.oclsc.org>

On 8/30/2021 9:06 AM, Norman Wilson wrote:
> A key point is that the character of the errors they
> found suggests it's not just the disks one ought to worry
> about, but all the hardware and software (much of the latter
> inside disks and storage controllers and the like) in the
> storage stack.
I had a pair of Dell MD1000's, full of SATA drives (28 total), with the 
SATA/SAS interposers on the back of the drive. Was getting checksum 
errors in ZFS on a handful of the drives. Took the time to build a new 
array, on a Supermicro backplane, and no more errors with the exact same 
drives.

I'm theorizing it was either the interposers, or the SAS 
backplane/controllers in the MD1000. Without ZFS, who knows who 
swiss-cheesy my data would be.

Not to mention the time I setup a Solaris x86 cluster zoned to a 
Compellent and periodically would get one or two checksum errors in ZFS. 
This was the only cluster out of a handful that had issues, and only on 
that one filesystem. Of course, it was a production PeopleSoft Oracle 
database. I guess moving to a VMware Linux guest and XFS just swept the 
problem under the rug, but the hardware is not being reused so there's that.

> I had heard anecdotes long before (e.g. from Andrew Hume)
> suggesting silent data corruption had become prominent
> enough to matter, but this paper was the first real study
> I came across.
>
> I have used ZFS for my home file server for more than a
> decade; presently on an antique version of Solaris, but
> I hope to migrate to OpenZFS on a newer OS and hardware.
> So far as I can tell ZFS in old Solaris is quite stable
> and reliable.  As Ted has said, there are philosophical
> reasons why some prefer to avoid it, but if you don't
> subscribe to those it's a fine answer.
>
Been running Solaris 11.3 and ZFS for quite a few years now, at home. 
Before that, Solaris 10. I recently setup a home Redhat 8 server, w/ZoL 
(.8), earlier this year - so far, no issues, with 40+TB online. I have 
various test servers with ZoL 2.0 on them, too.

I have so much online data that I use as the "live copy" - going back to 
the early 80's copies of my TOPS-10 stuff. Even though I have copious 
amounts of LTO tape copies of this data, I won't go back to the "out of 
sight out of mind" mentality.

Trying to get customers to buy into that idea is another story.

art k.

PS: I refuse to use a workstation that doesn't use ECC RAM, either. I 
like swiss-cheese on a sandwich. I don't like my (or my customers') data 
emulating it.

  parent reply	other threads:[~2021-08-30 16:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 13:06 Norman Wilson
2021-08-30 14:42 ` Theodore Ts'o
2021-08-30 18:08   ` Adam Thornton
2021-08-30 16:46 ` Arthur Krewat [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-08-29 22:12 Jon Steinhart
2021-08-29 23:09 ` Henry Bent
2021-08-30  3:14   ` Theodore Ts'o
2021-08-30 13:55     ` Steffen Nurpmeso
2021-08-30  9:14   ` John Dow via TUHS
2021-08-29 23:57 ` Larry McVoy
2021-08-30  1:21   ` Rob Pike
2021-08-30  3:46   ` Theodore Ts'o
2021-08-30 23:04     ` Bakul Shah
2021-09-02 15:52       ` Jon Steinhart
2021-09-02 16:57         ` Theodore Ts'o
2021-08-30  3:36 ` Bakul Shah
2021-08-30 11:56   ` Theodore Ts'o
2021-08-30 22:35     ` Bakul Shah
2021-08-30 15:05 ` Steffen Nurpmeso
2021-08-31 13:18   ` Steffen Nurpmeso
2021-08-30 21:38 ` Larry McVoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=434d8b8d-ed64-cc73-9947-3d415a90bb08@kilonet.net \
    --to=krewat@kilonet.net \
    --cc=tuhs@minnie.tuhs.org \
    --subject='Re: [TUHS] Is it time to resurrect the original dsw (delete with switches)?' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).