From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HTML_MESSAGE,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 22203 invoked from network); 5 Feb 2021 18:17:21 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 5 Feb 2021 18:17:21 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 774F29C6CF; Sat, 6 Feb 2021 04:17:19 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 30ABF9BA45; Sat, 6 Feb 2021 04:16:45 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=bsdimp-com.20150623.gappssmtp.com header.i=@bsdimp-com.20150623.gappssmtp.com header.b="tyVz2/e0"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 27D709BA45; Sat, 6 Feb 2021 04:16:40 +1000 (AEST) Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by minnie.tuhs.org (Postfix) with ESMTPS id 0E2F49BA3F for ; Sat, 6 Feb 2021 04:16:39 +1000 (AEST) Received: by mail-qt1-f174.google.com with SMTP id e11so5637896qtg.6 for ; Fri, 05 Feb 2021 10:16:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PGCgnjeefqhqMyFi07qJ3iOvbQFjvazYaeMtsupRLvo=; b=tyVz2/e0ERz7BLxnjyLy4dMt60rGeIsQu/0bYUaB36vfBJFe2CBN1ASfPHjP4+a6yD A2cvnVXX2SIy2lXInoRAYOG/T4Zbz9AiQxmD8Xt2d5DgEHxJd0KlvAFbD8b9RTzXj9EG JS5WlIHDSF/Bt7tnolimldgL1Vd/AGqS4xbjlJSf+ErtNsZz3R8f36H5iMGasjtemFfw obTfuzNciAnspC/EEs1/GKdMqtMrf10BFMPt7JacHQ2K9nXj4WKl684quzmLWkRj6zh8 imySkM7zezZHuNY/BOAWZ382G+h4WYdYc+LUrj6Zz5EKhROZtk4TOPbhVYBp/8X80isl bIlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PGCgnjeefqhqMyFi07qJ3iOvbQFjvazYaeMtsupRLvo=; b=TUAJ963Xml5+J+LNKsM1kfDhroZKKPO4TZuEJ8CqQito9KqVrNObulNIFZDPtYJCCa 082a//U5BmWZ2bxzCLw1Iq2ts2kxbzo3BKzA8MAi2z+3oGW2qrBIebs23z+6PuTDi93L 8eg1u0JJaVnPZfQIYxOhZUteHP8S46qECqqUrONZg5WhBKlYnZujr978FgoyPkzlTssh MR7RJzvP2M/A6Wbll0fIvccf15TiTFOjCzi3SYz2H6uq0QOKlUxdT3osA44OONPgqkH/ Ui9gXm0ilLaJLPZup4anZGSGikyFo6peyy7T5dut2Qc0E6sxAwFYyI/f4+Pjkpa6pDhN aCLw== X-Gm-Message-State: AOAM533NF9kNC3ro2DkJm8EkFApeI2wH16WRBusYDUkxxl6ZybFbECwq s+E6/kSM6Yh1a6ucKuWRIKDMlB7vrkg0lNolOrjBqrFFy2s= X-Google-Smtp-Source: ABdhPJyIagVs97N2jTMIvdLZuDfj+GkE/E58pDkRN2UJ8UvBNZMOQvvR93BSfJzhli/7VqRY3MnYarODI9tzkV6QicU= X-Received: by 2002:a05:622a:90:: with SMTP id o16mr5201799qtw.49.1612548998055; Fri, 05 Feb 2021 10:16:38 -0800 (PST) MIME-Version: 1.0 References: <20210205003315.GK13701@mcvoy.com> <26D923CF-5319-4207-BA28-6EFA0E3BB1F8@iitbombay.org> <20210205141820.GO13701@mcvoy.com> In-Reply-To: <20210205141820.GO13701@mcvoy.com> From: Warner Losh Date: Fri, 5 Feb 2021 11:16:26 -0700 Message-ID: To: Larry McVoy Content-Type: multipart/alternative; boundary="000000000000365d5a05ba9ad24b" Subject: Re: [TUHS] FreeBSD behind the times? (was: Favorite unix design principles?) X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bakul Shah , TUHS main list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000365d5a05ba9ad24b Content-Type: text/plain; charset="UTF-8" On Fri, Feb 5, 2021, 7:19 AM Larry McVoy wrote: > On Thu, Feb 04, 2021 at 09:17:54PM -0800, Bakul Shah wrote: > > On Feb 4, 2021, at 4:33 PM, Larry McVoy wrote: > > > > > > Ignoring the page cache and make their own cache has big problems. > > > You can mmap() ZFS files and doing so means that when a page is > referenced > > > it is copied from the ZFS cache to the page cache. That creates a > > > coherency problem, I can write via the mapping and I can write via > > > write(2) and now you have two copies of the data that don't match, > > > that's pretty much OS no-no #1. > > > > Write(2)ing to a mapped page sounds pretty dodgy. Likely to get you > > in trouble in any case. Similarly read(2)ing. > > The entire point of the SunOS 4.0 VM system was that the page you > saw via mmap(2) is the exact same page you saw via read(2). It's > the page cache, it has page sized chunks of memory that cache > file,offset pairs. > > There is one, and only one, copy of the truth. Doesn't matter how > you get at it, there is only one "it". > > ZFS broke that contract and that was a step backwards in terms of > OS design. > The double copy is the primary reason we don't use it to store videos we serve. It's a performance bottleneck as well. And fixing it is... rather involved... possible, but a lot of work to teach the ARC about the buffer cache or the buffer cache about the ARC... But for everything else I do, I accept the imperfect design because of all the other features it unlocks. Warner > --000000000000365d5a05ba9ad24b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Fri, Feb 5, 2021, 7:19 AM Larry McVoy <lm@mcvoy.com> wrote:
On Thu, Feb 04, 2021 at 09:17:54PM -0800, Bakul Shah wrot= e:
> On Feb 4, 2021, at 4:33 PM, Larry McVoy <lm@mcvoy.com> wrote:
> >
> > Ignoring the page cache and make their own cache has big problems= .
> > You can mmap() ZFS files and doing so means that when a page is r= eferenced
> > it is copied from the ZFS cache to the page cache.=C2=A0 That cre= ates a
> > coherency problem, I can write via the mapping and I can write vi= a
> > write(2) and now you have two copies of the data that don't m= atch,
> > that's pretty much OS no-no #1.
>
> Write(2)ing to a mapped page sounds pretty dodgy. Likely to get you > in trouble in any case. Similarly read(2)ing.

The entire point of the SunOS 4.0 VM system was that the page you
saw via mmap(2) is the exact same page you saw via read(2).=C2=A0 It's<= br> the page cache, it has page sized chunks of memory that cache
file,offset pairs.

There is one, and only one, copy of the truth.=C2=A0 Doesn't matter how=
you get at it, there is only one "it".

ZFS broke that contract and that was a step backwards in terms of
OS design.

The double copy is the primary reason we don't use it to stor= e videos we serve. It's a performance bottleneck as well.

And fixing it is... rather involved..= . possible, but a lot of work to teach the ARC about the buffer cache or th= e buffer cache about the ARC...

But for everything else I do, I accept the imperfect design becau= se of all the other features it unlocks.

<= div dir=3D"auto">Warner
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
--000000000000365d5a05ba9ad24b--