From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <9front-bounces@9front.inri.net> X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from 9front.inri.net (9front.inri.net [168.235.81.73]) by inbox.vuxu.org (Postfix) with ESMTP id 1108A23C39 for ; Wed, 8 May 2024 18:35:57 +0200 (CEST) Received: from dpmailmta01.doteasy.com ([65.61.219.8]) by 9front; Wed May 8 12:33:49 -0400 2024 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=192.168.101.81; Received: from dpmailrp01.doteasy.com (unverified [192.168.101.81]) by dpmailmta01.doteasy.com (DEO) with ESMTP id 134654820-1394429 for <9front@9front.org>; Wed, 08 May 2024 09:33:41 -0700 Received: from dpmail01.doteasy.com (dpmail01.doteasy.com [192.168.101.1]) by dpmailrp01.doteasy.com (8.15.2/8.15.2/Debian-8+deb9u1) with ESMTP id 448GXdvB016870 for <9front@9front.org>; Wed, 8 May 2024 09:33:40 -0700 X-SmarterMail-Authenticated-As: fde101@fjrhome.net Received: from [192.168.1.95] (pool-173-67-134-57.hrbgpa.fios.verizon.net [173.67.134.57]) by dpmail01.doteasy.com with SMTP (version=Tls12 cipher=Aes256 bits=256); Wed, 8 May 2024 09:33:20 -0700 Message-ID: <959215bb-a8d0-40b9-bbd5-89c21a90c9ac@fjrhome.net> Date: Wed, 8 May 2024 12:33:13 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: 9front@9front.org References: <328dc131-044d-408d-a040-512a44ae6e7b@fjrhome.net> <9df183e7-7a94-4d58-9a68-2dbc0e73018f@posixcafe.org> Content-Language: en-US From: "Frank D. Engel, Jr." In-Reply-To: <9df183e7-7a94-4d58-9a68-2dbc0e73018f@posixcafe.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Exim-Id: 959215bb-a8d0-40b9-bbd5-89c21a90c9ac X-Bayes-Prob: 0.5 (Score 0, tokens from: base:default, @@RPTN) X-CanIt-Geo: No geolocation information available for 192.168.101.1 X-CanItPRO-Stream: base:default X-Canit-Stats-ID: 01ckExE9f - bdc40e88a886 - 20240508 X-Scanned-By: CanIt (www . roaringpenguin . com) on 192.168.101.81 X-Originating-IP: 192.168.101.81 List-ID: <9front.9front.org> List-Help: X-Glyph: ➈ X-Bullshit: deep-learning enhancement-aware solution Subject: Re: [9front] Enabling a service Reply-To: 9front@9front.org Precedence: bulk When you perform a write, you send that write to the file server on the remote system and it updates the block, generating the appropriate hash.  If another client sends a write with a collision, the file server handles it the same way it already would - the two are ultimately processed in some order or another. Each client system may have updated a locally cached copy of the block with the same written data, but if the data did not match what was generated on the file server because it was updated in the meantime, then the hash on the client system will not match what the remote server came up with. Consequently, when it later tries to read that block again, the file server sends it the hash that it has, which does not exist in the local cache, so the client is forced to request the correct block from the file server instead of using the locally cached copy. Similarly, the other client performed its write and has yet another different hash, since the server has the data from both clients.  When that client tries to perform another read, its hash won't match either, forcing it too to obtain the correct copy of the data from the file server. Had only one of them been manipulating the data, then that one would have the correct copy with a matching hash, so it would be able to use its local copy. On 5/8/24 12:10, Jacob Moody wrote: > On 5/8/24 10:49, Frank D. Engel, Jr. wrote: >> How did it work for Venti when there were multiple users? > My understanding of venti is somewhat limited to take my explanation with a grain of salt. > If you have divergent fossils using the same venti you will get divergent root scores, they > become two paths that have to be merged manually. > >> You still have a single source of truth on the file server as all of the >> data would still be written there with this approach and it would still >> manage the directory structure, so any data that would come from other >> users would ultimately just be pulled from the file server and loaded >> into their cache separately. > I got your proposal the wrong way around, you are talking about a local > venti and a remote fossil. At the point you decide that you want a single > remote source of truth(filesystem) that everyone must reconcile with you are still going > to have issues with merging. Venti works as a backing for multiple disjoint fossils > because it has no single source of truth for the filesystem, it just stores blocks. > > No matter how you cut you are going to have to deal with multiple people merging > their cache in to the single root of truth with potentially latent updates. > Either you have to serialize all mutations at the source of truth (and at that point > you are latent bound) or you have to be clever about merging. A lot of ink has been > spilled about this problem in the scope of web programming (CRDs iirc), perhaps > that may serve as some inspiration. > >> One challenge might seem to be simultaneous writes to different parts of >> the same block, but in this case the locally calculated hash for the >> block that was written to cache would (hopefully) not match the one >> calculated by the file server which would reflect both updates, so when >> the read would occur the file server would send a hash that would not be >> in the local cache and the read would be sent across to the file server >> for the updated block, with the incorrect local block eventually being >> aged out of the cache as it started to fill up. >> >> > Reading this and rereading your previous email I still do not fully understand > how you plan to deal with collisions. If you have a local cache that you write > to first and then you slowly drain that to the remote system you are still > going to have merge issues. The fileserver is going to say at some point "No > this is based on stale information" and you'll have to figure out out to retroactively > reconcile this error with a system that has already forgotten this request. > >