From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu Subject: Re: [9fans] QTCTL? From: "Russ Cox" Date: Thu, 1 Nov 2007 13:03:45 -0400 In-Reply-To: <13426df10711010904r317f9fd6v14a87dc2f024b0b1@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20071101170234.13E9E1E8C55@holo.morphisms.net> Topicbox-Message-UUID: e39696be-ead2-11e9-9d60-3106f5b1d025 > The fact is we have loose consistency now, we just don't call it that. > Anytime you are running a file from a server, you have loose > consistency. It works ok in most cases. Because all reads and writes go through to the server, all file system operations on a particular server are globally ordered, *and* that global ordering matches the actual sequence of events in the physical world (because clients wait for R-messages). That's a pretty strong consistency statement! Any revocation-based system has to have the server wait for an acknowledgement from the client. If there is no wait, then between the time that the server sends the "oops, stop caching this" and the client processes it, the client might incorrectly use the now-invalid data. That's why a change file doesn't provide the same consistency guarantees as pushing all reads/writes to the file server. To get those, revocations fundamentally must wait for the client. It's also why this doesn't work: > Tcache asks whether the server is prepared to cache > Rcache makes lease available with parameters, Rerror says no. > > Tlease says, ok start my lease now (almost immediately follows Rache) > Rlease lease expired or lease needs to be given back early > > Tcache done with old lease (may immediately ask for a new lease) > etc. because the Rlease/Tcache sequence is a s->c->s message. If a client doesn't respond with the Tcache to formally give up the lease, the server has no choice but to wait. If you are willing to assume that each machine has a real-time clock that runs approximately at the same rate (so that different machines agree on what 5 seconds means, but not necessarily what time it is right now), then you can fix the above messages by saying that the client lease is only good for a fixed time period (say 5 seconds) from the time that the client sent the Tlease. Then the server can overestimate the lease length as 5 seconds from when it sent the Rlease, and everything is safe. And if the server sends a Rlease and the client doesn't respond with a Tcache to officially renounce the lease, the server can just wait until Tlease + 5 seconds and go on. But that means the client has to be renewing the lease every 5 seconds (more frequently, actually). Also, in the case where the lease is just expiring but not being revoked, then you have to have some mechanism for establishing the new lease before the old one runs out. If there is a time between two leases when you don't hold any leases, then all your cached data becomes invalid. The following works: Tnewlease asks for a new lease Rnewlease grants the lease, for n seconds starting at time of Tnewlease Trenewlease asks to renew the lease Rrenewlease grants the renewal for n seconds starting at time of Trenewlease Now if the server needs to revoke the lease, it just refuses to renew and waits until the current lease expires. You can add a pseudo-callback to speed up revocation with a cooperative client: Tneeditback offers to give lease back to server early Rneeditback says I accept your offer, please do Tdroplease gives lease back Rdroplease says okay I got it (not really necessary) The lease ends when the client sends Tdroplease, *not* when the server sends Rneeditback. It can't end at Rneeditback for the same reason change files don't work. And this can *only* be an optimization, because it depends on the client sending Tdroplease. To get something that works in the presence of misbehaved clients you have to be able to fall back on the "wait it out" strategy. One could, of course, use a different protocol with a 9P front end. That's okay for clients, but you'd still have to teach the back-end server (i.e. fossil) to speak the other protocol directly in order to get any guarantees. (If 9P doesn't cut it then anything that's just in front of (not in place of) a 9P server can't solve the problem.) Russ