* [9fans] QTCTL? @ 2007-10-31 18:40 Francisco J Ballesteros 2007-10-31 18:56 ` Eric Van Hensbergen 2007-10-31 19:42 ` erik quanstrom 0 siblings, 2 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-10-31 18:40 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Hi, while hunting yet another bug in the octopus, I´ve been thinking that one problem that we have in general, in Plan 9, is that there are files that behave like files, and files that do not. For example, append only files do not, offsets are ignored on writes. ctl files are not, either. You write a ctl string, and reading the file reports something else. Clone files are different files, each time they are open. This is a problem when (like we do in the octopus) you try to cache files. But it´s also a problem for things like tar and to whoever tries to use the file as a plain one. Why don´t add a QTCTL bit to Qid.type? It would mean "this file does not behave like a regular file, do not cache and handle with care). I think the change can be incorporated without causing a nightmare and it would make things more clean, regarding what can one expect from a file after looking at its directory entry. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 18:40 [9fans] QTCTL? Francisco J Ballesteros @ 2007-10-31 18:56 ` Eric Van Hensbergen 2007-10-31 19:13 ` Charles Forsyth 2007-10-31 19:42 ` erik quanstrom 1 sibling, 1 reply; 76+ messages in thread From: Eric Van Hensbergen @ 2007-10-31 18:56 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, Francisco J Ballesteros <nemo@lsub.org> wrote: > > while hunting yet another bug in the octopus, I´ve been thinking that > one problem that we have in general, in Plan 9, > is that there are files that behave like files, and files that > do not. > > For example, append only files do not, offsets are ignored on writes. > ctl files are not, either. You write a ctl string, and reading the file reports > something else. Clone files are different files, each time they are open. > > This is a problem when (like we do in the octopus) you try to cache files. > But it´s also a problem for things like tar and to whoever tries to use the > file as a plain one. > > Why don´t add a QTCTL bit to Qid.type? > It would mean "this file does not behave like a regular file, do not cache and > handle with care). > IIRC, qid.version == 0 is used to mark synthetics (like ctl) for the purposes of being marked as uncacheable and should be handled with care. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 18:56 ` Eric Van Hensbergen @ 2007-10-31 19:13 ` Charles Forsyth 2007-10-31 19:33 ` Eric Van Hensbergen 2007-10-31 20:43 ` geoff 0 siblings, 2 replies; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 19:13 UTC (permalink / raw) To: 9fans > IIRC, qid.version == 0 is used to mark synthetics (like ctl) for the > purposes of being marked as uncacheable and should be handled with > care. i don't see that anywhere. MCACHE allows things in a mounted space to be cached; otherwise, i'd suppose not. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 19:13 ` Charles Forsyth @ 2007-10-31 19:33 ` Eric Van Hensbergen 2007-10-31 19:39 ` erik quanstrom 2007-10-31 20:43 ` geoff 1 sibling, 1 reply; 76+ messages in thread From: Eric Van Hensbergen @ 2007-10-31 19:33 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > IIRC, qid.version == 0 is used to mark synthetics (like ctl) for the > > purposes of being marked as uncacheable and should be handled with > > care. > > i don't see that anywhere. MCACHE allows things in a mounted space > to be cached; otherwise, i'd suppose not. > > hurumph. Don't know where i got that from - I tried to base my v9fs cacheing stuff on cfs.c, but I don't see any qid.version==0 checks there. Then I thought perhaps it was from a conversation wtih Russ when I was doing cacheing in v9fs -- but on searching my gmail he was saying that it didn't always hold true -- so perhaps something new is needed... -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 19:33 ` Eric Van Hensbergen @ 2007-10-31 19:39 ` erik quanstrom 0 siblings, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-10-31 19:39 UTC (permalink / raw) To: 9fans > hurumph. Don't know where i got that from - I tried to base my v9fs > cacheing stuff on cfs.c, but I don't see any qid.version==0 checks > there. Then I thought perhaps it was from a conversation wtih Russ > when I was doing cacheing in v9fs -- but on searching my gmail he was > saying that it didn't always hold true -- so perhaps something new is > needed... this is not true of devsd devices. the version is used to deal with removable devices. if you have a cd open and the media is changed, i/o to the device will return Echange. raw /dev/aoe devices will do the same thing. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 19:13 ` Charles Forsyth 2007-10-31 19:33 ` Eric Van Hensbergen @ 2007-10-31 20:43 ` geoff 2007-10-31 21:32 ` Charles Forsyth 2007-10-31 22:48 ` roger peppe 1 sibling, 2 replies; 76+ messages in thread From: geoff @ 2007-10-31 20:43 UTC (permalink / raw) To: 9fans I remember something similar to what Eric remembers: qid.vers of zero means `don't cache'. It might not be written down; it may have just been oral folklore at the labs. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:43 ` geoff @ 2007-10-31 21:32 ` Charles Forsyth 2007-10-31 22:48 ` roger peppe 1 sibling, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 21:32 UTC (permalink / raw) To: 9fans > I remember something similar to what Eric remembers: qid.vers of zero > means `don't cache'. It might not be written down; it may have just > been oral folklore at the labs. it wasn't used by either cache.c or cfs, which seemed to be the main candidates for cache management in practice. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:43 ` geoff 2007-10-31 21:32 ` Charles Forsyth @ 2007-10-31 22:48 ` roger peppe 2007-10-31 23:35 ` erik quanstrom 1 sibling, 1 reply; 76+ messages in thread From: roger peppe @ 2007-10-31 22:48 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > remember something similar to what Eric remembers: qid.vers of zero > means `don't cache'. It might not be written down; it may have just > been oral folklore at the labs. when the 9p2000 man pages were initially posted to this list for discussion, i made a suggestion to that effect, and i seem to recall rob saying "not a bad idea, but we haven't done it yet". it's still not done (or documented, at any rate), but i'm still not sure it's a bad idea. > this is not true of devsd devices. the version is used to deal with > removable devices. if you have a cd open and the media is changed, > i/o to the device will return Echange. using qid.version to indicate the status of the underlying device rather than of the file data seems to me like an abuse of the system. surely a status file would be a better way of indicating media change? > Why don´t add a QTCTL bit to Qid.type? if we ignore concurrent usage (which is, after all, a rare case), one big issue is idempotency - can i read twice at the same offset and get the same result; can i write two blocks at different offsets out of order (a la fcp) and end up with the same file? there are many possible ways that the read and write operations can be used in the construction of interesting file systems, but perhaps the semantics of "regular" files are sufficiently common and useful that it would be worth knowing whether a given file adheres to them. those semantics being something like: - read or write twice at the same offset will yield the same result (modulo concurrent writers) - read or write of several sequential items of data is the same as one read or write of all the data. - write, followed by a read at the same offset yields the same data (modulo concurrent writers again) so i guess i'd argue for a QTREGULAR (QTDATA?) bit rather than a QTCTL bit. that way, we could start off by adding that bit to those files that definitely observed the given semantics, and avoid arguing about which files were or were not "control" files. (and qid.version==0 could still be useful as a "treat this file as if it always had a new version number signifier). (and QTAPPEND is still useful to signify a modification, but not a complete discarding of the given semantics). ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 22:48 ` roger peppe @ 2007-10-31 23:35 ` erik quanstrom 2007-11-01 9:29 ` roger peppe 0 siblings, 1 reply; 76+ messages in thread From: erik quanstrom @ 2007-10-31 23:35 UTC (permalink / raw) To: 9fans > > remember something similar to what Eric remembers: qid.vers of zero > > means `don't cache'. It might not be written down; it may have just > > been oral folklore at the labs. > > when the 9p2000 man pages were initially posted to this list for discussion, > i made a suggestion to that effect, and i seem to recall rob > saying "not a bad idea, but we haven't done it yet". here you argue that using the qid.version to infer something about the file is a good idea > > >> this is not true of devsd devices. the version is used to deal with >> removable devices. if you have a cd open and the media is changed, >> i/o to the device will return Echange. > > using qid.version to indicate the status of the underlying device > rather than of the file data seems to me like an abuse of the system. > surely a status file would be a better way of indicating media change? yet here you argue that using the qid.version to indicate that the medium underlying sdXX/data has changed is an "abuse". to be a bit picky, the qid.version doesn't indicate the status of the device, it indicates how many times the media have changed. it doesn't make sense for a process to blithly continue writing to the new medium without getting an error. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 23:35 ` erik quanstrom @ 2007-11-01 9:29 ` roger peppe 2007-11-01 11:03 ` Eric Van Hensbergen 2007-11-01 12:11 ` erik quanstrom 0 siblings, 2 replies; 76+ messages in thread From: roger peppe @ 2007-11-01 9:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, erik quanstrom <quanstro@quanstro.net> wrote: > > using qid.version to indicate the status of the underlying device > > rather than of the file data seems to me like an abuse of the system. > > surely a status file would be a better way of indicating media change? > > yet here you argue that using the qid.version to indicate that the medium > underlying sdXX/data has changed is an "abuse". > > to be a bit picky, the qid.version doesn't indicate the status of the device, > it indicates how many times the media have changed. > > it doesn't make sense for a process to blithly continue writing to the > new medium without getting an error. i agree with that, and for read-only devices, using the version on the data file to indicate media change seems fine. but for writable devices, surely the version number should increment once every time the device has been written? for writable devices, if there was a status file giving some information about the media, perhaps the version number on that would be a better place to record media changes. regarding QTDECENT (or whatever it might be called), i recently came up against this. i've implemented most of a filesystem to allow latency lowering - kind of similar to fcp, but allows naive clients to gain the benefits, and does it for streaming files too. it uses a filesystem at both the server and client sides - the client sends several requests at once, and the server gathers them into a coherent order. the problem being that the server needs to make a decision as to what kind of semantics a given file supports - whether it has to preserve record boundaries as it reads, for example, or what to do if the reader does a seek (for a "decent" file, it can just discard the read-ahead data - for others, it should probably yield an error). i don't really want to teach this filesystem about which files are "conventionally" normal - and it would be nice to just run one instance for an entire exported fs (accessed through another name, as brian stuart's example). ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 9:29 ` roger peppe @ 2007-11-01 11:03 ` Eric Van Hensbergen 2007-11-01 11:19 ` Charles Forsyth 2007-11-01 12:11 ` erik quanstrom 1 sibling, 1 reply; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 11:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 11/1/07, roger peppe <rogpeppe@gmail.com> wrote: > i don't really want to teach this filesystem about which files are > "conventionally" > normal - and it would be nice to just run one instance for an entire exported > fs (accessed through another name, as brian stuart's example). > Yes - I think transitive mounts, the desire to be able to mount pre-composed file systems, and even mixed mode synthetics (which have both ctl files and cacheable data) all lean toward having a way of identifying what should be cache-able. For instance, my Libra libraryOS environment currently only has a single channel with which to mount all resources -- so it mounts the file system at the same time as console and network files. I imagine if we ran 9p directly on top of one of the raw interconnect channels on Blue Gene we might be in a similar situation (although we aren't currently looking at that). -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 11:03 ` Eric Van Hensbergen @ 2007-11-01 11:19 ` Charles Forsyth 0 siblings, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 11:19 UTC (permalink / raw) To: 9fans i'd often mount each distinct underlying thing separately, not just for cache control but to select the right type of connection for each, or to get better bandwidth than could be achieved with just one connection. if a service is aggregating some others, and the connection classes are otherwise the same, then i'd have that service talk to the client side cache services to provide them with its own cache constraints. (and so on) ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 9:29 ` roger peppe 2007-11-01 11:03 ` Eric Van Hensbergen @ 2007-11-01 12:11 ` erik quanstrom 1 sibling, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-11-01 12:11 UTC (permalink / raw) To: 9fans >> to be a bit picky, the qid.version doesn't indicate the status of the device, >> it indicates how many times the media have changed. >> >> it doesn't make sense for a process to blithly continue writing to the >> new medium without getting an error. > > i agree with that, and for read-only devices, using the version > on the data file to indicate media change seems fine. but for > writable devices, surely the version number should increment > once every time the device has been written? for writable devices, > if there was a status file giving some information about the media, > perhaps the version number on that would be a better place to > record media changes. i see the symmetry of what you're saying. what would be the utility of maintaining the version this way? the version, as you describe it, wouldn't survive reboot and, for network-attached storage, wouldn't be coherent across machines. i'm not sure that devices are either read-only or read-write. that might depend on the underlying (hot-pluggable) medium. and the device driver might not care. i think it makes sense to use the medium changes (not connections, if possible) to determine the version. the marvell and aoe driver consider a device changed if the serial# or number of sectors change.that is something most io clients are interested in. how many times do you want to want to write a random subset of blocks to a different device? it does not seem too much of a stretch. the stat.size field isn't the "file size" of a stream (whatever that means). so i think this is well within the tradition. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 18:40 [9fans] QTCTL? Francisco J Ballesteros 2007-10-31 18:56 ` Eric Van Hensbergen @ 2007-10-31 19:42 ` erik quanstrom 2007-10-31 19:49 ` Eric Van Hensbergen 1 sibling, 1 reply; 76+ messages in thread From: erik quanstrom @ 2007-10-31 19:42 UTC (permalink / raw) To: 9fans > Why don´t add a QTCTL bit to Qid.type? > It would mean "this file does not behave like a regular file, do not cache and > handle with care). why are the current namespace conventions insufficient? /mnt, /net, and /dev hold most, if not all, of the special files. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 19:42 ` erik quanstrom @ 2007-10-31 19:49 ` Eric Van Hensbergen 2007-10-31 20:03 ` erik quanstrom 0 siblings, 1 reply; 76+ messages in thread From: Eric Van Hensbergen @ 2007-10-31 19:49 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, erik quanstrom <quanstro@quanstro.net> wrote: > > Why don´t add a QTCTL bit to Qid.type? > > It would mean "this file does not behave like a regular file, do not cache and > > handle with care). > > why are the current namespace conventions insufficient? > /mnt, /net, and /dev hold most, if not all, of the special files. > The dynamic nature of namespace works against such conventions. Besides it would be nice to have a mechanism that could work in other systems that use 9p. File servers should be able to convey whether a file is cache-able or not. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 19:49 ` Eric Van Hensbergen @ 2007-10-31 20:03 ` erik quanstrom 2007-10-31 20:10 ` Latchesar Ionkov ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: erik quanstrom @ 2007-10-31 20:03 UTC (permalink / raw) To: 9fans > The dynamic nature of namespace works against such conventions. > Besides it would be nice to have a mechanism that could work in other > systems that use 9p. File servers should be able to convey whether a > file is cache-able or not. i'm not sure i follow this argument. plan 9 namespaces are dynamic. one could put the network devices anywhere, but they are conventionally put on /net. there are no "regular" files in /net. perhaps if you gave a concrete example of why conventions can't sort this out it would make more sense to me. (i'm slow.) - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:03 ` erik quanstrom @ 2007-10-31 20:10 ` Latchesar Ionkov 2007-10-31 20:12 ` Eric Van Hensbergen 2007-10-31 20:17 ` Russ Cox 2 siblings, 0 replies; 76+ messages in thread From: Latchesar Ionkov @ 2007-10-31 20:10 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs You have a caching server (a separate machine) that caches files from servers that are far away. Some of the file servers it caches are synthetic. You mount one of them in /net. The server doesn't know that. On Oct 31, 2007, at 2:03 PM, erik quanstrom wrote: >> The dynamic nature of namespace works against such conventions. >> Besides it would be nice to have a mechanism that could work in other >> systems that use 9p. File servers should be able to convey whether a >> file is cache-able or not. > > i'm not sure i follow this argument. plan 9 namespaces are dynamic. > one could put the network devices anywhere, but they are > conventionally > put on /net. there are no "regular" files in /net. > > perhaps if you gave a concrete example of why conventions can't sort > this out it would make more sense to me. (i'm slow.) > > - erik > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:03 ` erik quanstrom 2007-10-31 20:10 ` Latchesar Ionkov @ 2007-10-31 20:12 ` Eric Van Hensbergen 2007-10-31 20:17 ` Russ Cox 2 siblings, 0 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-10-31 20:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, erik quanstrom <quanstro@quanstro.net> wrote: > > The dynamic nature of namespace works against such conventions. > > Besides it would be nice to have a mechanism that could work in other > > systems that use 9p. File servers should be able to convey whether a > > file is cache-able or not. > > i'm not sure i follow this argument. plan 9 namespaces are dynamic. > one could put the network devices anywhere, but they are conventionally > put on /net. there are no "regular" files in /net. > > perhaps if you gave a concrete example of why conventions can't sort > this out it would make more sense to me. (i'm slow.) > /net.alt (for one) While there is value in having conventions for where certain synthetics are bound (like /net), that doesn't mean that alternate synthetics aren't located in arbitrary places. Even if you use conventional locations, these may be elsewhere when transitively mounted /n/remote/net. Then you have the fact that Inferno has additional synthetics (like /cmd) that don't match Plan 9 conventions. And people using p9p and v9fs on Linux may use yet another set of conventions. Hardcoding a set number of paths or having a set number of paths in a configuration file feels wrong and its my opinion that going down that path isn't the right solution. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:03 ` erik quanstrom 2007-10-31 20:10 ` Latchesar Ionkov 2007-10-31 20:12 ` Eric Van Hensbergen @ 2007-10-31 20:17 ` Russ Cox 2007-10-31 20:29 ` Francisco J Ballesteros 2 siblings, 1 reply; 76+ messages in thread From: Russ Cox @ 2007-10-31 20:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Plan 9's default is not to cache, making a "don't cache this" bit unnecessary. If the user explicitly requests caching (by using cfs, say), then he's responsible for making sure it is appropriate. If I tell the computer to cache /net, that's not the computer's problem, any more than if I bind /proc /net. Since there's no coherence protocol anyway, caching can't be done automatically. It might give the right answer most of the time, but it will screw up corner cases and make the system more fragile. This whole synthetic vs not mentality is Unix brain-damange. On Plan 9 there is no distinction. Everything is synthetic (or everything is not, depending on your point of view). Russ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:17 ` Russ Cox @ 2007-10-31 20:29 ` Francisco J Ballesteros 2007-10-31 20:48 ` Charles Forsyth 0 siblings, 1 reply; 76+ messages in thread From: Francisco J Ballesteros @ 2007-10-31 20:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs It´s not "synthetic vs stored on disk". It´s "behaves like a file vs does not". For example, some files in omero may be large (images) and it´s nice to cache them. But some files do not behave as files (you open OTRUNC, rewrite, and might read something else from it later) In the octopus we mount devices, yet we want to cache most of their structure/data. For a local area network, it´s ok to say "do not cache at all". But IMHO, for a wide area network, or slow adsl lines, it´s not so ok. Regarding coherency, you always have races, you have to reach the server anyway and that takes time. But in many cases this is not a problem. When it is, I agree that you have to use something else or put something else within the file server involved. On 10/31/07, Russ Cox <rsc@swtch.com> wrote: > Plan 9's default is not to cache, > making a "don't cache this" bit > unnecessary. If the user explicitly > requests caching (by using cfs, say), > then he's responsible for making sure > it is appropriate. > > If I tell the computer to cache /net, > that's not the computer's problem, > any more than if I bind /proc /net. > > Since there's no coherence protocol > anyway, caching can't be done automatically. > It might give the right answer most of > the time, but it will screw up corner cases > and make the system more fragile. > > This whole synthetic vs not mentality > is Unix brain-damange. On Plan 9 there > is no distinction. Everything is synthetic > (or everything is not, depending on your > point of view). > > Russ > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:29 ` Francisco J Ballesteros @ 2007-10-31 20:48 ` Charles Forsyth 2007-10-31 21:23 ` Francisco J Ballesteros 0 siblings, 1 reply; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 20:48 UTC (permalink / raw) To: 9fans > It´s "behaves like a file vs does not". > > ... But some files do not behave as files that description seems to assume that there is a real, proper, honest-to-God file (you know, that has the decency to be located on a proper disc somewhere) that defines the "expected" behaviour. a "file" in Plan 9 (or Inferno) is something you can name, open, and read and (perhaps) write. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 20:48 ` Charles Forsyth @ 2007-10-31 21:23 ` Francisco J Ballesteros 2007-10-31 21:40 ` Russ Cox 0 siblings, 1 reply; 76+ messages in thread From: Francisco J Ballesteros @ 2007-10-31 21:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Not exactly. But I assume that when you echo a > f cat f would yield a and I expect a file with reported size "0" to have 0 bytes on it when read. That´s (file) decency for me. I know that most files do not have to behave that way, I´ve implemented some. But the point it that it´s been more than once that I had to determine if a file was "decent" or not. Isn´t this a real problem with a simple fix? Why shouldn´t this be addressed. I´d love to stand corrected, but I still think this is an actual problem. On 10/31/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > It´s "behaves like a file vs does not". > > > > ... But some files do not behave as files > > that description seems to assume that there is a real, proper, honest-to-God > file (you know, that has the decency to be located on a proper disc somewhere) > that defines the "expected" behaviour. > > a "file" in Plan 9 (or Inferno) is something you can name, open, and read and (perhaps) write. > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 21:23 ` Francisco J Ballesteros @ 2007-10-31 21:40 ` Russ Cox 2007-10-31 22:11 ` Charles Forsyth ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Russ Cox @ 2007-10-31 21:40 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > But I assume that when you > echo a > f > cat f > would yield > a > and I expect a file with reported size "0" to have 0 bytes on it when read. > > That´s (file) decency for me. But this isn't even true for disk files if someone else or some other machine is writing to f around the same time. If I'm doing tail -f on a remote log file and tail -f just does occasional reads at the end of the file, then you will get the wrong answer, because once the cache sees the eof it will never issue another read. It is a fundamental problem with implementing caching atop a system that is not intended to be cached. Having a QTCTL bit (or a QTOKTOCACHE bit) will not solve the problem. Cfs is not magic. It trades some of the reliability of 9P for some performance. It doesn't do a perfect job. If you choose to use cfs then you are accepting those degradations in semantics, even for "disk files". What you really need is a way to ask the server "can I cache the following?" and have the server say yes or no and then have some way to invalidate the cache, so that you get coherent behavior, even in the above case. We discussed various ways to add this to the protocol but ultimately we didn't see any way that was simple enough that the specification effort wasn't outweighed by our not needing to solve the problem at that time. (We did add QTAPPEND to fix one glaring cfs bug.) By all means experiment with real caching protocols using 9P. Perhaps you will find a nice way to add it and then 9P2010 can adopt it. QTCTL isn't enough though: it pushes your problems farther away but doesn't solve them. Russ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 21:40 ` Russ Cox @ 2007-10-31 22:11 ` Charles Forsyth 2007-10-31 22:26 ` Francisco J Ballesteros 2007-11-01 6:21 ` Bakul Shah 2 siblings, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 22:11 UTC (permalink / raw) To: 9fans > and then 9P2010 can adopt it. QTCTL isn't enough though: > it pushes your problems farther away but doesn't solve them. something similar was suggested some time ago (marking files as seekable or not) so it's probably worthwhile rereading that discussion if you can find it. i think someone even started adding the marker to various synthetic files until ... (and it was all undone). ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 21:40 ` Russ Cox 2007-10-31 22:11 ` Charles Forsyth @ 2007-10-31 22:26 ` Francisco J Ballesteros 2007-10-31 22:37 ` Charles Forsyth 2007-10-31 23:54 ` erik quanstrom 2007-11-01 6:21 ` Bakul Shah 2 siblings, 2 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-10-31 22:26 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > > If I'm doing tail -f on a remote log file and tail -f > just does occasional reads at the end of the file, > then you will get the wrong answer, because once > the cache sees the eof it will never issue another > read. If the file is "decent", the cache must still check out that the file is up to date. It might not do so all the times (as we do, to trade for performance, as you say). That means that the cache would get further file data as soon as it sees a new qid.vers for the file. And tail -f would still work. However, for some indecent files ;), the cache may have problems even if it trusts the file length as reported by the server or the qid.vers. QTAPPEND is indeed something that says that file is weird, QTCL would just signal the general case, not just a +a file. I can do a quick experiment using Op just to see if by faking up some QTCTLs in the Op server, the client may work with all files, even clone ones. And see what happens. I´m not seeking for coherency, I´d just like to be able to cache what I can, (keeping races as they are), to better tolerate latency. thanks a lot for all the comments, btw. > It is a fundamental problem with implementing > caching atop a system that is not intended to be > cached. Having a QTCTL bit (or a QTOKTOCACHE bit) > will not solve the problem. > > Cfs is not magic. It trades some of the reliability > of 9P for some performance. It doesn't do a perfect > job. If you choose to use cfs then you are accepting > those degradations in semantics, even for "disk files". > > What you really need is a way to ask the server "can I > cache the following?" and have the server say yes or no > and then have some way to invalidate the cache, so that > you get coherent behavior, even in the above case. > We discussed various ways to add this to the protocol > but ultimately we didn't see any way that was simple > enough that the specification effort wasn't outweighed > by our not needing to solve the problem at that time. > (We did add QTAPPEND to fix one glaring cfs bug.) > > By all means experiment with real caching protocols > using 9P. Perhaps you will find a nice way to add it > and then 9P2010 can adopt it. QTCTL isn't enough though: > it pushes your problems farther away but doesn't solve them. > > Russ > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 22:26 ` Francisco J Ballesteros @ 2007-10-31 22:37 ` Charles Forsyth 2007-10-31 22:43 ` Francisco J Ballesteros 2007-10-31 23:32 ` Eric Van Hensbergen 2007-10-31 23:54 ` erik quanstrom 1 sibling, 2 replies; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 22:37 UTC (permalink / raw) To: 9fans > QTCL would ... perhaps one of my points is that in the Plan 9/Inferno world, you'd be better off marking the files you can cache, not trying to identify the ones that you can't. one reason is the practical one of having to touch perhaps two or maybe three important servers instead of everything. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 22:37 ` Charles Forsyth @ 2007-10-31 22:43 ` Francisco J Ballesteros 2007-10-31 23:32 ` Eric Van Hensbergen 1 sibling, 0 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-10-31 22:43 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I was missing your point. It seems much easier that way. I shall try with QTDECENT instead of the other way around. On 10/31/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > QTCL would ... > > perhaps one of my points is that in the Plan 9/Inferno world, you'd > be better off marking the files you can cache, not trying to identify the ones > that you can't. one reason is the practical one of having to touch perhaps two or > maybe three important servers instead of everything. > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 22:37 ` Charles Forsyth 2007-10-31 22:43 ` Francisco J Ballesteros @ 2007-10-31 23:32 ` Eric Van Hensbergen 2007-10-31 23:41 ` [V9fs-developer] " Charles Forsyth [not found] ` <606b6f003ae9f0ed3e8c3c5f90ddc720@terzarima.net> 1 sibling, 2 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-10-31 23:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: V9FS Developers On 10/31/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > QTCL would ... > > perhaps one of my points is that in the Plan 9/Inferno world, you'd > be better off marking the files you can cache, not trying to identify the ones > that you can't. one reason is the practical one of having to touch perhaps two or > maybe three important servers instead of everything. > That makes a lot of sense -- particularly in a Plan 9 context. It should be straightforward enough to modify appropriate "static" content file servers to set this bit. The current range of non-Plan9 servers (u9fs,spfs,etc.) present a bit of difficulty in that there isn't a good way to transitively detect this sort of thing when say a p9p server is mounted on Linux and then re-exported, mapping it through the Linux VFS space. But then I suppose that's just a matter of not using the aforementioned tools to export "special" files or file systems. I've got some ideas to fix this that I want to play with using Lucho's new in-Linux-kernel-9p-server. Of course, all of this is probably a corner case that most people don't see anyways. In the context of FastOS, dealing with thousands of nodes, we are going to need to implement more sophisticated forms of caching. Since this is going to be pretty critical to even booting at larger scale, its likely we'll be digging into this early next year. We'll keep folks in the loop as we get prototypes working. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [V9fs-developer] [9fans] QTCTL? 2007-10-31 23:32 ` Eric Van Hensbergen @ 2007-10-31 23:41 ` Charles Forsyth [not found] ` <606b6f003ae9f0ed3e8c3c5f90ddc720@terzarima.net> 1 sibling, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-10-31 23:41 UTC (permalink / raw) To: ericvh, 9fans; +Cc: v9fs-developer > In the context of FastOS, dealing with thousands of nodes, we are > going to need to implement more sophisticated forms of caching. Since > this is going to be pretty critical to even booting at larger scale, > its likely we'll be digging into this early next year. We'll keep > folks in the loop as we get prototypes working. i didn't think that would lead to requiring new bits as markers anywhere, though in the protocol as such ^ permalink raw reply [flat|nested] 76+ messages in thread
[parent not found: <606b6f003ae9f0ed3e8c3c5f90ddc720@terzarima.net>]
* Re: [V9fs-developer] [9fans] QTCTL? [not found] ` <606b6f003ae9f0ed3e8c3c5f90ddc720@terzarima.net> @ 2007-11-01 1:13 ` Eric Van Hensbergen 0 siblings, 0 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 1:13 UTC (permalink / raw) To: Charles Forsyth; +Cc: v9fs-developer, 9fans On 10/31/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > In the context of FastOS, dealing with thousands of nodes, we are > > going to need to implement more sophisticated forms of caching. Since > > this is going to be pretty critical to even booting at larger scale, > > its likely we'll be digging into this early next year. We'll keep > > folks in the loop as we get prototypes working. > > i didn't think that would lead to requiring new bits as markers anywhere, though > in the protocol as such > True - it was more of a general statement indicating we'd be looking at the issues and probably experimenting with things like leases and invalidations, etc. I didn't really go into detail because we haven't really discussed which paths we are going to explore. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 22:26 ` Francisco J Ballesteros 2007-10-31 22:37 ` Charles Forsyth @ 2007-10-31 23:54 ` erik quanstrom 2007-11-01 0:03 ` Charles Forsyth 2007-11-01 1:25 ` Eric Van Hensbergen 1 sibling, 2 replies; 76+ messages in thread From: erik quanstrom @ 2007-10-31 23:54 UTC (permalink / raw) To: 9fans > If the file is "decent", the cache must still check out > that the file is up to date. It might not do so all the times > (as we do, to trade for performance, as you say). That means > that the cache would get further file data as soon as it sees > a new qid.vers for the file. And tail -f would still work. the problem is that no files are "decent" as long as concurrent access is allowed. "control" files have the decency at least to behave the same way all the time. if one goes down the road of client-side caching, i think concurrency issues need to be taken seriously. otherwise it's like having a multiprogramming kernel without locks. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 23:54 ` erik quanstrom @ 2007-11-01 0:03 ` Charles Forsyth 2007-11-01 1:25 ` Eric Van Hensbergen 1 sibling, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 0:03 UTC (permalink / raw) To: 9fans > if one goes down the road of client-side caching, i think > concurrency issues need to be taken seriously. otherwise > it's like having a multiprogramming kernel without locks. it's probably somewhat worse. in the latter case the system as a whole will probably fail quickly, badly. with the former, it's possible the system will survive and even that an application won't crash outright, but you'll see odd effects. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 23:54 ` erik quanstrom 2007-11-01 0:03 ` Charles Forsyth @ 2007-11-01 1:25 ` Eric Van Hensbergen 2007-11-01 1:44 ` erik quanstrom 2007-11-01 7:34 ` Skip Tavakkolian 1 sibling, 2 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 1:25 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/31/07, erik quanstrom <quanstro@quanstro.net> wrote: > > If the file is "decent", the cache must still check out > > that the file is up to date. It might not do so all the times > > (as we do, to trade for performance, as you say). That means > > that the cache would get further file data as soon as it sees > > a new qid.vers for the file. And tail -f would still work. > > the problem is that no files are "decent" as long as concurrent > access is allowed. "control" files have the decency at least > to behave the same way all the time. > Sure - however, there is a case for loose caches as well. For example, lots of remote file data is essentially read-only, or at the very worst its updated very infrequently. Brucee had sessionfs, which although more specialized (I'm going to oversimplify here Brzr, so don't shoot me), could essentially be thought of as serving a snapshot of the system. You could cache to your hearts content because you'd always be reading from the same snapshot. If you ever wanted to roll the snapshot forward, you could blow away the cache -- or for optimum safety, restart the entire node. With such a mechanism you could even keep the cache around on disk for long periods of time (as long as the session was still exported by the file server). -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 1:25 ` Eric Van Hensbergen @ 2007-11-01 1:44 ` erik quanstrom 2007-11-01 2:15 ` Eric Van Hensbergen 2007-11-01 7:34 ` Skip Tavakkolian 1 sibling, 1 reply; 76+ messages in thread From: erik quanstrom @ 2007-11-01 1:44 UTC (permalink / raw) To: 9fans > Sure - however, there is a case for loose caches as well. For example, > lots of remote file data is essentially read-only, or at the very > worst its updated very infrequently. Brucee had i might be speaking out of school. but i worry about the qualifiers "essentially" and "very infrequently". they tend not to scale. what about drawing a sharp line? these mounts are static and cachable. these are not and need coherency. perhaps the data that needs cache coherency doesn't need full file sematics. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 1:44 ` erik quanstrom @ 2007-11-01 2:15 ` Eric Van Hensbergen 0 siblings, 0 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 2:15 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, 31 Oct 2007 8:44 pm, erik quanstrom wrote: >> Sure - however, there is a case for loose caches as well. For example, >> lots of remote file data is essentially read-only, or at the very >> worst its updated very infrequently. Brucee had > > i might be speaking out of school. but i worry about the qualifiers > "essentially" and "very infrequently". they tend not to scale. > > what about drawing a sharp line? these mounts are static and > cachable. these are not and need coherency. Yes - sessionfs satisfied the first case, items falling into the second class were served from a normal 9p server (w/no cache). > > perhaps the > data that needs cache coherency doesn't need full file sematics. > I think they are two separate issues. -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 1:25 ` Eric Van Hensbergen 2007-11-01 1:44 ` erik quanstrom @ 2007-11-01 7:34 ` Skip Tavakkolian 1 sibling, 0 replies; 76+ messages in thread From: Skip Tavakkolian @ 2007-11-01 7:34 UTC (permalink / raw) To: 9fans a side note, intro(5) says "The version is a version number for a file; typically, it is incremented every time the file is modified." wouldn't that mean that for devices, the version should change everytime you read them? it's interesting that httpd (actually sendfd) already uses qid.path and qid.vers to generate the entity tag (ETag) header, and pays attention to if-none-match header. perhaps Op's Tget can have a similar parameter using path/version. i keep wanting to go back to our proposal regarding Text/Rext extension messages. for caching, instead of a Tread a Text("read if-modified") request is sent. an advantage of Op is that Tget, for example, combines walk/open/read/clunk into one request to optimize for high latency networks. Text("get /a/b/foo") could do the same. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-10-31 21:40 ` Russ Cox 2007-10-31 22:11 ` Charles Forsyth 2007-10-31 22:26 ` Francisco J Ballesteros @ 2007-11-01 6:21 ` Bakul Shah 2007-11-01 14:28 ` Russ Cox 2 siblings, 1 reply; 76+ messages in thread From: Bakul Shah @ 2007-11-01 6:21 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > What you really need is a way to ask the server "can I > cache the following?" and have the server say yes or no > and then have some way to invalidate the cache, so that > you get coherent behavior, even in the above case. > We discussed various ways to add this to the protocol > but ultimately we didn't see any way that was simple > enough that the specification effort wasn't outweighed > by our not needing to solve the problem at that time. Do you recall what the issues were? Wouldn't something like load-linked/store-conditional suffice if the common case is a single writer? When the client does a "read-linked" call, the server sends an ID along with the data. The client can then do a "write-conditional" by passing the original ID and new data. If the ID is not valid anymore (if someone else wrote in the meantime) the write fails. The server doesn't have to keep any client state or inform anyone about an invalid cache. Of course, if any client fails to follow this protocol things fall apart but at least well behaved clients can get coherency. And this would work for cases such as making changes on a disconnected laptop and resyncing to the main server on the next connect. You wouldn't use this for synthetic files. This ID can be as simple as a file "generation" number incremented on each write or crypto strong checksum. As someone (Terje Mathiesen?) said all programming is an exercise in caching. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 6:21 ` Bakul Shah @ 2007-11-01 14:28 ` Russ Cox 2007-11-01 14:38 ` erik quanstrom ` (3 more replies) 0 siblings, 4 replies; 76+ messages in thread From: Russ Cox @ 2007-11-01 14:28 UTC (permalink / raw) To: 9fans > Do you recall what the issues were? The main proposal I remember was to give out read and write tokens and then revoke them as appropriate. The two big problems with this were that it meant having spontaneous server->client->server message exchange instead of the usual client->server->client (a very large semantic break with existing 9P) and that once you had granted a read token to a client the client could simply stop responding (acknowledging the revocation) and slow down the rest of the system. I think that an ideal caching solution would have the following properties. 1. For 9P servers that don't care to let their files be cached, the effort in doing so should be minimal. For example, perhaps servers would simply respond to a Tcache message with Rerror, like non-authenticating servers respond to Tauth with Rerror. Anything more is not going to get implemented. 2. All 9P messages would still be client->server->client. This fits with #1, but also excludes solutions that introduce new server->client->server messages after a successful Tcache. 3. If a client that is caching some data stops responding, the rest of the system can continue to function without it: slow clients don't slow the entire system. 4. Except for timing, the cached behavior is identical to what you'd see in an uncached setting, not some relaxed semantics. For example, suppose you adopted a model where each server response could have some cache invalidations piggybacked on it. This would provide a weak but precise consistency model in that any cached behaviors observed interacting with that server would be the same as shifting uncached behavior back in time a bit. It could be made to appear that the machine was just a few seconds behind the server, but otherwise fully consistent. The problem with this is when multiple machines are involved, and since Plan 9 is a networked system, this happens. For example, a common setup is for one machine to spool mail into /mail/queue and then run rx to another machine to kick the queue processor (the mail sender). If the other machine is behaving like it's 5 seconds behind, then it won't see the mail that just got spooled, making the rx kick worthless. 5. It is easy to get right. #1 is trivial. #2 and #3 are difficult and point to some kind of lease-based solution instead of tokens. #4 keeps us honest: weakened consistency like in my example or in cfs(4) or in recover(4) might occasionally be useful, but it will break important and subtle real-world cases and make the system a lot more fragile. If you pile up enough things that only work 99% of the time, you very quickly end up with a crappy system. (If that's what you want, might I suggest Linux?) #5 is probably wishful thinking on my part. > Wouldn't something like load-linked/store-conditional suffice > if the common case is a single writer? When the client does > a "read-linked" call, the server sends an ID along with the > data. The client can then do a "write-conditional" by passing > the original ID and new data. If the ID is not valid anymore > (if someone else wrote in the meantime) the write fails. The > server doesn't have to keep any client state or inform anyone > about an invalid cache. Of course, if any client fails to > follow this protocol things fall apart but at least well > behaved clients can get coherency. And this would work for > cases such as making changes on a disconnected laptop and > resyncing to the main server on the next connect. You > wouldn't use this for synthetic files. This ID can be as > simple as a file "generation" number incremented on each > write or crypto strong checksum. This doesn't solve the problem of one client caching the file contents and another writing to the file; how does the first find out that the file has changed before it uses the cached contents again? Russ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 14:28 ` Russ Cox @ 2007-11-01 14:38 ` erik quanstrom 2007-11-01 14:41 ` Charles Forsyth ` (2 subsequent siblings) 3 siblings, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-11-01 14:38 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 369 bytes --] i haven't thought this through, but perhaps this would be an easier problem if we didn't change 9p, but we changed the model of the caching server. the current proposals assume that the caching servers don't know that other caches are out there. alternatively, the caching servers participate in a coherency protocol with 9p clients none-the-wiser. - erik [-- Attachment #2: Type: message/rfc822, Size: 6133 bytes --] From: "Russ Cox" <rsc@swtch.com> To: 9fans@cse.psu.edu Subject: Re: [9fans] QTCTL? Date: Thu, 1 Nov 2007 10:28:48 -0400 Message-ID: <20071101142849.BA1A61E8C4C@holo.morphisms.net> > Do you recall what the issues were? The main proposal I remember was to give out read and write tokens and then revoke them as appropriate. The two big problems with this were that it meant having spontaneous server->client->server message exchange instead of the usual client->server->client (a very large semantic break with existing 9P) and that once you had granted a read token to a client the client could simply stop responding (acknowledging the revocation) and slow down the rest of the system. I think that an ideal caching solution would have the following properties. 1. For 9P servers that don't care to let their files be cached, the effort in doing so should be minimal. For example, perhaps servers would simply respond to a Tcache message with Rerror, like non-authenticating servers respond to Tauth with Rerror. Anything more is not going to get implemented. 2. All 9P messages would still be client->server->client. This fits with #1, but also excludes solutions that introduce new server->client->server messages after a successful Tcache. 3. If a client that is caching some data stops responding, the rest of the system can continue to function without it: slow clients don't slow the entire system. 4. Except for timing, the cached behavior is identical to what you'd see in an uncached setting, not some relaxed semantics. For example, suppose you adopted a model where each server response could have some cache invalidations piggybacked on it. This would provide a weak but precise consistency model in that any cached behaviors observed interacting with that server would be the same as shifting uncached behavior back in time a bit. It could be made to appear that the machine was just a few seconds behind the server, but otherwise fully consistent. The problem with this is when multiple machines are involved, and since Plan 9 is a networked system, this happens. For example, a common setup is for one machine to spool mail into /mail/queue and then run rx to another machine to kick the queue processor (the mail sender). If the other machine is behaving like it's 5 seconds behind, then it won't see the mail that just got spooled, making the rx kick worthless. 5. It is easy to get right. #1 is trivial. #2 and #3 are difficult and point to some kind of lease-based solution instead of tokens. #4 keeps us honest: weakened consistency like in my example or in cfs(4) or in recover(4) might occasionally be useful, but it will break important and subtle real-world cases and make the system a lot more fragile. If you pile up enough things that only work 99% of the time, you very quickly end up with a crappy system. (If that's what you want, might I suggest Linux?) #5 is probably wishful thinking on my part. > Wouldn't something like load-linked/store-conditional suffice > if the common case is a single writer? When the client does > a "read-linked" call, the server sends an ID along with the > data. The client can then do a "write-conditional" by passing > the original ID and new data. If the ID is not valid anymore > (if someone else wrote in the meantime) the write fails. The > server doesn't have to keep any client state or inform anyone > about an invalid cache. Of course, if any client fails to > follow this protocol things fall apart but at least well > behaved clients can get coherency. And this would work for > cases such as making changes on a disconnected laptop and > resyncing to the main server on the next connect. You > wouldn't use this for synthetic files. This ID can be as > simple as a file "generation" number incremented on each > write or crypto strong checksum. This doesn't solve the problem of one client caching the file contents and another writing to the file; how does the first find out that the file has changed before it uses the cached contents again? Russ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 14:28 ` Russ Cox 2007-11-01 14:38 ` erik quanstrom @ 2007-11-01 14:41 ` Charles Forsyth 2007-11-01 15:26 ` Sape Mullender 2007-11-01 16:59 ` Bakul Shah 3 siblings, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 14:41 UTC (permalink / raw) To: 9fans >with this were that it meant having spontaneous server->client->server >message exchange instead of the usual client->server->client > 2. All 9P messages would still be client->server->client. > This fits with #1, but also excludes solutions that introduce > new server->client->server messages after a successful Tcache. for similar things, i typically have a file on which the client reads messages from the service. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 14:28 ` Russ Cox 2007-11-01 14:38 ` erik quanstrom 2007-11-01 14:41 ` Charles Forsyth @ 2007-11-01 15:26 ` Sape Mullender 2007-11-01 15:51 ` Latchesar Ionkov 2007-11-01 16:59 ` Bakul Shah 3 siblings, 1 reply; 76+ messages in thread From: Sape Mullender @ 2007-11-01 15:26 UTC (permalink / raw) To: 9fans > 2. All 9P messages would still be client->server->client. > This fits with #1, but also excludes solutions that introduce > new server->client->server messages after a successful Tcache. And there is the quandrary. Allowing server->client->server messages (aka callbacks) complicate 9P beyond anything acceptable. On the other hand, these call-backs make the following possible: 1. Client obtains a lease to a file (say valid for exclusive access in the next five minutes) 2. Server needs the file for read or exclusive access by another client after one minute and wants the lease returned early. It initiates a callback. 3. Clients flushes all data back to the server (i.e., it performs a series of writes 4. Clients responds to the callback 5. Server give s alease to another client. This is the sequence of actions that maintains consistency for all parties obeying the protocol. Not obeying (e.g. ignoring callbacks, not writing back dirty data) will slow the system down (server must wait for lease to expire) or will just harm the client not obeying. Leases are a really good idea, but for the complexity of callbacks. If we can have this functionality without callback, that would be really nice. One could have only client-server-client calls like this: Tcache asks whether the server is prepared to cache Rcache makes lease available with parameters, Rerror says no. Tlease says, ok start my lease now (almost immediately follows Rache) Rlease lease expired or lease needs to be given back early Tcache done with old lease (may immediately ask for a new lease) etc. So Tcache serves two purposes: it gives up an old lease if one existed and immediately asks for a new one if one is needed. This might give all the functionality we need without using callbacks. (Of course, the client still needs a proc waiting for that Rlease while doing its reads and writes). Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 15:26 ` Sape Mullender @ 2007-11-01 15:51 ` Latchesar Ionkov 2007-11-01 16:04 ` ron minnich ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Latchesar Ionkov @ 2007-11-01 15:51 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Nov 1, 2007, at 9:26 AM, Sape Mullender wrote: > One could have only client-server-client calls like this: > > Tcache asks whether the server is prepared to cache > Rcache makes lease available with parameters, Rerror says no. > > Tlease says, ok start my lease now (almost immediately follows Rache) > Rlease lease expired or lease needs to be given back early > > Tcache done with old lease (may immediately ask for a new lease) > etc. > > So Tcache serves two purposes: it gives up an old lease if one existed > and immediately asks for a new one if one is needed. > > This might give all the functionality we need without using callbacks. > (Of course, the client still needs a proc waiting for that Rlease > while > doing its reads and writes). In the case of read cache (which is probably going to be used more often than write-cache), the client needs to send two RPC every time a writer modifies the cached file. What if Rlease doesn't necessary break the lease, but have an option (negotiated in Tcache) to let the client know that the file is changed without breaking the lease. Thanks, Lucho ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 15:51 ` Latchesar Ionkov @ 2007-11-01 16:04 ` ron minnich 2007-11-01 16:16 ` Latchesar Ionkov ` (3 more replies) 2007-11-01 16:17 ` Sape Mullender 2007-11-01 16:58 ` Sape Mullender 2 siblings, 4 replies; 76+ messages in thread From: ron minnich @ 2007-11-01 16:04 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Why not just have a file that a client reads that lets the client know of changes to files. client opens this server-provided file ("changes"? "dnotify"?) Server agrees to send client info about all FIDS which client has active that are changing. Form of the message? fid[4]offset[8]len[4] It's up to the client to figure out what to do. if the client doesn't care, no extra server overhead. no new T*, no callbacks (which i can tell you are horrible when you get to bigger machines -- having an 'ls' take 30 minutes is no fun). No leases. The fact is we have loose consistency now, we just don't call it that. Anytime you are running a file from a server, you have loose consistency. It works ok in most cases. ron ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:04 ` ron minnich @ 2007-11-01 16:16 ` Latchesar Ionkov 2007-11-01 16:21 ` Sape Mullender ` (2 subsequent siblings) 3 siblings, 0 replies; 76+ messages in thread From: Latchesar Ionkov @ 2007-11-01 16:16 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Leases are good for purposes other than caching, for example locking. I don't see much difference if the protocol is going to define a special filename or a new message. There are other small details that need to be solved -- the server and the client need to be extra careful that no events fall between the cracks (i.e. between a Rread and the subsequent Tread on the special file). On Nov 1, 2007, at 10:04 AM, ron minnich wrote: > Why not just have a file that a client reads that lets the client know > of changes to files. > > client opens this server-provided file ("changes"? "dnotify"?) > > Server agrees to send client info about all FIDS which client has > active that are changing. Form of the message? > fid[4]offset[8]len[4] > > It's up to the client to figure out what to do. > > if the client doesn't care, no extra server overhead. > > no new T*, no callbacks (which i can tell you are horrible when you > get to bigger machines -- having an 'ls' take 30 minutes is no fun). > No leases. > > The fact is we have loose consistency now, we just don't call it that. > Anytime you are running a file from a server, you have loose > consistency. It works ok in most cases. > > ron ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:04 ` ron minnich 2007-11-01 16:16 ` Latchesar Ionkov @ 2007-11-01 16:21 ` Sape Mullender 2007-11-01 16:58 ` Francisco J Ballesteros 2007-11-01 17:03 ` Russ Cox 2007-11-01 17:14 ` Bakul Shah 3 siblings, 1 reply; 76+ messages in thread From: Sape Mullender @ 2007-11-01 16:21 UTC (permalink / raw) To: 9fans > Why not just have a file that a client reads that lets the client know > of changes to files. A bit better, but the comment I just made about breaking single-copy semantics still hold. The point is that merely notifying the client isn't enough. The server should wait for an acknowledgement to that notification (which possibly doesn't arrive until after the client has flushed its updates from the cache). Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:21 ` Sape Mullender @ 2007-11-01 16:58 ` Francisco J Ballesteros 2007-11-01 17:11 ` Charles Forsyth ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 16:58 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I was thinking about something like Tinval (asks the server for invalidations for files seen) Rinval (reports new invalidations) Tinval (asks for further invals, and let the server know that Rinval was seen by client) This is similar to the "changes" file proposed above, but it´s simple, does not require two new RPCs (a server would respond to a Tinval with an Rerror (unknown request or whatever), is not an upcall (although behaves as one) and may both let the client know which files changed and which cache entries are invalid. We did consider this, but having all terminals connected to slow links to the central fs means that all fs activity might be slowed down by the link with worst latency. However, my experience using this thing says that (at least in my case) I´m using at most one remote terminal at a time, or a bunch of well connected terminals. Which might suggest that this Tinval thing might pay. Time to experiment, perhaps. On 11/1/07, Sape Mullender <sape@plan9.bell-labs.com> wrote: > > Why not just have a file that a client reads that lets the client know > > of changes to files. > > A bit better, but the comment I just made about breaking single-copy > semantics still hold. The point is that merely notifying the client > isn't enough. The server should wait for an acknowledgement to that > notification (which possibly doesn't arrive until after the client has > flushed its updates from the cache). > > Sape > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:58 ` Francisco J Ballesteros @ 2007-11-01 17:11 ` Charles Forsyth 2007-11-01 17:11 ` Francisco J Ballesteros 2007-11-01 17:13 ` Sape Mullender 2007-11-01 17:38 ` ron minnich 2 siblings, 1 reply; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 17:11 UTC (permalink / raw) To: 9fans > This is similar to the "changes" file proposed above, but it´s simple, does not > require two new RPCs (a server would respond to a Tinval with an Rerror > (unknown request or whatever), is not an upcall (although behaves as one) and > may both let the client know which files changed and which cache > entries are invalid. the advantage of the changes file is that it requires no new rpcs at all so you can do it today ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:11 ` Charles Forsyth @ 2007-11-01 17:11 ` Francisco J Ballesteros 0 siblings, 0 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 17:11 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I suppose you could implement the same Tinval Rinval Tinval .... protocol just by issuing sequential reads on a changes file, but you´d have to modify both the server and the client even if it´s by using a file and not by including a new transaction in 9p. I admit that good thing of using a file is that 9p remains untouched. On 11/1/07, Charles Forsyth <forsyth@terzarima.net> wrote: > > This is similar to the "changes" file proposed above, but it´s simple, does not > > require two new RPCs (a server would respond to a Tinval with an Rerror > > (unknown request or whatever), is not an upcall (although behaves as one) and > > may both let the client know which files changed and which cache > > entries are invalid. > > the advantage of the changes file is that it requires no new rpcs at all > so you can do it today > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:58 ` Francisco J Ballesteros 2007-11-01 17:11 ` Charles Forsyth @ 2007-11-01 17:13 ` Sape Mullender 2007-11-01 17:38 ` ron minnich 2 siblings, 0 replies; 76+ messages in thread From: Sape Mullender @ 2007-11-01 17:13 UTC (permalink / raw) To: 9fans > Tinval (asks the server for invalidations for files seen) > Rinval (reports new invalidations) > Tinval (asks for further invals, and let the server know that Rinval > was seen by client) That'll work. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:58 ` Francisco J Ballesteros 2007-11-01 17:11 ` Charles Forsyth 2007-11-01 17:13 ` Sape Mullender @ 2007-11-01 17:38 ` ron minnich 2007-11-01 17:56 ` Francisco J Ballesteros 2 siblings, 1 reply; 76+ messages in thread From: ron minnich @ 2007-11-01 17:38 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs right, but before the experiments start in earnest, see what they're doing at lustre.org, in nfs v3, etc. Much of this discussion is familiar. <shameless plug> you can also see what I did in mnfs ca. 1992, if you promise to ignore the use I put it to (DSM). I implemented invalidates for shared pages. It took, like 15 minutes to implement. It required that I run an nfs server on each client, however, and it worked because NFS blocks on a node have a global name: <fhandle>:<offset>. So the server tracked who had what pages, and an invalidate was actually a simple RPC from servers to clients. yes, this broke the c-s-c model, but hey ... I like it better than leases, personally. But neither this nor leases seems to scale terribly well to 4096 or more clients. ron ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:38 ` ron minnich @ 2007-11-01 17:56 ` Francisco J Ballesteros 2007-11-01 18:01 ` Francisco J Ballesteros ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 17:56 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs We did look at nfs v3 and lbfs before implementing op. nfs v3 at least tried to address latency and not just bandwidth as lbfs, but it seemed to use more RPCs than needed for some tasks (I don´t remember now, but have that written down somewhere). On 11/1/07, ron minnich <rminnich@gmail.com> wrote: > right, but before the experiments start in earnest, see what they're > doing at lustre.org, in nfs v3, etc. > > Much of this discussion is familiar. > > <shameless plug> you can also see what I did in mnfs ca. 1992, if you > promise to ignore the use I put it to (DSM). I implemented invalidates > for shared pages. It took, like 15 minutes to implement. It required > that I run an nfs server on each client, however, and it worked > because NFS blocks on a node have a global name: <fhandle>:<offset>. > So the server tracked who had what pages, and an invalidate was > actually a simple RPC from servers to clients. yes, this broke the > c-s-c model, but hey ... I like it better than leases, personally. > > But neither this nor leases seems to scale terribly well to 4096 or > more clients. > > ron > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:56 ` Francisco J Ballesteros @ 2007-11-01 18:01 ` Francisco J Ballesteros 2007-11-01 18:52 ` Eric Van Hensbergen 2007-11-01 23:24 ` ron minnich 2 siblings, 0 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 18:01 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs One thing that Op taught us was that it´s quite nice to add changes in the protocol just by NOT changing the protocol, and putting special servers-clients as gateways. We could implement a export for caching clients (and a caching client) so that the export implements the Tinval-Rinval thing, and serves the caching clients. The export process may export its own namespace (perhaps just with the main fileserver mounted) and the caching clients would behave as 9p servers to others in their machines. This worked just fine in Inferno and could work if we try it in Plan 9. Also, I´m now thinking that we could take advantage of these new xport-cache links to avoid some RPCs. To maintain 9p semantics we could not use op in there, but perhaps there´s something in the middle. I have to give this all more thought. Any comment on this, though? On 11/1/07, Francisco J Ballesteros <nemo@lsub.org> wrote: > We did look at nfs v3 and lbfs before implementing op. > > nfs v3 at least tried to address latency and not just bandwidth as lbfs, > but it seemed to use more RPCs than needed for some tasks (I don´t remember > now, but have that written down somewhere). > > > > On 11/1/07, ron minnich <rminnich@gmail.com> wrote: > > right, but before the experiments start in earnest, see what they're > > doing at lustre.org, in nfs v3, etc. > > > > Much of this discussion is familiar. > > > > <shameless plug> you can also see what I did in mnfs ca. 1992, if you > > promise to ignore the use I put it to (DSM). I implemented invalidates > > for shared pages. It took, like 15 minutes to implement. It required > > that I run an nfs server on each client, however, and it worked > > because NFS blocks on a node have a global name: <fhandle>:<offset>. > > So the server tracked who had what pages, and an invalidate was > > actually a simple RPC from servers to clients. yes, this broke the > > c-s-c model, but hey ... I like it better than leases, personally. > > > > But neither this nor leases seems to scale terribly well to 4096 or > > more clients. > > > > ron > > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:56 ` Francisco J Ballesteros 2007-11-01 18:01 ` Francisco J Ballesteros @ 2007-11-01 18:52 ` Eric Van Hensbergen 2007-11-01 19:29 ` Francisco J Ballesteros 2007-11-01 23:24 ` ron minnich 2 siblings, 1 reply; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 18:52 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 11/1/07, Francisco J Ballesteros <nemo@lsub.org> wrote: > We did look at nfs v3 and lbfs before implementing op. > > nfs v3 at least tried to address latency and not just bandwidth as lbfs, > but it seemed to use more RPCs than needed for some tasks (I don´t remember > now, but have that written down somewhere). > IIRC, NFSv4 has more lease negotiation stuff as well as compound operations. Of course its like a 500 page spec and looks to be more of an example of what not to do IMHO... -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:52 ` Eric Van Hensbergen @ 2007-11-01 19:29 ` Francisco J Ballesteros 0 siblings, 0 replies; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 19:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Sorry. My fault. I meant v4. On 11/1/07, Eric Van Hensbergen <ericvh@gmail.com> wrote: > On 11/1/07, Francisco J Ballesteros <nemo@lsub.org> wrote: > > We did look at nfs v3 and lbfs before implementing op. > > > > nfs v3 at least tried to address latency and not just bandwidth as lbfs, > > but it seemed to use more RPCs than needed for some tasks (I don´t remember > > now, but have that written down somewhere). > > > > IIRC, NFSv4 has more lease negotiation stuff as well as compound operations. > Of course its like a 500 page spec and looks to be more of an example > of what not to do IMHO... > > -eric > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:56 ` Francisco J Ballesteros 2007-11-01 18:01 ` Francisco J Ballesteros 2007-11-01 18:52 ` Eric Van Hensbergen @ 2007-11-01 23:24 ` ron minnich 2 siblings, 0 replies; 76+ messages in thread From: ron minnich @ 2007-11-01 23:24 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 11/1/07, Francisco J Ballesteros <nemo@lsub.org> wrote: > We did look at nfs v3 and lbfs before implementing op. > > nfs v3 at least tried to address latency and not just bandwidth as lbfs, > but it seemed to use more RPCs than needed for some tasks (I don´t remember > now, but have that written down somewhere). it's all new and better in nfs v4.1 and lustre! more features! ron ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:04 ` ron minnich 2007-11-01 16:16 ` Latchesar Ionkov 2007-11-01 16:21 ` Sape Mullender @ 2007-11-01 17:03 ` Russ Cox 2007-11-01 17:12 ` Sape Mullender ` (3 more replies) 2007-11-01 17:14 ` Bakul Shah 3 siblings, 4 replies; 76+ messages in thread From: Russ Cox @ 2007-11-01 17:03 UTC (permalink / raw) To: 9fans > The fact is we have loose consistency now, we just don't call it that. > Anytime you are running a file from a server, you have loose > consistency. It works ok in most cases. Because all reads and writes go through to the server, all file system operations on a particular server are globally ordered, *and* that global ordering matches the actual sequence of events in the physical world (because clients wait for R-messages). That's a pretty strong consistency statement! Any revocation-based system has to have the server wait for an acknowledgement from the client. If there is no wait, then between the time that the server sends the "oops, stop caching this" and the client processes it, the client might incorrectly use the now-invalid data. That's why a change file doesn't provide the same consistency guarantees as pushing all reads/writes to the file server. To get those, revocations fundamentally must wait for the client. It's also why this doesn't work: > Tcache asks whether the server is prepared to cache > Rcache makes lease available with parameters, Rerror says no. > > Tlease says, ok start my lease now (almost immediately follows Rache) > Rlease lease expired or lease needs to be given back early > > Tcache done with old lease (may immediately ask for a new lease) > etc. because the Rlease/Tcache sequence is a s->c->s message. If a client doesn't respond with the Tcache to formally give up the lease, the server has no choice but to wait. If you are willing to assume that each machine has a real-time clock that runs approximately at the same rate (so that different machines agree on what 5 seconds means, but not necessarily what time it is right now), then you can fix the above messages by saying that the client lease is only good for a fixed time period (say 5 seconds) from the time that the client sent the Tlease. Then the server can overestimate the lease length as 5 seconds from when it sent the Rlease, and everything is safe. And if the server sends a Rlease and the client doesn't respond with a Tcache to officially renounce the lease, the server can just wait until Tlease + 5 seconds and go on. But that means the client has to be renewing the lease every 5 seconds (more frequently, actually). Also, in the case where the lease is just expiring but not being revoked, then you have to have some mechanism for establishing the new lease before the old one runs out. If there is a time between two leases when you don't hold any leases, then all your cached data becomes invalid. The following works: Tnewlease asks for a new lease Rnewlease grants the lease, for n seconds starting at time of Tnewlease Trenewlease asks to renew the lease Rrenewlease grants the renewal for n seconds starting at time of Trenewlease Now if the server needs to revoke the lease, it just refuses to renew and waits until the current lease expires. You can add a pseudo-callback to speed up revocation with a cooperative client: Tneeditback offers to give lease back to server early Rneeditback says I accept your offer, please do Tdroplease gives lease back Rdroplease says okay I got it (not really necessary) The lease ends when the client sends Tdroplease, *not* when the server sends Rneeditback. It can't end at Rneeditback for the same reason change files don't work. And this can *only* be an optimization, because it depends on the client sending Tdroplease. To get something that works in the presence of misbehaved clients you have to be able to fall back on the "wait it out" strategy. One could, of course, use a different protocol with a 9P front end. That's okay for clients, but you'd still have to teach the back-end server (i.e. fossil) to speak the other protocol directly in order to get any guarantees. (If 9P doesn't cut it then anything that's just in front of (not in place of) a 9P server can't solve the problem.) Russ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:03 ` Russ Cox @ 2007-11-01 17:12 ` Sape Mullender 2007-11-01 17:35 ` erik quanstrom 2007-11-01 17:13 ` Charles Forsyth ` (2 subsequent siblings) 3 siblings, 1 reply; 76+ messages in thread From: Sape Mullender @ 2007-11-01 17:12 UTC (permalink / raw) To: 9fans > It's also why this doesn't work: > >> Tcache asks whether the server is prepared to cache >> Rcache makes lease available with parameters, Rerror says no. >> >> Tlease says, ok start my lease now (almost immediately follows Rache) >> Rlease lease expired or lease needs to be given back early >> >> Tcache done with old lease (may immediately ask for a new lease) >> etc. > > because the Rlease/Tcache sequence is a s->c->s > message. If a client doesn't respond with the Tcache > to formally give up the lease, the server has no choice > but to wait. Correct. And if the server *does* wait in that case, single-copy semantics are maintained. My assumption is that clients will, in general, be well-behaved and use Tcache to allow the server to reuse the file earlier than indicated in the lease. Any maliciousness on the part of clients in this scheme would result in (possibly one-time only) temporary denial of service to users sharing a file; such users are not usually maliciously inclined. Indeed, the Rlease/Tcache sequence forms a s->c->s interaction and the second half (c->s) is a necessary one for synchronizing the client's release of the file to the next client's access to that file. This discussion reminds me of the distributed file system discussion raging in the eighties. I'm showing my age (and nothing has changed, but much has been forgotten). Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:12 ` Sape Mullender @ 2007-11-01 17:35 ` erik quanstrom 2007-11-01 18:36 ` erik quanstrom 0 siblings, 1 reply; 76+ messages in thread From: erik quanstrom @ 2007-11-01 17:35 UTC (permalink / raw) To: 9fans > Any maliciousness on the part of clients in this scheme would result > in (possibly one-time only) temporary denial of service to users > sharing a file; such users are not usually maliciously inclined. it doesn't take malice. 1 faulty client will do. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:35 ` erik quanstrom @ 2007-11-01 18:36 ` erik quanstrom 0 siblings, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-11-01 18:36 UTC (permalink / raw) To: 9fans sorry i didn't see that email for a bit. i'm at home working on loading the dump (up to 25 feb 2006). and i got distracted taking a look at what i needed to implement the last bit of functionality on my list. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:03 ` Russ Cox 2007-11-01 17:12 ` Sape Mullender @ 2007-11-01 17:13 ` Charles Forsyth 2007-11-01 17:16 ` Charles Forsyth 2007-11-01 17:52 ` Eric Van Hensbergen 2007-11-01 18:00 ` Latchesar Ionkov 3 siblings, 1 reply; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 17:13 UTC (permalink / raw) To: 9fans > That's why a change file doesn't > provide the same consistency guarantees as pushing > all reads/writes to the file server. To get those, revocations sorry: i was assuming that when needed the client (or rather a cache control agent on the client) would acknowledge by writing back on the file. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:13 ` Charles Forsyth @ 2007-11-01 17:16 ` Charles Forsyth 2007-11-01 17:20 ` Charles Forsyth 0 siblings, 1 reply; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 17:16 UTC (permalink / raw) To: 9fans > sorry: i was assuming that when needed the client (or rather a cache control agent on the client) > would acknowledge by writing back on the file. actually any service-specific agent on the client, it's not just for cache control, which is one reason i like it more than cache-specific things actually in the protocol. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:16 ` Charles Forsyth @ 2007-11-01 17:20 ` Charles Forsyth 0 siblings, 0 replies; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 17:20 UTC (permalink / raw) To: 9fans >> sorry: i was assuming that when needed the client (or rather an agent on the client) >> would acknowledge by writing back on the file. indeed in some cases a subsequent read might be defined to acknowledge the items previously read. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:03 ` Russ Cox 2007-11-01 17:12 ` Sape Mullender 2007-11-01 17:13 ` Charles Forsyth @ 2007-11-01 17:52 ` Eric Van Hensbergen 2007-11-01 18:00 ` Latchesar Ionkov 3 siblings, 0 replies; 76+ messages in thread From: Eric Van Hensbergen @ 2007-11-01 17:52 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 11/1/07, Russ Cox <rsc@swtch.com> wrote: > > One could, of course, use a different protocol with > a 9P front end. That's okay for clients, but you'd still > have to teach the back-end server (i.e. fossil) to speak > the other protocol directly in order to get any guarantees. > (If 9P doesn't cut it then anything that's just in front of > (not in place of) a 9P server can't solve the problem.) > Sorry, could you clarify what you mean by this? -eric ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 17:03 ` Russ Cox ` (2 preceding siblings ...) 2007-11-01 17:52 ` Eric Van Hensbergen @ 2007-11-01 18:00 ` Latchesar Ionkov 2007-11-01 18:03 ` Francisco J Ballesteros 3 siblings, 1 reply; 76+ messages in thread From: Latchesar Ionkov @ 2007-11-01 18:00 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs The 5 seconds lease might work in the local network case, but not caching at all is going to work out pretty well too. What if you want to cache over Internet and you round-trip is 3-4 seconds :) On Nov 1, 2007, at 11:03 AM, Russ Cox wrote: >> The fact is we have loose consistency now, we just don't call it >> that. >> Anytime you are running a file from a server, you have loose >> consistency. It works ok in most cases. > > Because all reads and writes go through to the > server, all file system operations on a particular > server are globally ordered, *and* that global ordering > matches the actual sequence of events in the > physical world (because clients wait for R-messages). > That's a pretty strong consistency statement! > > Any revocation-based system has to have the server > wait for an acknowledgement from the client. > If there is no wait, then between the time that the server > sends the "oops, stop caching this" and the client > processes it, the client might incorrectly use the > now-invalid data. That's why a change file doesn't > provide the same consistency guarantees as pushing > all reads/writes to the file server. To get those, revocations > fundamentally must wait for the client. > > It's also why this doesn't work: > >> Tcache asks whether the server is prepared to cache >> Rcache makes lease available with parameters, Rerror says no. >> >> Tlease says, ok start my lease now (almost immediately follows Rache) >> Rlease lease expired or lease needs to be given back early >> >> Tcache done with old lease (may immediately ask for a new lease) >> etc. > > because the Rlease/Tcache sequence is a s->c->s > message. If a client doesn't respond with the Tcache > to formally give up the lease, the server has no choice > but to wait. > > If you are willing to assume that each machine has > a real-time clock that runs approximately at the > same rate (so that different machines agree on > what 5 seconds means, but not necessarily what > time it is right now), then you can fix the above messages > by saying that the client lease is only good for a fixed > time period (say 5 seconds) from the time that the > client sent the Tlease. Then the server can overestimate > the lease length as 5 seconds from when it sent the > Rlease, and everything is safe. And if the server > sends a Rlease and the client doesn't respond with > a Tcache to officially renounce the lease, the server > can just wait until Tlease + 5 seconds and go on. > But that means the client has to be renewing the > lease every 5 seconds (more frequently, actually). > > Also, in the case where the lease is just expiring > but not being revoked, then you have to have some > mechanism for establishing the new lease before > the old one runs out. If there is a time between > two leases when you don't hold any leases, then > all your cached data becomes invalid. > > The following works: > > Tnewlease asks for a new lease > Rnewlease grants the lease, for n seconds starting at time of > Tnewlease > > Trenewlease asks to renew the lease > Rrenewlease grants the renewal for n seconds starting at time of > Trenewlease > > Now if the server needs to revoke the lease, it just > refuses to renew and waits until the current lease expires. > > You can add a pseudo-callback to speed up revocation > with a cooperative client: > > Tneeditback offers to give lease back to server early > Rneeditback says I accept your offer, please do > > Tdroplease gives lease back > Rdroplease says okay I got it (not really necessary) > > The lease ends when the client sends Tdroplease, > *not* when the server sends Rneeditback. It can't end > at Rneeditback for the same reason change files don't work. > And this can *only* be an optimization, because it > depends on the client sending Tdroplease. To get > something that works in the presence of misbehaved > clients you have to be able to fall back on the > "wait it out" strategy. > > One could, of course, use a different protocol with > a 9P front end. That's okay for clients, but you'd still > have to teach the back-end server (i.e. fossil) to speak > the other protocol directly in order to get any guarantees. > (If 9P doesn't cut it then anything that's just in front of > (not in place of) a 9P server can't solve the problem.) > > Russ > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:00 ` Latchesar Ionkov @ 2007-11-01 18:03 ` Francisco J Ballesteros 2007-11-01 18:08 ` Latchesar Ionkov 0 siblings, 1 reply; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 18:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs But 5 seconds would be enough to be convinced that a client is not properly responding to invalidation requests, and cease all its leases. Why make other clients wait for more? I mean, assuming a central FS and clients connected on star to it. On 11/1/07, Latchesar Ionkov <lionkov@lanl.gov> wrote: > The 5 seconds lease might work in the local network case, but not > caching at all is going to work out pretty well too. What if you want > to cache over Internet and you round-trip is 3-4 seconds :) > > On Nov 1, 2007, at 11:03 AM, Russ Cox wrote: > > >> The fact is we have loose consistency now, we just don't call it > >> that. > >> Anytime you are running a file from a server, you have loose > >> consistency. It works ok in most cases. > > > > Because all reads and writes go through to the > > server, all file system operations on a particular > > server are globally ordered, *and* that global ordering > > matches the actual sequence of events in the > > physical world (because clients wait for R-messages). > > That's a pretty strong consistency statement! > > > > Any revocation-based system has to have the server > > wait for an acknowledgement from the client. > > If there is no wait, then between the time that the server > > sends the "oops, stop caching this" and the client > > processes it, the client might incorrectly use the > > now-invalid data. That's why a change file doesn't > > provide the same consistency guarantees as pushing > > all reads/writes to the file server. To get those, revocations > > fundamentally must wait for the client. > > > > It's also why this doesn't work: > > > >> Tcache asks whether the server is prepared to cache > >> Rcache makes lease available with parameters, Rerror says no. > >> > >> Tlease says, ok start my lease now (almost immediately follows Rache) > >> Rlease lease expired or lease needs to be given back early > >> > >> Tcache done with old lease (may immediately ask for a new lease) > >> etc. > > > > because the Rlease/Tcache sequence is a s->c->s > > message. If a client doesn't respond with the Tcache > > to formally give up the lease, the server has no choice > > but to wait. > > > > If you are willing to assume that each machine has > > a real-time clock that runs approximately at the > > same rate (so that different machines agree on > > what 5 seconds means, but not necessarily what > > time it is right now), then you can fix the above messages > > by saying that the client lease is only good for a fixed > > time period (say 5 seconds) from the time that the > > client sent the Tlease. Then the server can overestimate > > the lease length as 5 seconds from when it sent the > > Rlease, and everything is safe. And if the server > > sends a Rlease and the client doesn't respond with > > a Tcache to officially renounce the lease, the server > > can just wait until Tlease + 5 seconds and go on. > > But that means the client has to be renewing the > > lease every 5 seconds (more frequently, actually). > > > > Also, in the case where the lease is just expiring > > but not being revoked, then you have to have some > > mechanism for establishing the new lease before > > the old one runs out. If there is a time between > > two leases when you don't hold any leases, then > > all your cached data becomes invalid. > > > > The following works: > > > > Tnewlease asks for a new lease > > Rnewlease grants the lease, for n seconds starting at time of > > Tnewlease > > > > Trenewlease asks to renew the lease > > Rrenewlease grants the renewal for n seconds starting at time of > > Trenewlease > > > > Now if the server needs to revoke the lease, it just > > refuses to renew and waits until the current lease expires. > > > > You can add a pseudo-callback to speed up revocation > > with a cooperative client: > > > > Tneeditback offers to give lease back to server early > > Rneeditback says I accept your offer, please do > > > > Tdroplease gives lease back > > Rdroplease says okay I got it (not really necessary) > > > > The lease ends when the client sends Tdroplease, > > *not* when the server sends Rneeditback. It can't end > > at Rneeditback for the same reason change files don't work. > > And this can *only* be an optimization, because it > > depends on the client sending Tdroplease. To get > > something that works in the presence of misbehaved > > clients you have to be able to fall back on the > > "wait it out" strategy. > > > > One could, of course, use a different protocol with > > a 9P front end. That's okay for clients, but you'd still > > have to teach the back-end server (i.e. fossil) to speak > > the other protocol directly in order to get any guarantees. > > (If 9P doesn't cut it then anything that's just in front of > > (not in place of) a 9P server can't solve the problem.) > > > > Russ > > > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:03 ` Francisco J Ballesteros @ 2007-11-01 18:08 ` Latchesar Ionkov 2007-11-01 18:16 ` erik quanstrom 2007-11-01 18:19 ` Francisco J Ballesteros 0 siblings, 2 replies; 76+ messages in thread From: Latchesar Ionkov @ 2007-11-01 18:08 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs The problem is that the clients with higher latencies badly need to be able to cache. And the ones with better latencies can afford not caching :) On Nov 1, 2007, at 12:03 PM, Francisco J Ballesteros wrote: > But 5 seconds would be enough to be convinced that a client is not > properly > responding to invalidation requests, and cease all its leases. Why > make other > clients wait for more? I mean, assuming a central FS and clients > connected on star > to it. > > On 11/1/07, Latchesar Ionkov <lionkov@lanl.gov> wrote: >> The 5 seconds lease might work in the local network case, but not >> caching at all is going to work out pretty well too. What if you want >> to cache over Internet and you round-trip is 3-4 seconds :) >> >> On Nov 1, 2007, at 11:03 AM, Russ Cox wrote: >> >>>> The fact is we have loose consistency now, we just don't call it >>>> that. >>>> Anytime you are running a file from a server, you have loose >>>> consistency. It works ok in most cases. >>> >>> Because all reads and writes go through to the >>> server, all file system operations on a particular >>> server are globally ordered, *and* that global ordering >>> matches the actual sequence of events in the >>> physical world (because clients wait for R-messages). >>> That's a pretty strong consistency statement! >>> >>> Any revocation-based system has to have the server >>> wait for an acknowledgement from the client. >>> If there is no wait, then between the time that the server >>> sends the "oops, stop caching this" and the client >>> processes it, the client might incorrectly use the >>> now-invalid data. That's why a change file doesn't >>> provide the same consistency guarantees as pushing >>> all reads/writes to the file server. To get those, revocations >>> fundamentally must wait for the client. >>> >>> It's also why this doesn't work: >>> >>>> Tcache asks whether the server is prepared to cache >>>> Rcache makes lease available with parameters, Rerror says no. >>>> >>>> Tlease says, ok start my lease now (almost immediately >>>> follows Rache) >>>> Rlease lease expired or lease needs to be given back early >>>> >>>> Tcache done with old lease (may immediately ask for a new >>>> lease) >>>> etc. >>> >>> because the Rlease/Tcache sequence is a s->c->s >>> message. If a client doesn't respond with the Tcache >>> to formally give up the lease, the server has no choice >>> but to wait. >>> >>> If you are willing to assume that each machine has >>> a real-time clock that runs approximately at the >>> same rate (so that different machines agree on >>> what 5 seconds means, but not necessarily what >>> time it is right now), then you can fix the above messages >>> by saying that the client lease is only good for a fixed >>> time period (say 5 seconds) from the time that the >>> client sent the Tlease. Then the server can overestimate >>> the lease length as 5 seconds from when it sent the >>> Rlease, and everything is safe. And if the server >>> sends a Rlease and the client doesn't respond with >>> a Tcache to officially renounce the lease, the server >>> can just wait until Tlease + 5 seconds and go on. >>> But that means the client has to be renewing the >>> lease every 5 seconds (more frequently, actually). >>> >>> Also, in the case where the lease is just expiring >>> but not being revoked, then you have to have some >>> mechanism for establishing the new lease before >>> the old one runs out. If there is a time between >>> two leases when you don't hold any leases, then >>> all your cached data becomes invalid. >>> >>> The following works: >>> >>> Tnewlease asks for a new lease >>> Rnewlease grants the lease, for n seconds starting at time of >>> Tnewlease >>> >>> Trenewlease asks to renew the lease >>> Rrenewlease grants the renewal for n seconds starting at time of >>> Trenewlease >>> >>> Now if the server needs to revoke the lease, it just >>> refuses to renew and waits until the current lease expires. >>> >>> You can add a pseudo-callback to speed up revocation >>> with a cooperative client: >>> >>> Tneeditback offers to give lease back to server early >>> Rneeditback says I accept your offer, please do >>> >>> Tdroplease gives lease back >>> Rdroplease says okay I got it (not really necessary) >>> >>> The lease ends when the client sends Tdroplease, >>> *not* when the server sends Rneeditback. It can't end >>> at Rneeditback for the same reason change files don't work. >>> And this can *only* be an optimization, because it >>> depends on the client sending Tdroplease. To get >>> something that works in the presence of misbehaved >>> clients you have to be able to fall back on the >>> "wait it out" strategy. >>> >>> One could, of course, use a different protocol with >>> a 9P front end. That's okay for clients, but you'd still >>> have to teach the back-end server (i.e. fossil) to speak >>> the other protocol directly in order to get any guarantees. >>> (If 9P doesn't cut it then anything that's just in front of >>> (not in place of) a 9P server can't solve the problem.) >>> >>> Russ >>> >> >> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:08 ` Latchesar Ionkov @ 2007-11-01 18:16 ` erik quanstrom 2007-11-01 18:19 ` Francisco J Ballesteros 1 sibling, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-11-01 18:16 UTC (permalink / raw) To: 9fans > The problem is that the clients with higher latencies badly need to > be able to cache. And the ones with better latencies can afford not > caching :) the irony is, the higher the latency, the greater the cost of syncronization. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:08 ` Latchesar Ionkov 2007-11-01 18:16 ` erik quanstrom @ 2007-11-01 18:19 ` Francisco J Ballesteros 2007-11-01 18:35 ` Sape Mullender 1 sibling, 1 reply; 76+ messages in thread From: Francisco J Ballesteros @ 2007-11-01 18:19 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I know, I think I was not clear, sorry. The point is that, referring to the Tinval-Rinval-Tinval-... sequence, a server could afford to consider its Rinval acknowledged if the client happens not to respond (by issuing another Tinval) within 5 seconds. That would "freeze" clients for at most 5 seconds when a client fails to respond, but that should not happen often, and it would not make other clients slow. Also, regarding > the irony is, the higher the latency, the greater the cost of syncronization. if we consider that this would happen only for rw files, and that rd files would be considered as leased up to the next Rinval mentioning them, the cost would not probably be too high. But I won´t actually know before implementing and trying it. On 11/1/07, Latchesar Ionkov <lionkov@lanl.gov> wrote: > The problem is that the clients with higher latencies badly need to > be able to cache. And the ones with better latencies can afford not > caching :) > > On Nov 1, 2007, at 12:03 PM, Francisco J Ballesteros wrote: > > > But 5 seconds would be enough to be convinced that a client is not > > properly > > responding to invalidation requests, and cease all its leases. Why > > make other > > clients wait for more? I mean, assuming a central FS and clients > > connected on star > > to it. > > > > On 11/1/07, Latchesar Ionkov <lionkov@lanl.gov> wrote: > >> The 5 seconds lease might work in the local network case, but not > >> caching at all is going to work out pretty well too. What if you want > >> to cache over Internet and you round-trip is 3-4 seconds :) > >> > >> On Nov 1, 2007, at 11:03 AM, Russ Cox wrote: > >> > >>>> The fact is we have loose consistency now, we just don't call it > >>>> that. > >>>> Anytime you are running a file from a server, you have loose > >>>> consistency. It works ok in most cases. > >>> > >>> Because all reads and writes go through to the > >>> server, all file system operations on a particular > >>> server are globally ordered, *and* that global ordering > >>> matches the actual sequence of events in the > >>> physical world (because clients wait for R-messages). > >>> That's a pretty strong consistency statement! > >>> > >>> Any revocation-based system has to have the server > >>> wait for an acknowledgement from the client. > >>> If there is no wait, then between the time that the server > >>> sends the "oops, stop caching this" and the client > >>> processes it, the client might incorrectly use the > >>> now-invalid data. That's why a change file doesn't > >>> provide the same consistency guarantees as pushing > >>> all reads/writes to the file server. To get those, revocations > >>> fundamentally must wait for the client. > >>> > >>> It's also why this doesn't work: > >>> > >>>> Tcache asks whether the server is prepared to cache > >>>> Rcache makes lease available with parameters, Rerror says no. > >>>> > >>>> Tlease says, ok start my lease now (almost immediately > >>>> follows Rache) > >>>> Rlease lease expired or lease needs to be given back early > >>>> > >>>> Tcache done with old lease (may immediately ask for a new > >>>> lease) > >>>> etc. > >>> > >>> because the Rlease/Tcache sequence is a s->c->s > >>> message. If a client doesn't respond with the Tcache > >>> to formally give up the lease, the server has no choice > >>> but to wait. > >>> > >>> If you are willing to assume that each machine has > >>> a real-time clock that runs approximately at the > >>> same rate (so that different machines agree on > >>> what 5 seconds means, but not necessarily what > >>> time it is right now), then you can fix the above messages > >>> by saying that the client lease is only good for a fixed > >>> time period (say 5 seconds) from the time that the > >>> client sent the Tlease. Then the server can overestimate > >>> the lease length as 5 seconds from when it sent the > >>> Rlease, and everything is safe. And if the server > >>> sends a Rlease and the client doesn't respond with > >>> a Tcache to officially renounce the lease, the server > >>> can just wait until Tlease + 5 seconds and go on. > >>> But that means the client has to be renewing the > >>> lease every 5 seconds (more frequently, actually). > >>> > >>> Also, in the case where the lease is just expiring > >>> but not being revoked, then you have to have some > >>> mechanism for establishing the new lease before > >>> the old one runs out. If there is a time between > >>> two leases when you don't hold any leases, then > >>> all your cached data becomes invalid. > >>> > >>> The following works: > >>> > >>> Tnewlease asks for a new lease > >>> Rnewlease grants the lease, for n seconds starting at time of > >>> Tnewlease > >>> > >>> Trenewlease asks to renew the lease > >>> Rrenewlease grants the renewal for n seconds starting at time of > >>> Trenewlease > >>> > >>> Now if the server needs to revoke the lease, it just > >>> refuses to renew and waits until the current lease expires. > >>> > >>> You can add a pseudo-callback to speed up revocation > >>> with a cooperative client: > >>> > >>> Tneeditback offers to give lease back to server early > >>> Rneeditback says I accept your offer, please do > >>> > >>> Tdroplease gives lease back > >>> Rdroplease says okay I got it (not really necessary) > >>> > >>> The lease ends when the client sends Tdroplease, > >>> *not* when the server sends Rneeditback. It can't end > >>> at Rneeditback for the same reason change files don't work. > >>> And this can *only* be an optimization, because it > >>> depends on the client sending Tdroplease. To get > >>> something that works in the presence of misbehaved > >>> clients you have to be able to fall back on the > >>> "wait it out" strategy. > >>> > >>> One could, of course, use a different protocol with > >>> a 9P front end. That's okay for clients, but you'd still > >>> have to teach the back-end server (i.e. fossil) to speak > >>> the other protocol directly in order to get any guarantees. > >>> (If 9P doesn't cut it then anything that's just in front of > >>> (not in place of) a 9P server can't solve the problem.) > >>> > >>> Russ > >>> > >> > >> > > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:19 ` Francisco J Ballesteros @ 2007-11-01 18:35 ` Sape Mullender 2007-11-01 19:09 ` Charles Forsyth 0 siblings, 1 reply; 76+ messages in thread From: Sape Mullender @ 2007-11-01 18:35 UTC (permalink / raw) To: 9fans > I know, I think I was not clear, sorry. > > The point is that, referring to the > Tinval-Rinval-Tinval-... > sequence, a server could afford to consider its Rinval acknowledged > if the client happens not to respond (by issuing another Tinval) > within 5 seconds. > That would "freeze" clients for at most 5 seconds when a client fails > to respond, > but that should not happen often, and it would not make other clients slow. > > Also, regarding >> the irony is, the higher the latency, the greater the cost of syncronization. > > if we consider that this would happen only for rw files, and that rd files would > be considered as leased up to the next Rinval mentioning them, the cost would > not probably be too high. But I won´t actually know before > implementing and trying it. There's a funny obsession in this discussion with optimal performance in the least common scenarios. Let me reiterate: 1. 90% (a symbolic number) of files are not shared 2. Of the 10% remaining, 90% of files are read-shared 3. Of the 1 % remaining, 90% of clients are well-behaved 4. In the 0.1% remaining, only the first request in a series of requests issued by a badly behaving client will result in making the server wait for the lease timeout (in all subsequent cases, the server just won't give that client a lease or an extremely short one) 5. That leaves 0.01%of files. Optimize away, guys. Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 18:35 ` Sape Mullender @ 2007-11-01 19:09 ` Charles Forsyth 2007-11-01 19:07 ` erik quanstrom 0 siblings, 1 reply; 76+ messages in thread From: Charles Forsyth @ 2007-11-01 19:09 UTC (permalink / raw) To: 9fans > There's a funny obsession in this discussion with optimal performance > in the least common scenarios. but surely that's absolutely typical of most such discussions in computing and perhaps sports. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 19:09 ` Charles Forsyth @ 2007-11-01 19:07 ` erik quanstrom 0 siblings, 0 replies; 76+ messages in thread From: erik quanstrom @ 2007-11-01 19:07 UTC (permalink / raw) To: 9fans >> There's a funny obsession in this discussion with optimal performance >> in the least common scenarios. > > but surely that's absolutely typical of most such discussions in computing > and perhaps sports. oddly, it's the same with spam. - erik ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:04 ` ron minnich ` (2 preceding siblings ...) 2007-11-01 17:03 ` Russ Cox @ 2007-11-01 17:14 ` Bakul Shah 3 siblings, 0 replies; 76+ messages in thread From: Bakul Shah @ 2007-11-01 17:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Why not just have a file that a client reads that lets the client know > of changes to files. > > client opens this server-provided file ("changes"? "dnotify"?) > > Server agrees to send client info about all FIDS which client has > active that are changing. Form of the message? > fid[4]offset[8]len[4] A changefile would be useful in many ways. You can implement Unix's select(2) or poll(2). You can discover when devices disappear or reappear for auto configuration. You can grab new email as soon as it gets delivered to your mbox etc. Change propagation is very handy. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 15:51 ` Latchesar Ionkov 2007-11-01 16:04 ` ron minnich @ 2007-11-01 16:17 ` Sape Mullender 2007-11-01 16:27 ` Sape Mullender 2007-11-01 16:58 ` Sape Mullender 2 siblings, 1 reply; 76+ messages in thread From: Sape Mullender @ 2007-11-01 16:17 UTC (permalink / raw) To: 9fans > In the case of read cache (which is probably going to be used more > often than write-cache), the client needs to send two RPC every time > a writer modifies the cached file. What if Rlease doesn't necessary > break the lease, but have an option (negotiated in Tcache) to let the > client know that the file is changed without breaking the lease. That breaks single-copy semantics: A client may have acted on data after it had been changed by somebody else. Say, A and B are sharing the file. B has a read-lease on the file, A obtains a write lease, modifies the file and sends a message to A to read what was changed. A reads the file (which is still in the cache and has not been updated). Meanwhile, just after B obtained the write lease, the server notifies A that the lease is expiring early, but this message travels slowly and doesn't arrive until the whole exchange is over. Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 16:17 ` Sape Mullender @ 2007-11-01 16:27 ` Sape Mullender 0 siblings, 0 replies; 76+ messages in thread From: Sape Mullender @ 2007-11-01 16:27 UTC (permalink / raw) To: 9fans I Type too fast — got A and B mixed up. Below is the fix: > That breaks single-copy semantics: A client may have acted on data after > it had been changed by somebody else. Say, A and B are sharing the file. > B has a read-lease on the file, A obtains a write lease, modifies the file > and sends a message to B to read what was changed. B reads the file (which > is still in the cache and has not been updated). Meanwhile, just after A > obtained the write lease, the server notifies B that the lease is expiring > early, but this message travels slowly and doesn't arrive until the whole > exchange is over. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 15:51 ` Latchesar Ionkov 2007-11-01 16:04 ` ron minnich 2007-11-01 16:17 ` Sape Mullender @ 2007-11-01 16:58 ` Sape Mullender 2 siblings, 0 replies; 76+ messages in thread From: Sape Mullender @ 2007-11-01 16:58 UTC (permalink / raw) To: 9fans > In the case of read cache (which is probably going to be used more > often than write-cache), the client needs to send two RPC every time > a writer modifies the cached file. What if Rlease doesn't necessary > break the lease, but have an option (negotiated in Tcache) to let the > client know that the file is changed without breaking the lease. > > Thanks, > Lucho Another point in this discussion: 1. Most files are not shared 2. Some files are read-shared 3. Very, very few files are read-write shared (Satya did some research on this at CMU — quite some time ago) Having said that, we do want correct semantics all the time, especially for read-write sharing. A file server can use heuristics to decide the time out for leases. For example, it could always grant 10-minute leases to begin with. Doesn't cost a thing unless the client refuses to return a lease early (but clients will be rarely asked to do so). With updates in the recent past, or with the first occurrence of read-write sharing, lease times can be drastically reduced. Note that for files not shared or read-shared, call backs do not happen, so lease calls will occur at the rate of the lease time, which is once every few minutes. Big deal. Sape ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [9fans] QTCTL? 2007-11-01 14:28 ` Russ Cox ` (2 preceding siblings ...) 2007-11-01 15:26 ` Sape Mullender @ 2007-11-01 16:59 ` Bakul Shah 3 siblings, 0 replies; 76+ messages in thread From: Bakul Shah @ 2007-11-01 16:59 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > > Do you recall what the issues were? > > The main proposal I remember was to give out read and write > tokens and then revoke them as appropriate. The two big problems > with this were that it meant having spontaneous server->client->server > message exchange instead of the usual client->server->client > (a very large semantic break with existing 9P) and that once you > had granted a read token to a client the client could simply stop > responding (acknowledging the revocation) and slow down the > rest of the system. Thanks. > > Wouldn't something like load-linked/store-conditional suffice > > if the common case is a single writer? When the client does > > a "read-linked" call, the server sends an ID along with the > > data. The client can then do a "write-conditional" by passing > > the original ID and new data. If the ID is not valid anymore > > (if someone else wrote in the meantime) the write fails. The > > server doesn't have to keep any client state or inform anyone > > about an invalid cache. Of course, if any client fails to > > follow this protocol things fall apart but at least well > > behaved clients can get coherency. And this would work for > > cases such as making changes on a disconnected laptop and > > resyncing to the main server on the next connect. You > > wouldn't use this for synthetic files. This ID can be as > > simple as a file "generation" number incremented on each > > write or crypto strong checksum. > > This doesn't solve the problem of one client caching the file > contents and another writing to the file; how does the first > find out that the file has changed before it uses the cached > contents again? It can in effect find out if the file changed between two calls to the server. My thought was that there are many schemes for providing full consistency, each with its own strengths and problems so may be support for a specific scheme doesn't belong in 9p but if it provides "relaxed" cosistency of LL/SC, various full consistency schemes can be built on top of that. So for example if you read a lockfile and it says `free' you conditionally write `busy'. If the write succeeds you have exclusive access to the file being protected by this lockfile. But if the read says `busy' or if your conditional write fails, you try again. Cooperating user processes can set up any other scheme as well such as shared read exclusive write or a leased based one etc. If version number accompanies every read/write one can implement multi-versioned concurrency -- readers get a consistent version but writers can only write the "head" version. It would be nice to be able to implement that but I wouldn't want it builtin. ^ permalink raw reply [flat|nested] 76+ messages in thread
end of thread, other threads:[~2007-11-01 23:24 UTC | newest] Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-10-31 18:40 [9fans] QTCTL? Francisco J Ballesteros 2007-10-31 18:56 ` Eric Van Hensbergen 2007-10-31 19:13 ` Charles Forsyth 2007-10-31 19:33 ` Eric Van Hensbergen 2007-10-31 19:39 ` erik quanstrom 2007-10-31 20:43 ` geoff 2007-10-31 21:32 ` Charles Forsyth 2007-10-31 22:48 ` roger peppe 2007-10-31 23:35 ` erik quanstrom 2007-11-01 9:29 ` roger peppe 2007-11-01 11:03 ` Eric Van Hensbergen 2007-11-01 11:19 ` Charles Forsyth 2007-11-01 12:11 ` erik quanstrom 2007-10-31 19:42 ` erik quanstrom 2007-10-31 19:49 ` Eric Van Hensbergen 2007-10-31 20:03 ` erik quanstrom 2007-10-31 20:10 ` Latchesar Ionkov 2007-10-31 20:12 ` Eric Van Hensbergen 2007-10-31 20:17 ` Russ Cox 2007-10-31 20:29 ` Francisco J Ballesteros 2007-10-31 20:48 ` Charles Forsyth 2007-10-31 21:23 ` Francisco J Ballesteros 2007-10-31 21:40 ` Russ Cox 2007-10-31 22:11 ` Charles Forsyth 2007-10-31 22:26 ` Francisco J Ballesteros 2007-10-31 22:37 ` Charles Forsyth 2007-10-31 22:43 ` Francisco J Ballesteros 2007-10-31 23:32 ` Eric Van Hensbergen 2007-10-31 23:41 ` [V9fs-developer] " Charles Forsyth [not found] ` <606b6f003ae9f0ed3e8c3c5f90ddc720@terzarima.net> 2007-11-01 1:13 ` Eric Van Hensbergen 2007-10-31 23:54 ` erik quanstrom 2007-11-01 0:03 ` Charles Forsyth 2007-11-01 1:25 ` Eric Van Hensbergen 2007-11-01 1:44 ` erik quanstrom 2007-11-01 2:15 ` Eric Van Hensbergen 2007-11-01 7:34 ` Skip Tavakkolian 2007-11-01 6:21 ` Bakul Shah 2007-11-01 14:28 ` Russ Cox 2007-11-01 14:38 ` erik quanstrom 2007-11-01 14:41 ` Charles Forsyth 2007-11-01 15:26 ` Sape Mullender 2007-11-01 15:51 ` Latchesar Ionkov 2007-11-01 16:04 ` ron minnich 2007-11-01 16:16 ` Latchesar Ionkov 2007-11-01 16:21 ` Sape Mullender 2007-11-01 16:58 ` Francisco J Ballesteros 2007-11-01 17:11 ` Charles Forsyth 2007-11-01 17:11 ` Francisco J Ballesteros 2007-11-01 17:13 ` Sape Mullender 2007-11-01 17:38 ` ron minnich 2007-11-01 17:56 ` Francisco J Ballesteros 2007-11-01 18:01 ` Francisco J Ballesteros 2007-11-01 18:52 ` Eric Van Hensbergen 2007-11-01 19:29 ` Francisco J Ballesteros 2007-11-01 23:24 ` ron minnich 2007-11-01 17:03 ` Russ Cox 2007-11-01 17:12 ` Sape Mullender 2007-11-01 17:35 ` erik quanstrom 2007-11-01 18:36 ` erik quanstrom 2007-11-01 17:13 ` Charles Forsyth 2007-11-01 17:16 ` Charles Forsyth 2007-11-01 17:20 ` Charles Forsyth 2007-11-01 17:52 ` Eric Van Hensbergen 2007-11-01 18:00 ` Latchesar Ionkov 2007-11-01 18:03 ` Francisco J Ballesteros 2007-11-01 18:08 ` Latchesar Ionkov 2007-11-01 18:16 ` erik quanstrom 2007-11-01 18:19 ` Francisco J Ballesteros 2007-11-01 18:35 ` Sape Mullender 2007-11-01 19:09 ` Charles Forsyth 2007-11-01 19:07 ` erik quanstrom 2007-11-01 17:14 ` Bakul Shah 2007-11-01 16:17 ` Sape Mullender 2007-11-01 16:27 ` Sape Mullender 2007-11-01 16:58 ` Sape Mullender 2007-11-01 16:59 ` Bakul Shah
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).