From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <46BC5AD2.8090109@ec.gc.ca>
Date: Fri, 10 Aug 2007 08:32:18 -0400
From: John Marshall <John.Marshall@ec.gc.ca>
User-Agent: Thunderbird 2.0.0.6 (X11/20070728)
MIME-Version: 1.0
To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu>
Subject: Re: [9fans] synthetic filesystems and changing data
References: <46BBC396.9040009@ec.gc.ca> <46BC0A6D.9060404@proweb.co.uk>
In-Reply-To: <46BC0A6D.9060404@proweb.co.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Topicbox-Message-UUID: a3de0106-ead2-11e9-9d60-3106f5b1d025

matt wrote:
> The coder for the file system makes the choice whether to keep a 
> snapshot of the data to be read per FID, this way the FID reader will 
> always get the file they asked for, which, after the first read, isn't 
> necessarily the current version or to just return whatever each Tread 
> asks for regardless of whether the underlying data has changed.
> 
> It is akin to an SQL cursor. When I make an SQL request the rows 
> returned are a snapshot of the data when the query was made. If I read 
> them at one row per second there's a chance that the actual data in the 
> datastore changes between reads.

Hi,

Although I was not immediately convinced that a versioning approach was
all that useful, I can now see that it roughly matches what I've always
done for servicing web requests for dynamically generated data: each
connection keeps a version, which may or may not be out of date, which
problem is left up to the client to deal with. I guess the difference was
that buffering each version in the web context has always been so natural.
All the "state" (connection, user, object being referenced, cached data)
was kept separately for the entire lifetime of the connection/transaction
and I did not have to worry about anything. I could even timeout the
transaction easily. In the versioning case I would have to 1) store the
different versions somewhere (local fs or memory), 2) track their
association with the related fids, 3) and remove them on clunk. And I
would have to keep the version around until the clunk which might happen
in 1s, 10 minutes, or never.

> Either way there's a "tough luck" aspect, if you have a solution I'm 
> sure everyone would be delighted to hear it :)
>
> It's a perinnial problem. When opening a file for writing some programs 
> (acme for instance) will warn you if the file has changed since you 
> opened it. That's a solution to the same problem.

 From people's experience, do the following seem like reasonable
alternatives, if the situation allows for it?
1) return an error message if the message size is too small to return
    the content in one response? The application would simply trap for
    the error condition and retry.
2) if, over the course of returning the individual messages for a read,
    an error message was returned which indicated a changed file (the
    read operation would no longer return data)? Again, the application
    would simply trap for the error condition and retry. This is
    something like what editors do (acme, as you mention), except that
    the read operation is affected rather than trying to prevent an
    overwrite of possibly newer data.

I guess I am trying to avoid versioning because there are too many
aspects that I am concerned I could not reasonably deal with. I'd
like to be able to support something akin to an atomic transaction
(although it would involve multiple reads/messages).

Any comments on a third option?
3) if versioning is used, reads would only be supported on a cached
    version for a limited period of time. Afterwards, the cached
    version would be dropped and an error message would be returned
    indicating something like "File version has expired."

I realize that this none of these may be general solutions but
for specific kinds of data (e.g., status information) they may be.
And, the client accessing the files would have to be prepared to
deal with the different kinds of error messages. The goal is to
get correct/consistent data.

Thanks for everyone's input,
John