From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <de0adc6d3a2f620248344ddd402220b4@terzarima.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] silly replica question (repeated m msgs won't go away)
From: Charles Forsyth <forsyth@terzarima.net>
In-Reply-To: <597ffdd9b7799e931a08e4658420c643@collyer.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Date: Tue, 10 Feb 2004 12:22:28 +0000
Topicbox-Message-UUID: dc7fa5d8-eacc-11e9-9e20-41e7f4b1d025

i think the replica scheme is actually quite pretty,
but there's a flaw that happens to afflict 'm' particularly.

the easiest way to fix this without code change is to use
-c and -s as desired (carefully!)to resolve every other inconsistency,
so that plan9.time will finally be updated, and it won't process the 'm' entries again.
better still, your subsequent updates will be quicker too because it won't replay so much
of the log!  perhaps just do one replica/pull -v ... to get a list of differences to check later,
and fetch anything new, then do replica/pull -vc ... to retain your current client state.
a subsequent replica/pull -v ... shouldn't show the dreaded 'm' entries.

it might be helpful to look at one case to see what went wrong.

the file on sources is
--rw-rw-r-- M 119 rsc sys 10583 Nov 25 20:58 /n/sources/plan9/sys/lib/antiword/8859-15.txt

the m messages are repeated because the log contains several batches of records like the following:

1069794016 8 a sys/lib/antiword/8859-15.txt - 640 sys sys 1069793886 10583
1069799416 4 m sys/lib/antiword/8859-15.txt - 644 sys sys 1069793886 10583
1069801217 4 m sys/lib/antiword/8859-15.txt - 664 sys sys 1069793886 10583

(with other records interspersed).  note that the mtime is the same in each record
(correctly, as it happens).

the 'a' record has the test:
			if(dbd.mtime >= rd.mtime)	/* already created this file; ignore */
				break;
which prevents it from going any further.

the 'm' record has the test:
			if(!(dbd.mode&DMDIR) && dbd.mtime > rd.mtime)		/* have newer version; ignore */
				break;
			if((dbd.mode&DMDIR) && dbd.mtime > now)
				break;
where `now' is the date of the log entry (its first field), not the current time.
the test for `newer version' is > not >= so each 'm' is considered further.

one of the tests is whether the local file mode matches the remote mode,
where the `remote mode' is the mode in the log entry not the state of the current remote file.
they don't match, so the local file's mode is changed by the first 'm', then changed again
by the second 'm', and since it has done something you get a comment each time,
and an update in the database.

the file's mtime is the same in each log entry, because chmod correctly affects the parent directory,
not the file itself.  the -c or -s options have no effect because although they correctly update
the metadata (.db) to set the appropriate state, all the 'm' entries have the same
mtime and thus the newer-version test will continue to be false and they'll
rerun each time.  the trouble is that the .db has got only the mtime to resolve
`newer action', not the log sequence number for that action.

is there a simple change to the code? it can't simply set the mtime in .db
to the log entry time because it needs the correct last-modified time
for the file contents to be in .db to test later 'a'/'c'/'d' requests.  it isn't correct
to use >= in the mtime comparison because that would prevent 'm' from having much effect generally.
it would help to know which log entry it has seen.
there might be another way short of storing the log sequence in .db,
but i couldn't see it in the time i was prepared to spend on it.
i could see several partial/incorrect changes (eg, stat the actual remote file).

i looked at replica's approach for possible use in Inferno, because it
was compact and seemed simpler (in code size) compared to some
alternatives.  as it happens, the replica problem has its subtle
points (even for a problem simpler than tra's).  i quickly resorted to
a case analysis on my own, and wrote Limbo code based on that.
i noticed at least one other difference in
my analysis and applylog's; i must check my notes when i remember
where i filed them.  i ended up deciding to process the log files differently,
more precisely as with a (class of) log-structured file system (i was
doing a lot with them at the time).  there is a particular invariant
for the contents of the client's .log.  i could possibly have used a
modified .db structure (including log sequence numbers), but in my
scheme that's just an optimisation (and there was another way to do that).
of course, something else might be wrong ...  we'll see.