zsh-users
 help / color / mirror / code / Atom feed
* History corruption (over NFS)
@ 2005-02-10 16:54 Vincent Lefevre
  2005-02-10 18:41 ` Bart Schaefer
  0 siblings, 1 reply; 11+ messages in thread
From: Vincent Lefevre @ 2005-02-10 16:54 UTC (permalink / raw)
  To: zsh-users

I've noticed history corruption (several times) when using screen,
with the following .screenrc file: the end of the .zhistory file
gets many null characters. I thought that zsh was using a lock
mechanism.

sessionname test32
shell -zsh
chdir /users/spaces/vlefevre/oldtests
sorendition 4 06
autodetach on
defscrollback 6000
escape ^za
hardstatus off
hardstatus string "%h%n (%t)"
termcapinfo xterm*|rxvt hs:ts=\E]2;:fs=^G:ds=\E]2;TITLEDISABLED^G
bind r screen telnet ulysse 4913
screen -h 6000 -t server  1 zsh -c "./t3-server -v results.stp.-2.54 2> server.out"
screen -h 6000 -t secstep 2 nice -2 zsh -c "./t3-secstep -r=results.stp.-2.54 --remove -l=20 2>&1 2> secstep.out"

Some additional information:

ulysse:~/tmd/oldtests> echo $ZSH_VERSION
4.2.0
ulysse:~/tmd/oldtests> setopt|grep hist
extendedhistory
histignoredups
histignorespace
histnofunctions
histnostore
histreduceblanks
incappendhistory
ulysse:~/tmd/oldtests> unsetopt|grep hist
noappendhistory
nobanghist
cshjunkiehistory
histallowclobber
nohistbeep
histexpiredupsfirst
histfindnodups
histignorealldups
histsavenodups
histverify
sharehistory
ulysse:~/tmd/oldtests> echo $HISTFILE
/users/spaces/vlefevre/.zhistory
ulysse:~/tmd/oldtests> echo $HISTSIZE
8000
ulysse:~/tmd/oldtests> mount|grep /users/spaces
waly:/vol/users/spaces on /users/spaces type nfs (rw,nfsvers=3,rsize=32768,wsize=32768,proto=tcp,addr=152.81.1.27)

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-10 16:54 History corruption (over NFS) Vincent Lefevre
@ 2005-02-10 18:41 ` Bart Schaefer
  2005-02-10 23:54   ` Vincent Lefevre
  2005-02-14 15:56   ` Vincent Lefevre
  0 siblings, 2 replies; 11+ messages in thread
From: Bart Schaefer @ 2005-02-10 18:41 UTC (permalink / raw)
  To: zsh-users

On Feb 10,  5:54pm, Vincent Lefevre wrote:
}
} I've noticed history corruption (several times) when using screen,
} with the following .screenrc file: the end of the .zhistory file
} gets many null characters. I thought that zsh was using a lock
} mechanism.

My experience has been that this sort of thing occurs when a file is
shortened with e.g. ftruncate(), and is a problem with NFS and not
with whether a locking mechanism is in use.

One thing to check is whether the file on the NFS server actually does
contain those nul bytes, or if it's only the NFS client that sees them.
I vaguely recall that you may need to explicitly specify "noac" in the
NFS mount options, but it's been quite some time since I encountered
this particular problem.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-10 18:41 ` Bart Schaefer
@ 2005-02-10 23:54   ` Vincent Lefevre
  2005-02-14 15:56   ` Vincent Lefevre
  1 sibling, 0 replies; 11+ messages in thread
From: Vincent Lefevre @ 2005-02-10 23:54 UTC (permalink / raw)
  To: zsh-users

On 2005-02-10 18:41:11 +0000, Bart Schaefer wrote:
> One thing to check is whether the file on the NFS server actually
> does contain those nul bytes, or if it's only the NFS client that
> sees them.

The file on the server has these nul bytes; nore precisely, the
server makes backups available in .snapshot directories, and one
of the backups has nul bytes.

> I vaguely recall that you may need to explicitly specify "noac" in the
> NFS mount options, but it's been quite some time since I encountered
> this particular problem.

OK.

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-10 18:41 ` Bart Schaefer
  2005-02-10 23:54   ` Vincent Lefevre
@ 2005-02-14 15:56   ` Vincent Lefevre
  2005-02-14 16:24     ` Bart Schaefer
  2005-02-14 18:57     ` Wayne Davison
  1 sibling, 2 replies; 11+ messages in thread
From: Vincent Lefevre @ 2005-02-14 15:56 UTC (permalink / raw)
  To: zsh-users

On 2005-02-10 18:41:11 +0000, Bart Schaefer wrote:
> One thing to check is whether the file on the NFS server actually does
> contain those nul bytes, or if it's only the NFS client that sees them.
> I vaguely recall that you may need to explicitly specify "noac" in the
> NFS mount options, but it's been quite some time since I encountered
> this particular problem.

The "noac" didn't change anything (except making command execution
slower).

Here's what my corrupted .zhistory file looked like:

[...]
: 1108388462:0;ssh priam
^@^@^@^@^@^@^@^@[...]^@^@: 1108388476:0;t3-exec ulysse 0 -c=3 -l=3 [...]
: 1108390243:0;lt results.*
: 1108390256:0;head old-xtp/results.ctp.-3.54.1
: 1108390282:0;t3-secstep -f=ctp -e=-3 -m=54 -i=40 -n=44 --imax=7935
: 1108390290:0;screen -h 6000 -t server 1 zsh -c "./t3-server -v
+results.ctp.-3.54 2> server.out"
: 1108390309:0;ssh priam
[...]

If I've understood correctly, the first "ssh priam" started a zsh that
truncated the history file (since its size was above the limit). Note
that I quit this shell immediately. Then the current zsh instances
went on writing to the history file without noticing that it had been
truncated, hence the null bytes. The corrupt history was detected when
I started the second "ssh priam".

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 15:56   ` Vincent Lefevre
@ 2005-02-14 16:24     ` Bart Schaefer
  2005-02-15  9:50       ` Vincent Lefevre
  2005-02-14 18:57     ` Wayne Davison
  1 sibling, 1 reply; 11+ messages in thread
From: Bart Schaefer @ 2005-02-14 16:24 UTC (permalink / raw)
  To: zsh-users

On Feb 14,  4:56pm, Vincent Lefevre wrote:
} Subject: Re: History corruption (over NFS)
}
} The "noac" didn't change anything (except making command execution
} slower).
} 
} If I've understood correctly, the first "ssh priam" started a zsh that
} truncated the history file (since its size was above the limit). Note
} that I quit this shell immediately. Then the current zsh instances
} went on writing to the history file without noticing that it had been
} truncated, hence the null bytes.

Try changing from "noac" to an explicit "hard".  I wouldn't have thought
this was needed in a modern NFS environment, but the symptoms sure look
familiar.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 15:56   ` Vincent Lefevre
  2005-02-14 16:24     ` Bart Schaefer
@ 2005-02-14 18:57     ` Wayne Davison
  2005-02-14 21:06       ` Seth Kurtzberg
  2005-02-15  9:58       ` Vincent Lefevre
  1 sibling, 2 replies; 11+ messages in thread
From: Wayne Davison @ 2005-02-14 18:57 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

On Mon, Feb 14, 2005 at 04:56:30PM +0100, Vincent Lefevre wrote:
> Then the current zsh instances went on writing to the history file
> without noticing that it had been truncated, hence the null bytes.

I think this is an NFS problem, since the zsh instances do not keep the
history file open -- they just open the file for appending, and add an
item to the end.  When the file is rewritten, it is opened for writing
with O_TRUNC, and the new contents output.

A little while ago I wrote a patch that caused zsh to change its
rewriting strategy to use a HISTORY_FILE.new file and then rename it
into place.  The purpose of this patch is to avoid losing any history if
zsh gets interrupted during the writing of the new history file.
However, it should also be a friendlier update strategy for NFS dirs
because the inode of the file changes.  The downside is that it can undo
some user's use of symlinks, hardlinks, or special group permissions on
their history file (because it is replacing the history file instead of
rewriting it in-place).  Perhaps this algorithm should be made optional?

I've attached a diff in case you want to try it out.

..wayne..

[-- Attachment #2: hist-dot-new.patch --]
[-- Type: text/plain, Size: 1528 bytes --]

--- orig/hist.c	2005-01-28 02:16:20 -0800
+++ hist.c	2004-10-18 15:21:54 -0700
@@ -2004,7 +2004,7 @@
 void
 savehistfile(char *fn, int err, int writeflags)
 {
-    char *t, *start = NULL;
+    char *t, *tmpfile, *start = NULL;
     FILE *out;
     Histent he;
     zlong xcurhist = curhist - !!(histactive & HA_ACTIVE);
@@ -2041,12 +2041,14 @@
 	    extended_history = 1;
     }
     if (writeflags & HFILE_APPEND) {
+	tmpfile = NULL;
 	out = fdopen(open(unmeta(fn),
 			O_CREAT | O_WRONLY | O_APPEND | O_NOCTTY, 0600), "a");
     }
     else {
-	out = fdopen(open(unmeta(fn),
-			 O_CREAT | O_WRONLY | O_TRUNC | O_NOCTTY, 0600), "w");
+	tmpfile = bicat(unmeta(fn), ".new");
+	unlink(tmpfile);
+	out = fdopen(open(tmpfile, O_WRONLY | O_CREAT | O_EXCL, 0600), "w");
     }
     if (out) {
 	for (; he && he->histnum <= xcurhist; he = down_histent(he)) {
@@ -2091,6 +2093,11 @@
 	    lasthist.text = ztrdup(start);
 	}
 	fclose(out);
+	if (tmpfile) {
+	    if (rename(tmpfile, unmeta(fn)) < 0)
+		zerr("can't rename %s.new to $HISTFILE", fn, 0);
+	    free(tmpfile);
+	}
 
 	if (writeflags & HFILE_SKIPOLD
 	 && !(writeflags & (HFILE_FAST | HFILE_NO_REWRITE))) {
@@ -2110,8 +2117,13 @@
 	    pophiststack();
 	    histactive = remember_histactive;
 	}
-    } else if (err)
-	zerr("can't write history file %s", fn, 0);
+    } else if (err) {
+	if (tmpfile) {
+	    zerr("can't write history file %s.new", fn, 0);
+	    free(tmpfile);
+	} else
+	    zerr("can't write history file %s", fn, 0);
+    }
 
     unlockhistfile(fn);
 }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 18:57     ` Wayne Davison
@ 2005-02-14 21:06       ` Seth Kurtzberg
  2005-02-14 22:33         ` Wayne Davison
  2005-02-15  9:58       ` Vincent Lefevre
  1 sibling, 1 reply; 11+ messages in thread
From: Seth Kurtzberg @ 2005-02-14 21:06 UTC (permalink / raw)
  To: Wayne Davison; +Cc: zsh-users

Wayne Davison wrote:

>On Mon, Feb 14, 2005 at 04:56:30PM +0100, Vincent Lefevre wrote:
>  
>
>>Then the current zsh instances went on writing to the history file
>>without noticing that it had been truncated, hence the null bytes.
>>    
>>
>
>I think this is an NFS problem, since the zsh instances do not keep the
>history file open -- they just open the file for appending, and add an
>item to the end.  When the file is rewritten, it is opened for writing
>with O_TRUNC, and the new contents output.
>
>A little while ago I wrote a patch that caused zsh to change its
>rewriting strategy to use a HISTORY_FILE.new file and then rename it
>into place.  The purpose of this patch is to avoid losing any history if
>zsh gets interrupted during the writing of the new history file.
>However, it should also be a friendlier update strategy for NFS dirs
>because the inode of the file changes.  The downside is that it can undo
>some user's use of symlinks, hardlinks, or special group permissions on
>their history file (because it is replacing the history file instead of
>rewriting it in-place).  Perhaps this algorithm should be made optional?
>  
>
Won't this cause problems in the mode where history is shared, and 
history is written for every command, not just once when the shell 
terminates?

>I've attached a diff in case you want to try it out.
>
>..wayne..
>
>
>!DSPAM:4210f8a3201492089188992!
>  
>
>------------------------------------------------------------------------
>
>--- orig/hist.c	2005-01-28 02:16:20 -0800
>+++ hist.c	2004-10-18 15:21:54 -0700
>@@ -2004,7 +2004,7 @@
> void
> savehistfile(char *fn, int err, int writeflags)
> {
>-    char *t, *start = NULL;
>+    char *t, *tmpfile, *start = NULL;
>     FILE *out;
>     Histent he;
>     zlong xcurhist = curhist - !!(histactive & HA_ACTIVE);
>@@ -2041,12 +2041,14 @@
> 	    extended_history = 1;
>     }
>     if (writeflags & HFILE_APPEND) {
>+	tmpfile = NULL;
> 	out = fdopen(open(unmeta(fn),
> 			O_CREAT | O_WRONLY | O_APPEND | O_NOCTTY, 0600), "a");
>     }
>     else {
>-	out = fdopen(open(unmeta(fn),
>-			 O_CREAT | O_WRONLY | O_TRUNC | O_NOCTTY, 0600), "w");
>+	tmpfile = bicat(unmeta(fn), ".new");
>+	unlink(tmpfile);
>+	out = fdopen(open(tmpfile, O_WRONLY | O_CREAT | O_EXCL, 0600), "w");
>     }
>     if (out) {
> 	for (; he && he->histnum <= xcurhist; he = down_histent(he)) {
>@@ -2091,6 +2093,11 @@
> 	    lasthist.text = ztrdup(start);
> 	}
> 	fclose(out);
>+	if (tmpfile) {
>+	    if (rename(tmpfile, unmeta(fn)) < 0)
>+		zerr("can't rename %s.new to $HISTFILE", fn, 0);
>+	    free(tmpfile);
>+	}
> 
> 	if (writeflags & HFILE_SKIPOLD
> 	 && !(writeflags & (HFILE_FAST | HFILE_NO_REWRITE))) {
>@@ -2110,8 +2117,13 @@
> 	    pophiststack();
> 	    histactive = remember_histactive;
> 	}
>-    } else if (err)
>-	zerr("can't write history file %s", fn, 0);
>+    } else if (err) {
>+	if (tmpfile) {
>+	    zerr("can't write history file %s.new", fn, 0);
>+	    free(tmpfile);
>+	} else
>+	    zerr("can't write history file %s", fn, 0);
>+    }
> 
>     unlockhistfile(fn);
> }
>  
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 21:06       ` Seth Kurtzberg
@ 2005-02-14 22:33         ` Wayne Davison
  0 siblings, 0 replies; 11+ messages in thread
From: Wayne Davison @ 2005-02-14 22:33 UTC (permalink / raw)
  To: Seth Kurtzberg; +Cc: zsh-users

On Mon, Feb 14, 2005 at 02:06:12PM -0700, Seth Kurtzberg wrote:
> Won't this cause problems in the mode where history is shared, and 
> history is written for every command

Nope -- appended history still appends just fine.  As mentioned, zsh
re-opens the history file for every append, so it matters not if the
file gets rewritten to a new inode.  Also, every append and every
rewrite is locked by using a .zhistory.LOCK file, so multiple shells
don't update the history file at the same time.

> not just once when the shell terminates?

Zsh also periodically rewrites the history file when the number of
appended lines exceeds the SAVEHIST count by a certain percentage.

..wayne..


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 16:24     ` Bart Schaefer
@ 2005-02-15  9:50       ` Vincent Lefevre
  2005-02-15 16:15         ` Bart Schaefer
  0 siblings, 1 reply; 11+ messages in thread
From: Vincent Lefevre @ 2005-02-15  9:50 UTC (permalink / raw)
  To: zsh-users

On 2005-02-14 16:24:40 +0000, Bart Schaefer wrote:
> Try changing from "noac" to an explicit "hard".

The nfs(5) man page says:

   hard       If an NFS file operation has a major timeout then report
              "server not responding"  on  the  console  and  continue
              retrying indefinitely.  This is the default.

Why would this option change anything here?

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-14 18:57     ` Wayne Davison
  2005-02-14 21:06       ` Seth Kurtzberg
@ 2005-02-15  9:58       ` Vincent Lefevre
  1 sibling, 0 replies; 11+ messages in thread
From: Vincent Lefevre @ 2005-02-15  9:58 UTC (permalink / raw)
  To: zsh-users

On 2005-02-14 10:57:05 -0800, Wayne Davison wrote:
> A little while ago I wrote a patch that caused zsh to change its
> rewriting strategy to use a HISTORY_FILE.new file and then rename it
> into place. The purpose of this patch is to avoid losing any history
> if zsh gets interrupted during the writing of the new history file.
> However, it should also be a friendlier update strategy for NFS dirs
> because the inode of the file changes.

Will the other NFS clients necessarily notice the inode change?

Anyway it seems that this solution would work. I did a test with
Emacs, which changes the inode when modifying a file, simulating
what zsh does by removing some .zhistory lines at the beginning,
and there was no problem with NFS.

> The downside is that it can undo some user's use of symlinks,
> hardlinks, or special group permissions on their history file
> (because it is replacing the history file instead of rewriting it
> in-place). Perhaps this algorithm should be made optional?

Making it optional would be a good idea, much better than requiring
some mount options (which can slow down other applications...).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: History corruption (over NFS)
  2005-02-15  9:50       ` Vincent Lefevre
@ 2005-02-15 16:15         ` Bart Schaefer
  0 siblings, 0 replies; 11+ messages in thread
From: Bart Schaefer @ 2005-02-15 16:15 UTC (permalink / raw)
  To: zsh-users

On Feb 15, 10:50am, Vincent Lefevre wrote:
} Subject: Re: History corruption (over NFS)
}
} On 2005-02-14 16:24:40 +0000, Bart Schaefer wrote:
} > Try changing from "noac" to an explicit "hard".
} 
} Why would this option change anything here?

Sorry, you're right.  I haven't been doing NFS administration for a while,
I should have looked before I spoke.

The option I was thinking of was "sync".


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-02-15 16:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-02-10 16:54 History corruption (over NFS) Vincent Lefevre
2005-02-10 18:41 ` Bart Schaefer
2005-02-10 23:54   ` Vincent Lefevre
2005-02-14 15:56   ` Vincent Lefevre
2005-02-14 16:24     ` Bart Schaefer
2005-02-15  9:50       ` Vincent Lefevre
2005-02-15 16:15         ` Bart Schaefer
2005-02-14 18:57     ` Wayne Davison
2005-02-14 21:06       ` Seth Kurtzberg
2005-02-14 22:33         ` Wayne Davison
2005-02-15  9:58       ` Vincent Lefevre

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).