* History corruption (over NFS) @ 2005-02-10 16:54 Vincent Lefevre 2005-02-10 18:41 ` Bart Schaefer 0 siblings, 1 reply; 11+ messages in thread From: Vincent Lefevre @ 2005-02-10 16:54 UTC (permalink / raw) To: zsh-users I've noticed history corruption (several times) when using screen, with the following .screenrc file: the end of the .zhistory file gets many null characters. I thought that zsh was using a lock mechanism. sessionname test32 shell -zsh chdir /users/spaces/vlefevre/oldtests sorendition 4 06 autodetach on defscrollback 6000 escape ^za hardstatus off hardstatus string "%h%n (%t)" termcapinfo xterm*|rxvt hs:ts=\E]2;:fs=^G:ds=\E]2;TITLEDISABLED^G bind r screen telnet ulysse 4913 screen -h 6000 -t server 1 zsh -c "./t3-server -v results.stp.-2.54 2> server.out" screen -h 6000 -t secstep 2 nice -2 zsh -c "./t3-secstep -r=results.stp.-2.54 --remove -l=20 2>&1 2> secstep.out" Some additional information: ulysse:~/tmd/oldtests> echo $ZSH_VERSION 4.2.0 ulysse:~/tmd/oldtests> setopt|grep hist extendedhistory histignoredups histignorespace histnofunctions histnostore histreduceblanks incappendhistory ulysse:~/tmd/oldtests> unsetopt|grep hist noappendhistory nobanghist cshjunkiehistory histallowclobber nohistbeep histexpiredupsfirst histfindnodups histignorealldups histsavenodups histverify sharehistory ulysse:~/tmd/oldtests> echo $HISTFILE /users/spaces/vlefevre/.zhistory ulysse:~/tmd/oldtests> echo $HISTSIZE 8000 ulysse:~/tmd/oldtests> mount|grep /users/spaces waly:/vol/users/spaces on /users/spaces type nfs (rw,nfsvers=3,rsize=32768,wsize=32768,proto=tcp,addr=152.81.1.27) -- Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-10 16:54 History corruption (over NFS) Vincent Lefevre @ 2005-02-10 18:41 ` Bart Schaefer 2005-02-10 23:54 ` Vincent Lefevre 2005-02-14 15:56 ` Vincent Lefevre 0 siblings, 2 replies; 11+ messages in thread From: Bart Schaefer @ 2005-02-10 18:41 UTC (permalink / raw) To: zsh-users On Feb 10, 5:54pm, Vincent Lefevre wrote: } } I've noticed history corruption (several times) when using screen, } with the following .screenrc file: the end of the .zhistory file } gets many null characters. I thought that zsh was using a lock } mechanism. My experience has been that this sort of thing occurs when a file is shortened with e.g. ftruncate(), and is a problem with NFS and not with whether a locking mechanism is in use. One thing to check is whether the file on the NFS server actually does contain those nul bytes, or if it's only the NFS client that sees them. I vaguely recall that you may need to explicitly specify "noac" in the NFS mount options, but it's been quite some time since I encountered this particular problem. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-10 18:41 ` Bart Schaefer @ 2005-02-10 23:54 ` Vincent Lefevre 2005-02-14 15:56 ` Vincent Lefevre 1 sibling, 0 replies; 11+ messages in thread From: Vincent Lefevre @ 2005-02-10 23:54 UTC (permalink / raw) To: zsh-users On 2005-02-10 18:41:11 +0000, Bart Schaefer wrote: > One thing to check is whether the file on the NFS server actually > does contain those nul bytes, or if it's only the NFS client that > sees them. The file on the server has these nul bytes; nore precisely, the server makes backups available in .snapshot directories, and one of the backups has nul bytes. > I vaguely recall that you may need to explicitly specify "noac" in the > NFS mount options, but it's been quite some time since I encountered > this particular problem. OK. -- Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-10 18:41 ` Bart Schaefer 2005-02-10 23:54 ` Vincent Lefevre @ 2005-02-14 15:56 ` Vincent Lefevre 2005-02-14 16:24 ` Bart Schaefer 2005-02-14 18:57 ` Wayne Davison 1 sibling, 2 replies; 11+ messages in thread From: Vincent Lefevre @ 2005-02-14 15:56 UTC (permalink / raw) To: zsh-users On 2005-02-10 18:41:11 +0000, Bart Schaefer wrote: > One thing to check is whether the file on the NFS server actually does > contain those nul bytes, or if it's only the NFS client that sees them. > I vaguely recall that you may need to explicitly specify "noac" in the > NFS mount options, but it's been quite some time since I encountered > this particular problem. The "noac" didn't change anything (except making command execution slower). Here's what my corrupted .zhistory file looked like: [...] : 1108388462:0;ssh priam ^@^@^@^@^@^@^@^@[...]^@^@: 1108388476:0;t3-exec ulysse 0 -c=3 -l=3 [...] : 1108390243:0;lt results.* : 1108390256:0;head old-xtp/results.ctp.-3.54.1 : 1108390282:0;t3-secstep -f=ctp -e=-3 -m=54 -i=40 -n=44 --imax=7935 : 1108390290:0;screen -h 6000 -t server 1 zsh -c "./t3-server -v +results.ctp.-3.54 2> server.out" : 1108390309:0;ssh priam [...] If I've understood correctly, the first "ssh priam" started a zsh that truncated the history file (since its size was above the limit). Note that I quit this shell immediately. Then the current zsh instances went on writing to the history file without noticing that it had been truncated, hence the null bytes. The corrupt history was detected when I started the second "ssh priam". -- Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 15:56 ` Vincent Lefevre @ 2005-02-14 16:24 ` Bart Schaefer 2005-02-15 9:50 ` Vincent Lefevre 2005-02-14 18:57 ` Wayne Davison 1 sibling, 1 reply; 11+ messages in thread From: Bart Schaefer @ 2005-02-14 16:24 UTC (permalink / raw) To: zsh-users On Feb 14, 4:56pm, Vincent Lefevre wrote: } Subject: Re: History corruption (over NFS) } } The "noac" didn't change anything (except making command execution } slower). } } If I've understood correctly, the first "ssh priam" started a zsh that } truncated the history file (since its size was above the limit). Note } that I quit this shell immediately. Then the current zsh instances } went on writing to the history file without noticing that it had been } truncated, hence the null bytes. Try changing from "noac" to an explicit "hard". I wouldn't have thought this was needed in a modern NFS environment, but the symptoms sure look familiar. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 16:24 ` Bart Schaefer @ 2005-02-15 9:50 ` Vincent Lefevre 2005-02-15 16:15 ` Bart Schaefer 0 siblings, 1 reply; 11+ messages in thread From: Vincent Lefevre @ 2005-02-15 9:50 UTC (permalink / raw) To: zsh-users On 2005-02-14 16:24:40 +0000, Bart Schaefer wrote: > Try changing from "noac" to an explicit "hard". The nfs(5) man page says: hard If an NFS file operation has a major timeout then report "server not responding" on the console and continue retrying indefinitely. This is the default. Why would this option change anything here? -- Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-15 9:50 ` Vincent Lefevre @ 2005-02-15 16:15 ` Bart Schaefer 0 siblings, 0 replies; 11+ messages in thread From: Bart Schaefer @ 2005-02-15 16:15 UTC (permalink / raw) To: zsh-users On Feb 15, 10:50am, Vincent Lefevre wrote: } Subject: Re: History corruption (over NFS) } } On 2005-02-14 16:24:40 +0000, Bart Schaefer wrote: } > Try changing from "noac" to an explicit "hard". } } Why would this option change anything here? Sorry, you're right. I haven't been doing NFS administration for a while, I should have looked before I spoke. The option I was thinking of was "sync". ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 15:56 ` Vincent Lefevre 2005-02-14 16:24 ` Bart Schaefer @ 2005-02-14 18:57 ` Wayne Davison 2005-02-14 21:06 ` Seth Kurtzberg 2005-02-15 9:58 ` Vincent Lefevre 1 sibling, 2 replies; 11+ messages in thread From: Wayne Davison @ 2005-02-14 18:57 UTC (permalink / raw) To: zsh-users [-- Attachment #1: Type: text/plain, Size: 1158 bytes --] On Mon, Feb 14, 2005 at 04:56:30PM +0100, Vincent Lefevre wrote: > Then the current zsh instances went on writing to the history file > without noticing that it had been truncated, hence the null bytes. I think this is an NFS problem, since the zsh instances do not keep the history file open -- they just open the file for appending, and add an item to the end. When the file is rewritten, it is opened for writing with O_TRUNC, and the new contents output. A little while ago I wrote a patch that caused zsh to change its rewriting strategy to use a HISTORY_FILE.new file and then rename it into place. The purpose of this patch is to avoid losing any history if zsh gets interrupted during the writing of the new history file. However, it should also be a friendlier update strategy for NFS dirs because the inode of the file changes. The downside is that it can undo some user's use of symlinks, hardlinks, or special group permissions on their history file (because it is replacing the history file instead of rewriting it in-place). Perhaps this algorithm should be made optional? I've attached a diff in case you want to try it out. ..wayne.. [-- Attachment #2: hist-dot-new.patch --] [-- Type: text/plain, Size: 1528 bytes --] --- orig/hist.c 2005-01-28 02:16:20 -0800 +++ hist.c 2004-10-18 15:21:54 -0700 @@ -2004,7 +2004,7 @@ void savehistfile(char *fn, int err, int writeflags) { - char *t, *start = NULL; + char *t, *tmpfile, *start = NULL; FILE *out; Histent he; zlong xcurhist = curhist - !!(histactive & HA_ACTIVE); @@ -2041,12 +2041,14 @@ extended_history = 1; } if (writeflags & HFILE_APPEND) { + tmpfile = NULL; out = fdopen(open(unmeta(fn), O_CREAT | O_WRONLY | O_APPEND | O_NOCTTY, 0600), "a"); } else { - out = fdopen(open(unmeta(fn), - O_CREAT | O_WRONLY | O_TRUNC | O_NOCTTY, 0600), "w"); + tmpfile = bicat(unmeta(fn), ".new"); + unlink(tmpfile); + out = fdopen(open(tmpfile, O_WRONLY | O_CREAT | O_EXCL, 0600), "w"); } if (out) { for (; he && he->histnum <= xcurhist; he = down_histent(he)) { @@ -2091,6 +2093,11 @@ lasthist.text = ztrdup(start); } fclose(out); + if (tmpfile) { + if (rename(tmpfile, unmeta(fn)) < 0) + zerr("can't rename %s.new to $HISTFILE", fn, 0); + free(tmpfile); + } if (writeflags & HFILE_SKIPOLD && !(writeflags & (HFILE_FAST | HFILE_NO_REWRITE))) { @@ -2110,8 +2117,13 @@ pophiststack(); histactive = remember_histactive; } - } else if (err) - zerr("can't write history file %s", fn, 0); + } else if (err) { + if (tmpfile) { + zerr("can't write history file %s.new", fn, 0); + free(tmpfile); + } else + zerr("can't write history file %s", fn, 0); + } unlockhistfile(fn); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 18:57 ` Wayne Davison @ 2005-02-14 21:06 ` Seth Kurtzberg 2005-02-14 22:33 ` Wayne Davison 2005-02-15 9:58 ` Vincent Lefevre 1 sibling, 1 reply; 11+ messages in thread From: Seth Kurtzberg @ 2005-02-14 21:06 UTC (permalink / raw) To: Wayne Davison; +Cc: zsh-users Wayne Davison wrote: >On Mon, Feb 14, 2005 at 04:56:30PM +0100, Vincent Lefevre wrote: > > >>Then the current zsh instances went on writing to the history file >>without noticing that it had been truncated, hence the null bytes. >> >> > >I think this is an NFS problem, since the zsh instances do not keep the >history file open -- they just open the file for appending, and add an >item to the end. When the file is rewritten, it is opened for writing >with O_TRUNC, and the new contents output. > >A little while ago I wrote a patch that caused zsh to change its >rewriting strategy to use a HISTORY_FILE.new file and then rename it >into place. The purpose of this patch is to avoid losing any history if >zsh gets interrupted during the writing of the new history file. >However, it should also be a friendlier update strategy for NFS dirs >because the inode of the file changes. The downside is that it can undo >some user's use of symlinks, hardlinks, or special group permissions on >their history file (because it is replacing the history file instead of >rewriting it in-place). Perhaps this algorithm should be made optional? > > Won't this cause problems in the mode where history is shared, and history is written for every command, not just once when the shell terminates? >I've attached a diff in case you want to try it out. > >..wayne.. > > >!DSPAM:4210f8a3201492089188992! > > >------------------------------------------------------------------------ > >--- orig/hist.c 2005-01-28 02:16:20 -0800 >+++ hist.c 2004-10-18 15:21:54 -0700 >@@ -2004,7 +2004,7 @@ > void > savehistfile(char *fn, int err, int writeflags) > { >- char *t, *start = NULL; >+ char *t, *tmpfile, *start = NULL; > FILE *out; > Histent he; > zlong xcurhist = curhist - !!(histactive & HA_ACTIVE); >@@ -2041,12 +2041,14 @@ > extended_history = 1; > } > if (writeflags & HFILE_APPEND) { >+ tmpfile = NULL; > out = fdopen(open(unmeta(fn), > O_CREAT | O_WRONLY | O_APPEND | O_NOCTTY, 0600), "a"); > } > else { >- out = fdopen(open(unmeta(fn), >- O_CREAT | O_WRONLY | O_TRUNC | O_NOCTTY, 0600), "w"); >+ tmpfile = bicat(unmeta(fn), ".new"); >+ unlink(tmpfile); >+ out = fdopen(open(tmpfile, O_WRONLY | O_CREAT | O_EXCL, 0600), "w"); > } > if (out) { > for (; he && he->histnum <= xcurhist; he = down_histent(he)) { >@@ -2091,6 +2093,11 @@ > lasthist.text = ztrdup(start); > } > fclose(out); >+ if (tmpfile) { >+ if (rename(tmpfile, unmeta(fn)) < 0) >+ zerr("can't rename %s.new to $HISTFILE", fn, 0); >+ free(tmpfile); >+ } > > if (writeflags & HFILE_SKIPOLD > && !(writeflags & (HFILE_FAST | HFILE_NO_REWRITE))) { >@@ -2110,8 +2117,13 @@ > pophiststack(); > histactive = remember_histactive; > } >- } else if (err) >- zerr("can't write history file %s", fn, 0); >+ } else if (err) { >+ if (tmpfile) { >+ zerr("can't write history file %s.new", fn, 0); >+ free(tmpfile); >+ } else >+ zerr("can't write history file %s", fn, 0); >+ } > > unlockhistfile(fn); > } > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 21:06 ` Seth Kurtzberg @ 2005-02-14 22:33 ` Wayne Davison 0 siblings, 0 replies; 11+ messages in thread From: Wayne Davison @ 2005-02-14 22:33 UTC (permalink / raw) To: Seth Kurtzberg; +Cc: zsh-users On Mon, Feb 14, 2005 at 02:06:12PM -0700, Seth Kurtzberg wrote: > Won't this cause problems in the mode where history is shared, and > history is written for every command Nope -- appended history still appends just fine. As mentioned, zsh re-opens the history file for every append, so it matters not if the file gets rewritten to a new inode. Also, every append and every rewrite is locked by using a .zhistory.LOCK file, so multiple shells don't update the history file at the same time. > not just once when the shell terminates? Zsh also periodically rewrites the history file when the number of appended lines exceeds the SAVEHIST count by a certain percentage. ..wayne.. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: History corruption (over NFS) 2005-02-14 18:57 ` Wayne Davison 2005-02-14 21:06 ` Seth Kurtzberg @ 2005-02-15 9:58 ` Vincent Lefevre 1 sibling, 0 replies; 11+ messages in thread From: Vincent Lefevre @ 2005-02-15 9:58 UTC (permalink / raw) To: zsh-users On 2005-02-14 10:57:05 -0800, Wayne Davison wrote: > A little while ago I wrote a patch that caused zsh to change its > rewriting strategy to use a HISTORY_FILE.new file and then rename it > into place. The purpose of this patch is to avoid losing any history > if zsh gets interrupted during the writing of the new history file. > However, it should also be a friendlier update strategy for NFS dirs > because the inode of the file changes. Will the other NFS clients necessarily notice the inode change? Anyway it seems that this solution would work. I did a test with Emacs, which changes the inode when modifying a file, simulating what zsh does by removing some .zhistory lines at the beginning, and there was no problem with NFS. > The downside is that it can undo some user's use of symlinks, > hardlinks, or special group permissions on their history file > (because it is replacing the history file instead of rewriting it > in-place). Perhaps this algorithm should be made optional? Making it optional would be a good idea, much better than requiring some mount options (which can slow down other applications...). -- Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-02-15 16:15 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-02-10 16:54 History corruption (over NFS) Vincent Lefevre 2005-02-10 18:41 ` Bart Schaefer 2005-02-10 23:54 ` Vincent Lefevre 2005-02-14 15:56 ` Vincent Lefevre 2005-02-14 16:24 ` Bart Schaefer 2005-02-15 9:50 ` Vincent Lefevre 2005-02-15 16:15 ` Bart Schaefer 2005-02-14 18:57 ` Wayne Davison 2005-02-14 21:06 ` Seth Kurtzberg 2005-02-14 22:33 ` Wayne Davison 2005-02-15 9:58 ` Vincent Lefevre
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).