From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id e48d0c30 for ; Thu, 28 Feb 2019 06:37:47 +0000 (UTC) Received: (qmail 12738 invoked by alias); 28 Feb 2019 06:37:31 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 44094 Received: (qmail 6897 invoked by uid 1010); 28 Feb 2019 06:37:31 -0000 X-Qmail-Scanner-Diagnostics: from granite.fifsource.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.100.2/25370. spamassassin: 3.4.2. Clear:RC:0(173.255.216.206):SA:0(-1.9/5.0):. Processed in 3.424817 secs); 28 Feb 2019 06:37:31 -0000 X-Envelope-From: phil@fifi.org X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at fifi.org designates 173.255.216.206 as permitted sender) Message-ID: Subject: Re: Issues with fcntl() history file locking From: Philippe Troin To: Zsh hackers list Date: Wed, 27 Feb 2019 22:36:53 -0800 In-Reply-To: References: <717dfbf28e1b56d070ad0038f0367e3d2ab99464.camel@fifi.org> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit On Wed, 2019-02-27 at 13:27 -0800, Bart Schaefer wrote: > On Wed, Feb 27, 2019 at 10:31 AM Philippe Troin > wrote: > > I've been using zsh with share_history for many years and never had > > any > > real issues on several networks where my home directory is mounted > > over > > NFS. Recently, it's been giving me trouble, maybe when I bumped up > > my > > history file size to 10k entries. > > > > I then discovered hist_fcntl_lock, which I had not ever set, and > > turned > > it on. It didn't improve anything. > > Well, it wouldn't ... in fact it would likely make things worse. > flock() historically doesn't work reliably over NFS, and if you turn > that option on you are disabling the symlink-based file locking that > is usually more NFS-friendly. We used to do both kinds of locking > when hist_fcntl_lock, but workers/32580 reverted to using only one > kind ... I forget why I was asked to do that, probably something not > working as fast as was desired. Not necessarily worse. While you're right that (BSD) flock() never worked correctly on NFS, that is not the case with POSIX fcntl() locks. Zsh uses the later even though the zsh functions are named flock*. Also locking the file with fcntl clears the NFS attribute cache for that inode, making sure that you get the latest data. > > Unfortunately, POSIX states that the fcntl() lock will be released > > upon > > the closing the first descriptor to the file. [...and thus...] > > > > * writehistfile writes the history file without lock > > If that were the problem, you'd be likely to see corrupted entries > (the read stopping somewhere in the middle of what's being written) > or > problems when both shells were writing to the file, which would also > likely manifest as corrupted entries. > > Do the entries from terminal 1 NEVER show up in the file? Are they > in > the file but never show up in the history of terminal 2? Or are they > just slow to arrive in terminal 2? > > I'd be more inclined to suspect async NFS issues rather than locking. > Have you strace'd both processes to see when writes v. reads are > happening? The history file never gets corrupted. What I'm experiencing is loss of sync for a while. New commands on host1 never seem to appear (or take a long time to appear) on host2. Given this happens randomly, it's hard to catch zsh in the act. > > The right and hard way is to have the various calls to open() the > > history file to actually use the flock_fd lock file descriptor (and > > not > > close it when done with it, leaving that to unlockhistfile()). > > > > The easy messy way is to keep track of all the open descriptors to > > the > > history file in a global variable, and delaying the actual close > > until > > unlockhistfile() is called. > > If this actually turns out to be necessary, the second way is more > similar to how we handle descriptors in other parts of the shell. I'll do further experiments. This is my current hunch: everything is swell as long as lines are appended to the history file. But, when one host decides it's time to trim the history file is when stuff hits the fan. If someone had an idea on how to force zsh to trim history reliably, I'm all ears. Phil.