From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 96db7021 for ; Wed, 27 Feb 2019 21:28:45 +0000 (UTC) Received: (qmail 3673 invoked by alias); 27 Feb 2019 21:28:28 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 44093 Received: (qmail 19693 invoked by uid 1010); 27 Feb 2019 21:28:28 -0000 X-Qmail-Scanner-Diagnostics: from mail-lj1-f170.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.100.2/25370. spamassassin: 3.4.2. Clear:RC:0(209.85.208.170):SA:0(-1.9/5.0):. Processed in 3.204428 secs); 27 Feb 2019 21:28:28 -0000 X-Envelope-From: schaefer@brasslantern.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 209.85.208.170 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ySejXxmsecVK8GiwsStau/0PA0FIDW6GCyEOPoXp2zk=; b=UHPAEyNrYdW6I9mssUqNxyEKnctmE1VTm/yLaiSNaC6hcvRUWJFPsu79vIROy1Lm7n rH3TH76prj2gC5wyr6hNPD+5/nESh71UdyO4ImjORSxcnbvEL0UJWGFR3PPuPDrJc8wa neu2R6PE4/DyAQlXrF5rsyrYB0h+CCdQsGMrhJe9FRm6Pxgyu0d4ohxxA79l3atIUzQm iGZ961hTq9zepbVYLbNJKpwtI5VfDUK/1bFErgd2rkZ1TnnN8tiZYw5kAFsvBMdoiorP 4l0RBEKLDfnAEfDkQ+TdMIYaH/QSYVPL5hHJF5te2gvrnqMrtF7Sc1mOgMxkdst7NAdM ruXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ySejXxmsecVK8GiwsStau/0PA0FIDW6GCyEOPoXp2zk=; b=JRYI6ZwHo8eKPmyuVY1uQteee2fdjnajef8n+h5dp1NUI5FogyNNFtlmJ/WynVx3y1 yOifXTbSh86cP0lxkQ7wrxmwrYMUZXAVDcCaCy7VXRmNblaxw0SfxCypMpUhmf350Wmn nLgSBP76AVWIhnkC94bj/ji42PWc2N/7RXrMZuecCeQaGL45uI+rJUYxQ3FvVFrAwzIF H+sLxyb5GPYeidR8GVtD5rcs7ag5dTUJU6Lh3rdeMBJJ1UhbL61HJhpYcUj+U9rfCZzC ceJ2PG5HRI+VAkULTVErFxYMCf+tXIwgIvEydq4B1RgsmiD5aXfGcluL2OEgmr7L9oUd V0yw== X-Gm-Message-State: APjAAAXI+6zW99vAwCtIf7Sggh0xrSfkHmi+KTqKxIqv/03GkUQmb1Yx co9XMO4AlOjWwg+/3TgU1piUFqfzU2N3w2rU5B2BPhQcpMA= X-Google-Smtp-Source: APXvYqxu8P+kNXcEu33O9jZoDG2YJsjTuhCQqouqzKB4j+zYTJa0jegskKC5T/Vkz3l4gDTpJfDSuMod0mVAtc9LrP8= X-Received: by 2002:a2e:6358:: with SMTP id x85mr2539446ljb.167.1551302868910; Wed, 27 Feb 2019 13:27:48 -0800 (PST) MIME-Version: 1.0 References: <717dfbf28e1b56d070ad0038f0367e3d2ab99464.camel@fifi.org> In-Reply-To: <717dfbf28e1b56d070ad0038f0367e3d2ab99464.camel@fifi.org> From: Bart Schaefer Date: Wed, 27 Feb 2019 13:27:37 -0800 Message-ID: Subject: Re: Issues with fcntl() history file locking To: Philippe Troin Cc: Zsh hackers list Content-Type: text/plain; charset="UTF-8" On Wed, Feb 27, 2019 at 10:31 AM Philippe Troin wrote: > > I've been using zsh with share_history for many years and never had any > real issues on several networks where my home directory is mounted over > NFS. Recently, it's been giving me trouble, maybe when I bumped up my > history file size to 10k entries. > > I then discovered hist_fcntl_lock, which I had not ever set, and turned > it on. It didn't improve anything. Well, it wouldn't ... in fact it would likely make things worse. flock() historically doesn't work reliably over NFS, and if you turn that option on you are disabling the symlink-based file locking that is usually more NFS-friendly. We used to do both kinds of locking when hist_fcntl_lock, but workers/32580 reverted to using only one kind ... I forget why I was asked to do that, probably something not working as fast as was desired. > Unfortunately, POSIX states that the fcntl() lock will be released upon > the closing the first descriptor to the file. [...and thus...] > > * writehistfile writes the history file without lock If that were the problem, you'd be likely to see corrupted entries (the read stopping somewhere in the middle of what's being written) or problems when both shells were writing to the file, which would also likely manifest as corrupted entries. Do the entries from terminal 1 NEVER show up in the file? Are they in the file but never show up in the history of terminal 2? Or are they just slow to arrive in terminal 2? I'd be more inclined to suspect async NFS issues rather than locking. Have you strace'd both processes to see when writes v. reads are happening? > The right and hard way is to have the various calls to open() the > history file to actually use the flock_fd lock file descriptor (and not > close it when done with it, leaving that to unlockhistfile()). > > The easy messy way is to keep track of all the open descriptors to the > history file in a global variable, and delaying the actual close until > unlockhistfile() is called. If this actually turns out to be necessary, the second way is more similar to how we handle descriptors in other parts of the shell.