From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19295 invoked by alias); 23 Jan 2015 12:13:27 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 34356 Received: (qmail 28302 invoked from network); 23 Jan 2015 12:13:15 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=3olSlHZRfH79VHGofziAGdE2Mgiy8OStGXF1Sq8gTjg=; b=RKzNfS4AW6VHThK+0OzZWjz2AMtceOmN6K6tRTfis0TsdMQmfp+kg4zIwpDoj1nmM/ KysOL9nX8VGbEqXJXh2K+5OM/BPBO1UUFlcTgQpQ6YvH2MHfRbbq603GBOHXmEQkJU2r 267UbU1TNrgw5EZMV//oxYC66PTUoSK8Wc3R7gnFKLIv/2H+DYjSbYRHk02GruMMv3KM AfwOngrjvRPnVz1R1yxVpmRMEo8WJSmnjZGJme/6jSfn+1ZSBKOtylPONpDTaKITho1i StQ6BuY8bgqqrN+kT8Pt+4IHCfl3C6HPQwyF2gLgdHV9kZo2jHMQrMvyexfKE4QbtCdg p+ow== X-Gm-Message-State: ALoCoQlNr0KLFM5BBL8mQyxD8/wuS76CRH11e3ucSm1hNxlXmrZDHzUa41TSaybntAwzGqrBlnon MIME-Version: 1.0 X-Received: by 10.112.54.167 with SMTP id k7mr6907535lbp.72.1422015189487; Fri, 23 Jan 2015 04:13:09 -0800 (PST) X-Originating-IP: [80.239.194.50] In-Reply-To: <20150123114628.GB10898@mistotebe.net> References: <20150123114628.GB10898@mistotebe.net> Date: Fri, 23 Jan 2015 12:13:09 +0000 Message-ID: Subject: Re: History file lock contention From: Peter Stephenson To: ondra@mistotebe.net Cc: zsh workers Content-Type: multipart/alternative; boundary=001a11c3fb303c2267050d50b617 --001a11c3fb303c2267050d50b617 Content-Type: text/plain; charset=UTF-8 On 23 January 2015 at 11:46, wrote: > when running zsh from a terminal multiplexer like tmux, it is possible > (and not uncommon) to close several processes at the exact same moment. > In that case, all of them try to sync their history and all but one fail > to acquire the lock. So far so good. > > However, all of these call nanosleep({1,0}), and since we might be on a > multicore system, they might just be scheduled at the same time again, > only one of them will progress and the scenario repeats. What the user > sees is the terminals not disappearing immediately, but one after the > other, a second per process. > > A common solution to this problem is to randomize the back-off, for > example in the interval (0, 1s). On a decent system without load, this > would make it more responsive (all of them would be likely to finish > within a second of the user's request if possible) while not causing > any undue increase in the load on the machine. Yes, this is sensible. I think the current simplistic algorithm is mostly there because we aren't making assumptions about the availability of sleep with a resolution better than 1 second --- and a random backoff of multiples of seconds isn't very useful. However, it can be improved in various largely standard ways which are already used in other parts of the shell, so this is fixable without needing to rely on system specific features. A better scheme would be to start with a small wait time, says 0.1 seconds, with a random back off, with a maximum length of say the same time again, and then double the base time each time we back off, but keeping the same maximum time of 10 seconds since we first tried before reporting a failure. We can then tweak this as necessary. Or maybe just having a random backoff with doubling maximum and no fixed base time is good enough for typical problems. pws P.S. if acronyms like EDCA and SIFS went through anyone else's mind, too, my commiserations :-). --001a11c3fb303c2267050d50b617--