From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-4.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,NICE_REPLY_A autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23480 invoked from network); 11 Oct 2022 00:41:42 -0000 Received: from alyss.skarnet.org (95.142.172.232) by inbox.vuxu.org with ESMTPUTF8; 11 Oct 2022 00:41:42 -0000 Received: (qmail 29111 invoked by uid 89); 11 Oct 2022 00:42:06 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Received: (qmail 29104 invoked from network); 11 Oct 2022 00:42:05 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=heuristicsystems.com.au; s=hsa; t=1665448814; x=1666053615; bh=q/FGshZtLdvNC/o98YNEX+9zbr1bHIUCHvEQ4PIE/lQ=; h=Message-ID:Date:From:Subject:To; b=IRPefjreDG+2XiL3zyLINzQyQLaWUkS1nd3LIX7VRkqFFQAby00C+K3UHVKn1TNre hDZ091tzJKB8Yl+V+foHT/DcoigKdK7MDYpL1wUlGLjHjKMZx5pv0wLUCFocFH3I99 PI/VuHDZ9K7oNbyPx8f85Yovy10dnqGqTMmmIl40NLYK6XwE5fg72 X-Authentication-Warning: b3.hs: Host noddy.hs [10.0.5.3] claimed to be [10.0.5.3] Message-ID: Date: Tue, 11 Oct 2022 11:38:19 +1100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 From: Dewayne Subject: Re: "Back off" setting for crashing services with s6+openrc? To: supervision@list.skarnet.org References: <76856b27-1653-4e3c-28a5-737b63dea1f0@fourc.eu> <20220930113440.72c09a4f@flunder.oschad.de> Content-Language: en-GB In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 30/09/2022 11:21 pm, Laurent Bercot wrote: > A service that keeps crashing is an abnormal condition Yes - sometimes that's what villains like to do :) > Note that I don't think making the restart delay configurable is a good > trade-off. It adds complexity, size and failure cases to the s6-supervise > code, it adds another file to a service directory for users to remember, > it adds another avenue for configuration mistakes causing downtime, all > that to save resources for a pathological case. The difference between > 0 second and 1 second of free CPU is significant; longer delays have > diminishing returns. I'm not sure that its entirely pathological, as I also use 'finish' with a 'sleep' and timeout-finish in an effort to reduce SROP issues. Its also fairly common for us to have a loadavg of 4x ncores. In the general case, yes if you want a process running then it should be up asap, and for that I'm very appreciative. :) Ref: For ROP https://en.wikipedia.org/wiki/Return-oriented_programming SROP https://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf