From: Peter Tribble <peter.tribble@gmail.com>
To: illumos-developer <developer@lists.illumos.org>
Subject: Re: [developer] Review - 15665 svc:/network/loopback exits successfully even if it fails
Date: Wed, 31 Jul 2024 10:44:51 +0100 [thread overview]
Message-ID: <CAEgYsbEbQOHxZPnyeituctnLnG4pD8rK6c2D0NA=g84V5bsBPQ@mail.gmail.com> (raw)
In-Reply-To: <CAD0Ztp1Fhj=XOBbjw645vC2_R-0wHrX9gbm_3PzBrUR0PSaQwA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5218 bytes --]
On Tue, Jul 30, 2024 at 11:46 PM Gordon Ross <gordon.w.ross@gmail.com>
wrote:
> Optional dependency does that in SMF, right?
>
Well no, that's a rather different case. That is the "I don't care if it's
enabled or not,
but if it is I'll have a hard dependency on it".
What we're after here is the "I do care that it's enabled, and must run
after it, but I'm
prepared to live with errors".
> On Tue, Jul 30, 2024 at 12:56 PM Jorge Schrauwen via illumos-developer <
> developer@lists.illumos.org> wrote:
>
>> This last reply from Peter made me think of the difference between
>> requires vs after in systemd speak.
>>
>> Although that is probably a lot of work as one would need those feature
>> and somehow fix all manifests that express a dependancy on loopback.
>>
>> Admittedly I sometimes miss a more soft dependancy in smf in general.
>>
>> ~ sjorge
>>
>> On 26 Jul 2024, at 17:16, Peter Tribble <peter.tribble@gmail.com> wrote:
>>
>>
>>
>>
>>
>> On Fri, Jul 26, 2024 at 2:50 PM Andy Fiddaman <andy@omnios.org> wrote:
>>
>>>
>>> On Fri, 26 Jul 2024, Peter Tribble wrote:
>>>
>>> > On Fri, Jul 26, 2024 at 9:21?AM Andy Fiddaman <illumos@fiddaman.net>
>>> wrote:
>>> >
>>> > > Please can you review the following change?
>>> > >
>>> > > 15665 svc:/network/loopback exits successfully even if it fails
>>> > > https://www.illumos.org/issues/15665
>>> > > https://code.illumos.org/c/illumos-gate/+/3610
>>> > >
>>> >
>>> > When this first came up I expressed my belief that making this change
>>> is
>>> > the wrong
>>> > thing to do, and I'll express it again.
>>>
>>> Apologies Peter. I had recalled that your objection to the original
>>> change
>>> was mostly around the addition of the extra dependency to the service,
>>> which
>>> I've removed in this new patch set (that is
>>> https://www.illumos.org/issues/15664 which remains open).
>>>
>>> > If this service fails, I think the best thing to do is drive on so
>>> that the
>>> > system can come up as far as possible to maximise the chance that the
>>> system
>>> > comes up far enough for an administrator to be able to get in and fix
>>> it. Not
>>> > putting the service into maintenance is a feature, not a bug.
>>>
>>> The impetus for this change is that over the past couple of years we've
>>> had
>>> a number of occasions where we've had to debug networking problems that
>>> have had their root in the fact that the loopback interfaces were not
>>> created
>>> for one reason or another. It happened again yesterday in a non-global
>>> zone. In
>>> all of these, it would have been really useful and expedited diagnosis
>>> if the
>>> service had gone into maintenance. I understand the perspective of
>>> allowing the
>>> system to come up as far as possible - to the point of remote access
>>> even - but
>>> it still seems wrong for a service to report success where it has not
>>> actually
>>> achieved its goal. Is there some middle ground here.
>>>
>>> > I think generally it would be wrong for a single voice to veto any
>>> change,
>>> > which means I would generally be uncomfortable sticking a -1 on it,
>>> but if
>>> > this does get into the gate it will be reverted in Tribblix.
>>>
>>> Understood. This definitely warrants further discussion.
>>>
>>
>> As I mentioned in my other reply, it seems that what we're after is some
>> way to mark
>> a service as having generated an error without bringing the system down
>> by going
>> into maintenance. Some sort of degraded mode.
>>
>> We have a couple of SMF exit codes that look interesting -
>> SMF_EXIT_MON_DEGRADE
>> and SMF_EXIT_MON_OFFLINE, but I'm sure they were never implemented.
>> There's
>> even an issue in this area - https://www.illumos.org/issues/7711 (which
>> refers back to 8891
>> which is another case of something dropping into maintenance breaking the
>> entire system).
>>
>> Interestingly, looking at the ssh method script for S11
>>
>> https://github.com/oracle/solaris-userland/blob/master/components/openssh/sources/sshd.sh#L132
>> you see the following:
>>
>> # Put the service into degraded mode in case some of previous
>> # configuration tasks failed.
>> # We do not let the service enter maintenance mode, since
>> # we want to keep the system as much operating as feasible.
>> #
>> if [ $ret1 -ne 0 ]; then
>> smf_method_exit $SMF_EXIT_DEGRADED "hostkey_configuration" \
>> "Failed to generate missing host keys."
>> fi
>>
>> So the equivalent of SMF_EXIT_DEGRADED might be what we're looking for?
>>
>> --
>> -Peter Tribble
>> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>>
>> *illumos <https://illumos.topicbox.com/latest>* / illumos-developer /
> see discussions <https://illumos.topicbox.com/groups/developer> +
> participants <https://illumos.topicbox.com/groups/developer/members> +
> delivery options
> <https://illumos.topicbox.com/groups/developer/subscription> Permalink
> <https://illumos.topicbox.com/groups/developer/Tb6183512dad6d1f9-Me937f6d14ff9b2b0d0229d8b>
>
--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
[-- Attachment #2: Type: text/html, Size: 8158 bytes --]
next prev parent reply other threads:[~2024-07-31 9:45 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-26 8:20 Andy Fiddaman
2024-07-26 12:44 ` [developer] " Peter Tribble
2024-07-26 13:41 ` Toomas Soome
2024-07-26 14:05 ` Peter Tribble
2024-07-26 13:50 ` Andy Fiddaman
2024-07-26 15:14 ` Peter Tribble
2024-07-26 15:55 ` Jorge Schrauwen
2024-07-30 22:46 ` Gordon Ross
2024-07-31 9:44 ` Peter Tribble [this message]
2024-08-01 8:48 ` Joshua M. Clulow
2024-07-26 18:08 ` Alan Coopersmith
2024-08-07 20:04 ` Andy Fiddaman
2024-09-09 17:36 ` Andy Fiddaman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEgYsbEbQOHxZPnyeituctnLnG4pD8rK6c2D0NA=g84V5bsBPQ@mail.gmail.com' \
--to=peter.tribble@gmail.com \
--cc=developer@lists.illumos.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).