From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from tb-mx1.topicbox.com (localhost.local [127.0.0.1]) by tb-mx1.topicbox.com (Postfix) with ESMTP id 8BB4A1624502 for ; Wed, 31 Jul 2024 05:45:05 -0400 (EDT) (envelope-from peter.tribble@gmail.com) Received: from tb-mx1.topicbox.com (localhost [127.0.0.1]) by tb-mx1.topicbox.com (Authentication Milter) with ESMTP id 982527FDB89; Wed, 31 Jul 2024 05:45:05 -0400 ARC-Seal: i=1; a=rsa-sha256; cv=none; d=topicbox.com; s=arcseal; t= 1722419105; b=nNDeeqfzOaKjpsCL4qV0ZaP5vqndP8S61SfqzYCurVnnlcNiwd qHrrfvqf6/wurzA2zQ+bpduBciEpqwBXKjAsKPzb0nFIY/uqer005WEQDM3Qau5p TwFoltmUBtsqWUVALW259Oo709OM8CO8qzjowPtslR5fUzW3wAIOwbReph09b8xB qzXND+oXSYdVDMIcaeqNkh6ELUzS2iLlrfZuQL7eoSAePm9reDBvLdnC/sVV2g7p 6em1qGYm9lMdUy8uY2rchZWevyyZjNu8zQcDN34q9eDqXMYgfJ1lnD6RM3GaklKF S7kxK6ZPW8k92jrCiIpy98Uw0JHVpikmpPsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d= topicbox.com; h=mime-version:references:in-reply-to:from:date :message-id:subject:to:content-type; s=arcseal; t=1722419105; bh=B0tODlYtyFP4MtK/IIPP2rVmJHBmcLaTaYuc62C9Vxw=; b=VATezpvVW8ar aJl0QSyOvUHVZAlHZiDxqZ6e4p0N6mGJ6WvgLkRS1cXfyqda+gxZKu9YR/PME/az fRCQ/0dRrtlGWa82eLk7s7S9ITkhd4chRyYEuKcshmfjy1qgEe4r5LDs3WONQsJT 1rkAxoGimigk9xW3wAoVKJ0hRV1VFoB1HzWYmgqBrj35By0Yuwx/eACyWuN4Z3GZ 6Fyy6pxHIGPqfHrVJ7fo+F7+rSggpcQkw3JpgmdyUT7bvWSEJr4e7dMGdaVCGC+c SgU0sGodKzqU9FBkjOw3WHSA0r+3pxywJrXuOExNQHFCMYAEn6bGGWzXGaLBc4yn 8NscXkclkA== ARC-Authentication-Results: i=1; tb-mx1.topicbox.com; arc=none (no signatures found); bimi=skipped (DMARC Policy is not at enforcement); dkim=pass (2048-bit rsa key sha256) header.d=gmail.com header.i=@gmail.com header.b=FMmkl+m9 header.a=rsa-sha256 header.s=20230601 x-bits=2048; dmarc=pass policy.published-domain-policy=none policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=none,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=gmail.com; iprev=pass smtp.remote-ip=209.85.161.41 (mail-oo1-f41.google.com); spf=pass smtp.mailfrom=peter.tribble@gmail.com smtp.helo=mail-oo1-f41.google.com; x-aligned-from=pass (Address match); x-google-dkim=pass (2048-bit rsa key) header.d=1e100.net header.i=@1e100.net header.b=k4K42rzP; x-me-sender=none; x-ptr=pass smtp.helo=mail-oo1-f41.google.com policy.ptr=mail-oo1-f41.google.com; x-return-mx=pass header.domain=gmail.com policy.is_org=yes (MX Records found: alt4.gmail-smtp-in.l.google.com,gmail-smtp-in.l.google.com,alt3.gmail-smtp-in.l.google.com,alt1.gmail-smtp-in.l.google.com,alt2.gmail-smtp-in.l.google.com); x-return-mx=pass smtp.domain=gmail.com policy.is_org=yes (MX Records found: alt4.gmail-smtp-in.l.google.com,gmail-smtp-in.l.google.com,alt3.gmail-smtp-in.l.google.com,alt1.gmail-smtp-in.l.google.com,alt2.gmail-smtp-in.l.google.com); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES256-GCM-SHA384 smtp.bits=256/256; x-vs=clean score=-51 state=0 Authentication-Results: tb-mx1.topicbox.com; arc=none (no signatures found); bimi=skipped (DMARC Policy is not at enforcement); dkim=pass (2048-bit rsa key sha256) header.d=gmail.com header.i=@gmail.com header.b=FMmkl+m9 header.a=rsa-sha256 header.s=20230601 x-bits=2048; dmarc=pass policy.published-domain-policy=none policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=none,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=gmail.com; iprev=pass smtp.remote-ip=209.85.161.41 (mail-oo1-f41.google.com); spf=pass smtp.mailfrom=peter.tribble@gmail.com smtp.helo=mail-oo1-f41.google.com; x-aligned-from=pass (Address match); x-google-dkim=pass (2048-bit rsa key) header.d=1e100.net header.i=@1e100.net header.b=k4K42rzP; x-me-sender=none; x-ptr=pass smtp.helo=mail-oo1-f41.google.com policy.ptr=mail-oo1-f41.google.com; x-return-mx=pass header.domain=gmail.com policy.is_org=yes (MX Records found: alt4.gmail-smtp-in.l.google.com,gmail-smtp-in.l.google.com,alt3.gmail-smtp-in.l.google.com,alt1.gmail-smtp-in.l.google.com,alt2.gmail-smtp-in.l.google.com); x-return-mx=pass smtp.domain=gmail.com policy.is_org=yes (MX Records found: alt4.gmail-smtp-in.l.google.com,gmail-smtp-in.l.google.com,alt3.gmail-smtp-in.l.google.com,alt1.gmail-smtp-in.l.google.com,alt2.gmail-smtp-in.l.google.com); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES256-GCM-SHA384 smtp.bits=256/256; x-vs=clean score=-51 state=0 X-ME-VSCause: gggruggvucftvghtrhhoucdtuddrgeeftddrjeeigddvtdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdpuffr tefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnth hsucdlqddutddtmdenogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenucfjughrpegg fhgjhfffkffuvfgtsegrtderredttdejnecuhfhrohhmpefrvghtvghrucfvrhhisggslh gvuceophgvthgvrhdrthhrihgssghlvgesghhmrghilhdrtghomheqnecuggftrfgrthht vghrnhepveduueethfevledufffhgefhtedugedttdfgtdduieeuudettefhheeuudfhje dtnecuffhomhgrihhnpehilhhluhhmohhsrdhorhhgpdhgihhthhhusgdrtghomhdpshhs hhgurdhshhdpphgvthgvrhhtrhhisggslhgvrdgtohdruhhkpdgslhhoghhsphhothdrtg homhdpthhophhitggsohigrdgtohhmnecukfhppedvtdelrdekhedrudeiuddrgedunecu vehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepvddtledrkeehrdduie durdeguddphhgvlhhopehmrghilhdqohhouddqfheguddrghhoohhglhgvrdgtohhmpdhm rghilhhfrhhomhepoehpvghtvghrrdhtrhhisggslhgvsehgmhgrihhlrdgtohhmqedpnh gspghrtghpthhtohepuddprhgtphhtthhopeeouggvvhgvlhhophgvrheslhhishhtshdr ihhllhhumhhoshdrohhrgheq X-ME-VSScore: -51 X-ME-VSCategory: clean Received-SPF: pass (gmail.com ... _spf.google.com: Sender is authorized to use 'peter.tribble@gmail.com' in 'mfrom' identity (mechanism 'include:_netblocks.google.com' matched)) receiver=tb-mx1.topicbox.com; identity=mailfrom; envelope-from="peter.tribble@gmail.com"; helo=mail-oo1-f41.google.com; client-ip=209.85.161.41 Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tb-mx1.topicbox.com (Postfix) with ESMTPS for ; Wed, 31 Jul 2024 05:45:04 -0400 (EDT) (envelope-from peter.tribble@gmail.com) Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5d5bb2ac2ddso386422eaf.0 for ; Wed, 31 Jul 2024 02:45:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722419103; x=1723023903; darn=lists.illumos.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=B0tODlYtyFP4MtK/IIPP2rVmJHBmcLaTaYuc62C9Vxw=; b=FMmkl+m9IDC/JZAKmKeH4lgoLEWQDMNHj6ZM9MzSVQZm0iTFHU77x4L6D62kUervpW aiIAihZrU+wvM9+wCNwCFmHIa3XX1JK57CUy4a6iP+dStybNm8EtHLBMbxHWDK+P3K8v jWguEC20JsW0pALhr1BonqjeAG43+O6Yg5n7LW2GSg1xrMIoAnQfBnrgRB7VaSsOu/Dr m09nfl2HVKLdylIq9GYOlAtsKvbd3aws4gCNwJeifjQzqmOrcTgXDR/ZBpvq6ysl8sMY 07Vu2zddHDGq9S7DImLfqquHfwezCe/P0YcKGc7PQHA21NBFi3+8AWU5pOtsEusjZzw8 fO4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722419103; x=1723023903; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B0tODlYtyFP4MtK/IIPP2rVmJHBmcLaTaYuc62C9Vxw=; b=k4K42rzPjjf2QpkCzj2+V0Nyk0bsf1DSKgX7nbWoB1M6zDDU/cInhy7vuFUy9kI0cW W4IwkmcNzPxbWCOXw41mJdptTW8KMy26GdpC2nUVfB+kDQkmd7n6UBrEb3gwhdvp6ppl aZWGFXSKQztnxU58omJMdWKgfMNp0ZhLba6GhC5RcOpPSGiQo6Ttm78cT/mjNUt5ePLi +TQ5J1gyK0ZwUw/KwcrxYg+z6TcTzNUpym+ihjctyuVAfe2sGUtV+vk/he2TfZA032tN Kv54cFbRlxs1ML/jSGmuXjusY0SFlZ3ZWFxZUOw1Z9PxCP4fcAqD4XcxXyiP2E7hkFf0 +Gqw== X-Gm-Message-State: AOJu0Yy8TtrR1Zg9q8AMIdHK6L4f200iXYPVmWWsdaN6laAH5OBn7W8J JRCmefaKiu7qquCRuGt5JdVKgczC4CW0gQgucQFRyT+IfzV4R5O4rqNNEHI8GQ5TYvsDtjdXH04 tcb2FHEgjHLgOzGYMXtokgPjQe046 X-Google-Smtp-Source: AGHT+IHWnX/JSDynvS6ynWQtPzWEDdp5QKgLsSIiiJ0ZNYiTwLOpdmR9nW+EE1Gm80hhPIU+9qac8I3bCy9Nps87l9A= X-Received: by 2002:a05:6820:229b:b0:5d5:b51d:e61b with SMTP id 006d021491bc7-5d605b33e0bmr2856428eaf.0.1722419103489; Wed, 31 Jul 2024 02:45:03 -0700 (PDT) MIME-Version: 1.0 References: <3D043DE2-817C-4A22-9BB6-A673FAAFDC58@blackdot.be> In-Reply-To: From: Peter Tribble Date: Wed, 31 Jul 2024 10:44:51 +0100 Message-ID: Subject: Re: [developer] Review - 15665 svc:/network/loopback exits successfully even if it fails To: illumos-developer Content-Type: multipart/alternative; boundary="000000000000d16715061e87f290" Topicbox-Policy-Reasoning: allow: sender is a member Topicbox-Message-UUID: 957d4f28-4f21-11ef-94f5-e3cce86c6aa7 --000000000000d16715061e87f290 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 30, 2024 at 11:46=E2=80=AFPM Gordon Ross wrote: > Optional dependency does that in SMF, right? > Well no, that's a rather different case. That is the "I don't care if it's enabled or not, but if it is I'll have a hard dependency on it". What we're after here is the "I do care that it's enabled, and must run after it, but I'm prepared to live with errors". > On Tue, Jul 30, 2024 at 12:56=E2=80=AFPM Jorge Schrauwen via illumos-deve= loper < > developer@lists.illumos.org> wrote: > >> This last reply from Peter made me think of the difference between >> requires vs after in systemd speak. >> >> Although that is probably a lot of work as one would need those feature >> and somehow fix all manifests that express a dependancy on loopback. >> >> Admittedly I sometimes miss a more soft dependancy in smf in general. >> >> ~ sjorge >> >> On 26 Jul 2024, at 17:16, Peter Tribble wrote: >> >> =EF=BB=BF >> >> >> >> On Fri, Jul 26, 2024 at 2:50=E2=80=AFPM Andy Fiddaman = wrote: >> >>> >>> On Fri, 26 Jul 2024, Peter Tribble wrote: >>> >>> > On Fri, Jul 26, 2024 at 9:21?AM Andy Fiddaman >>> wrote: >>> > >>> > > Please can you review the following change? >>> > > >>> > > 15665 svc:/network/loopback exits successfully even if it fails >>> > > https://www.illumos.org/issues/15665 >>> > > https://code.illumos.org/c/illumos-gate/+/3610 >>> > > >>> > >>> > When this first came up I expressed my belief that making this change >>> is >>> > the wrong >>> > thing to do, and I'll express it again. >>> >>> Apologies Peter. I had recalled that your objection to the original >>> change >>> was mostly around the addition of the extra dependency to the service, >>> which >>> I've removed in this new patch set (that is >>> https://www.illumos.org/issues/15664 which remains open). >>> >>> > If this service fails, I think the best thing to do is drive on so >>> that the >>> > system can come up as far as possible to maximise the chance that the >>> system >>> > comes up far enough for an administrator to be able to get in and fix >>> it. Not >>> > putting the service into maintenance is a feature, not a bug. >>> >>> The impetus for this change is that over the past couple of years we've >>> had >>> a number of occasions where we've had to debug networking problems that >>> have had their root in the fact that the loopback interfaces were not >>> created >>> for one reason or another. It happened again yesterday in a non-global >>> zone. In >>> all of these, it would have been really useful and expedited diagnosis >>> if the >>> service had gone into maintenance. I understand the perspective of >>> allowing the >>> system to come up as far as possible - to the point of remote access >>> even - but >>> it still seems wrong for a service to report success where it has not >>> actually >>> achieved its goal. Is there some middle ground here. >>> >>> > I think generally it would be wrong for a single voice to veto any >>> change, >>> > which means I would generally be uncomfortable sticking a -1 on it, >>> but if >>> > this does get into the gate it will be reverted in Tribblix. >>> >>> Understood. This definitely warrants further discussion. >>> >> >> As I mentioned in my other reply, it seems that what we're after is some >> way to mark >> a service as having generated an error without bringing the system down >> by going >> into maintenance. Some sort of degraded mode. >> >> We have a couple of SMF exit codes that look interesting - >> SMF_EXIT_MON_DEGRADE >> and SMF_EXIT_MON_OFFLINE, but I'm sure they were never implemented. >> There's >> even an issue in this area - https://www.illumos.org/issues/7711 (which >> refers back to 8891 >> which is another case of something dropping into maintenance breaking th= e >> entire system). >> >> Interestingly, looking at the ssh method script for S11 >> >> https://github.com/oracle/solaris-userland/blob/master/components/openss= h/sources/sshd.sh#L132 >> you see the following: >> >> # Put the service into degraded mode in case some of previous >> # configuration tasks failed. >> # We do not let the service enter maintenance mode, since >> # we want to keep the system as much operating as feasible. >> # >> if [ $ret1 -ne 0 ]; then >> smf_method_exit $SMF_EXIT_DEGRADED "hostkey_configuration" \ >> "Failed to generate missing host keys." >> fi >> >> So the equivalent of SMF_EXIT_DEGRADED might be what we're looking for? >> >> -- >> -Peter Tribble >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ >> >> *illumos * / illumos-developer / > see discussions + > participants + > delivery options > Permalink > > --=20 -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ --000000000000d16715061e87f290 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Jul 30, 2024 at 11:46=E2=80=AFPM = Gordon Ross <gordon.w.ross@gm= ail.com> wrote:
Optional dependency does that in SMF= , right?

Well no, that's a ra= ther different case. That is the "I don't care if it's enabled= or not,
but if it is I'll have a hard dependency on it&q= uot;.

What we're after here is the "I do care th= at it's enabled, and must run after it, but I'm
prepa= red to live with errors".
=C2=A0
On Tue, Jul 30, 2024 at 12:56=E2=80=AFPM Jorge = Schrauwen via illumos-developer <developer@lists.illumos.org> wrote:
This last reply from Peter made me think of the difference between require= s vs after in systemd speak.=C2=A0

Although that i= s probably a lot of work as one would need those feature and somehow fix al= l manifests that express a dependancy on loopback.

Admittedly I sometimes miss a more soft dependancy in smf in general.
~ sjorge

On 26 Jul 2024, at 17:16, Peter Tribble <peter.tribble@gmail.com>= wrote:

=EF=BB=BF


On Fri, Jul 26,= 2024 at 2:50=E2=80=AFPM Andy Fiddaman <andy@omnios.org> wrote:

On Fri, 26 Jul 2024, Peter Tribble wrote:

> On Fri, Jul 26, 2024 at 9:21?AM Andy Fiddaman <illumos@fiddaman.net> wrote: >
> > Please can you review the following change?
> >
> >=C2=A0 =C2=A0 =C2=A015665 svc:/network/loopback exits successfully= even if it fails
> >=C2=A0 =C2=A0 =C2=A0https://www.illumos.org/issues/1566= 5
> >=C2=A0 =C2=A0 =C2=A0https://code.illumos.org/= c/illumos-gate/+/3610
> >
>
> When this first came up I expressed my belief that making this change = is
> the wrong
> thing to do, and I'll express it again.

Apologies Peter. I had recalled that your objection to the original change<= br> was mostly around the addition of the extra dependency to the service, whic= h
I've removed in this new patch set (that is
https://www.i= llumos.org/issues/15664 which remains open).

> If this service fails, I think the best thing to do is drive on so tha= t the
> system can come up as far as possible to maximise the chance that the = system
> comes up far enough for an administrator to be able to get in and fix = it. Not
> putting the service into maintenance is a feature, not a bug.

The impetus for this change is that over the past couple of years we've= had
a number of occasions where we've had to debug networking problems that=
have had their root in the fact that the loopback interfaces were not creat= ed
for one reason or another. It happened again yesterday in a non-global zone= . In
all of these, it would have been really useful and expedited diagnosis if t= he
service had gone into maintenance. I understand the perspective of allowing= the
system to come up as far as possible - to the point of remote access even -= but
it still seems wrong for a service to report success where it has not actua= lly
achieved its goal. Is there some middle ground here.

> I think generally it would be wrong for a single voice to veto any cha= nge,
> which means I would generally be uncomfortable sticking a -1 on it, bu= t if
> this does get into the gate it will be reverted in Tribblix.

Understood. This definitely warrants further discussion.

As I mentioned in my other reply, it seems that what we&#= 39;re after is some way to mark
a service as having generated= an error without bringing the system down by going
into main= tenance. Some sort of degraded mode.

We have a couple of = SMF exit codes that look interesting - SMF_EXIT_MON_DEGRADE
and SMF_EXI= T_MON_OFFLINE, but I'm sure they were never implemented. There's
even an issue in this area - https://www.illumos.org/issues/7711 (wh= ich refers back to 8891
which is another case of something dr= opping into maintenance breaking the entire system).

you see= the following:

# Put the service into degraded mode in case some o= f previous
# configuration tasks failed.
# We do not let the servic= e enter maintenance mode, since
# we want to keep the system as much op= erating as feasible.
#
if [ $ret1 -ne 0 ]; then
smf_method_exi= t $SMF_EXIT_DEGRADED "hostkey_configuration" \
=C2=A0 =C2= =A0"Failed to generate missing host keys."
fi

So the equivalent of SMF_EXIT_DEGRADED might be what we're looking f= or?

--
-Peter Tribble
http://www.petertribble.co.uk/= - http://p= tribble.blogspot.com/
=


--
--000000000000d16715061e87f290--