From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HTML_MESSAGE,MAILING_LIST_MULTI,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 9574 invoked from network); 15 Dec 2023 18:08:26 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 15 Dec 2023 18:08:26 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id D2ECE43E9D; Sat, 16 Dec 2023 04:08:24 +1000 (AEST) Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by minnie.tuhs.org (Postfix) with ESMTPS id CF62343E9C for ; Sat, 16 Dec 2023 04:08:19 +1000 (AEST) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-54bf9a54fe3so1215331a12.3 for ; Fri, 15 Dec 2023 10:08:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1702663698; x=1703268498; darn=tuhs.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2jyRHo6taAQN6ZsTneOlKLWSnTV1zXAgYTBGSUkBHVs=; b=nTvpgDRXBz/I9A3sUSFoBlq5nL5pLVFW5ESoVfKcL3nZsKgkYy8KLMVt27J+CLJyIy ASYfPIp7zsIzXDeIHVkPzuwGzMALML8Veh39DPE1tfEjD92Wq5bTQUsJbGKDzDEki5sU n+krKyrzVrEbVjzqPqOaMXswuKgYk4Ef67T4Xt+E7OQ44RwzKb9g198WS0OGBm4sSxTo DkwlVO65WYDixgjf3MCl+GAUBb0aPP/bhi5ZMwpvqiAZhnrLmaq472GEHOBr8CM3tf+p OvwsPl50BaZgpUl5swQmxd7NEqm6ROXLhurnhGHgJo7KevcWc6N1STiXUAYrbfbFf1Tx 6NFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702663698; x=1703268498; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2jyRHo6taAQN6ZsTneOlKLWSnTV1zXAgYTBGSUkBHVs=; b=NiwwTN18QNd+tWi33UxdqQJKP0P2EPoXmB4+PRpZ3y8SbM6VlH9c7saaB2wunZtNcp TxXhCjrFcWFbxlZRZrJpV+OxlmFATibXNz8dD0MXZ8y40CJFMLaCBGYtcj0vQWRIRIfF jQ2mnTXJ2ilcK4FB3E2CD4xNSCDiKy+ZUgU15Fa0XRGBTr684xCnA5WsFnd72veCkUSG 4PJWZLxyVE4L8JeSJvI9pntHcXPhHarAXGnC3t9N71IM/+uSk5GpXiuF40phORuJIJnj GBC3Jb86NpSKQeU0CquXVXgxI7334lu1HOTiqISojLmfShaVRoVGTEkSJaBJa4PvgWq7 AZ4A== X-Gm-Message-State: AOJu0YwkqMjpWkK9HTIg425t8elC65drCJWCYuhaeA4NWTp6TExnCbIq p2qne6RfgTRajbi/EOYBCKaZEOjXwT5/JoHY0/06+SJrAxecqJ+MYko= X-Google-Smtp-Source: AGHT+IFaJC0v3xEQZBAhy1DnYRxKzYfz9jDlZs+RagJ/wkfPC+BjVWU1vpX+Eyi1164J+hU6Zoh3lAjAAwx7jExKm5g= X-Received: by 2002:a17:907:a810:b0:a1c:4c3e:99e5 with SMTP id vo16-20020a170907a81000b00a1c4c3e99e5mr7069837ejc.55.1702663697715; Fri, 15 Dec 2023 10:08:17 -0800 (PST) MIME-Version: 1.0 References: <20231214232935.802BB18C08F@mercury.lcs.mit.edu> <4416CB1B-CDE2-42DA-92F2-33284DB6093F@iitbombay.org> In-Reply-To: From: Warner Losh Date: Fri, 15 Dec 2023 11:08:06 -0700 Message-ID: To: Paul Winalski Content-Type: multipart/alternative; boundary="000000000000dfe3c7060c9048a3" Message-ID-Hash: KHMOWHAEKYHTMHNEQSU7PBLG6LJJQRXV X-Message-ID-Hash: KHMOWHAEKYHTMHNEQSU7PBLG6LJJQRXV X-MailFrom: wlosh@bsdimp.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: coff@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [COFF] Re: Terminology query - 'system process'? List-Id: Computer Old Farts Forum Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000dfe3c7060c9048a3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Dec 15, 2023 at 10:51=E2=80=AFAM Paul Winalski wrote: > For me, the term "system process" means either: > > o A conventional, but perhaps privileged user-mode process that > performs a system function. An example would be the output side of a > spooling system, or an operator communications process. > > o A process, or at least an address space + execution thread, that > runs in privileged mode on the hardware and whose address space is in > the resident kernel. > > Do Unix system processes participate in time-sliced scheduling the way > that user processes do? > Yes. At least on FreeBSD they do. They are just processes that get scheduled. They may have different priorities, etc, but all that factors in, and those priorities allow them to compete and/or preempt already running processes depending on a number of things. The only thing special about kernel-only thread/processes is that they are optimized knowing they never have a userland associated with them... > On 12/14/23, Bakul Shah wrote: > > > > Exactly! If blocking was not required, you can do the work in an > > interrupt handler. If blocking is required, you can't just use the > > stack of a random process (while in supervisor mode) unless you > > are doing some work specifically on its behalf. > > > >> Interestingly, other early systems don't seem to have thought of this > >> structuring technique. > > > > I suspect IBM operating systems probably did use them. At least TSO > > must have. Once you start *accounting* (and charging) for cpu time, > > this idea must fall out naturally. You don't want to charge a process > > for kernel time used for an unrelated work! > > The usual programming convention for IBM S/360/370 operating systems > (OS/360, OS/VS, TOS and DOS/360, DOS/VS) did not involve use of a > stack at all, unless one was writing a routine involving recursive > calls, and that was rare. Addressing for both program and data was > done using a base register + offset. PL/I is the only IBM HLL I know > that explicitly supported recursion. I don't know how they > implemented automatic variables assigned to memory in recursive > routines. It might have been a linked list rather than a stack. > > I remember when I first went from the IBM world and started > programming VAX/VMS, I thought it was really weird to burn an entire > register just for a process stack. > > > There was a race condition in V7 swapping code. Once a colleague and I > > spent two weeks of 16 hour debugging days! > > I had a race condition in some multithread code I wrote. I couldn't > find it the bug. I even resorted to getting machine code listings of > the whole program and marking the critical and non-critical sections > with green and red markers. I eventually threw all of the code out > and rewrite it from scratch. The second version didn't have the race > condition. > The award for my 'longest bug chased' is at around 3-4 years. We had a product, based on an arm9 CPU (so armv4) that would sometimes hang. Well, individual threads in it would hang waiting for a lock and so weird aspects of the program stopped working in unusual ways. But the root cause was a stuck lock, or missed wakeup. It took months to recreate this problem. I tried all manner of debugging to accelerate it reoccurring (no luck) to audit tall locks/unlocks/wakeups to make sure there was no leaks or subtle mismatches (there wasn't, despite a 100MB log file). It went on and on. I rewrote all the locking / sleeping / etc code, but also no dice. The one day, by chance, I was talking to someone who asked me about atomic operations. I blew them off at first, but then realized the atomic ops weren't implemented in hardware, but in software with the support of the kernel (there were no CPU level atomic ops). Within an hour of realizin= g this and auditing the code path, I had a fix to a race that was trivial to discover once you looked at the code closely. My friend also found the same race that I had about the same time I was finishing up my fix (which he found another race in, go pair programming). With the corrected fix, the weird hanging went away, only to be reported once again... in a unit that hadn't been updated with the patch! tl;dr: you never know what the root cause might be in weird, racy situations. Warner --000000000000dfe3c7060c9048a3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, Dec 15, 2023 at 10:51=E2=80= =AFAM Paul Winalski <paul.win= alski@gmail.com> wrote:
For me, the term "system process" means either:
o A conventional, but perhaps privileged user-mode process that
performs a system function.=C2=A0 An example would be the output side of a<= br> spooling system, or an operator communications process.

o A process, or at least an address space + execution thread, that
runs in privileged mode on the hardware and whose address space is in
the resident kernel.

Do Unix system processes participate in time-sliced scheduling the way
that user processes do?

Yes. At least o= n FreeBSD they do. They are just processes that get
scheduled. Th= ey may have different priorities, etc, but all that factors
in, a= nd those priorities allow them=C2=A0to compete and/or preempt already
=
running processes depending on a number of things. The only thing
special about kernel-only thread/processes is that they are optimized=
knowing they never have a userland associated with them...
=
=C2=A0
On 12/14/23, Bakul Shah <bakul@iitbombay.org> wrote:
>
> Exactly! If blocking was not required, you can do the work in an
> interrupt handler. If blocking is required, you can't just use the=
> stack of a random process (while in supervisor mode) unless you
> are doing some work specifically on its behalf.
>
>> Interestingly, other early systems don't seem to have thought = of this
>> structuring technique.
>
> I suspect IBM operating systems probably did use them. At least TSO > must have. Once you start *accounting* (and charging) for cpu time, > this idea must fall out naturally. You don't want to charge a proc= ess
> for kernel time used for an unrelated work!

The usual programming convention for IBM S/360/370 operating systems
(OS/360, OS/VS, TOS and DOS/360, DOS/VS) did not involve use of a
stack at all, unless one was writing a routine involving recursive
calls, and that was rare.=C2=A0 Addressing for both program and data was done using a base register + offset.=C2=A0 PL/I is the only IBM HLL I know<= br> that explicitly supported recursion.=C2=A0 I don't know how they
implemented automatic variables assigned to memory in recursive
routines.=C2=A0 It might have been a linked list rather than a stack.

I remember when I first went from the IBM world and started
programming VAX/VMS, I thought it was really weird to burn an entire
register just for a process stack.

> There was a race condition in V7 swapping code. Once a colleague and I=
> spent two weeks of 16 hour debugging days!

I had a race condition in some multithread code I wrote.=C2=A0 I couldn'= ;t
find it the bug.=C2=A0 I even resorted to getting machine code listings of<= br> the whole program and marking the critical and non-critical sections
with green and red markers.=C2=A0 I eventually threw all of the code out and rewrite it from scratch.=C2=A0 The second version didn't have the r= ace
condition.

The award for my 'longes= t bug chased' is at around 3-4 years. We had
a product, based= on an arm9 CPU (so armv4) that would sometimes
hang. Well, indiv= idual threads in it would hang waiting for a lock and so
weird as= pects of the program stopped working in unusual ways. But the
roo= t cause was a stuck lock, or missed wakeup. It took months to recreate
this problem. I tried all manner of debugging to accelerate it reoccu= rring (no
luck) to audit tall locks/unlocks/wakeups to make sure = there was no leaks
or subtle mismatches (there wasn't, despit= e a 100MB log file). It went on
and on. I rewrote all the locking= / sleeping / etc code, but also no dice.
The one day, by chance,= I was talking to someone who asked me
about atomic operations. I= blew them off at first, but then realized the atomic
ops weren&#= 39;t implemented in hardware, but in software with the support of
the kernel (there were no CPU level atomic ops). Within an hour of realizi= ng
this and auditing the code path, I had a fix to a race that wa= s trivial to discover
once you looked at the code closely. My fri= end also found the same race that=C2=A0I
had about the same time = I was finishing up my fix (which he found another race
in, go pai= r programming). With the corrected fix, the weird hanging went
aw= ay, only to be reported once again... in a unit that hadn't been update= d
with the patch!

tl;dr: you never know = what the root cause might be in weird, racy situations.

Warner
--000000000000dfe3c7060c9048a3--