From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED,
	DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI,
	T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
Received: (qmail 26458 invoked from network); 15 Dec 2023 13:43:57 -0000
Received: from minnie.tuhs.org (2600:3c01:e000:146::1)
  by inbox.vuxu.org with ESMTPUTF8; 15 Dec 2023 13:43:57 -0000
Received: from minnie.tuhs.org (localhost [IPv6:::1])
	by minnie.tuhs.org (Postfix) with ESMTP id 2071643D61;
	Fri, 15 Dec 2023 23:43:55 +1000 (AEST)
Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233])
	by minnie.tuhs.org (Postfix) with ESMTPS id 6D6D643D60
	for <coff@tuhs.org>; Fri, 15 Dec 2023 23:43:47 +1000 (AEST)
Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-2cb21afa6c1so9115761fa.0
        for <coff@tuhs.org>; Fri, 15 Dec 2023 05:43:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1702647826; x=1703252626; darn=tuhs.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=qFNJwUV0143sZFI3ct/yHNjUf02+/enMyGHD2r3Oyd8=;
        b=MFE5gqD4e7xhLJ7Xb86ezIvKjze6gKcwoqQOEZFowpu1ZHFteTuRlt32xqsxb3GQRl
         r0CJ6+/aecweWoQFyJ81qK3B85YDqedkzjppK6XRDNFwqXEMBIUZhf866tB2KxhkPC3w
         FBYAMBURN8n42xBNShx4RgDB4NEBt8GiqF0x80JEGTMHaCiOuvvhyNujeefJ6UvRKmKk
         QBV6/cpKGI84Fz1ya0e6iIvVlJRzNjDRysIgDdLB01GN5PvfEdnqGnLsc8VNI/CW2Uko
         sV7TkqpC3NGmkxw6qFs8dzzFY/dOw3YSpSv3DB4YhJhLG8KIQnMFhAnio4BfJASsgrcI
         K+mA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1702647826; x=1703252626;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=qFNJwUV0143sZFI3ct/yHNjUf02+/enMyGHD2r3Oyd8=;
        b=cc+BSThcSjEOIdqCPufBJmOC3r37UgJi7UVA41CsvagcQxHG0hCB7N805bO7zqmXs5
         sUJ0T8HYgisKsdo0IfJpGFTGmm4fJO52lt7g8IhcRCZaacBmixngoH09oY5lrxra2B8j
         qgh1AyBG/0mcw8QA8bLGXCB8LMKNT8yUSOGonhFG9ydEiCac1UoB1uOucOJ2coOVjhnQ
         2o4csXnJ8qGLOqDu1WDO9U/ctBfe44pVeObSPdY75YE/IIDib1u6Y82sErIYFjX3NxF5
         PWSqEAI3+hOyH1SzhzMW22loh9eVo1RwO4uOqJPj7yBIhADp9K9lUs6RlftyUsJGUJ82
         n5yQ==
X-Gm-Message-State: AOJu0YzO8hlSjd6WlJ87/Jhm1H/FqaTcPQtAEnLuvWawnz42bDohTdhc
	gv3R+IqaO58c1bXXzLpfZMhDn5R+A/ujirBg1ZgWBQgU
X-Google-Smtp-Source: AGHT+IHPKfnBGIqpPVNGgN+VGNfUEthviuLfZUuRa1TUo1InoY12rbrZt3MMIdw/Q6FKWkQpR3ECtQTo73OGF/pThH8=
X-Received: by 2002:a2e:9856:0:b0:2c9:ffdf:f6dc with SMTP id
 e22-20020a2e9856000000b002c9ffdff6dcmr5830302ljj.41.1702647825415; Fri, 15
 Dec 2023 05:43:45 -0800 (PST)
MIME-Version: 1.0
References: <20231214232935.802BB18C08F@mercury.lcs.mit.edu>
In-Reply-To: <20231214232935.802BB18C08F@mercury.lcs.mit.edu>
From: Dan Cross <crossd@gmail.com>
Date: Fri, 15 Dec 2023 08:43:09 -0500
Message-ID: <CAEoi9W5rLJnxPqDVh34YY5g5ahZzrvA=VVC8J9XVK0AX=HRozw@mail.gmail.com>
To: Noel Chiappa <jnc@mercury.lcs.mit.edu>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Message-ID-Hash: PQOGPPBRXE6D4XZK3LFEWKF4LI3BJNPB
X-Message-ID-Hash: PQOGPPBRXE6D4XZK3LFEWKF4LI3BJNPB
X-MailFrom: crossd@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: coff@tuhs.org
X-Mailman-Version: 3.3.6b1
Precedence: list
Subject: [COFF] Re: Terminology query - 'system process'?
List-Id: Computer Old Farts Forum <coff.tuhs.org>
Archived-At: <https://www.tuhs.org/mailman3/hyperkitty/list/coff@tuhs.org/message/PQOGPPBRXE6D4XZK3LFEWKF4LI3BJNPB/>
List-Archive: <https://www.tuhs.org/mailman3/hyperkitty/list/coff@tuhs.org/>
List-Help: <mailto:coff-request@tuhs.org?subject=help>
List-Owner: <mailto:coff-owner@tuhs.org>
List-Post: <mailto:coff@tuhs.org>
List-Subscribe: <mailto:coff-join@tuhs.org>
List-Unsubscribe: <mailto:coff-leave@tuhs.org>

On Thu, Dec 14, 2023 at 7:07=E2=80=AFPM Noel Chiappa <jnc@mercury.lcs.mit.e=
du> wrote:
>     > Now I'd probably call them kernel threads as they don't have a sepa=
rate
>     > address space.
>
> Makes sense. One query about stacks, and blocking, there. Do kernel threa=
ds,
> in general, have per-thread stacks; so that they can block (and later res=
ume
> exactly where they were when they blocked)?
>
> That was the thing that, I think, made kernel processes really attractive=
 as
> a kernel structuring tool; you get code ike this (from V6):
>
>         swap(rp->p_addr, a, rp->p_size, B_READ);
>         mfree(swapmap, (rp->p_size+7)/8, rp->p_addr);
>
> The call to swap() blocks until the I/O operation is complete, whereupon =
that
> call returns, and away one goes. Very clean and simple code.

Assuming we're talking about Unix, yes, each process has two stacks:
one for userspace, one in the kernel.

The way I've always thought about it, every process has two parts: the
userspace part, and a matching thread in the kernel. When Unix is
running, it is always running in the context of _some_ process (modulo
early boot, before any processes have been created, of course).
Furthermore, when the process is running in user mode, the kernel
stack is empty. When a process traps into the kernel, it's running on
the kernel stack for the corresponding kthread.

Processes may enter the kernel in one of two ways: directly, by
invoking a system call, or indirectly, by taking an interrupt. In the
latter case, the kernel simply runs the interrupt handler within the
context of whatever process happened to be running when the interrupt
occurred. In both cases, one usually says that the process is either
"running in userspace" (ie, normal execution of whatever program is
running in the process) or "running in the kernel" (that is, the
kernel is executing in the context of that process).

Note that this affects behavior around blocking operations.
Traditionally, Unix device drivers had a notion of an "upper half" and
a "lower half." The upper half is the code that is invoked on behalf
of a process requesting services from the kernel via some system call;
the lower half is the code that runs in response to an interrupt for
the corresponding device. Since it's impossible in general to know
what process is running when an interrupt fires, it was important not
to perform operations that would cause the current process to be
unscheduled in an interrupt handler; hence the old adage, "don't sleep
in the bottom half of a device driver" (where sleep here means sleep
as in "sleep and wakeup", a la a condition variable, not "sleep for
some amount of time"): you would block some random process, which may
never be woken up again!

An interesting aside here is signals. We think of them as an
asynchronous mechanism for interrupting a process, but their delivery
must be coordinated by the kernel; in particular, if I send a signal
to a process that is running in userspace, it (typically) won't be
delivered right away; rather, it will be delivered the next time the
process is scheduled to run, as the process must enter the kernel
before delivery can be effected. Signal delivery is a synthetic event,
unlike the delivery of a hardware interrupt, and the upcall happens in
userspace.

> Use of a kernel process probably makes the BSD pageout daemon code fairly
> straightforward, too (well, as straightforward as anything done by Berzer=
kly
> was :-).
>
>
> Interestingly, other early systems don't seem to have thought of this
> structuring technique. I assumed that Multics used a similar technique to
> write 'dirty' pages out, to maintain a free list. However, when I looked =
in
> the Multics Storage System Program Logic Manual:
>
>   http://www.bitsavers.org/pdf/honeywell/large_systems/multics/AN61A_stor=
ageSysPLM_Sep78.pdf
>
> Multics just writes dirty pages as part of the page fault code: "This
> starting of writes is performed by the subroutine claim_mod_core in
> page_fault. This subroutine is invoked at the end of every page fault." (=
pg.
> 8-36, pg. 166 of the PDF.) (Which also increases the real-time delay to
> complete dealing with a page fault.)

Note that this says, "starting of writes." Presumably, the writes
themselves were asynchronous; this just initiates the operations. It
certainly adds latency to the page fault handler, but not as much as
waiting for the operations to complete!

> It makes sense to have a kernel process do this; having the page fault co=
de
> do it just makes that code more complicated. (The code in V6 to swap
> processes in and out is beautifully simple.) But it's apparently only obv=
ious
> in retrospect (like many brilliant ideas :-).

I can kinda sorta see a method in the madness of the Multics approach.
If you think that page faults are relatively rare, and initiating IO
is relatively cheap but still more expensive than executing "normal"
instructions, then it makes some sense that you might want to amortize
the cost of that by piggybacking one on the other. Of course, that's
just speculation and I don't really have a sense for how well that
worked out in Multics (which I have played around with and read about,
but still seems largely mysterious to me). In the Unix model, you've
got scheduling latency to deal with to run the pageout daemon; of
course, that all happened as part of a context switch, and in early
Unix there was no demand paging (and so I suppose page faults were
considered fatal).

That said, using threads as an organizational metaphor for structured
concurrency in the kernel is wonderful compared to many of the
alternatives (hand-coded state machines, for example).

        - Dan C.