supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* ipc in process #1
@ 2019-05-09  7:10 Jeff
  2019-05-10 17:57 ` Laurent Bercot
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff @ 2019-05-09  7:10 UTC (permalink / raw)
  To: supervision


IMO process #1 should be solely signal driven.
i guess there is no need for further ipc by other means,
escpecially for those that require rw fs access to work
(eg fifos, unix sockets).

process #1 has to react to (some) incoming signals and thus
signal handling has to be enabled anyway (mainly SIGCHLD,
but also for signals that are sent by the kernel to indicate the
occurence of important events like the 3 finger salute et al).

apart from that process #1 has IMO only the additional duty of
leaving stage 2 and entering staqe 3 when requested. this can also
be done by sending process #1 signals that indicate what to do
(reboot, halt, poweroff, maybe suspend). access control is easy here
since only the super-user may signal process #1.
there are also quite a lot of real-time signals that might also be used
for the purpose of notifying process #1.

hence there is no need for further ipc i guess.
for those who still need more than what is provided by signals
i would recommend using abstract unix domain sockets (Linux only)
or SysV message queues (the latter works even on OpenBSD) since
those ipc mechanisms can work properly without rw fs access.

SysV msg queues are especially useful in situations that only require
one way ipc (ie the init process just reacts to commands sent by clients
without sending information (results like successful task completion)
back to them) since they are rather portable and provide cheap and easy
to set up access control. and since it is process #1 that is the first
process to create and use a SysV msg queue the usual problems with
SysV ipc ids do not occur at all as process #1 can just grab ANY possible
ipc id, like say 1, without interfering with other processes' msg queues
and so all clients know which msg queue id to use for writing requests
(they can also send a signal like SIGIO to process #1 to wake it up and
have it process its msg queue, process #1's pid is also well known. ;-).

this can also be used as an emergency protocol when ipc by other means
(such as unix sockets) becomes unavailable.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ipc in process #1
  2019-05-09  7:10 ipc in process #1 Jeff
@ 2019-05-10 17:57 ` Laurent Bercot
  2019-05-11 11:26   ` SysV shutdown util Jeff
  2019-05-11 12:38   ` emergency IPC with SysV message queues Jeff
  0 siblings, 2 replies; 13+ messages in thread
From: Laurent Bercot @ 2019-05-10 17:57 UTC (permalink / raw)
  To: supervision


>IMO process #1 should be solely signal driven.
>i guess there is no need for further ipc by other means,
>escpecially for those that require rw fs access to work
>(eg fifos, unix sockets).

The devil is in the details here. Theoretically, yes, you're right.
In practice, only using signals to control pid 1 is rather limiting,
and the choice to do so or not to do so is essentially decided by
other factors such as the existing interfaces you want to follow.

For instance, if you want to implement the stage 3 procedure in
pid 1, and you want to follow the LSB interface for the "shutdown"
binary - all of which is the case for sysvinit - then signals are
not enough: you need to be able to convey more information to pid 1
than a few signals can.

But yes, limiting what pid 1 does is reducing its attack surface,
which is a good idea in general, and it is totally possible to
design a correct pid 1 that is only controlled by signals.


>apart from that process #1 has IMO only the additional duty of
>leaving stage 2 and entering staqe 3 when requested.

And reaping zombies, and supervising at least one process. :)


>for those who still need more than what is provided by signals
>i would recommend using abstract unix domain sockets (Linux only)
>or SysV message queues (the latter works even on OpenBSD) since
>those ipc mechanisms can work properly without rw fs access.

Have you tried working with SysV message queues before recommending
them? Because my recommendation would be the exact opposite. Don't
ever use SysV message queues if you can avoid it. The API is very
clumsy, and does not mix with event loops, so it constrains your
programming model immensely - you're practically forced to use threads.
And that's a can of worms you certainly don't want to open in pid 1.

Abstract sockets are cool - the only issue is that they're not
portable, which would limit your init system to Linux only. If you're
going for a minimal init, it's a shame not to make it portable.

Really, the argument for an ipc mechanism that does not require a
rw fs is a weak one. Every modern Unix can mount a RAM filesystem
nowadays, and it is basically essential to do so if you want early
logs. Having no logs at all until you can mount a rw fs is a big
no-no, and being unable to log what your init system does *at all*,
unless you're willing to store everything in RAM until a rw fs
comes up, is worse. An early tmpfs technically stores things in RAM
too, but is much, *much* cleaner, and doesn't need ad-hoc code or
weird convolutions to make it work.

Just use an early tmpfs and use whatever IPC you want that uses a
rendez-vous point in that tmpfs to communicate with process 1.
But just say no to message queues.

--
Laurent



^ permalink raw reply	[flat|nested] 13+ messages in thread

* SysV shutdown util
  2019-05-10 17:57 ` Laurent Bercot
@ 2019-05-11 11:26   ` Jeff
  2019-05-11 12:56     ` Laurent Bercot
  2019-05-11 12:38   ` emergency IPC with SysV message queues Jeff
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff @ 2019-05-11 11:26 UTC (permalink / raw)
  To: supervision

10.05.2019, 20:03, "Laurent Bercot" <ska-supervision@skarnet.org>:
> then signals are not enough:
> you need to be able to convey more information to pid 1
> than a few signals can.

such as ?
what more information than the runlevel (0 or 6, maybe 1 to go
into single user) does SysV init need to start the system shutdown ?
and shutdown itself just notifies all users via wall, logs shutdown time
to wtmp and then notifies init via the /dev/.initctl fifo.
this can all be done solely by 2 different signals.

the Void Linux team just made a shell script out of it that just brings
down the runit services and runsvdir itself.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* emergency IPC with SysV message queues
  2019-05-10 17:57 ` Laurent Bercot
  2019-05-11 11:26   ` SysV shutdown util Jeff
@ 2019-05-11 12:38   ` Jeff
  2019-05-11 13:34     ` Laurent Bercot
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff @ 2019-05-11 12:38 UTC (permalink / raw)
  To: supervision

10.05.2019, 20:03, "Laurent Bercot" <ska-supervision@skarnet.org>:
> Have you tried working with SysV message queues before recommending
> them ?

yes, i had the code in an init once, but i never completed that init.
but dealing with SysV msg queues was not such a big deal from the
code side.

i used it merely as an additional emergency ipc method when other
ipc methods become impossible. though actually signals were sufficient
in case of emergency.

> Because my recommendation would be the exact opposite. Don't
> ever use SysV message queues if you can avoid it. The API is very
> clumsy, and does not mix with event loops, so it constrains your
> programming model immensely - you're practically forced to use threads.
> And that's a can of worms you certainly don't want to open in pid 1.

that is wrong. just read the msg queue when a signal arrives
(say SIGIO for example). catch that signal via selfpiping and there
you are, no need to use threads. we were talking about process #1
anyway, so some msg queue restrictions do not apply here (like
finding the ipc id), if you need to send replies back to clients set up a
second queue for the replies (with ipc id 2, we are process #1 and
free to grab ANY possible ipc id). if those clients wait for their results
and block you could also signal them after writing the reply to the
second queue (that means clients send their pid along with their
requests, usually the message type field is (ab ?)used to pass that
information).

> Abstract sockets are cool - the only issue is that they're not
> portable, which would limit your init system to Linux only.

well, we are talking about Linux here since that is where that
obscene systemd monstrosity rampages for quite a while now.

> If you're going for a minimal init, it's a shame not to make it
> portable.

in my case it will be portable and will work solely signal driven.
but i do not see to much need for other unices to change their
init systems, especially not for BSDs.

of course BSD init could be improved upon but it just works and
it is rather easy to understand how. they did not follow the SysV
runlevel BS and speeding their inits up will mostly mean to speed
up /etc/rc and friends. it also has respawn capabilities via /etc/ttys
to (re)start a supervisor from there. though i agree it is quite big for
not doing too much ...

i hope the FreeBSD (i do not think the other BSDs even consider such
a step) team will not follow the BS introduced elsewhere (OpenBSD
will probably not), a danger for FreeBSD was some of their users'
demand to port that launchd joke over from macOS or come up
with something even worse (more in the direction of systemd).

> Really, the argument for an ipc mechanism that does not require a
> rw fs is a weak one.

not at all.

> Every modern Unix can mount a RAM filesystem nowadays,

that is a poor excuse, you wanted to portable, right ?

> and it is basically essential to do so if you want early logs.

use the console device for your early logs, that requires console
access though ...

> Having no logs at all until you can mount a rw fs is a big
> no-no, and being unable to log what your init system does *at all*,
> unless you're willing to store everything in RAM until a rw fs
> comes up, is worse. An early tmpfs technically stores things in RAM
> too, but is much, *much* cleaner, and doesn't need ad-hoc code or
> weird convolutions to make it work.

> Just use an early tmpfs and use whatever IPC you want that uses a
> rendez-vous point in that tmpfs to communicate with process 1.
> But just say no to message queues.

that is just your opinion since your solution works that way.
other solutions are of course possible, eg using msg queues
just as a backup ipc method since we can exploit being process #1
here. in the case of Linux i do not see any reason not to use
abstract unix sockets as preferred ipc method in process #1
(except the kernel does not support sockets which is rare these
days, right ? on BSD AFAIK the kernel also supports sockets
since socketpairs were said to be used there to implement pipes.)
for BSD use normal unix sockets (this is save to do even on OpenBSD
since we have emergency ipc via signals and SysV msg queues),
on Solaris SysV streams (and possibly doors) might be used as
we also have said backup mechanism in reserve.

but to be honest: a simple reliable init implementation should be solely
signal driven. i was just thinking about more complex integrated inits
that have higher ipc demands (dunno what systemd does :-).

you can tell a reliable init by the way it does ipc.
many inits do not get that right and rely on ipc mechanisms that require
rw fs access. if mounting the corresponding fs rw fails they are pretty
hosed since their ipc does not work anymore and their authors were
too clueless to just react to signals in case of emergency and abused
signal numbers to reread their config or other needless BS.

i top my claims even more:
you can tell a reliable init by not even using malloc directly nor indirectly
(hello opendir(3) !!! ;-). :PP

but more on that topic in another post since this one has become
quite long.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: SysV shutdown util
  2019-05-11 11:26   ` SysV shutdown util Jeff
@ 2019-05-11 12:56     ` Laurent Bercot
  0 siblings, 0 replies; 13+ messages in thread
From: Laurent Bercot @ 2019-05-11 12:56 UTC (permalink / raw)
  To: supervision

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

>>  you need to be able to convey more information to pid 1
>>  than a few signals can.
>such as ?
>what more information than the runlevel (0 or 6, maybe 1 to go
>into single user) does SysV init need to start the system shutdown ?

The time of the shutdown. The "shutdown" command isn't necessarily
instantaneous, it can register a shutdown for a given time.
  You could deal with that by creating another process, which is
equivalent to running a daemon outside pid 1 (and that's exactly
what s6-linux-init-1.0.0.0 does), but if you want to handle it all
in pid 1, you need time management in it and that can't be done
via signals only.


>the Void Linux team just made a shell script out of it that just brings
>down the runit services and runsvdir itself.

  Yeah, that's not difficult. What is difficult is implementing the
legacy features of other inits.

--
  Laurent

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-11 12:38   ` emergency IPC with SysV message queues Jeff
@ 2019-05-11 13:34     ` Laurent Bercot
  2019-05-16 15:15       ` Jeff
  0 siblings, 1 reply; 13+ messages in thread
From: Laurent Bercot @ 2019-05-11 13:34 UTC (permalink / raw)
  To: supervision


  Please stop breaking threads. This makes conversations needlessly
difficult to follow, and clutters up mailboxes. Your mailer certainly
has a "Reply" or a "Reply to group" feature, that does not break
threads; please use it.


>that is wrong. just read the msg queue when a signal arrives
>(say SIGIO for example). catch that signal via selfpiping and there
>you are, no need to use threads.

That is obviously not going to work. Operations such as msgsnd() and
msgrcv() are synchronous and cannot be mixed with asynchronous event
loops. There is no way to be asynchronously notified of the
readability or writability of a message queue, which would be
essential for working with poll() and selfpipes for signals.

If you are suggesting a synchronous architecture around message
queues where the execution flow can be disrupted by signal handlers,
that is theoretically possible, but leads to incredibly ugly code
that is exceptionally difficult to write and maintain, to the point
where I would absolutely choose to use threads over this.
Interruption-driven code was common in the 1990s, was a complete
nightmare, and thankfully essentially disappeared when Linux
implemented a half-working threading implementation. I would not
trust myself to write correct interruption-driven code, let alone
make it readable by other programmers. You should not trust yourself
either.


>>  Every modern Unix can mount a RAM filesystem nowadays,
>that is a poor excuse, you wanted to portable, right ?

Yes. Every modern Unix system has a way to create and mount a RAM
filesystem. The APIs themselves are not portable, and that is why
s6 doesn't do it itself, but the concept totally is. If I had more
experience with the BSD APIs, I could easily port s6-linux-init to
the BSDs. I hope someone more BSD-savvy than me will do so.

Using non-portable *functionality*, on the other hand, is an
entirely different game. Abstract sockets only exist on Linux, and
porting code that uses abstract sockets to another platform, without
using non-abstract Unix domain sockets, is much more difficult than
porting code that mounts a tmpfs.


>use the console device for your early logs, that requires console
>access though ...

  I said *logs*, as in, data that can be accessed later and reused,
and maybe stored into a safe place. Scrolling text on a console is
no substitute for logs.


>that is just your opinion since your solution works that way.

That is my opinion backed by 20 years of experience working with
Unix systems and 8 years with init systems, and evidence I've been
patiently laying since you started making various claims in the
mailing-list. You are obviously free to disagree with my opinion,
but I wish your arguments came backed with as much evidence as mine.


>you can tell a reliable init by the way it does ipc.
>many inits do not get that right and rely on ipc mechanisms that require
>rw fs access. if mounting the corresponding fs rw fails they are pretty
>hosed since their ipc does not work anymore and their authors were
>too clueless to just react to signals in case of emergency and abused
>signal numbers to reread their config or other needless BS.
>
>i top my claims even more:
>you can tell a reliable init by not even using malloc directly nor indirectly
>(hello opendir(3) !!! ;-). :PP

  Every piece of software is a compromise between theoretical minimalism
/ elegance and other requirements. Even minimalism and elegance are
sometimes subjective: my position is that using s6-svscan as process 1
is more elegant than having a separate "minimal" process 1 that still
needs to perform some kind of supervision, because writing the separate
process 1 would lead to some functionality duplication and also add
more code. Your opinion is obviously different; instead of taking cheap
shots, show me the code, and then, maybe, we can compare merits.

--
  Laurent



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-11 13:34     ` Laurent Bercot
@ 2019-05-16 15:15       ` Jeff
  2019-05-16 21:25         ` Laurent Bercot
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff @ 2019-05-16 15:15 UTC (permalink / raw)
  To: supervision

11.05.2019, 15:33, "Laurent Bercot" <ska-supervision@skarnet.org>:
> Please stop breaking threads. This makes conversations needlessly
> difficult to follow, and clutters up mailboxes.

i do that intentionally since i find the opposite easier to follow.
that leads often to complaints on other lists aswell.

> That is obviously not going to work.

obviously ? to me this is not obvious at all.

> Operations such as msgsnd() and msgrcv() are synchronous and
> cannot be mixed with asynchronous event loops.

what does that exactly mean ? do you mean they block ?
this not the case when the IPC_NOWAIT flag is used.

> There is no way to be asynchronously notified of the
> readability or writability of a message queue, which would be
> essential for working with poll() and selfpipes for signals.

i do not understand at all what you mean here.
the client should signal us (SIGIO for example) to wake us up, then
we look at the input queue without blocking (handling SIGIO via selfpipe)
else we ignore the msg queue. again: this just a backup ipc protocol.
i suggest not to use an output queue to reply to client requests, keeps
things simpler. but if you insist on a reply queue you have to look if it
is not full before writing the reply to it (do this with a non-blocking call
to msgrcv(2) using said IPC_NOWAIT flag). use IPC_NOWAIT in the
following msgsnd(2) writing call, again nothing will block.

and i have not even mentioned posix message queues which can be
used from a select/poll based event loop ...
(dunno if OpenBSD has them)

please also take into consideration the following:

* we run as process #1 which we should exploit (SysV ipc ids) here

* SysV ipc is just used as a backup ipc protocol where unix (abstract
  on Linux) sockets are the preferred default method 
  (and signals of course).

> If you are suggesting a synchronous architecture around message
> queues where the execution flow can be disrupted by signal handlers,
> that is theoretically possible, but leads to incredibly ugly code
> that is exceptionally difficult to write and maintain, to the point
> where I would absolutely choose to use threads over this.

just a claim, it is not too much work, now that you beg for it i consider
adding it again, just for you as i would else have restricted myself to
signal handling only which is absolutely sufficient in that case.

> Yes. Every modern Unix system has a way to create and mount a RAM
                     ^^^^^^^^^^ see ? how modern ? not that portable, right ?
                     there was a reason why the initctl fifo was stored in /dev ...

> filesystem. The APIs themselves are not portable, and that is why
> s6 doesn't do it itself, but the concept totally is. If I had more
> experience with the BSD APIs, I could easily port s6-linux-init to
> the BSDs. I hope someone more BSD-savvy than me will do so.

no this is not portable but your assumption since s6 will not work
without rw which is indeed very limiting and far from correct behaviour.
the need for rw is an unnecessary artificial introduction to use your
daemontools style supervisor tool suite for process #1.

> Using non-portable *functionality*, on the other hand, is an
> entirely different game. Abstract sockets only exist on Linux, and
> porting code that uses abstract sockets to another platform, without
> using non-abstract Unix domain sockets, is much more difficult than
> porting code that mounts a tmpfs.

we use normal unix sockets on BSD, on Solaris one could also use SysV
STREAMs for local ipc (needs also rw AFAIK).

that is ok since we provide a portable backup/emergency method that
works even on OpenBSD. on Linux we exploit its non-portable abstract
sockets which makes ipc even less dependent on rw mounts.
that is perfectly ok IMO.

> That is my opinion backed by 20 years of experience working with
> Unix systems and 8 years with init systems, and evidence I've been
> patiently laying since you started making various claims in the mailing-list.

it is you who makes claims on your web pages, so i just wondered how
you back them. quote:

> System V IPCs, i.e. message queues and semaphores.

why using semaphores ? they are primarily meant to ease the usage
of SysV shared memory. but epoch init uses shared memory without them.

> The interfaces to those IPCs are quite specific and can't mix with

specific to what ? even older unices support them.
and SysV shared memory is in fact a very fast ipc method.

> select/poll loops, that's why nobody in their right mind uses them. 

there exist also the respective successor posix ipc mechanisms
where one could do exactly that.

> You are obviously free to disagree with my opinion,
> but I wish your arguments came backed with as much evidence as mine.

they are backed very well by the man pages of the syscalls in question.
please also notice the difference between an ipc mechanism and a
protocol that makes use of it.

>> (hello opendir(3) !!! ;-). :PP

hello (pu,se)tenv(3) !!



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-16 15:15       ` Jeff
@ 2019-05-16 21:25         ` Laurent Bercot
  2019-05-16 21:54           ` Oliver Schad
                             ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Laurent Bercot @ 2019-05-16 21:25 UTC (permalink / raw)
  To: supervision

>>  Please stop breaking threads. This makes conversations needlessly
>>  difficult to follow, and clutters up mailboxes.
>i do that intentionally since i find the opposite easier to follow.
>that leads often to complaints on other lists aswell.

Oh? And the other complaints haven't given you a clue?
We are a friendly community, and that includes choosing to follow
widely adopted threading conventions in order to make your readers
comfortable, instead of breaking them because you happen not to like
them. Please behave accordingly and don't be a jerk.


>>  Operations such as msgsnd() and msgrcv() are synchronous and
>>  cannot be mixed with asynchronous event loops.
>what does that exactly mean ? do you mean they block ?
>this not the case when the IPC_NOWAIT flag is used.

No, it means you cannot be notified of readability or writability,
they do not work with poll(). Sure, you have a flag that makes
them non-blocking, but to use it, you need to either loop around
it, which is polling (i.e. terrible), or to use another channel
for notifications. Which is what you are suggesting below:


>the client should signal us (SIGIO for example) to wake us up, then
>we look at the input queue without blocking (handling SIGIO via selfpipe)
>else we ignore the msg queue.

Okay, so your IPC mechanism isn't just message queues, it's a mix
of two different channels: message queues *plus* signals. Signals
for notification, message queues for data transmission. Yes, it can
work, but it's more complex than it has to be, using two Unix
facilities instead of one. You basically need a small library for
the client side. Meh.
A fifo or a socket works as both a notification mechanism and a
data transmission mechanism, and it's as simple as it gets.


>and i have not even mentioned posix message queues which can be
>used from a select/poll based event loop ...

Yes, they can... but on Linux, they are implemented via a virtual
filesystem, mqueue. And your goal, in using message queues, was to
avoid having to mount a read-write filesystem to perform IPC with
process 1 - so that eliminates them from contention, since mounting
a mqueue is just as heavy a requirement as mounting a tmpfs.

Also, it is not clear from the documentation, and I haven't
performed tests, but it's even possible that the Linux implementation
of SysV message queues requires a mqueue mount just like POSIX ones,
in which case this whole discussion would be moot anyway.


>* SysV ipc is just used as a backup ipc protocol where unix (abstract
>   on Linux) sockets are the preferred default method
>   (and signals of course).

You've lost me there. Why do you want several methods of IPCs in
your init system? Why don't you pick one and stick to it? Sockets
are available on every Unix system. So are FIFOs. If you're going
to use sockets by default, then use sockets, and you'll never need
to fall back on SysV IPC, because sockets work.


>just a claim, it is not too much work, now that you beg for it i consider
>adding it again, just for you as i would else have restricted myself to
>signal handling only which is absolutely sufficient in that case.

I have no idea what you are talking about.


>                      ^^^^^^^^^^ see ? how modern ? not that portable, right ?
>                      there was a reason why the initctl fifo was stored in /dev ...

Uh, yes, I'm writing an init system for 2019, not for 1992.
And *even* in 1992, there was a writable filesystem: /dev.
Now I'm not saying that creating fifos in /dev is good design, but
I am saying that if you need a writable place to create a fifo,
you always have one. Especially nowadays with /dev being a tmpfs,
so even if you're reluctant to mount an additional tmpfs at boot
time, you can always do stuff in /dev!

Needing a writable filesystem to create a fifo or a socket has
never been a serious limitation, and nowadays it is even less of a
limitation than before. The "must not use the filesystem at all"
constraint is artificial and puts a much greater burden on your
design than needing a rw fs does.


>no this is not portable but your assumption since s6 will not work
>without rw which is indeed very limiting and far from correct behaviour.

It's really not limiting, and the *only* correct behaviour.
The need to have a rw fs does not even come from the daemontools-like
architecture with a writable scandir. It comes from the need to store
init's logs.

Storing logs from an init system is not easy to do. Some systems,
including sysvinit, choose to not even attempt it, and keep writing
to either /dev/console (which is very transient and not accessible
remotely) or /dev/null. Some systems do unholy things with
temporary logging daemons. Daemontools (which needs a place to
write logs even though it doesn't naturally run as pid 1, and
doesn't assume that svscan's stderr points to something sensible)
does something that is better than /dev/null or /dev/console, but not
by much: it uses readproctitle, which reserves space in its friggin'
_argv_ to write rotated logs.

And some, including systemd of all things, do the right thing, and
actually realize that init's logs are *logs* and should be treated
as *logs*, i.e. stored and processed by a *logger program* in the
filesystem. And if there's no suitable filesystem at that time, well
they create one. Making a tmpfs is *easy*. And it's the only way
to get a unified log architecture.


>we use normal unix sockets on BSD

Oh, so it's not a problem to need a writable filesystem on BSD then?
So, why all the contortions to avoid it on other systems? If you're
fine with Unix domain sockets, then you're fine with Unix domain
sockets and that's it. And there's nothing wrong with that. And
you don't need a "portable backup/emergency method" - that just
bloats your init system for zero benefit. Z-e-r-o. Zero.


>why using semaphores ? they are primarily meant to ease the usage
>of SysV shared memory. but epoch init uses shared memory without them.

It is true that I didn't back my claim on this page that SysV IPCs
have terrible interfaces. At the time of writing, I had tried to
use them for an unrelated project and found them unusable; and in
10 years of auditing C code with heavy focus on IPC, I had been
submitted *no* code using them, so I thought my opinion was pretty
common. That was 7ish years ago. Since then, I have had to work with
*one* project using SysV message queues, and my initial impressions
were confirmed. I managed to make it work, but it was really
convoluted, and a lot more complex than it needed to be; it's
*definitely* not an IPC I would choose for a notification mechanism
for a supervision suite.

I don't know, something about only being usable for data transmission
and needing another IPC mechanism to the side for notification makes
me think it wouldn't be a good mechanism to use for notification.
I just have weird opinions like that.

--
Laurent



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-16 21:25         ` Laurent Bercot
@ 2019-05-16 21:54           ` Oliver Schad
  2019-05-19 17:54           ` Jeff
  2019-05-19 18:38           ` Bob
  2 siblings, 0 replies; 13+ messages in thread
From: Oliver Schad @ 2019-05-16 21:54 UTC (permalink / raw)
  Cc: supervision

[-- Attachment #1: Type: text/plain, Size: 613 bytes --]

Sorry for my side note: but I have to say, that these discussions are
really great!

I thought it would be just boring as it started. But from my side I can
tell you, that I learn a lot because you talk about reasons to do
something or to let something.

It's far better than "I believe" and "I like". Thank you very much for
sharing your reasons.

Best Regards
Oli

-- 
Automatic-Server AG •••••
Oliver Schad
Geschäftsführer
Turnerstrasse 2
9000 St. Gallen | Schweiz

www.automatic-server.com | oliver.schad@automatic-server.com
Tel: +41 71 511 31 11 | Mobile: +41 76 330 03 47

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-16 21:25         ` Laurent Bercot
  2019-05-16 21:54           ` Oliver Schad
@ 2019-05-19 17:54           ` Jeff
       [not found]             ` <CALZWFRJws+eDthy8MCFj4PiVdRdRpM+afda6QAbEtYA4NT8ESQ@mail.gmail.com>
  2019-05-19 20:26             ` Jeff
  2019-05-19 18:38           ` Bob
  2 siblings, 2 replies; 13+ messages in thread
From: Jeff @ 2019-05-19 17:54 UTC (permalink / raw)
  To: super


On Thu, May 16, 2019 at 09:25:09PM +0000, Laurent Bercot wrote:
> Oh? And the other complaints haven't given you a clue?
> We are a friendly community, and that includes choosing to follow
> widely adopted threading conventions in order to make your readers
> comfortable, instead of breaking them because you happen not to like
> them. Please behave accordingly and don't be a jerk.

breaking long threads that went in a direction that has nothing to do
with the original thread topic is in no way unfriendly or offensive,
nor does that make me a "jerk".

> Okay, so your IPC mechanism isn't just message queues, it's a mix
> of two different channels: message queues *plus* signals.

well, no. the mechanism is SysV msg queues and the protocol for
clients to use to communicate includes - among other things - notifying
the daemon (its PID is well known) by sending a signal to wake it up and
have it processes the request input queue.
you do not use just fifos (the mechanism), there is also a protocol
involved that clients and server use.

> Signals for notification, message queues for data transmission. Yes,
> it can work, but it's more complex than it has to be, using two Unix
> facilities instead of one.

indeed, this is more complex than - say - just sockets. on the other
hand it does not involve any locking to protect against concurrently
accessing the resource as it would have done with a fifo.

and again: it is just an emergency backup solution, the preferred way
are (Linux: abstract) unix sockets of course. such complicated ipc is
not even necessary in my case, but for more complex and integrated
inits it is. that was why i suggested in order to make their ipc
independent of rw fs access.

and of course one can tell a reliable init by the way it does ipc.

> You basically need a small library for the client side. Meh.

right, the client has to know the protocol.
first try via the socket, then try to reach init via the msg queue.
for little things like shutdown requests signaling suffices.

> A fifo or a socket works as both a notification mechanism and a
> data transmission mechanism,

true, but the protocol used by requests has to be desinged as well.
and in the case of fifos: they have to be guarded against concurrent
writing by clients via locking (which requires rw fs access).

> and it's as simple as it gets.

the code used for the msg queueing is not complicated either.

> Yes, they can... but on Linux, they are implemented via a virtual
> filesystem, mqueue. And your goal, in using message queues, was to
> avoid having to mount a read-write filesystem to perform IPC with
> process 1 - so that eliminates them from contention, since mounting
> a mqueue is just as heavy a requirement as mounting a tmpfs.

indeed, they usually live in /dev/mqueue while posix shared memory
lives in /dev/shm.

that was reason that i did not mention them in the first place
(i dunno if OpenBSD has them as they usually lag behind the other
unices when it comes to posix conformance).

i just mentioned them to point out that you can be notified about
events involving the posix SysV ipc successors.
i never used them in any way since they require a tmpfs for this.

> Also, it is not clear from the documentation, and I haven't
> performed tests, but it's even possible that the Linux implementation
> of SysV message queues requires a mqueue mount just like POSIX ones,
> in which case this whole discussion would be moot anyway.

which in fact is not the case, try it with "ipcmk -Q", same for the
other SysV ipc mechanisms like shared memory and semaphores.
you can see that easily when running firefox. it uses shared memory
without semaphores akin to "epoch" (btw: if anyone uses "epoch" init
it would be interesting to see what ipcs(1) outputs).
this is in fact a very fast ipc mechanism (the fastest ??), though
a clever protocol must be used to avoid dead locks, concurrent accesses
and such. the msg queues have the advantage that messages are already
separated and sorted in order of arrival.

> You've lost me there. Why do you want several methods of IPCs in
> your init system? Why don't you pick one and stick to it?

since SysV msg queues are a quite portable ipc mechanism that does
not need any rw access. so they make up for a reliable ipc backup
emergency method.

> Sockets are available on every Unix system.

these days (IoT comes to mind). but i guess SysV init (Linux) does
not use them since there might have been kernels in use without
socket support (?? dunno, just a guess).
on the BSDs this should be true since it was said that they implement
piping via socketpairs.

> So are FIFOs.

i do not like to use them at all, especially since they need rw
(is that true everywhere ??).

> If you're going to use sockets by default, then use sockets,
> and you'll never need to fall back on SysV IPC, because sockets work.

true for abstract sockets (where available), dunno what access rights
are needed to use unix sockets residing on a fs.

> Uh, yes, I'm writing an init system for 2019, not for 1992.
> And *even* in 1992, there was a writable filesystem: /dev.
> Now I'm not saying that creating fifos in /dev is good design, but
> I am saying that if you need a writable place to create a fifo,
> you always have one. Especially nowadays with /dev being a tmpfs,
> so even if you're reluctant to mount an additional tmpfs at boot
> time, you can always do stuff in /dev!

is /dev in any case always writable ?
what about platforms that have a static /dev residing on disc
(maybe in the root fs or as separate partition) ?
although chances are it is writable, that is why people place unrelated
things into it.

> Needing a writable filesystem to create a fifo or a socket has
> never been a serious limitation,

really NEVER ??

> and nowadays it is even less of a
> limitation than before. The "must not use the filesystem at all"
> constraint is artificial and puts a much greater burden on your
> design than needing a rw fs does.

we were discussing correct/safe/reliable behaviour in preferably all
possible situations that might arise.

> It's really not limiting, and the *only* correct behaviour.

hu ? any proof here ?

> The need to have a rw fs does not even come from the daemontools-like
> architecture with a writable scandir. It comes from the need to store
> init's logs.

when the console device is not enough ...
in fact "working" with read-only access might not be very pleasant
but this might be true in some scenarios.

> Storing logs from an init system is not easy to do. Some systems,
> including sysvinit, choose to not even attempt it, and keep writing
> to either /dev/console (which is very transient and not accessible
> remotely) or /dev/null. Some systems do unholy things with
> temporary logging daemons.

bootlogd ? to be honest:
i agree with SysV init, when was logging process #1's duty ?
it is nice to have though but IMO not a requirement per se,
maybe you could enlighten me a bit in case you disagree

as you already did in the case of respawning subprocesses.
placing such functionality into process #1 looks indeed safer and
exploits process #1 being protected against signaling by the kernel.

> Making a tmpfs is *easy*.

nowadays.

> Oh, so it's not a problem to need a writable filesystem on BSD then?
> So, why all the contortions to avoid it on other systems? If you're
> fine with Unix domain sockets, then you're fine with Unix domain
> sockets and that's it. And there's nothing wrong with that. And
> you don't need a "portable backup/emergency method" - that just
> bloats your init system for zero benefit. Z-e-r-o. Zero.

> It is true that I didn't back my claim on this page that SysV IPCs
> have terrible interfaces. At the time of writing, I had tried to
> use them for an unrelated project and found them unusable;

i have not advocated their general use.

> Since then, I have had to work with
> *one* project using SysV message queues, and my initial impressions
> were confirmed. I managed to make it work, but it was really
> convoluted, and a lot more complex than it needed to be; it's
> *definitely* not an IPC I would choose for a notification mechanism
> for a supervision suite.

that is true, but i recommended them just for process #1 and even there
as emergency backup.

i personally would not use fifos for anything either.
IMO their usage is convoluted and complex aswell except that they
offer the advantage of notification.

> I don't know, something about only being usable for data transmission
> and needing another IPC mechanism to the side for notification makes
> me think it wouldn't be a good mechanism to use for notification.

unix sockets (Linux: abstract in case one does not want them to reside on
a fs) are the solution i prefer for ipc between unrelated processes.
but in situations where one wants to bypass them with a faster mechanism
i would suggest SysV shared memory (in fact its usage is not so uncommon).
of course protecting the memory area against concurrent accesses has to
be ensured by a clever protocol ... :-/

(btw: firefox and epoch init seem not to use SysV semaphores for that
purpose)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-16 21:25         ` Laurent Bercot
  2019-05-16 21:54           ` Oliver Schad
  2019-05-19 17:54           ` Jeff
@ 2019-05-19 18:38           ` Bob
  2 siblings, 0 replies; 13+ messages in thread
From: Bob @ 2019-05-19 18:38 UTC (permalink / raw)
  To: super


BTW:

i assume runit-init was introduced when its author wanted go get rid
of SysV init without changing runsvdir while s6-svscan's signal
handling functionality was motivated by the same desire and previous
hacks that involved running svscan directly as process #1.

this explains their authors' design decisions and the different ways
both took.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Fwd: emergency IPC with SysV message queues
       [not found]             ` <CALZWFRJws+eDthy8MCFj4PiVdRdRpM+afda6QAbEtYA4NT8ESQ@mail.gmail.com>
@ 2019-05-19 19:59               ` Cameron Nemo
  0 siblings, 0 replies; 13+ messages in thread
From: Cameron Nemo @ 2019-05-19 19:59 UTC (permalink / raw)
  To: supervision

On Sun, May 19, 2019 at 10:54 AM Jeff <sysinit@yandex.com> wrote:
> [...]
> On Thu, May 16, 2019 at 09:25:09PM +0000, Laurent Bercot wrote:
> > [...]
> > Okay, so your IPC mechanism isn't just message queues, it's a mix
> > of two different channels: message queues *plus* signals.
>
> well, no. the mechanism is SysV msg queues and the protocol for
> clients to use to communicate includes - among other things - notifying
> the daemon (its PID is well known) by sending a signal to wake it up and
> have it processes the request input queue.
> you do not use just fifos (the mechanism), there is also a protocol
> involved that clients and server use.

What details need to be conveyed other than "stand up", "sit down",
and "roll over" (boot, sigpwr, sigint)?

> > Signals for notification, message queues for data transmission. Yes,
> > it can work, but it's more complex than it has to be, using two Unix
> > facilities instead of one.
>
> indeed, this is more complex than - say - just sockets. on the other
> hand it does not involve any locking to protect against concurrently
> accessing the resource as it would have done with a fifo.
>
> and again: it is just an emergency backup solution, the preferred way
> are (Linux: abstract) unix sockets of course. such complicated ipc is
> not even necessary in my case, but for more complex and integrated
> inits it is. that was why i suggested in order to make their ipc
> independent of rw fs access.

Abstract namespace sockets have two shortcomings:

* not portable beyond linux
* need to use access restrictions
* shared across different mount namespaces;
  one needs a new network namespace for different instances

I am considering dropping it for a socket in /run in my supervisor.

--
Cameron

>
> and of course one can tell a reliable init by the way it does ipc.
>
> > You basically need a small library for the client side. Meh.
>
> right, the client has to know the protocol.
> first try via the socket, then try to reach init via the msg queue.
> for little things like shutdown requests signaling suffices.
>
> > A fifo or a socket works as both a notification mechanism and a
> > data transmission mechanism,
>
> true, but the protocol used by requests has to be desinged as well.
> and in the case of fifos: they have to be guarded against concurrent
> writing by clients via locking (which requires rw fs access).
>
> > and it's as simple as it gets.
>
> the code used for the msg queueing is not complicated either.
>
> > Yes, they can... but on Linux, they are implemented via a virtual
> > filesystem, mqueue. And your goal, in using message queues, was to
> > avoid having to mount a read-write filesystem to perform IPC with
> > process 1 - so that eliminates them from contention, since mounting
> > a mqueue is just as heavy a requirement as mounting a tmpfs.
>
> indeed, they usually live in /dev/mqueue while posix shared memory
> lives in /dev/shm.
>
> that was reason that i did not mention them in the first place
> (i dunno if OpenBSD has them as they usually lag behind the other
> unices when it comes to posix conformance).
>
> i just mentioned them to point out that you can be notified about
> events involving the posix SysV ipc successors.
> i never used them in any way since they require a tmpfs for this.
>
> > Also, it is not clear from the documentation, and I haven't
> > performed tests, but it's even possible that the Linux implementation
> > of SysV message queues requires a mqueue mount just like POSIX ones,
> > in which case this whole discussion would be moot anyway.
>
> which in fact is not the case, try it with "ipcmk -Q", same for the
> other SysV ipc mechanisms like shared memory and semaphores.
> you can see that easily when running firefox. it uses shared memory
> without semaphores akin to "epoch" (btw: if anyone uses "epoch" init
> it would be interesting to see what ipcs(1) outputs).
> this is in fact a very fast ipc mechanism (the fastest ??), though
> a clever protocol must be used to avoid dead locks, concurrent accesses
> and such. the msg queues have the advantage that messages are already
> separated and sorted in order of arrival.
>
> > You've lost me there. Why do you want several methods of IPCs in
> > your init system? Why don't you pick one and stick to it?
>
> since SysV msg queues are a quite portable ipc mechanism that does
> not need any rw access. so they make up for a reliable ipc backup
> emergency method.
>
> > Sockets are available on every Unix system.
>
> these days (IoT comes to mind). but i guess SysV init (Linux) does
> not use them since there might have been kernels in use without
> socket support (?? dunno, just a guess).
> on the BSDs this should be true since it was said that they implement
> piping via socketpairs.
>
> > So are FIFOs.
>
> i do not like to use them at all, especially since they need rw
> (is that true everywhere ??).
>
> > If you're going to use sockets by default, then use sockets,
> > and you'll never need to fall back on SysV IPC, because sockets work.
>
> true for abstract sockets (where available), dunno what access rights
> are needed to use unix sockets residing on a fs.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: emergency IPC with SysV message queues
  2019-05-19 17:54           ` Jeff
       [not found]             ` <CALZWFRJws+eDthy8MCFj4PiVdRdRpM+afda6QAbEtYA4NT8ESQ@mail.gmail.com>
@ 2019-05-19 20:26             ` Jeff
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff @ 2019-05-19 20:26 UTC (permalink / raw)
  To: supervision


> What details need to be conveyed other than "stand up", "sit down",
> and "roll over" (boot, sigpwr, sigint)?

depends on what you plan to do. for a minimal init handling SIGCHLD
(that is an interesting point indeed. is it really necessary ?
i still have to find out. would be nice if one could run without it
though.) and (Linux) SIG(INT,WINCH) should be sufficient.
in the case of the latter 2 it would be enough to run an external
executable to notify their arrival and let the admin decide what
to do about them. maybe SIGPWR is of relevance too.
that suffices for init itself.

a supervisor needs more information, such as:
start this service, stop that one, disable another, restart one,
signal another one and so on, depends on what capabilities the
supervisor provides.

and this has to be encoded in such a protocol that uses 2 ipc
mechanisms: sysv ipc and a specific signal (SIGIO comes to mind)
to notify the daemon (maybe a third one: (abstract) sockets).

> Abstract namespace sockets have two shortcomings:
>
> * not portable beyond linux

true, but i would use them where available and standard unix sockets
elsewhere.

> * need to use access restrictions

don't you use credentials anyway ?
AFAIK all the BSDs and Solaris have them too.

> * shared across different mount namespaces;
>   one needs a new network namespace for different instances

so you need to care for that namespaces too. this can be an
advantage since it is decoupled from mount namespaces though.

i did not consider namespaces at all since i follow the systemd
development approach: works on my laptop, hence works everywhere. :-(

> I am considering dropping it for a socket in /run in my supervisor.

why not ? i would use standard unix sockets for everything with
PID > 1 too, but in the process #1 situation i guess they provide
an advantage.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-19 20:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-09  7:10 ipc in process #1 Jeff
2019-05-10 17:57 ` Laurent Bercot
2019-05-11 11:26   ` SysV shutdown util Jeff
2019-05-11 12:56     ` Laurent Bercot
2019-05-11 12:38   ` emergency IPC with SysV message queues Jeff
2019-05-11 13:34     ` Laurent Bercot
2019-05-16 15:15       ` Jeff
2019-05-16 21:25         ` Laurent Bercot
2019-05-16 21:54           ` Oliver Schad
2019-05-19 17:54           ` Jeff
     [not found]             ` <CALZWFRJws+eDthy8MCFj4PiVdRdRpM+afda6QAbEtYA4NT8ESQ@mail.gmail.com>
2019-05-19 19:59               ` Fwd: " Cameron Nemo
2019-05-19 20:26             ` Jeff
2019-05-19 18:38           ` Bob

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).