* [announce] perp-2.03: persistent process supervision @ 2011-03-14 10:39 Wayne Marshall 2011-03-14 13:17 ` Laurent Bercot 0 siblings, 1 reply; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 10:39 UTC (permalink / raw) To: supervision Announcing the latest release of perp, perp-2.03, a persistent process supervisor: http://b0llix.net/perp/ Tarball: http://b0llix.net/perp/distfiles/perp-2.03.tar.gz What's New (As if You Care): The big news for the "second generation" perp-2.* series: * scanner/supervisor/controller runs as a single process * all context switching for multiple supervisor processes is eliminated * ipc for control/status clients now via single domain socket * perpd(8) creates a mere two file system objects at startup -- a lockfile and domain socket -- and otherwise generates no disk activity during runtime, perfect for read-only file systems and embedded applications! About (The Usual Outrageous Claims and Assertions): perp is a service supervisor similar in purpose to the venerable daemontools package, providing a modern update with many advantages: * easy configuration: in place service activation and no symlinks! * everthing administered in /etc/perp * fully FHS compatible * service reset capability * pretty good troff -man documentation * colorized(!) service lister, readable timestamps... * no slashpackage, no slashcommand, no slashdoc... Contact (Hah!): perp[At Sign]b0llix[Dot]net ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 10:39 [announce] perp-2.03: persistent process supervision Wayne Marshall @ 2011-03-14 13:17 ` Laurent Bercot 2011-03-14 14:02 ` Wayne Marshall 0 siblings, 1 reply; 14+ messages in thread From: Laurent Bercot @ 2011-03-14 13:17 UTC (permalink / raw) To: supervision Hi Wayne, > Announcing the latest release of perp, perp-2.03, a persistent > process supervisor: Good news ! :) I just have a question about your design: > * easy configuration: in place service activation and no > symlinks! Does that mean that perpd stores all the service states in memory ? To control or check on services, perpctl and other utilities connect to perpd via the Unix domain socket, right ? So... the dreaded question... what happens if perpd dies ? Will perpboot restore a sane supervision tree ? -- Laurent ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 13:17 ` Laurent Bercot @ 2011-03-14 14:02 ` Wayne Marshall 2011-03-14 14:23 ` Robin Bowes 0 siblings, 1 reply; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 14:02 UTC (permalink / raw) To: Laurent Bercot; +Cc: supervision Hi Laurent, > I just have a question about your design: > > > * easy configuration: in place service activation and no > > symlinks! > > Does that mean that perpd stores all the service states in > memory ? To control or check on services, perpctl and other > utilities connect to perpd via the Unix domain socket, right ? Yes. > So... the dreaded question... what happens if perpd dies ? > Will perpboot restore a sane supervision tree ? > Yes. First, perpd(8) won't die :) If perpd(8) receives SIGTERM, it runs a controlled shutdown of all services under its supervision, and then terminates itself. Under normal (default) configurations, whenever perpd(8) terminates it is restarted by either perpboot(8), or init(8) with a "respawn" configuration in inittab(5). perpd(8) then restarts all services marked for activation in /etc/perp. Wayne ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 14:02 ` Wayne Marshall @ 2011-03-14 14:23 ` Robin Bowes 2011-03-14 14:34 ` Wayne Marshall 2011-03-14 15:03 ` Charlie Brady 0 siblings, 2 replies; 14+ messages in thread From: Robin Bowes @ 2011-03-14 14:23 UTC (permalink / raw) To: supervision On 14/03/11 14:02, Wayne Marshall wrote: > Under normal (default) configurations, whenever perpd(8) > terminates it is restarted by either perpboot(8), or init(8) with > a "respawn" configuration in inittab(5). perpd(8) then > restarts all services marked for activation in /etc/perp. So, if I have a service that is normally running, ie. starts at boot, but I have taken it down manually for whatever reason, and perpd dies, then my service will also be re-started? R. -- "Feed that ego and you starve the soul" - Colonel J.D. Wilkes http://www.theshackshakers.com/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 14:23 ` Robin Bowes @ 2011-03-14 14:34 ` Wayne Marshall 2011-03-14 16:47 ` Laurent Bercot 2011-03-14 15:03 ` Charlie Brady 1 sibling, 1 reply; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 14:34 UTC (permalink / raw) To: Robin Bowes; +Cc: supervision > > > Under normal (default) configurations, whenever perpd(8) > > terminates it is restarted by either perpboot(8), or init(8) > > with a "respawn" configuration in inittab(5). perpd(8) then > > restarts all services marked for activation in /etc/perp. > > So, if I have a service that is normally running, ie. starts > at boot, but I have taken it down manually for whatever > reason, and perpd dies, then my service will also be > re-started? > First, perpd(8) will not die (TM). If you deactivate your service (chmod -t myservice), or delete it from /etc/perp, or touch flag.down in the service directory, then it will not be restarted. If you take your service down with perpctl(8) -- without doing any of the above -- and in the interim perpd(8) is restarted, then the service will be restarted. So it is up to you to decide the intent of taking a service down. Wayne ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 14:34 ` Wayne Marshall @ 2011-03-14 16:47 ` Laurent Bercot 2011-03-14 17:39 ` Wayne Marshall 0 siblings, 1 reply; 14+ messages in thread From: Laurent Bercot @ 2011-03-14 16:47 UTC (permalink / raw) To: supervision > First, perpd(8) will not die (TM). Of course it will not - not in normal circumstances. Neither will svscan, or runsvdir, or s6-svscan. I trust your programming ability in that matter as much as mine - this is not a concern at all. The concern is that you don't always have the say. There's this playful thing called the Linux OOM killer. I hear the heuristics have been fixed in recent kernel releases, but for a long time, the OOM killer had the amusing habit of shooting processes at random, and very much failing to locate the process that is actually responsible for the memory outage. There are still a whole lot of broken OOM killers out there. Of course, this is not a normal condition, and under careful administration it never happens. But the point is, when you are designing a supervision tool, you should assume that you can get a random SIGKILL (Headshot. Do not pass Go. Do not call your cleanup routines.) at any time. Because if a supervision tool can't recover from an OOM event and keep vital services running until the sysadmin finishes his coffee and can manually repair things, then what is it good for ? That is why I asked my question. In other supervision schemes, tasks are de-centralized, so if one process randomly dies, it generally does not have much impact on the rest of the system. (If runsvdir dies, it's annoying, but things keep working until the admin can come clean things up.) perpd, however, looks like a neural hub, centralizing a lot of info into its memory. IOW, a SPOF, and you can be sure that the next broken system tool will love to play Doom with it. Is your supervision chain SIGKILL-resistant ? -- Laurent ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 16:47 ` Laurent Bercot @ 2011-03-14 17:39 ` Wayne Marshall 2011-03-14 17:52 ` Paul Jarc 2011-03-14 18:34 ` Laurent Bercot 0 siblings, 2 replies; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 17:39 UTC (permalink / raw) To: Laurent Bercot; +Cc: supervision On Mon, 14 Mar 2011 17:47:41 +0100 Laurent Bercot <ska-supervision@skarnet.org> wrote: > > First, perpd(8) will not die (TM). > > Of course it will not - not in normal circumstances. > Neither will svscan, or runsvdir, or s6-svscan. > I trust your programming ability in that matter as much as > mine - this is not a concern at all. > > The concern is that you don't always have the say. There's > this playful thing called the Linux OOM killer. I hear the > heuristics have been fixed in recent kernel releases, but for > a long time, the OOM killer had the amusing habit of shooting > processes at random, and very much failing to locate the > process that is actually responsible for the memory outage. > There are still a whole lot of broken OOM killers out there. It is like worrying, what if init(8) should die? > > Of course, this is not a normal condition, and under careful > administration it never happens. But the point is, when you > are designing a supervision tool, you should assume that you > can get a random SIGKILL (Headshot. Do not pass Go. Do not > call your cleanup routines.) at any time. > If a system is delivering random SIGKILL, one should select another system. There is no peaceful, confident sleeping at night otherwise, no matter what supervisory framework you choose. > That is why I asked my question. In other supervision > schemes, tasks are de-centralized, so if one process randomly > dies, it generally does not have much impact on the rest of > the system. (If runsvdir dies, it's annoying, but things keep > working until the admin can come clean things up.) perpd, > however, looks like a neural hub, centralizing a lot of info > into its memory. IOW, a SPOF, and you can be sure that the > next broken system tool will love to play Doom with it. > If we talk in terms of daemontools, svscan(8) already keeps a table of supervise(8) processes, and svscan itself functions as a supervisor of those multiple supervise(8)s. So it is not much of a conceptual jump, nor extra info, to simply eliminate the supervise(8) "middlemen", and have svscan supervise the services directly. This is all that perpd(8) does (as well as what init/minit/ninit do, too.) perpd does provide redundant supervision with perpboot/inittab by default when installed with perp-setup(8). Imagining any extra security from additional layers of supervision is merely a placebo, but you are certainly welcome to it if your base system is so fundamentally flawed. For example, you can run one perpd instance per service if you like. Or you can setup your perpetrate(5) service definitions to exec services under supervision of rundeux(8). Of course you can always revert to perp-0.00, too, if you prefer. It has all the same perp usability, but with a supervisory architecture that may be more familiar to you. Wayne ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 17:39 ` Wayne Marshall @ 2011-03-14 17:52 ` Paul Jarc 2011-03-14 18:43 ` Wayne Marshall 2011-03-14 18:34 ` Laurent Bercot 1 sibling, 1 reply; 14+ messages in thread From: Paul Jarc @ 2011-03-14 17:52 UTC (permalink / raw) To: supervision Wayne Marshall <wcm@b0llix.net> wrote: > It is like worrying, what if init(8) should die? If process 1 dies, the system halts, and we reboot it. But perpd doesn't run as process 1, right? So if it did receive SIGKILL, for whatever reason, it's not so obvious what would happen. > Imagining any extra security from additional layers of supervision > is merely a placebo, but you are certainly welcome to it if your > base system is so fundamentally flawed. No one has suggested adding layers. The separation of duties in daemontools and other systems doesn't determine the behavior when a process dies; it just makes it easier for us to *know* what will happen when a given process dies. paul ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 17:52 ` Paul Jarc @ 2011-03-14 18:43 ` Wayne Marshall 0 siblings, 0 replies; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 18:43 UTC (permalink / raw) To: Paul Jarc; +Cc: supervision On Mon, 14 Mar 2011 13:52:58 -0400 prj@po.cwru.edu (Paul Jarc) wrote: > Wayne Marshall <wcm@b0llix.net> wrote: > > It is like worrying, what if init(8) should die? > > If process 1 dies, the system halts, and we reboot it. But > perpd doesn't run as process 1, right? So if it did receive > SIGKILL, for whatever reason, it's not so obvious what would > happen. > It is as deterministic as if svscan is SIGKILLed: the system is unstable. Wayne ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 17:39 ` Wayne Marshall 2011-03-14 17:52 ` Paul Jarc @ 2011-03-14 18:34 ` Laurent Bercot 1 sibling, 0 replies; 14+ messages in thread From: Laurent Bercot @ 2011-03-14 18:34 UTC (permalink / raw) To: supervision > It is like worrying, what if init(8) should die? No, not exactly, because init(8) is process 1. If perpd is meant to be run as process 1, then I have no more questions - it just will not die, as you say. But if it is not, it is legitimate to wonder about it dying. And btw, I do worry about process 1 dying, not from murder, which is not possible, but from simple illness, i.e. bugs. That is why I do not trust Upstart, or MacOS X launchd, or even sysvinit's init: those programs are too complex for anyone to be able to guarantee that they can't die, and they don't leak memory, and they don't have any other bug of the kind. Process 1, the ultimate long-lived process on a system, should be *proven* to work, and complexity is antagonistic to that. > If a system is delivering random SIGKILL, one should select > another system. There is no peaceful, confident sleeping at > night otherwise, no matter what supervisory framework you choose. But the point of supervision is to make up for deficiencies in the real world ! I don't need the daemons I write to be supervised, because I know how to write daemons, and they just Do Not Die (TM). Who needs automatic respawning when there is no bug in your programs and they just work ? In a perfect world, none of the work we're doing here would be necessary ! Unfortunately, we're not living in a perfect world, and stuff happens. We're just building additional safeguards to ensure that even when the improbable happens, our systems keep working. If a SIGKILL hitting perpd is just too improbable for you and you do not want to cover that case ("perp offers no guarantee against acts of God, malevolent or stupid root account holders, or buggy Linux OOMs"), that's perfectly fine, and perp will still be basically usable about 100% of the time. But, well, I like to turn the paranoia to the max and be able to say "it still works". :) > If we talk in terms of daemontools, svscan(8) already keeps a > table of supervise(8) processes, and svscan itself functions as a > supervisor of those multiple supervise(8)s. So it is not much of > a conceptual jump, nor extra info, to simply eliminate the > supervise(8) "middlemen", and have svscan supervise the services > directly. Oh, I am not attacking perp's design at all. I welcome alternatives in the world of supervision suites. My own take on the matter, s6 (to be released as soon as the doc is written, which means... someday), is a daemontools-like design, and we were lacking an init-like design. Variety is good, and I don't think perpd's design is a fundamental flaw - I have reasons for liking daemontools' design better, but they're mostly maintainability- and aesthetics-related. If every Unix admin in the world used perp, or runit, I would be a happy man. Anything we have here is so much better than what mainstream offers. > perpd does provide redundant supervision with perpboot/inittab > by default when installed with perp-setup(8). Imagining any > extra security from additional layers of supervision is merely a > placebo It's not about adding layers, it's about dividing responsibilities. I'll elaborate on this later, right now I have a bus to catch. -- Laurent ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 14:23 ` Robin Bowes 2011-03-14 14:34 ` Wayne Marshall @ 2011-03-14 15:03 ` Charlie Brady 2011-03-14 15:35 ` Wayne Marshall 2011-03-14 17:02 ` Laurent Bercot 1 sibling, 2 replies; 14+ messages in thread From: Charlie Brady @ 2011-03-14 15:03 UTC (permalink / raw) To: Robin Bowes; +Cc: supervision On Mon, 14 Mar 2011, Robin Bowes wrote: > On 14/03/11 14:02, Wayne Marshall wrote: > > > Under normal (default) configurations, whenever perpd(8) > > terminates it is restarted by either perpboot(8), or init(8) with > > a "respawn" configuration in inittab(5). perpd(8) then > > restarts all services marked for activation in /etc/perp. > > So, if I have a service that is normally running, ie. starts at boot, > but I have taken it down manually for whatever reason, and perpd dies, > then my service will also be re-started? And presumably the converse will apply as well. This is a problem with runit (and daemontools) - if a service has a 'down' file, but has been later started, a dying runsv (e.g. if killed by the OoM killer, or by a service which kills its process group) will be replaced by runsvdir, but the service will stay down. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 15:03 ` Charlie Brady @ 2011-03-14 15:35 ` Wayne Marshall 2011-03-14 17:02 ` Laurent Bercot 1 sibling, 0 replies; 14+ messages in thread From: Wayne Marshall @ 2011-03-14 15:35 UTC (permalink / raw) To: Charlie Brady; +Cc: supervision On Mon, 14 Mar 2011 11:03:45 -0400 (EDT) Charlie Brady <charlieb-supervision@budge.apana.org.au> wrote: > > On Mon, 14 Mar 2011, Robin Bowes wrote: > > > On 14/03/11 14:02, Wayne Marshall wrote: > > > > > Under normal (default) configurations, whenever perpd(8) > > > terminates it is restarted by either perpboot(8), or > > > init(8) with a "respawn" configuration in inittab(5). > > > perpd(8) then restarts all services marked for activation > > > in /etc/perp. > > > > So, if I have a service that is normally running, ie. starts > > at boot, but I have taken it down manually for whatever > > reason, and perpd dies, then my service will also be > > re-started? > > And presumably the converse will apply as well. This is a > problem with runit (and daemontools) - if a service has a > 'down' file, but has been later started, a dying runsv (e.g. > if killed by the OoM killer, or by a service which kills its > process group) will be replaced by runsvdir, but the service > will stay down. > This is not so much a "problem" of design, but rather of adminsistrative clarity. Use "flag.down" only when you don't want a service to start immediately with perpd, but do want it activated and available to perpctl administration. As an example, I use a wpa_supplicant service definition on my laptop. It is defined with "flag.down", because I don't care for a wireless connection in all circumstances. Other network scripts may then call: perpctl up wpa_supplicant or perpctl down wpa_supplicant as necessary. Otherwise -- and generally for any truly persistent process service -- administrators will avoid using the "flag.down" mechanism in favor of the easy, in-place service activation/deactivation mechanism that perpd provides with the service directory sticky bit. Wayne ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 15:03 ` Charlie Brady 2011-03-14 15:35 ` Wayne Marshall @ 2011-03-14 17:02 ` Laurent Bercot 2011-03-14 17:42 ` Charlie Brady 1 sibling, 1 reply; 14+ messages in thread From: Laurent Bercot @ 2011-03-14 17:02 UTC (permalink / raw) To: supervision >> So, if I have a service that is normally running, ie. starts at boot, >> but I have taken it down manually for whatever reason, and perpd dies, >> then my service will also be re-started? > > And presumably the converse will apply as well. This is a problem with > runit (and daemontools) - if a service has a 'down' file, but has been > later started, a dying runsv (e.g. if killed by the OoM killer, or by a > service which kills its process group) will be replaced by runsvdir, but > the service will stay down. We've already discussed this. The default state of a service is controlled by the absence or presence of a 'down' file. The actual state of a service can be changed either manually or via a script, but this state *cannot be strongly guaranteed* if it does not match the default. This is an unavoidable limit of daemontools-like supervision schemes; do not blame it on perp's design. -- Laurent ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [announce] perp-2.03: persistent process supervision 2011-03-14 17:02 ` Laurent Bercot @ 2011-03-14 17:42 ` Charlie Brady 0 siblings, 0 replies; 14+ messages in thread From: Charlie Brady @ 2011-03-14 17:42 UTC (permalink / raw) To: Laurent Bercot; +Cc: supervision On Mon, 14 Mar 2011, Laurent Bercot wrote: > >> So, if I have a service that is normally running, ie. starts at boot, > >> but I have taken it down manually for whatever reason, and perpd dies, > >> then my service will also be re-started? > > > > And presumably the converse will apply as well. This is a problem with > > runit (and daemontools) - if a service has a 'down' file, but has been > > later started, a dying runsv (e.g. if killed by the OoM killer, or by a > > service which kills its process group) will be replaced by runsvdir, but > > the service will stay down. > > We've already discussed this. The default state of a service is controlled > by the absence or presence of a 'down' file. The actual state of a service > can be changed either manually or via a script, but this state *cannot be > strongly guaranteed* if it does not match the default. This is an > unavoidable limit of daemontools-like supervision schemes; do not blame it > on perp's design. I don't. I would just pointing it out, as one case of what can happen when state is in memory. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-03-14 18:43 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-03-14 10:39 [announce] perp-2.03: persistent process supervision Wayne Marshall 2011-03-14 13:17 ` Laurent Bercot 2011-03-14 14:02 ` Wayne Marshall 2011-03-14 14:23 ` Robin Bowes 2011-03-14 14:34 ` Wayne Marshall 2011-03-14 16:47 ` Laurent Bercot 2011-03-14 17:39 ` Wayne Marshall 2011-03-14 17:52 ` Paul Jarc 2011-03-14 18:43 ` Wayne Marshall 2011-03-14 18:34 ` Laurent Bercot 2011-03-14 15:03 ` Charlie Brady 2011-03-14 15:35 ` Wayne Marshall 2011-03-14 17:02 ` Laurent Bercot 2011-03-14 17:42 ` Charlie Brady
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).