supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* Trouble starting services on boot with s6-rc
@ 2022-08-20 11:01 dark.pen9108
  2022-08-20 12:57 ` Laurent Bercot
  0 siblings, 1 reply; 4+ messages in thread
From: dark.pen9108 @ 2022-08-20 11:01 UTC (permalink / raw)
  To: supervision

[-- Attachment #1: Type: text/plain, Size: 4671 bytes --]

Hi,

Hoping this is the right place to ask for some help as I am very new to s6 and not well versed on any init system.  

I’m attempting to use what I understand to be the “new” s6-rc way to start some processes in a docker container. 

Just for testing out what I want to do, I have a oneshot who’s job is just to sleep, and a longrun that depends on the oneshot. 

I have created the following files 

[Aug 19 18:42:24 root@s6 ~]# ll /etc/s6-overlay/s6-rc.d/longrun_test/
total 12
-rw-rw-r-- 1 root root 13 Aug 17 23:19 dependencies
-rw-rw-r-- 1 root root 53 Aug 19 18:29 run
-rw-rw-r-- 1 root root  8 Aug 17 23:19 type
[Aug 19 18:42:24 root@s6 ~]# cat /etc/s6-overlay/s6-rc.d/longrun_test/type
longrun
[Aug 19 18:42:24 root@s6 ~]# cat /etc/s6-overlay/s6-rc.d/longrun_test/run
#!/command/execlineb -P
/usr/local/bin/longrun_start
[Aug 19 18:42:24 root@s6 ~]# cat /etc/s6-overlay/s6-rc.d/longrun_test/dependencies
sleeponstart
[Aug 19 18:42:24 root@s6 ~]# ll /etc/s6-overlay/s6-rc.d/sleeponstart/
total 8
-rw-rw-r-- 1 root root  8 Aug 15 21:53 type
-rw-rw-r-- 1 root root 52 Aug 16 20:25 up
[Aug 19 18:42:24 root@s6 ~]# cat /etc/s6-overlay/s6-rc.d/sleeponstart/type
oneshot
[Aug 19 18:42:24 root@s6 ~]# cat /etc/s6-overlay/s6-rc.d/sleeponstart/up
#!/command/execlineb -P
/usr/local/bin/sleeponstart

I can see that my services are registered with s6-rc 

[Aug 19 18:43:04 root@s6 ~]# s6-rc-db list all | grep -e longrun_test -e sleeponstart
sleeponstart
longrun_test

The test services are very simple scripts just to prove they ran
[Aug 19 18:44:23 root@s6 ~]# cat /usr/local/bin/sleeponstart
#!/bin/bash
sleep_time=30
sleep ${sleep_time}
echo "I slept for ${sleep_time}." > /tmp/i_slept
[Aug 19 18:44:23 root@s6 ~]# cat /usr/local/bin/longrun_start
#!/bin/bash
echo "I started the longrun" > /tmp/longrun_started

But when I look in tmp I never see the files created.

[Aug 19 18:44:23 root@s6 ~]# ll /tmp/
total 652
-rw------- 1 root root   5516 Mar  7 20:55 s6-overlay-noarch.tar.xz
-rw------- 1 root root 656864 Mar  7 20:55 s6-overlay-x86_64.tar.xz

If I start the longrun manually it and its dependency do run.  

[Aug 19 18:50:00 root@s6 ~]# s6-rc -a change longrun_test
[Aug 19 18:52:08 root@s6 ~]# ll /tmp/
total 660
-rw-r--r-- 1 root root     16 Aug 19 18:50 i_slept
-rw-r--r-- 1 root root     22 Aug 19 18:52 longrun_started
-rw------- 1 root root   5516 Mar  7 20:55 s6-overlay-noarch.tar.xz
-rw------- 1 root root 656864 Mar  7 20:55 s6-overlay-x86_64.tar.xz

So I would normally just think I am missing something to make them start on boot BUT, soon after boot I do see my oneshot running. 
[Aug 19 18:53:54 root@s6 /]# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.5  0.0    200    64 ?        Ss   18:53   0:00 /package/admin/s6/command/s6-svscan -d4 -- /run/service
root          14  0.0  0.0   4152  3104 pts/0    Ss+  18:53   0:00 /bin/sh -e /run/s6/basedir/scripts/rc.init top
root          15  0.0  0.0    204    64 ?        S    18:53   0:00 s6-supervise s6-linux-init-shutdownd
root          16  0.0  0.0    196     4 ?        Ss   18:53   0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -c /run/s6/basedir -g 3000 -C -B
root          23  0.0  0.0    204    68 ?        S    18:53   0:00 s6-supervise s6rc-oneshot-runner
root          24  0.0  0.0    204    64 ?        S    18:53   0:00 s6-supervise longrun_test
root          25  0.0  0.0    204    60 ?        S    18:53   0:00 s6-supervise s6rc-fdholder
root          28  0.0  0.0    208    64 pts/0    S+   18:53   0:00 s6-rc -v2 -u -t 5000 -- change top
root          32  0.0  0.0    180     4 ?        Ss   18:53   0:00 /package/admin/s6/command/s6-ipcserverd -1 -- /package/admin/s6/command/s6-ipcserver-access -v0 -E -l0 -i data/rules
root          34  0.0  0.0    192    52 pts/0    S+   18:53   0:00 /package/admin/s6-2.11.1.0/command/s6-sudoc -e -t 30000 -T 4999 -- up 3
root          36  0.0  0.0    200    60 ?        S    18:53   0:00 /package/admin/s6/command/s6-sudod -t 30000 -- /package/admin/s6-rc/command/s6-rc-oneshot-run -l ../.. --
root          38  0.0  0.0   4152  3080 ?        S    18:53   0:00 /bin/bash /usr/local/bin/sleeponstart <--------------
root          40  0.0  0.0   2616   940 ?        S    18:53   0:00 sleep 30 <-----------------
root          53  1.0  0.0   4812  3836 pts/1    Ss   18:53   0:00 /bin/bash
root          74  0.0  0.0   7396  3268 pts/1    R+   18:53   0:00 ps aux

Sorry if this was much more info than was relevant. Any help would be great. Thanx so much! 

-Benjamin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Trouble starting services on boot with s6-rc
  2022-08-20 11:01 Trouble starting services on boot with s6-rc dark.pen9108
@ 2022-08-20 12:57 ` Laurent Bercot
  2022-08-20 14:30   ` dark.pen9108
  0 siblings, 1 reply; 4+ messages in thread
From: Laurent Bercot @ 2022-08-20 12:57 UTC (permalink / raw)
  To: supervision

>Hoping this is the right place to ask for some help as I am very new to s6 and not well versed on any init system.

  s6-overlay questions are normally asked in the "Issues" section of the
s6-overlay GitHub repository, but since yours are really s6-rc 
questions,
it's fine :)  (even though the answers are related to s6-overlay!)


>I can see that my services are registered with s6-rc
>[Aug 19 18:43:04 root@s6 ~]# s6-rc-db list all | grep -e longrun_test -e sleeponstart
>sleeponstart
>longrun_test

  That proves your services are *in the database*, not that they're going
to get started. To make sure your services are started on boot, you
should also check that they're declared in the "user" bundle (that's
the bundle where users declare what services are part of the final
machine state, by s6-overlay policy).

$ s6-rc-db contents user | grep -e longrun_test -e sleeponstart

  But since they *are* started when you boot your container, it means
they're indeed declared there, so that's not the issue. The real issue
is here:

>root          34  0.0  0.0    192    52 pts/0    S+   18:53   0:00 /package/admin/s6-2.11.1.0/command/s6-sudoc -e -t 30000 -T 4999 -- up 3

  The "-T 4999" part means there's a 5 second timeout on running your
sleeponstart service. And since this service is waiting for 30 seconds
before succeeding, and 30 > 5... it means that it times out after 5
seconds and gets killed, so the transition fails, and longrun_test
(which depends on it) is never brought up.

  s6-overlay-specific section:

  - You should see all that happening in your container's logs - s6-rc
prints informative messages when something fails. Unless you have
changed the S6_VERBOSITY value from the default 2 to something lower.

  - The 5 seconds timeout likely comes from the fact that you have not
modified the S6_CMD_WAIT_FOR_SERVICES_MAXTIME value, which is 5000 by
default. As is written in both the README[1] and the migration guide[2] 
:P
If your real service needs it, you can disable the timeout by adding to
your Dockerfile:
ENV S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
or you can adjust it to a bigger value than 5000.

  Good luck,

--
  Laurent


[1]: https://github.com/just-containers/s6-overlay/blob/master/README.md
[2]: 
https://github.com/just-containers/s6-overlay/blob/master/MOVING-TO-V3.md


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Trouble starting services on boot with s6-rc
  2022-08-20 12:57 ` Laurent Bercot
@ 2022-08-20 14:30   ` dark.pen9108
  2022-08-20 14:58     ` Laurent Bercot
  0 siblings, 1 reply; 4+ messages in thread
From: dark.pen9108 @ 2022-08-20 14:30 UTC (permalink / raw)
  To: supervision

[-- Attachment #1: Type: text/plain, Size: 4869 bytes --]

> 
>> root          34  0.0  0.0    192    52 pts/0    S+   18:53   0:00 /package/admin/s6-2.11.1.0/command/s6-sudoc -e -t 30000 -T 4999 -- up 3
> 
>   The "-T 4999" part means there's a 5 second timeout on running your
> sleeponstart service. And since this service is waiting for 30 seconds
> before succeeding, and 30 > 5... it means that it times out after 5
> seconds and gets killed, so the transition fails, and longrun_test
> (which depends on it) is never brought up.
> 
>   s6-overlay-specific section:
> 
>   - You should see all that happening in your container's logs - s6-rc
> prints informative messages when something fails. Unless you have
> changed the S6_VERBOSITY value from the default 2 to something lower.
> 
>   - The 5 seconds timeout likely comes from the fact that you have not
> modified the S6_CMD_WAIT_FOR_SERVICES_MAXTIME value, which is 5000 by
> default. As is written in both the README[1] and the migration guide[2] 
> :P
> If your real service needs it, you can disable the timeout by adding to
> your Dockerfile:
> ENV S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
> or you can adjust it to a bigger value than 5000.

ahh, very obvious. Sorry, I did read the docs(quite a few times) but starting from zero it was quite a lot to grok and I overlooked this(and I'm sure a number of other things). After adjusting S6_CMD_WAIT_FOR_SERVICES_MAXTIME, things are working as expected. 

I am a bit ashamed to admit I cannot find the logs. From reading https://wiki.gentoo.org/wiki/S6_and_s6-rc-based_init_system#logger I thought maybe I should be looking for file /run/uncaught-logs but could not find any such file in my docker instance(I understand, docker is not Gentoo).

While the docs did speak a lot to the directory structure used by s6, I still am finding it quite hard to figure out what the default directories are for some things. (e.g. I was clear on where my uncompiled s6-rc service directories should go but they seemed to "magically" get complied on boot and show up in a scan_dir) 

One additional item. As seems like not a great idea to smash the timeout for all services. Is there any way to adjust it on a per service basis? If not consider me a +1 to kindly add it to a wishlist somewhere.

Thanx so much for the quick reply, help, and great free software!

n.b. You should put up a BTC lightning "tip bucket" somewhere :)  

-Benjamin

On Sat, Aug 20, 2022, at 8:57 AM, Laurent Bercot wrote:
> >Hoping this is the right place to ask for some help as I am very new to s6 and not well versed on any init system.
> 
>   s6-overlay questions are normally asked in the "Issues" section of the
> s6-overlay GitHub repository, but since yours are really s6-rc 
> questions,
> it's fine :)  (even though the answers are related to s6-overlay!)
> 
> 
> >I can see that my services are registered with s6-rc
> >[Aug 19 18:43:04 root@s6 ~]# s6-rc-db list all | grep -e longrun_test -e sleeponstart
> >sleeponstart
> >longrun_test
> 
>   That proves your services are *in the database*, not that they're going
> to get started. To make sure your services are started on boot, you
> should also check that they're declared in the "user" bundle (that's
> the bundle where users declare what services are part of the final
> machine state, by s6-overlay policy).
> 
> $ s6-rc-db contents user | grep -e longrun_test -e sleeponstart
> 
>   But since they *are* started when you boot your container, it means
> they're indeed declared there, so that's not the issue. The real issue
> is here:
> 
> >root          34  0.0  0.0    192    52 pts/0    S+   18:53   0:00 /package/admin/s6-2.11.1.0/command/s6-sudoc -e -t 30000 -T 4999 -- up 3
> 
>   The "-T 4999" part means there's a 5 second timeout on running your
> sleeponstart service. And since this service is waiting for 30 seconds
> before succeeding, and 30 > 5... it means that it times out after 5
> seconds and gets killed, so the transition fails, and longrun_test
> (which depends on it) is never brought up.
> 
>   s6-overlay-specific section:
> 
>   - You should see all that happening in your container's logs - s6-rc
> prints informative messages when something fails. Unless you have
> changed the S6_VERBOSITY value from the default 2 to something lower.
> 
>   - The 5 seconds timeout likely comes from the fact that you have not
> modified the S6_CMD_WAIT_FOR_SERVICES_MAXTIME value, which is 5000 by
> default. As is written in both the README[1] and the migration guide[2] 
> :P
> If your real service needs it, you can disable the timeout by adding to
> your Dockerfile:
> ENV S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
> or you can adjust it to a bigger value than 5000.
> 
>   Good luck,
> 
> --
>   Laurent
> 
> 
> [1]: https://github.com/just-containers/s6-overlay/blob/master/README.md
> [2]: 
> https://github.com/just-containers/s6-overlay/blob/master/MOVING-TO-V3.md
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Trouble starting services on boot with s6-rc
  2022-08-20 14:30   ` dark.pen9108
@ 2022-08-20 14:58     ` Laurent Bercot
  0 siblings, 0 replies; 4+ messages in thread
From: Laurent Bercot @ 2022-08-20 14:58 UTC (permalink / raw)
  To: supervision


>I am a bit ashamed to admit I cannot find the logs. From reading https://wiki.gentoo.org/wiki/S6_and_s6-rc-based_init_system#logger I thought maybe I should be looking for file /run/uncaught-logs but could not find any such file in my docker instance(I understand, docker is not Gentoo).

  By default, s6-overlay containers don't have a catch-all logger, so
all the logs fall through to the container's stdout/stderr.
  You can have an in-container catch-all logger that logs to
/run/uncaught-logs if you define S6_LOGGING to 1 or 2.


>While the docs did speak a lot to the directory structure used by s6, I still am finding it quite hard to figure out what the default directories are for some things. (e.g. I was clear on where my uncompiled s6-rc service directories should go but they seemed to "magically" get complied on boot and show up in a scan_dir)

  That's a thing with s6, and even s6-rc: it does not define policies, 
but
only mechanism. In other words: it lets you place things wherever you 
want,
there's no default.
  Of course, higher-level software that uses the s6 bricks needs to 
define
policies in order to get things done; that's why s6-overlay is a set of
scripts putting directories in certain places and calling s6 binaries
with certain arguments and options.
  s6-overlay uses s6-rc-compile under the hood, so it compiles your
database for you at boot time, to keep things as simple as possible 
(even
though it's not the optimal way of doing it).



>One additional item. As seems like not a great idea to smash the timeout for all services. Is there any way to adjust it on a per service basis? If not
>consider me a +1 to kindly add it to a wishlist somewhere.

  You can define timeout-up and timeout-down files in your s6-rc source
definition directories, they do just that. :)


>n.b. You should put up a BTC lightning "tip bucket" somewhere :)

  Thank you. Please see this Twitter thread:
  https://twitter.com/laurentbercot/status/1209247040657674240

--
  Laurent


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-08-20 14:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-20 11:01 Trouble starting services on boot with s6-rc dark.pen9108
2022-08-20 12:57 ` Laurent Bercot
2022-08-20 14:30   ` dark.pen9108
2022-08-20 14:58     ` Laurent Bercot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).