Is this correct behaviour or are these just anomalies? 1. Use of backtick variable assignment on FreeBSD doesn't appear correct 2. Use of emptyenv results in a remnant "defunct" process 3. Should a bundle's contents file include the dependencies of its contents file, for a down change to the bundle to bring the service's components down? 1. I expected to see the date in seconds since time epoch, but result is variable name # execlineb -Pc 'backtick D { date "+%s" } echo $D' $D Note: this isn't how I intend to use backtick, but I try to use the simplest case to understand how things work --- 2. When I use emptyenv within an execlineb script, I have a "defunct" zombie process 89685 3 S< 0:00.01 |-- s6-supervise base:time-srv 3020 - S<s 0:00.03 | `-- /usr/local/sbin/ntpd -c /etc/ntp.conf -N -g -u ntpd --nofork 3601 - Z< 0:00.00 | `-- <defunct> The time server script is #!/usr/local/bin/execlineb -P emptyenv multidefine -d " " "base time ntpd /usr/local/sbin/ntpd" { JAIL SERVICE USER PROGRAM } background { echo Starting service $SERVICE using $PROGRAM on $JAIL under user $USER } fdmove 2 1 redirfd -w 1 /m/base:time/fifo $PROGRAM -c /etc/ntp.conf -N -g -u $USER --nofork removing emptyenv, prevents the zombie from being created. Is this normal? --- 3. Is it normal/standard/good practice to include a dependency in a bundle. For example, I have a "time" bundle whose contents are time-srv. time-srv starts the ntpd service, and has as a dependency time-log. Using "s6-rc -u change time", everything behaves as documented, ie starts "time" which starts time-log, then time-srv. However # s6-rc -v 9 -d change base:time s6-rc: info: bringing selected services down s6-rc: info: processing service base:time-srv: stopping s6-rc: info: service base:time-srv stopped successfully # Starting logging service time for base with user s6log folder /var/log/time and the time-log continues running. Admittedly # s6-svstat /s/scan/base:time-srv ; s6-svstat /s/scan/base:time-log down (exitcode 0) 6 seconds, ready 6 seconds # This is time-srv up (pid 85131) 6 seconds # This is time-log,so it has been restarted To obtain the desired/expected behaviour and bring time-log down must it also be added to the bundle's contents? These observations were made using FreeBSD 12.2Stable on amd64. Apologies for still asking newbie questions, but I'm trying to embed s6 here, which translates to properly understand. Regards, Dewayne.
Apologies, my earlier email, item 2, pointed to emptyenv as the cause of zombie processes on FreeBSD 12.2S, actually it is due to background. # execlineb -Pc 'background { echo hello } pipeline { ps -axw } grep defunct' hello 30144 0 Z+ 0:00.00 <defunct> while the following tests both foreground and emptyenv # execlineb -Pc 'emptyenv foreground { echo hello } pipeline { /bin/ps -axw } /usr/bin/grep defunct' hello # Software revision level (as available in the FreeBSD ports system) execline-2.6.0.1 s6-2.9.1.0 s6-rc-0.5.1.2 skalibs-2.9.2.1 Further detail: # execlineb -Pc 'emptyenv background { echo hello } pipeline { /bin/ps -axwwdo pid,ppid,stat,command } /usr/bin/grep -B1 "defunct"' hello 71212 70760 Ss | | `-- -csh (csh) 16885 71212 S+ | | `-- /usr/bin/grep -B1 defunct 17052 16885 Z+ | | |-- <defunct> I've also placed a ktrace and kdump of execlineb -Pc 'ktrace -f /tmp/bgnd.kt /usr/local/bin/background { /bin/ps } echo a' here http://www.heuristicsystems.com/s6/
>1. I expected to see the date in seconds since time epoch, but result is >variable name ># execlineb -Pc 'backtick D { date "+%s" } echo $D' >$D Normal behaviour, since there's no shell to interpret $D as the contents of variable D. Try using "importas D D" before the echo: it will read the value of D and substitute $D with this value, so echo will print the value. Yeah, execline is annoying like that, it's just a habit to take. Also, you generally want "backtick -n", to chomp the newline at the end of your input. >--- >2. When I use emptyenv within an execlineb script, I have a "defunct" >zombie process >89685 3 S< 0:00.01 |-- s6-supervise base:time-srv > 3020 - S<s 0:00.03 | `-- /usr/local/sbin/ntpd -c /etc/ntp.conf >-N -g -u ntpd --nofork > 3601 - Z< 0:00.00 | `-- <defunct> > >The time server script is >#!/usr/local/bin/execlineb -P >emptyenv >multidefine -d " " "base time ntpd /usr/local/sbin/ntpd" { JAIL SERVICE >USER PROGRAM } >background { echo Starting service $SERVICE using $PROGRAM on $JAIL >under user $USER } >fdmove 2 1 >redirfd -w 1 /m/base:time/fifo >$PROGRAM -c /etc/ntp.conf -N -g -u $USER --nofork > >removing emptyenv, prevents the zombie from being created. Is this normal? The zombie is the echo program in your background block, since it's a direct child of your run script and there's nothing that reaps it after it's forked (fdmove, redirfd, ntpd - those programs don't expect to inherit a child). So the zombie is expected. To prevent that, use "background -d", which will doublefork your echo program, so it will be reparented to pid 1 which will reap it properly. The anomaly is that you *don't* have that zombie without emptyenv; my first guess is that there's something in your environment that changes the behaviour of ntpd and makes it reap the zombie somehow. >--- >3. Is it normal/standard/good practice to include a dependency in a >bundle. For example, I have a "time" bundle whose contents are >time-srv. time-srv starts the ntpd service, and has as a dependency >time-log. > >Using "s6-rc -u change time", everything behaves as documented, ie >starts "time" which starts time-log, then time-srv. However > ># s6-rc -v 9 -d change base:time >s6-rc: info: bringing selected services down >s6-rc: info: processing service base:time-srv: stopping >s6-rc: info: service base:time-srv stopped successfully ># Starting logging service time for base with user s6log folder >/var/log/time > >and the time-log continues running. If you only have time-srv in your 'time' bundle, then time-srv and time are equivalent. Telling s6-rc to bring down time will do the exact same thing as telling it to bring down time-srv. time-log is not impacted. So the behaviour is expected. If you want "s6-rc -d change time" to also bring down time-log, then yes, you should add time-log to the time bundle. Then 'time' will address both time-srv and time-log. >y 6 seconds # This is time-srv >up (pid 85131) 6 seconds # This is time-log,so it >has been restarted If you're using a manually created named pipe to transmit data from time-srv to time-log, that pipe will close when time-srv exits, and your logger will get EOF and probably exit, which is why it stopped; but time-log's supervisor has received no instruction that it should stop, so it will restart it. This is also expected. The simplest way of achieving the behaviour you want is s6-rc's integrated pipeline feature. Get rid of your named pipe and of your stdout (for time-srv) and stdin (for time-log) redirections; get rid of your time bundle definition. Then declare time-log as a consumer for time-srv and time-srv as a producer for time-log. In the time-log source definition directory, write 'time' into the pipeline-name file. Then recompile your database. This will automatically create a pipe between time-srv and time-log; the pipe will be held open so it won't close even if one of the processes exits; and it will automatically create a 'time' bundle that contains both time-srv and time-log. You're on the right track. :) -- Laurent
>Apologies, my earlier email, item 2, pointed to emptyenv as the cause of
>zombie processes on FreeBSD 12.2S, actually it is due to background.
Ah, then everything is working as intended and there's no anomaly.
background spawns a process as a direct child, so if the parent execs
into a long-lived program that never reaps bastards (children it doesn't
know it has), then the zombie will hang around.
"background -d" was made for this situation, and will avoid the
zombie.
--
Laurent
On 4/10/2020 1:14 pm, Laurent Bercot wrote: >> 1. I expected to see the date in seconds since time epoch, but result is >> variable name >> # execlineb -Pc 'backtick D { date "+%s" } echo $D' >> $D > > Normal behaviour, since there's no shell to interpret $D as the > contents of variable D. Try using "importas D D" before the echo: > it will read the value of D and substitute $D with this value, so > echo will print the value. Yeah, execline is annoying like that, it's > just a habit to take. > Also, you generally want "backtick -n", to chomp the newline at > the end of your input. > > >> --- >> 2. When I use emptyenv within an execlineb script, I have a "defunct" >> zombie process >> 89685 3 S< 0:00.01 |-- s6-supervise base:time-srv >> 3020 - S<s 0:00.03 | `-- /usr/local/sbin/ntpd -c /etc/ntp.conf >> -N -g -u ntpd --nofork >> 3601 - Z< 0:00.00 | `-- <defunct> >> >> The time server script is >> #!/usr/local/bin/execlineb -P >> emptyenv >> multidefine -d " " "base time ntpd /usr/local/sbin/ntpd" { JAIL SERVICE >> USER PROGRAM } >> background { echo Starting service $SERVICE using $PROGRAM on $JAIL >> under user $USER } >> fdmove 2 1 >> redirfd -w 1 /m/base:time/fifo >> $PROGRAM -c /etc/ntp.conf -N -g -u $USER --nofork >> >> removing emptyenv, prevents the zombie from being created. Is this >> normal? > > The zombie is the echo program in your background block, since it's a > direct child of your run script and there's nothing that reaps it > after it's forked (fdmove, redirfd, ntpd - those programs don't expect > to inherit a child). So the zombie is expected. To prevent that, use > "background -d", which will doublefork your echo program, so it will > be reparented to pid 1 which will reap it properly. > EDIT My error, the problem was background, and -d fixes this. > The anomaly is that you *don't* have that zombie without emptyenv; > my first guess is that there's something in your environment that changes > the behaviour of ntpd and makes it reap the zombie somehow. > > >> --- >> 3. Is it normal/standard/good practice to include a dependency in a >> bundle. For example, I have a "time" bundle whose contents are >> time-srv. time-srv starts the ntpd service, and has as a dependency >> time-log. >> >> Using "s6-rc -u change time", everything behaves as documented, ie >> starts "time" which starts time-log, then time-srv. However >> >> # s6-rc -v 9 -d change base:time >> s6-rc: info: bringing selected services down >> s6-rc: info: processing service base:time-srv: stopping >> s6-rc: info: service base:time-srv stopped successfully >> # Starting logging service time for base with user s6log folder >> /var/log/time >> >> and the time-log continues running. > > If you only have time-srv in your 'time' bundle, then time-srv and > time are equivalent. Telling s6-rc to bring down time will do the > exact same thing as telling it to bring down time-srv. time-log is > not impacted. So the behaviour is expected. > > If you want "s6-rc -d change time" to also bring down time-log, then > yes, you should add time-log to the time bundle. Then 'time' will > address both time-srv and time-log. > > >> y 6 seconds # This is time-srv >> up (pid 85131) 6 seconds # This is time-log,so it >> has been restarted > > If you're using a manually created named pipe to transmit data > from time-srv to time-log, that pipe will close when time-srv exits, > and your logger will get EOF and probably exit, which is why it > stopped; but time-log's supervisor has received no instruction that > it should stop, so it will restart it. This is also expected. > > The simplest way of achieving the behaviour you want is s6-rc's > integrated pipeline feature. Get rid of your named pipe and of your > stdout (for time-srv) and stdin (for time-log) redirections; get rid > of your time bundle definition. Then declare time-log as a consumer > for time-srv and time-srv as a producer for time-log. In the > time-log source definition directory, write 'time' into the > pipeline-name file. Then recompile your database. > > This will automatically create a pipe between time-srv and time-log; > the pipe will be held open so it won't close even if one of the > processes exits; and it will automatically create a 'time' bundle > that contains both time-srv and time-log. > > You're on the right track. :) > > -- > Laurent > > Laurent, Thank-you very much. Using your advise (re 1 & 2) I've redeployed our testing platform and everything works as expected :) re 3. Implementing the producer-for/consumer-for pair, we've gone from (The application server in jail b3 to log server in jail b2 Ref1). # cat b3:named-setup2/up #!/usr/local/bin/execlineb -P define D /m/b3/fifo/named foreground { if -n { test -p $D } foreground { /usr/bin/mkfifo $D } } foreground { /usr/sbin/chown s6log:named $D } foreground { /bin/chmod 720 $D } # cat b3:named2/run #!/usr/local/bin/execlineb -P fdmove 2 1 redirfd -w 1 /m/b3/fifo/named /usr/sbin/jexec b3 /usr/local/sbin/named -f -n 1 -U 1 -u bind -c /usr/local/etc/namedb/named.conf # cat b3:named-log2/run #!/usr/local/bin/execlineb -P emptyenv redirfd -r 0 /m/b3/fifo/named /usr/sbin/jexec -U s6log b2 /usr/local/bin/s6-log -b n14 r7000 s100000 S3000000 !"/usr/bin/xz -7q" /var/log/named #Read as: run in jail b2 as user s6log the s6-log program TO # cat b3:named3/run #!/usr/local/bin/execlineb -P /usr/sbin/jexec b3 /usr/local/sbin/named -f -n 1 -U 1 -u bind -c /usr/local/etc/namedb/named.conf # cat b3:named-log3/run #!/usr/local/bin/execlineb -P emptyenv /usr/sbin/jexec -U s6log b2 /usr/local/bin/s6-log -b n14 r7000 s100000 S3000000 !"/usr/bin/xz -7q" /var/log/named A significant reduction in complexity. However, and the reason for my delay in replying. Magic happened! I was now transmitting data which crossed jail barriers (from b3 "named" to b2 "named logging"). I needed to consult with one of the FreeBSD developers to ensure that a security hole wasn't occurring. :) It appears (and I'm assuming) that s6 uses pseudo terminal sub-system to communicate. In this specific case below, per pts/3 # procstat -f 96796 95651 92390 | grep -E "text|pts" 96796 named text v r r------- - - - /jails/b3/usr/local/sbin/named 96796 named 2 v c rw------ 71 10677 - /dev/pts/3 95651 s6-log text v r r------- - - - /jails/b2/usr/local/bin/s6-log 95651 s6-log 1 v c rw------ 71 10677 - /dev/pts/3 95651 s6-log 2 v c rw------ 71 10677 - /dev/pts/3 92390 s6-fdholderd text v r r------- - - - /usr/local/bin/s6-fdholderd 92390 s6-fdholderd 2 v c rw------ 71 10677 - /dev/pts/3 I don't know if this is a good thing, so further investigation required. But for now s6 continues to make magic happen. Kind regards, Dewayne. References: 1. https://www.freebsd.org/cgi/man.cgi?query=jail&apropos=0&sektion=0&manpath=FreeBSD+12.1-RELEASE&arch=default&format=html 2. https://www.freebsd.org/cgi/man.cgi?query=nullfs&apropos=0&sektion=0&manpath=FreeBSD+12.1-RELEASE&arch=default&format=html
Glad it's working for you! >A significant reduction in complexity. However, and the reason for my >delay in replying. Magic happened! I was now transmitting data which >crossed jail barriers (from b3 "named" to b2 "named logging"). I needed >to consult with one of the FreeBSD developers to ensure that a security >hole wasn't occurring. :) Well, that's also what you were doing with your former b3:named2 and b3:named-log2, except you were transmitting the data via a named pipe created in your run script explicitly instead of an anonymous pipe created by s6-rc implicitly. The integrated pipe feature does not touch your security model at all; if you were to consult with a FreeBSD developer, you needed to do it before making the change. :) >It appears (and I'm assuming) that s6 uses pseudo terminal sub-system to >communicate. In this specific case below, per pts/3 No, s6 does not use pseudo-terminals at all; all it does is let processes inherit fds from their parent. In your case, /dev/pts/3 seems to be s6-svscan's stdout and stderr; if you don't want to have pseudo-terminals, you should check the script that launches your supervision tree, and redirect s6-svscan's outputs accordingly. -- Laurent