* supervising postfix
@ 2004-10-16 4:35 Vincent Danen
2004-10-16 19:11 ` Charlie Brady
0 siblings, 1 reply; 8+ messages in thread
From: Vincent Danen @ 2004-10-16 4:35 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 2724 bytes --]
Ok, this is driving me nuts, and I'm hoping maybe someone else has
dealt with and solved this. In Annvix, we ship both exim and postfix
(exim being preferred... it runs awesome supervised). The same can't
be said of postfix, however.
I spent all day upgrading my rpm packages from 2.0.13 to 2.1.5 to find
that somewhere, somehow, it's operating a little differently, and I'm
not at all sure why. I've found some info regarding postfix under
daemontools, but it looks like it's all for version 2.0.x or earlier.
This is the run script I'm using:
!/bin/sh
# this was originally posted at
http://mandree.home.pages.de/postfix/daemontools.html
# but doesn't seem to be there anymore... thanks google!
set -e
PATH="/sbin:/usr/sbin:/bin:/usr/bin"
# this runs postfix supervised
command_directory=`postconf -h command_directory`
daemon_directory=`$command_directory/postconf -h daemon_directory`
# kill postfix if running to ensure we run supervised
$daemon_directory/master -t || $command_directory/postfix stop
# make consistency check
#$command_directory/postfix check >/dev/console 2>&1
$daemon_directory/master 2>&1
I can't use exec for master because if I do I get this written to my
mail.log:
Oct 9 14:31:46 test postfix/master[1941]: fatal: unable to set session
and process group ID: Operation not permitted
However, for some odd reason if I manually run the run script (ie. sh
-x ./run) the master process starts and starts the children properly,
etc. If I try to make /service/postfix available to runsv by itself,
runsv never seems to pick up. But if I do "runsv /service/postfix"
then it will run (but that's not how it should be). If I do "runsvctrl
u postfix" I get:
[root@test postfix]# runsvctrl u /service/postfix
runsvctrl: warning: /service/postfix: supervise not running.
I'm really stumped on this one... I've never seen runsv not respond to
a service like this. Personally, I wouldn't mind ditching postfix
entirely but I think I'd have some users upset with me, and I'd really
like to not have wasted an entire day on this (and I don't want to run
postfix by itself outside of runsv... that defeats much of the purpose
of the system).
Anyone have any ideas they could toss to me? I'm about ready for any
bones here and willing to try anything. If I had hair I'd be ripping
it out.
If there's more info I can provide, please ask... I think I've included
all the pertinent info, but I may have missed something.
--
Annvix - Secure Linux Server: http://annvix.org/
*Please note gpg keyid FE6F2AFD has been replaced with keyid FEE30AD4*
"lynx -source http://linsec.ca/vdanen.asc | gpg --import"
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C A2BC 2EBC 5E32 FEE3 0AD4}
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 4:35 supervising postfix Vincent Danen
@ 2004-10-16 19:11 ` Charlie Brady
2004-10-16 19:28 ` Vincent Danen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Charlie Brady @ 2004-10-16 19:11 UTC (permalink / raw)
Cc: <supervision@list.skarnet.org><supervision@list.skarnet.org>
On Fri, 15 Oct 2004, Vincent Danen wrote:
> In Annvix, we ship both exim and postfix (exim being preferred... it
> runs awesome supervised). The same can't be said of postfix, however.
You can't expect any help from Postfix's author:
http://archives.neohapsis.com/archives/postfix/2001-08/1455.html
Postfix must be started with the postfix-script shell script that
is provided with the Postfix source code. Other startup procedures
are not supported. In other words, if you start Postfix in a
different manner, then you've broken the Postfix warranty. You do
so at your own risk, and I don't care why it breaks.
[In case you are not aware, there is a long running feud between DJB and
Wietse Venema.]
> $daemon_directory/master 2>&1
>
>
> I can't use exec for master because if I do I get this written to my
> mail.log:
>
> Oct 9 14:31:46 test postfix/master[1941]: fatal: unable to set session
> and process group ID: Operation not permitted
Wietse goes on to say:
That said, the problem described below could be evidence of a bug
in the implementation of the setsid() system call.
The Postfix master "super-server" calls setsid(). setsid() makes
the Postfix master the leader of a new process group. Any signals
sent by Postfix to the default process group are limited to processes
within that new process group. If such a signal kills the master's
parent process, then then the kernel's implementation of setsid()
is broken and needs to be fixed.
Which is all very easy to say. Perhaps he made that statement before
checking the return value of setsid(), as postfix now appears to do.
"man 2 setsid" should help you:
ERRORS
On error, -1 will be returned. The only error which can happen is
EPERM. It is returned when the process group ID of any process equals
the PID of the calling process. Thus, in particular, setsid fails if
the calling process is already a process group leader.
I don't see the logic in postfix interpreting this as a fatal error.
Postfix wanted to be the process group leader. It already was. Where's
the fatal problem?
> However, for some odd reason if I manually run the run script (ie. sh
> -x ./run) the master process starts and starts the children properly,
> etc.
That'd be right, since your shell is not a process group leader.
> I'm really stumped on this one...
You'll either need to ensure that the run script is not a process group
leader (remove -P from runsvdir, and possibly add "chpst -P" to most other
run scripts), or fix postfix to turn the fatal error into a warning.
They're my guesses, anyway. Like you, I choose not to run postfix, so I
don't have first-hand experience with it and its foibles.
> Anyone have any ideas they could toss to me? I'm about ready for any
> bones here and willing to try anything. If I had hair I'd be ripping
> it out.
:-)
---
Charlie
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 19:11 ` Charlie Brady
@ 2004-10-16 19:28 ` Vincent Danen
2004-10-16 20:11 ` Charlie Brady
2004-10-16 20:42 ` Charlie Brady
2004-11-01 21:45 ` Csillag Tamas
2 siblings, 1 reply; 8+ messages in thread
From: Vincent Danen @ 2004-10-16 19:28 UTC (permalink / raw)
Cc: <supervision@list.skarnet.org><supervision@list.skarnet.org>
[-- Attachment #1: Type: text/plain, Size: 4716 bytes --]
On 16-Oct-04, at 1:11 PM, Charlie Brady wrote:
>> In Annvix, we ship both exim and postfix (exim being preferred... it
>> runs awesome supervised). The same can't be said of postfix, however.
>
> You can't expect any help from Postfix's author:
>
> http://archives.neohapsis.com/archives/postfix/2001-08/1455.html
>
> Postfix must be started with the postfix-script shell script that
> is provided with the Postfix source code. Other startup procedures
> are not supported. In other words, if you start Postfix in a
> different manner, then you've broken the Postfix warranty. You do
> so at your own risk, and I don't care why it breaks.
>
> [In case you are not aware, there is a long running feud between DJB
> and
> Wietse Venema.]
Oh, I'm well aware of this feud... unfortunately, it's not limited to
Venema and DJB, but to the respective qmail and postfix users. I used
to be a (non-rabid) qmail user.. I prefer the tranquility of exim now.
=)
>> $daemon_directory/master 2>&1
>>
>>
>> I can't use exec for master because if I do I get this written to my
>> mail.log:
>>
>> Oct 9 14:31:46 test postfix/master[1941]: fatal: unable to set
>> session
>> and process group ID: Operation not permitted
>
> Wietse goes on to say:
>
> That said, the problem described below could be evidence of a bug
> in the implementation of the setsid() system call.
>
> The Postfix master "super-server" calls setsid(). setsid() makes
> the Postfix master the leader of a new process group. Any signals
> sent by Postfix to the default process group are limited to processes
> within that new process group. If such a signal kills the master's
> parent process, then then the kernel's implementation of setsid()
> is broken and needs to be fixed.
>
> Which is all very easy to say. Perhaps he made that statement before
> checking the return value of setsid(), as postfix now appears to do.
>
> "man 2 setsid" should help you:
>
> ERRORS
> On error, -1 will be returned. The only error which can
> happen is
> EPERM. It is returned when the process group ID of any
> process equals
> the PID of the calling process. Thus, in particular, setsid
> fails if
> the calling process is already a process group leader.
>
> I don't see the logic in postfix interpreting this as a fatal error.
> Postfix wanted to be the process group leader. It already was. Where's
> the fatal problem?
Hmmm... yeah... that doesn't make much sense.
>> However, for some odd reason if I manually run the run script (ie. sh
>> -x ./run) the master process starts and starts the children properly,
>> etc.
>
> That'd be right, since your shell is not a process group leader.
Ahhh... makes sense.
>> I'm really stumped on this one...
>
> You'll either need to ensure that the run script is not a process group
> leader (remove -P from runsvdir, and possibly add "chpst -P" to most
> other
> run scripts), or fix postfix to turn the fatal error into a warning.
runsvdir doesn't run with -P. I tried using chpst -P on postfix, but
that didn't work. I'm not too terribly interested in changing all the
runscripts to chpst -P every other service (I haven't had the need to
do it for any yet).
Patching postfix is not my idea of a good time, either. I'd prefer to
not mangle as much software as possible because it becomes a
maintenance nuisance.
> They're my guesses, anyway. Like you, I choose not to run postfix, so I
> don't have first-hand experience with it and its foibles.
Lucky us I guess... =)
I think what I may end up doing is calling "postfix start" from stage 2
if something like /etc/sysconfig/postfix contains "START=yes" or
something similar. Then in stage 3 I'll issue a "postfix stop". Goes
against how I like to do things, but it seems like "master" is doing a
bit of supervision on it's own so instead of using (on Annvix anyways)
"srv stop postfix" one would have to issue "postfix stop". I dislike
that it needs to be different, but at least this way I don't have to
fall back to a traditional initscript. I could then have a runscript
for service postfix that just checks every few seconds to make sure
that master is still running, and if it is, sleep for another 5 seconds
and then do another check. If master doesn't seem to be running, then
just issue "postfix start" and sleep again.
A bit of a compromise, but I think it might be the best solution.
Thanks for the insight, Charlie.
--
Annvix - Secure Linux Server: http://annvix.org/
*Please note gpg keyid FE6F2AFD has been replaced with keyid FEE30AD4*
"lynx -source http://linsec.ca/vdanen.asc | gpg --import"
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C A2BC 2EBC 5E32 FEE3 0AD4}
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 19:28 ` Vincent Danen
@ 2004-10-16 20:11 ` Charlie Brady
2004-10-16 23:37 ` Vincent Danen
0 siblings, 1 reply; 8+ messages in thread
From: Charlie Brady @ 2004-10-16 20:11 UTC (permalink / raw)
Cc: supervision
On Sat, 16 Oct 2004, Vincent Danen wrote:
> >> $daemon_directory/master 2>&1
> >>
> >> I can't use exec for master because if I do I get this written to my
> >> mail.log:
> >>
> >> Oct 9 14:31:46 test postfix/master[1941]: fatal: unable to set
> >> session
> >> and process group ID: Operation not permitted
...
> >> I'm really stumped on this one...
> >
> > You'll either need to ensure that the run script is not a process group
> > leader (remove -P from runsvdir, and possibly add "chpst -P" to most
> > other
> > run scripts), or fix postfix to turn the fatal error into a warning.
>
> runsvdir doesn't run with -P. I tried using chpst -P on postfix, but
> that didn't work. I'm not too terribly interested in changing all the
> runscripts to chpst -P every other service (I haven't had the need to
> do it for any yet).
It's a defensive measure. you can't control when or if a process will kill
its own process group. And you don't want any of those processes taking
out all your stage 2. You won't have the need for it, until you have the
need for it!
> Patching postfix is not my idea of a good time, either. I'd prefer to
> not mangle as much software as possible because it becomes a
> maintenance nuisance.
Sure, but you already have a maintenance problem, right now. Postfix
doesn't run for you.
If you are not using -P anywhere, then maybe you've found a bug with
postfix, and it is trying multiple times to become process group leader or
something. Have you straced it, so you can see what is being called when?
> I think what I may end up doing is calling "postfix start" from stage 2
> if something like /etc/sysconfig/postfix contains "START=yes" or
> something similar. Then in stage 3 I'll issue a "postfix stop". Goes
> against how I like to do things, but it seems like "master" is doing a
> bit of supervision on it's own so instead of using (on Annvix anyways)
> "srv stop postfix" one would have to issue "postfix stop". I dislike
> that it needs to be different, but at least this way I don't have to
> fall back to a traditional initscript. I could then have a runscript
> for service postfix that just checks every few seconds to make sure
> that master is still running, and if it is, sleep for another 5 seconds
> and then do another check. If master doesn't seem to be running, then
> just issue "postfix start" and sleep again.
>
> A bit of a compromise, but I think it might be the best solution.
Sounds aweful :-(
---
Charlie
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 19:11 ` Charlie Brady
2004-10-16 19:28 ` Vincent Danen
@ 2004-10-16 20:42 ` Charlie Brady
2004-11-01 21:45 ` Csillag Tamas
2 siblings, 0 replies; 8+ messages in thread
From: Charlie Brady @ 2004-10-16 20:42 UTC (permalink / raw)
Cc: supervision
On Sat, 16 Oct 2004, Charlie Brady wrote:
> I don't see the logic in postfix interpreting this as a fatal error.
> Postfix wanted to be the process group leader. It already was. Where's
> the fatal problem?
I notice that elsewhere in postfix (in pipe_command.c) setsid failure is
not fatal:
...
/*
* Child. Run the child in a separate process group so that the
* parent can kill not just the child but also its offspring.
*/
case 0:
set_ugid(args.uid, args.gid);
if (setsid() < 0)
msg_warn("setsid failed: %m");
...
You could do the same in master.c:
...
/*
* Run in a separate process group, so that "postfix stop" can terminate
* all MTA processes cleanly. Give up if we can't separate from our
* parent process. We're not supposed to blow away the parent.
*/
if (setsid() == -1)
msg_fatal("unable to set session and process group ID: %m");
...
The comment of course is bogus, one wouldn't blow away the parent if
setsid() fails, since that implies the parent is already in a separate
process group.
BTW, don't expect to use multilog with postfix:
...
/*
* If started from a terminal, get rid of any tty association. This also
* means that all errors and warnings must go to the syslog daemon.
*/
for (fd = 0; fd < 3; fd++) {
(void) close(fd);
if (open("/dev/null", O_RDWR, 0) != fd)
msg_fatal("open /dev/null: %m");
}
...
The comment there is also bogus, since there is not test for being started
from a terminal.
---
Charlie
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 20:11 ` Charlie Brady
@ 2004-10-16 23:37 ` Vincent Danen
2004-10-17 1:38 ` Vincent Danen
0 siblings, 1 reply; 8+ messages in thread
From: Vincent Danen @ 2004-10-16 23:37 UTC (permalink / raw)
Cc: supervision
[-- Attachment #1: Type: text/plain, Size: 3415 bytes --]
On 16-Oct-04, at 2:11 PM, Charlie Brady wrote:
>>> You'll either need to ensure that the run script is not a process
>>> group
>>> leader (remove -P from runsvdir, and possibly add "chpst -P" to most
>>> other
>>> run scripts), or fix postfix to turn the fatal error into a warning.
>>
>> runsvdir doesn't run with -P. I tried using chpst -P on postfix, but
>> that didn't work. I'm not too terribly interested in changing all the
>> runscripts to chpst -P every other service (I haven't had the need to
>> do it for any yet).
>
> It's a defensive measure. you can't control when or if a process will
> kill
> its own process group. And you don't want any of those processes taking
> out all your stage 2. You won't have the need for it, until you have
> the
> need for it!
Hmmm... so should I be running runsvdir with -P then? And if I do, do
I need to run chpst -P on all the other services?
Defensive measures are good, I'm just not sure of the best way to
implement it. Is running runsvdir with -P sufficient, I guess is what
I'm asking.
>> Patching postfix is not my idea of a good time, either. I'd prefer to
>> not mangle as much software as possible because it becomes a
>> maintenance nuisance.
>
> Sure, but you already have a maintenance problem, right now. Postfix
> doesn't run for you.
Well, it does. Not the way that I exactly want, but I can start
postfix from stage 1 and have it work. Of course, if I do it this way
I have to "exec chpst -P postfix start &" which isn't elegant.
I'm recompiling postfix now with the change to master.c you noted in
your next email and we'll see if I can make master run under
supervision and do the right thing.
> If you are not using -P anywhere, then maybe you've found a bug with
> postfix, and it is trying multiple times to become process group
> leader or
> something. Have you straced it, so you can see what is being called
> when?
Yeah, but most of that is greek to me. =)
>> I think what I may end up doing is calling "postfix start" from stage
>> 2
>> if something like /etc/sysconfig/postfix contains "START=yes" or
>> something similar. Then in stage 3 I'll issue a "postfix stop". Goes
>> against how I like to do things, but it seems like "master" is doing a
>> bit of supervision on it's own so instead of using (on Annvix anyways)
>> "srv stop postfix" one would have to issue "postfix stop". I dislike
>> that it needs to be different, but at least this way I don't have to
>> fall back to a traditional initscript. I could then have a runscript
>> for service postfix that just checks every few seconds to make sure
>> that master is still running, and if it is, sleep for another 5
>> seconds
>> and then do another check. If master doesn't seem to be running, then
>> just issue "postfix start" and sleep again.
>>
>> A bit of a compromise, but I think it might be the best solution.
>
> Sounds aweful :-(
It's not, but not really what I want either. It works, which is
something, and it still doesn't rely on clumsy initscripts. It just
isn't quite the way I wanted it, but we'll see if making master warn on
setsid() failure makes it work "properly".
--
Annvix - Secure Linux Server: http://annvix.org/
*Please note gpg keyid FE6F2AFD has been replaced with keyid FEE30AD4*
"lynx -source http://linsec.ca/vdanen.asc | gpg --import"
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C A2BC 2EBC 5E32 FEE3 0AD4}
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 23:37 ` Vincent Danen
@ 2004-10-17 1:38 ` Vincent Danen
0 siblings, 0 replies; 8+ messages in thread
From: Vincent Danen @ 2004-10-17 1:38 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 1981 bytes --]
On 16-Oct-04, at 5:37 PM, Vincent Danen wrote:
>>> Patching postfix is not my idea of a good time, either. I'd prefer
>>> to
>>> not mangle as much software as possible because it becomes a
>>> maintenance nuisance.
>>
>> Sure, but you already have a maintenance problem, right now. Postfix
>> doesn't run for you.
>
> Well, it does. Not the way that I exactly want, but I can start
> postfix from stage 1 and have it work. Of course, if I do it this way
> I have to "exec chpst -P postfix start &" which isn't elegant.
>
> I'm recompiling postfix now with the change to master.c you noted in
> your next email and we'll see if I can make master run under
> supervision and do the right thing.
Ok, just patched master.c and it works properly supervised now. For
the archives, and anyone else looking to run postfix 2.1.5 supervised:
[vdanen@dionysus SPECS]$ bzcat
../SOURCES/postfix-2.1.5-avx-warnsetsid.patch.bz2
--- postfix-2.1.5/src/master/master.c.avx 2004-10-16
15:47:25.000000000 -0600
+++ postfix-2.1.5/src/master/master.c 2004-10-16 15:49:27.000000000
-0600
@@ -286,9 +286,10 @@
* Run in a separate process group, so that "postfix stop" can
terminate
* all MTA processes cleanly. Give up if we can't separate from our
* parent process. We're not supposed to blow away the parent.
+ * Annvix: to run master supervised, we change this from being
fatal to being a warning
*/
- if (setsid() == -1)
- msg_fatal("unable to set session and process group ID: %m");
+ if (setsid() < 0)
+ msg_warn("setsid failed: %m");
/*
* Make some room for plumbing with file descriptors. XXX This
breaks
Thanks, Charlie! That seemed to work quite well.
--
Annvix - Secure Linux Server: http://annvix.org/
*Please note gpg keyid FE6F2AFD has been replaced with keyid FEE30AD4*
"lynx -source http://linsec.ca/vdanen.asc | gpg --import"
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C A2BC 2EBC 5E32 FEE3 0AD4}
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: supervising postfix
2004-10-16 19:11 ` Charlie Brady
2004-10-16 19:28 ` Vincent Danen
2004-10-16 20:42 ` Charlie Brady
@ 2004-11-01 21:45 ` Csillag Tamas
2 siblings, 0 replies; 8+ messages in thread
From: Csillag Tamas @ 2004-11-01 21:45 UTC (permalink / raw)
First I apologize for a late reply in this thread.
On 10/16, Charlie Brady wrote:
>
> On Fri, 15 Oct 2004, Vincent Danen wrote:
>
> > In Annvix, we ship both exim and postfix (exim being preferred... it
> > runs awesome supervised). The same can't be said of postfix, however.
>
> You can't expect any help from Postfix's author:
>
> http://archives.neohapsis.com/archives/postfix/2001-08/1455.html
>
> Postfix must be started with the postfix-script shell script that
> is provided with the Postfix source code. Other startup procedures
> are not supported. In other words, if you start Postfix in a
> different manner, then you've broken the Postfix warranty. You do
> so at your own risk, and I don't care why it breaks.
>
I'm read that mail above (the link) and it is all about Solaris' bad setsid
implementation, is not it?
>
> > $daemon_directory/master 2>&1
> >
> >
> > I can't use exec for master because if I do I get this written to my
> > mail.log:
> >
> > Oct 9 14:31:46 test postfix/master[1941]: fatal: unable to set session
> > and process group ID: Operation not permitted
Hmm, interesting... it works for me:
$ cat /service/postfix/run
#!/bin/sh
exec 2>&1
exec /usr/lib/postfix/master
And in the log:
2004-11-01_21:24:00.34350 mail.info: postfix/master[15265]: daemon started -- version 2.2-20040829
That reveals the version I am using, what's yours?
--
cstamas
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-11-01 21:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-16 4:35 supervising postfix Vincent Danen
2004-10-16 19:11 ` Charlie Brady
2004-10-16 19:28 ` Vincent Danen
2004-10-16 20:11 ` Charlie Brady
2004-10-16 23:37 ` Vincent Danen
2004-10-17 1:38 ` Vincent Danen
2004-10-16 20:42 ` Charlie Brady
2004-11-01 21:45 ` Csillag Tamas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).