From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12952 Path: news.gmane.org!.POSTED!not-for-mail From: John Mudd Newsgroups: gmane.linux.lib.musl.general Subject: Re: ERROR: epoll_create1 failed: Function not implemented ? Date: Tue, 26 Jun 2018 17:59:48 -0400 Message-ID: References: <20180625234615.GY4418@port70.net> <20180626141434.GU1392@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000045e12a056f929f59" X-Trace: blaine.gmane.org 1530050300 12263 195.159.176.226 (26 Jun 2018 21:58:20 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 26 Jun 2018 21:58:20 +0000 (UTC) Cc: musl To: Rich Felker Original-X-From: musl-return-12968-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jun 26 23:58:16 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fXvya-00037R-6E for gllmg-musl@m.gmane.org; Tue, 26 Jun 2018 23:58:16 +0200 Original-Received: (qmail 24286 invoked by uid 550); 26 Jun 2018 22:00:24 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 24262 invoked from network); 26 Jun 2018 22:00:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YD7fiH8iskUHOiguazvkp0LRm0jyiUBgK0Fmrpcbvs0=; b=CUk97Divio7khP+6JTer8FTRAdNPc6onelhVscjQiovXL/RPmlX5y7QfpXE3p6rCi2 Q/SPw8sYKfnxCwEvnYOScuBJzsfkgsqxuYbe13WoPGlslalpP/Zd0atbvip+jbl+7y99 4MCRjMguDo69H5gNEMSy3vJfGJN30PhWA+6fxEvmVWG3fhj7jafc3elmhauhlG9eZ4ff w5+SG27lBhGZn0mIgQvSR9UYadfxCKv2Y+QbsJ36TNaQBnG4S938kRGa3qRAUIlOXl6r hEMfS6rqFQfKFQYQxFVHIimcaLBBp84hzutVBIBRA4WJgsJJX9zpue3vCkRoWfgXztlD lGJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YD7fiH8iskUHOiguazvkp0LRm0jyiUBgK0Fmrpcbvs0=; b=WxnKJ7/7nJgfD1zipi36G5cJwthxtsnT1aUGjoFaZBShuKnOgflD25av4Bs6gXe8Je e9XD8Twlz7Rat6EVyWIJGCifgWjOJl+jMgvL/koJfi0pGYJdp0THn6DX6BQ4wY1J6M2x jrQBkUGiT48VlTcqtPZsQlA2/2yUmL3vGKUcoqALoSk/AmIt1pq4sOx+xbRK0D5j0vOJ k1oWCL9fX5+/32oTSZ6OICn2Oikrkv4pMbNgnVz8QCqS7IoFKLnIoAi/PE4HvRe3I4Kp bJrumDy74cBp9Qz2mzxzciXGadLgJG9ovQajFbYU5IJ9Drd9/Vscx1rUeOqMXe3XD07b hs9Q== X-Gm-Message-State: APt69E2Ep251PfeXZ6IxHW5/2x18p3cCxgjUbMuGq0FEV9JB3c96+TWx 0Y/b/68odVHbaJ2gEt8TVYerD52mBUYQKkhm/Is= X-Google-Smtp-Source: AAOMgpff1pkB4ltvnVH4dCj9GqhCmhPtfydPSk1bi14+ZC3iwtw+me9NitXAUEZmjT4Z9sgBpwhNlbggV7MvZH2XOfk= X-Received: by 2002:a0d:f884:: with SMTP id i126-v6mr1718491ywf.245.1530050411616; Tue, 26 Jun 2018 15:00:11 -0700 (PDT) In-Reply-To: <20180626141434.GU1392@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:12952 Archived-At: --00000000000045e12a056f929f59 Content-Type: text/plain; charset="UTF-8" Thanks, I was able to turn off use of epoll. Here part of my script. I added -UHAVE_SYS_EPOLL_H option and removed HAVE_SYS_EPOLL_H from the pg_config.h file. JOPT=-j6 POSTGRES=10.3 cd $BUILD_DIR . update rm -rf postgresql* wget https://ftp.postgresql.org/pub/source/v$POSTGRES/postgresql-$POSTGRES.tar.bz2 tar xf postgresql*.tar.* cd postgresql-$POSTGRES ./configure \ --prefix=$(pwd).install \ --with-python \ --with-openssl \ --with-libxml \ --with-libxslt \ --with-includes="$(ls -d ../openssl-*.install/include)" \ LDFLAGS="-ltinfo -lncurses" CFLAGS='-UHAVE_POSIX_FALLOCATE -UHAVE_SYS_EPOLL_H' && \ sed -i "/HAVE_POSIX_FALLOC/d" $BUILD_DIR/postgresql-$POSTGRES/src/include/pg_config.h && \ sed -i "/HAVE_SYS_EPOLL_H/d" $BUILD_DIR/postgresql-$POSTGRES/src/include/pg_config.h && \ make $JOPT world On Tue, Jun 26, 2018 at 10:14 AM Rich Felker wrote: > On Tue, Jun 26, 2018 at 01:46:15AM +0200, Szabolcs Nagy wrote: > > * John Mudd [2018-06-25 16:49:36 -0400]: > > > I build a dynamically linked version of Postgres using musl. It's been > > > working well for years. I just built a new version and I'm getting the > > > following Postgres error on some machines. Any suggestions? > > > > > > ERROR: epoll_create1 failed: Function not implemented > > > > > > > try to run it with strace to see how epoll_create1 is called > > > > > I build on 32-bit Linux Mint 18.3 Sylvia with 4.13.0-39-generic kernel. > > > > > > It runs on some machines such as 64-bit Ubuntu with 4.4.0-121-generic > > > kernel. But fails on CentOS release 5.4 (Final) with 2.6.18-416.el5 #1 > SMP > > > kernel. > > > > > > My previous musl builds of Postgres run on all of my machines. > > Linux 2.6.18 did not have the SYS_epoll_create1 syscall; it was added > in 2.6.27 (according to man 2 syscalls) which is around the time all > the O_CLOEXEC-family stuff was added. I suspect the new version of > Postgres you updated too is (correctly) passing the EPOLL_CLOEXEC flag > to make opening the epoll fd safe against fd leak races, and there is > fundamentally (well, without horrible hacks) no way to emulate this on > old kernels that lack the functionality. > > For some other interfaces we emulate the functionality non-atomically > with fcntl after the open, but this isn't really a good solution. > > Really you should update the kernel to something capable of dealing > safely with fd-leak races. For correct behavior of many interfaces, > musl needs a minimum kernel version of around 2.6.28; behavior with > earlier versions will be best-effort. > > If you really can't upgrade the kernel, consider patching Postgres to > remove the EPOLL_CLOEXEC flag (pass 0 for the flag) and possibly > adding a fcntl call to set the O_CLOEXEC flag after epoll_create[1] > succeeds. Or you can see if there's an option to build without epoll > at all, using the standard poll instead which does not use a fd and is > not affected by this issue. > > Rich > --00000000000045e12a056f929f59 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks, I was able to turn off use of epoll. Here part of = my script. I added=C2=A0=C2=A0-UHAVE_SYS_EPOLL_H option and removed=C2=A0HA= VE_SYS_EPOLL_H from the=C2=A0pg_config.h file.
=C2=A0 =C2=A0=C2=A0
=C2=A0 =C2=A0 JOPT=3D-j6
=C2=A0 =C2=A0 POSTGRES=3D10.3
=C2=A0 =C2=A0 cd $BUILD_DIR
=C2=A0 =C2=A0 . update
=C2=A0 =C2=A0 rm -rf postgresql*
=C2=A0 =C2=A0 tar xf postgresql*.tar.*
= =C2=A0 =C2=A0 cd postgresql-$POSTGRES
=C2=A0 =C2=A0 ./configure \=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --prefix=3D$(pwd).install \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-python \
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 --with-openssl \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-li= bxml \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-libxslt \
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 --with-includes=3D"$(ls -d ../openssl-*.insta= ll/include)" \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 LDFLAGS=3D"-= ltinfo -lncurses" CFLAGS=3D'-UHAVE_POSIX_FALLOCATE -UHAVE_SYS_EPOL= L_H' && \
=C2=A0 =C2=A0 sed -i "/HAVE_POSIX_FALL= OC/d" $BUILD_DIR/postgresql-$POSTGRES/src/include/pg_config.h &&am= p; \
=C2=A0 =C2=A0 sed -i "/HAVE_SYS_EPOLL_H/d" $BUILD_= DIR/postgresql-$POSTGRES/src/include/pg_config.h && \
=C2= =A0 =C2=A0 make $JOPT world

On Tue, Jun 26, 2018 at 10:14 AM Rich Felker <dalias@libc.org> wrote:
On Tue, Jun 26, 2018 at 01:46:15AM +0200, Szabolcs = Nagy wrote:
> * John Mudd <johnbmudd@gmail.com> [2018-06-25 16:49:36 -0400]:
> > I build a dynamically linked version of Postgres using musl. It&#= 39;s been
> > working well for years. I just built a new version and I'm ge= tting the
> > following Postgres error on some machines. Any suggestions?
> >
> >=C2=A0 =C2=A0 =C2=A0ERROR:=C2=A0 epoll_create1 failed: Function no= t implemented
> >
>
> try to run it with strace to see how epoll_create1 is called
>
> > I build on 32-bit Linux Mint 18.3 Sylvia with 4.13.0-39-generic k= ernel.
> >
> > It runs on some machines such as 64-bit Ubuntu with 4.4.0-121-gen= eric
> > kernel. But fails on CentOS release 5.4 (Final) with 2.6.18-416.e= l5 #1 SMP
> > kernel.
> >
> > My previous musl builds of Postgres run on all of my machines.
Linux 2.6.18 did not have the SYS_epoll_create1 syscall; it was added
in 2.6.27 (according to man 2 syscalls) which is around the time all
the O_CLOEXEC-family stuff was added. I suspect the new version of
Postgres you updated too is (correctly) passing the EPOLL_CLOEXEC flag
to make opening the epoll fd safe against fd leak races, and there is
fundamentally (well, without horrible hacks) no way to emulate this on
old kernels that lack the functionality.

For some other interfaces we emulate the functionality non-atomically
with fcntl after the open, but this isn't really a good solution.

Really you should update the kernel to something capable of dealing
safely with fd-leak races. For correct behavior of many interfaces,
musl needs a minimum kernel version of around 2.6.28; behavior with
earlier versions will be best-effort.

If you really can't upgrade the kernel, consider patching Postgres to remove the EPOLL_CLOEXEC flag (pass 0 for the flag) and possibly
adding a fcntl call to set the O_CLOEXEC flag after epoll_create[1]
succeeds. Or you can see if there's an option to build without epoll at all, using the standard poll instead which does not use a fd and is
not affected by this issue.

Rich
--00000000000045e12a056f929f59--