From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8329 Path: news.gmane.org!not-for-mail From: Chuck Lever Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: nfs-utils broken with musl: "select: Bad file descriptor" Date: Tue, 18 Aug 2015 18:44:46 -0700 Message-ID: <0165A67A-873B-4662-A1AF-F5B133204F50@oracle.com> References: <94A25C39-C496-4D79-948A-18B64C2CDE1D@oracle.com> <20150819012445.GO32742@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1439948714 23779 80.91.229.3 (19 Aug 2015 01:45:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 19 Aug 2015 01:45:14 +0000 (UTC) Cc: musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-8341-gllmg-musl=m.gmane.org@lists.openwall.com Wed Aug 19 03:45:13 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZRsRA-00066i-UB for gllmg-musl@m.gmane.org; Wed, 19 Aug 2015 03:45:09 +0200 Original-Received: (qmail 3324 invoked by uid 550); 19 Aug 2015 01:45:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3300 invoked from network); 19 Aug 2015 01:45:06 -0000 In-Reply-To: <20150819012445.GO32742@brightrain.aerifal.cx> X-Mailer: Apple Mail (2.1878.6) X-Source-IP: aserv0022.oracle.com [141.146.126.234] Xref: news.gmane.org gmane.linux.lib.musl.general:8329 Archived-At: On Aug 18, 2015, at 6:24 PM, Rich Felker wrote: > On Tue, Aug 18, 2015 at 06:05:01PM -0700, Chuck Lever wrote: >>>> i think this call goes wrong: >>>>=20 >>>>=20 >>> = http://git.linux-nfs.org/?p=3Dsteved/nfs-utils.git;a=3Dblob;f=3Dutils/stat= d/rmtcall.c;hb=3DHEAD#l56 >>>=20 >>>>=20 >>>> it loops for 100 iterations and if all ports are used >>>> according to getservbyport then it FD_SET(sockfd, &SVC_FDSET); >>>> with some random high sockfd (eg. 105) that is closed. >>>>=20 >>>> ...so should getservbyport fail there? >>>>=20 >>>> (according to strace it tries ports 883 to 982) >>>=20 >>> I think the application's expectation is that it fail rather than >>> returning a decimal-string-only service entity. However it looks = like >>> the code is written to handle the case where all 100 iterations fail >>> to get an anonymous port. The problem seems to be that, when the = loop >>> stops due to hitting the iteration count rather than exiting with >>> break, i has already been incremented past the last tmp_socket slot, >>> so the close loop closes the fd that they actually want to use, = later >>> causing EBADF. This is purely an application bug, but it happens not >>> to get noticed if getservbyport fails anywhere along the way, which >>> they expect to happen in the usual case. >>=20 >> statd_get_socket() is hunting for a privileged source port that >> is not just unused at the moment, but that is also not going to be >> used by some other well-known service. This is a long-lived socket >> that statd uses to communicate with the kernel. It must use a >> privileged port. >>=20 >> if getservbyport(3) is returning something for every port that >> is tried, then statd_get_socket() will fail to find a usable >> port. >>=20 >> If it's returning 105, that suggests it has run out of retries. >> It should return -1 in this case. That is a logic bug. >>=20 >> But is it true that every port returned by bindresvport(3) is >> actually defined in /etc/services? Surely there is one open >> port that can be used. What port does bindresvport(3) start >> with? > The logic bug is the count-down loop that closes all the temp sockets. > In the case where the loop terminates via break, it leaves the last > one open and only closes the extras. But in the case where where the > loop terminates via the end condition in the for statement, the close > loop closes all the sockets including the one it intends to use. OK. Do you have a patch? Still not clear why it would take 100 tries exactly. -- Chuck Lever