From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 12193 invoked from network); 19 Jul 2022 02:57:41 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 19 Jul 2022 02:57:41 -0000 Received: (qmail 10190 invoked by uid 550); 19 Jul 2022 02:57:38 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 10149 invoked from network); 19 Jul 2022 02:57:37 -0000 Date: Mon, 18 Jul 2022 22:57:24 -0400 From: Rich Felker To: "Nieminen, Jussi" Cc: "musl@lists.openwall.com" Message-ID: <20220719025723.GB7074@brightrain.aerifal.cx> References: <20211123150537.GZ7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211123150537.GZ7074@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Bug in getaddrinfo causing spurious returns with wrong error values On Tue, Nov 23, 2021 at 10:05:37AM -0500, Rich Felker wrote: > On Tue, Nov 23, 2021 at 02:47:49PM +0000, Nieminen, Jussi wrote: > > Hi, > > > > I'm a developer from the performance monitoring company Dynatrace, and I've been > > recently investigating curious problems at our customers' environments where a > > call to musl's getaddrinfo appears to spuriously return ENOENT when called from > > a node.js application that is being monitored with the Dynatrace agent. > > > > I managed to pinpoint the problem to the code that performs the AI_ADDRCONFIG > > check. If an address family that is not enabled on the host is specified, a call > > to "connect" in that code fails, the socket fd is closed, and the value of > > "errno" is then evaluated. > > > > The problem is that the call to "close" can change the value of errno, which > > will break the switch-case that follows it. Especially if aio is used (which is > > the case when the Dynatrace agent is included in the application), the call to > > close will end up setting errno to ENOENT by default (even without a failure) > > within the "aio_cancel" function if an aio operation is active. In such a case > > getaddrinfo will then incorrectly return EAI_SYSTEM with errno set to ENOENT. > > > > (After some error code translations within libuv, node.js will then print an > > error message claiming that getaddrinfo failed with ENOENT which is rather > > confusing.) > > Indeed, this all makes sense. > > > Even if aio is not used, the code might fail whenever "close" gets interrupted > > and returns with errno set to EINTR. As the return value of close is not > > checked, the errno might thus "silently" change before getting evaluated with > > the assumption that it still contains the value set when "connect" failed. > > close can't return with EINTR but can return with EINPROGRESS which > would give the same result here, I think. I was mistaken about this. EINPROGRESS was removed back in commit 6bea5dc69892cd9ff0c222474e7dd468c29dfa75 (2015) and is no longer possible. However, as the original report stated, ENOENT comes from __aio_close where errno is clobbered even on success. On looking at this again, I'm thinking we might should rework the stuff causing that to happen, but the patch submitted still looks reasonable since close does not have any contract not to clobber errno. > > Below is a simple patch that should take care of this problem. Let me know if I > > can provide any more information or if there is anything else I can help with. > > I think this patch is probably okay. I wondered if close was in the > set of functions POSIX-future intends to require not to touch errno on > success, but it doesn't seem to be, and moreover the EINPROGRESS > semantics would undermine that anyway. So saving errno before calling > close is probably the right thing to do here. > > > Thanks, > > Jussi > > Thanks for the clear analysis and patch! If nobody objects/sees anything wrong, I'll go ahead and take this now. Sorry for the long delay! Rich