mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: "Nieminen, Jussi" <Jussi.Nieminen@dynatrace.com>
Cc: "musl@lists.openwall.com" <musl@lists.openwall.com>
Subject: Re: [musl] Bug in getaddrinfo causing spurious returns with wrong error values
Date: Tue, 19 Jul 2022 21:54:59 -0400	[thread overview]
Message-ID: <20220720015457.GC7074@brightrain.aerifal.cx> (raw)
In-Reply-To: <DM5PR13MB421422A1B902BE2CF66AF6C69D609@DM5PR13MB4214.namprd13.prod.outlook.com>

On Tue, Nov 23, 2021 at 02:47:49PM +0000, Nieminen, Jussi wrote:
> Hi,
> 
> I'm a developer from the performance monitoring company Dynatrace, and I've been
> recently investigating curious problems at our customers' environments where a
> call to musl's getaddrinfo appears to spuriously return ENOENT when called from
> a node.js application that is being monitored with the Dynatrace agent.
> 
> I managed to pinpoint the problem to the code that performs the AI_ADDRCONFIG
> check. If an address family that is not enabled on the host is specified, a call
> to "connect" in that code fails, the socket fd is closed, and the value of
> "errno" is then evaluated.
> 
> The problem is that the call to "close" can change the value of errno, which
> will break the switch-case that follows it. Especially if aio is used (which is
> the case when the Dynatrace agent is included in the application), the call to
> close will end up setting errno to ENOENT by default (even without a failure)
> within the "aio_cancel" function if an aio operation is active. In such a case
> getaddrinfo will then incorrectly return EAI_SYSTEM with errno set to ENOENT.
> 
> (After some error code translations within libuv, node.js will then print an
> error message claiming that getaddrinfo failed with ENOENT which is rather
> confusing.)
> 
> Even if aio is not used, the code might fail whenever "close" gets interrupted
> and returns with errno set to EINTR. As the return value of close is not
> checked, the errno might thus "silently" change before getting evaluated with
> the assumption that it still contains the value set when "connect" failed.
> 
> Below is a simple patch that should take care of this problem. Let me know if I
> can provide any more information or if there is anything else I can help with.
> 
> Thanks,
> Jussi
> 
> 
> -------------------------------------------------------------------------------
> diff --git a/src/network/getaddrinfo.c b/src/network/getaddrinfo.c
> index efaab306..71809856 100644
> --- a/src/network/getaddrinfo.c
> +++ b/src/network/getaddrinfo.c
> @@ -16,6 +16,7 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru
>         char canon[256], *outcanon;
>         int nservs, naddrs, nais, canon_len, i, j, k;
>         int family = AF_UNSPEC, flags = 0, proto = 0, socktype = 0;
> +       int saved_errno = 0;
>         struct aibuf *out;
> 
>         if (!host && !serv) return EAI_NONAME;
> @@ -66,11 +67,14 @@ int getaddrinfo(const char *restrict host, const char *restrict serv, const stru
>                                 pthread_setcancelstate(
>                                         PTHREAD_CANCEL_DISABLE, &cs);
>                                 int r = connect(s, ta[i], tl[i]);
> +                               /* The call to "close" might change errno, especially if aio is in use;
> +                                * save the value set by "connect" for the later comparison. */
> +                               if (r < 0) saved_errno = errno;
>                                 pthread_setcancelstate(cs, 0);
>                                 close(s);
>                                 if (!r) continue;
>                         }
> -                       switch (errno) {
> +                       switch (saved_errno) {
>                         case EADDRNOTAVAIL:
>                         case EAFNOSUPPORT:
>                         case EHOSTUNREACH:
> -------------------------------------------------------------------------------

A couple minor problems with the patch:

- The errno from socket() is not used if the failure was from
  socket(). I'm not sure yet if that matters but I think it may if
  IPv6 was disabled in a way that makes socket() fail.

- In the case where EAI_SYSTEM is returned, the error was not restored
  back into errno, so the caller cannot get the cause of error if it
  was clobbered by close.

I'll work on a fixed version. I think the right thing to do is just
save/restore errno itself rather than switching on saved_errno.

Rich

      parent reply	other threads:[~2022-07-20  1:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 14:47 Nieminen, Jussi
2021-11-23 15:05 ` Rich Felker
2022-07-19  2:57   ` Rich Felker
2022-07-20  1:54 ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220720015457.GC7074@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=Jussi.Nieminen@dynatrace.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).