From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 26263 invoked from network); 19 Oct 2022 03:45:12 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 19 Oct 2022 03:45:12 -0000 Received: (qmail 24105 invoked by uid 550); 19 Oct 2022 03:45:08 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 24072 invoked from network); 19 Oct 2022 03:45:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NcCEtmtwjgS02t/n7yRGA0qNDAYwmSS267lCDR4olIw=; b=KGm8gaWauxNW+FDo59hpT4EQ7JyVPl1AusF+QKL2zXg3dRvW0XYiXvB/Ky7+gYl/A3 DuA318FEg9L9588YyJe/9i9pavqOWQztt3urLMjDgnJ3uO+5ouG4RWIojeebLCcEkh/c 5kwoS7yAd96nZOiXiYTzULhIPc/+SsbnoziGnJB3rvk2tA+NKw3LZgJMjht/cFEap8hA 8D+62mpE2B0a2lkI3lF9QxjZy6GFuYHoI/X/TL7X+ZllpeJan+rFHzwBgKoddgVzMno0 YOkYtMmTZiK5BmBtUGhoVHHa1SmOdhg28OA6zrwR8D0AmMSPaYUu90z1HPcUNhPdcu6O TScw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NcCEtmtwjgS02t/n7yRGA0qNDAYwmSS267lCDR4olIw=; b=qJTtyjpZkMLIxkXwWH+WQupqhc1LpVeZo7SBpK37jIqHJS9ViV9UKu+FWzB2WBFNjE jNrQbtWoiQ8JJqY/RO+qxslk0L5Bmz4cspO4wMNvELDPCIflwOnEooox/+u5PJDV/sEB SfSdgt8en2M20+jEkV/HmuLE8TAQBjY6EG56jE6uljmFf0DNMWGo3C438UZWvsavZvvc qLcOy2Uab/Bmf50cAeNGR/U7YahEZTFtQLpsHJPNpKL/1lXclZb1P452H/C5922Lg6PK xVqjINSj8Z/1pxXLjZ8nMxEHVZon4fZlq6ynWnyfrH/vU1sbaw7TV1GEAUTRIKo6zpkx K0ow== X-Gm-Message-State: ACrzQf28Afq+M7M40GcLXdDS5zQObURI25RhZqySVpjOjQN6XQWtNDML 8VVTPjE3Mp+1C1OinlGvAxmHa8fcP/j8gFdVDcm5THOOoKQ= X-Google-Smtp-Source: AMsMyM7iMyw4JBwyD33EcHyr2UtZSg7UYQ4sfE+PwbgZFxvRN4eqjTrhQNZoANxKuSq6Qc7aXnnfs9M1x18hVcNhm2I= X-Received: by 2002:a17:907:74e:b0:74f:83d4:cf58 with SMTP id xc14-20020a170907074e00b0074f83d4cf58mr5105742ejb.178.1666151096326; Tue, 18 Oct 2022 20:44:56 -0700 (PDT) MIME-Version: 1.0 References: <20221018172727.GK29905@brightrain.aerifal.cx> In-Reply-To: <20221018172727.GK29905@brightrain.aerifal.cx> From: Tom Shen Date: Wed, 19 Oct 2022 11:44:45 +0800 Message-ID: To: Rich Felker Cc: musl@lists.openwall.com Content-Type: multipart/alternative; boundary="0000000000003d212a05eb5b08d3" Subject: Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address --0000000000003d212a05eb5b08d3 Content-Type: text/plain; charset="UTF-8" Thanks for replying! > gethostbyname2 is legacy and has lots of inherent problems, so this > really should be changed to use getaddrinfo. If nothing else, such a > change would make it finish twice as fast. Ya, it is obsolete as stated in Linux doc. But a third-party service is using "getent hosts" which calls this API. > Are you saying that the configured nameserver (coredns) is responding > with an answer packet for a different question (A instead of AAAA) > than the query packet asked for? I'm confused why it would do that. > It sounds like this is also wrong behavior by coredns. If it's > rewriting AAAA queries to A for the sake of determining if the name > exists vs NxDomain, that's fine for the outgoing query it performs, > but instead of returning the answer to the rewritten query back to the > asker as an answer that mismatches the question, it should just be > removing the answer section to make it into a NODATA answer (if it's > not already NxDomain). Yes. In CoreDNS, we can configure it to rewrite the request type AAAA to A, so that the response will be IPv4 family and address. This is a CoreDNS feature. I cannot find anything in the DNS standards against it so far. > The expected behavior of a process receiving such an answer would be > to ignore it as bogus, but maybe we're not doing that, and then > wrongly returning answers that don't match the type we asked for? Unfortunately, we return invalid IPv6 address as stated in the previous email. But, maybe it's not good to mention it here, glibc can handle this situation. > I'd rather fix this at the layer the actual bug (missing validation) > is at. So, let me try to suggest a fix in __lookup_name or musl team would help to take a look? So far I cannot find a way to create a pull request to musl project. On Wed, Oct 19, 2022 at 1:27 AM Rich Felker wrote: > On Wed, Oct 19, 2022 at 12:02:36AM +0800, Tom Shen wrote: > > Hi Musl team, > > > > Recently we encountered an issue in requesting name resolution in Alpine > > Linux with Musl. > > > > Environment: > > * Alpine Linux 3.14.2 > > * IPv6 module disabled > > * musl-libc 1.2.2 > > * CoreDNS with "rewrite stop type AAAA A" configured > > > > Reproduce steps: > > 1. getent hosts > > 2. Check output > > > > Expected: Get correct IPv4 address, e.g. 10.96.36.74 > > > > Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the > > actual IPv4 32-bit address followed by zeros. > > > > Debugging: > > > > 1. Alpine Linux's getent ( > > > https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c) > > tries gethostbyname2 with AF_INET6 first, if no result, then calls with > > AF_INET. > > gethostbyname2 is legacy and has lots of inherent problems, so this > really should be changed to use getaddrinfo. If nothing else, such a > change would make it finish twice as fast. > > > 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name > > -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS. > > 3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in > > the RR_AAAA request type being rewritten to RR_A. So the response from > > CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below). > > > > > (gdb) p addrs > > > $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' > > times>, sortkey = 0} > > > (gdb) x/4xb addrs[0]->addr > > > 0x7fffffffe0d8: 0x0a 0x60 0x24 0x4a > > Are you saying that the configured nameserver (coredns) is responding > with an answer packet for a different question (A instead of AAAA) > than the query packet asked for? I'm confused why it would do that. > The expected behavior of a process receiving such an answer would be > to ignore it as bogus, but maybe we're not doing that, and then > wrongly returning answers that don't match the type we asked for? > > It sounds like this is also wrong behavior by coredns. If it's > rewriting AAAA queries to A for the sake of determining if the name > exists vs NxDomain, that's fine for the outgoing query it performs, > but instead of returning the answer to the rewritten query back to the > asker as an answer that mismatches the question, it should just be > removing the answer section to make it into a NODATA answer (if it's > not already NxDomain). > > > 4. In gethostbyname2_r, the result address family is assigned to the > > request's family, regardless of the family in the response. > > > > > h->h_addrtype = af; > > > h->h_length = af==AF_INET6 ? 16 : 4; > > > > > This results in the IPv4 address replied from CoreDNS is wrongly copied > to > > the result as IPv6 address which is a60:244a:: . > > OK, it looks like gethostbyname2_r is assuming the results match the > address type it asked for, which is mostly correct (except AF_UNSPEC? > that doesn't seem to be handled and it's probably invalid but we > should probably error out on it or something); it's __lookup_name > which is wrong to return results of the wrong type (as a consequence > of not validating the DNS responses). > > > 5. getent gets an IPv4 address, so it won't try AF_INET, rather directly > > return the invalid IPv6 address a60:244a:: . > > > > Suggested fix: > > In gethostbyname2_r when adding answered addresses into the result, we > need > > to filter out the addresses with mismatching address family. > > > > > for (i=0; i > > // if (addrs[i].family != h->h_addrtype) continue; // need to fix > the > > > index of h->h_addr_list too > > > h->h_addr_list[i] = (void *)buf; > > > buf += h->h_length; > > > memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length); > > > } > > > > > > > Could you review this issue? Thanks in advance! > > I'd rather fix this at the layer the actual bug (missing validation) > is at. > > Rich > --0000000000003d212a05eb5b08d3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for replying!

> gethostbyname2 is legacy and has l= ots of inherent problems, so this
> really should be changed to use= getaddrinfo. If nothing else, such a
> change would make it finish t= wice as fast.

Ya, it is obsolete as stated in Lin= ux doc. But a third-party service is using "getent hosts" which c= alls this API.
<= div>
> Are you saying that the configured nameserver (core= dns) is responding
>=C2=A0=C2=A0 with an answer packet for a differ= ent question (A instead of AAAA)
>=C2=A0=C2=A0 than the query packet = asked for? I'm confused why it would do that.

>=C2=A0=C2=A0 It sounds like= this is also wrong behavior by coredns. If it's
>=C2=A0=C2=A0 re= writing AAAA queries to A for the sake of determining if the name
>= =C2=A0=C2=A0 exists vs NxDomain, that's fine for the outgoing query it = performs,
>=C2=A0=C2=A0 but instead of returning the answer to the re= written query back to the
>=C2=A0=C2=A0 asker as an answer that misma= tches the question, it should just be
>=C2=A0=C2=A0 removing the answ= er section to make it into a NODATA answer (if it's
>=C2=A0=C2=A0= not already NxDomain).
=C2=A0=C2=A0
Yes. In= CoreDNS, we can configure it to rewrite the request type AAAA to A, so tha= t the response will be IPv4 family and address. This is a CoreDNS feature. = I cannot find anything in the DNS standards against it so far.

>=C2=A0=C2=A0 Th= e expected behavior of a process receiving such an answer would be
>= =C2=A0=C2=A0 to ignore it as bogus, but maybe we're not doing that, and= then
>=C2=A0=C2=A0 wrongly returning answers that don't match th= e type we asked for?

Unfortunately, we return invalid IPv6 ad= dress as stated in the previous email. But, maybe it's not good to ment= ion it here, glibc can handle this situation.

> I'd rather fix thi= s at the layer the actual bug (missing validation)
> is at.
So, let me try to suggest a fix in __lookup_name or musl= team would help to take a look? So far I cannot find a way to create a pul= l request to musl project.

=
On Wed, Oct 19, 2022 at 1:27 AM Rich = Felker <dalias@libc.org> wrote= :
On Wed, Oct 19= , 2022 at 12:02:36AM +0800, Tom Shen wrote:
> Hi Musl team,
>
> Recently we encountered an issue in requesting name resolution in Alpi= ne
> Linux with Musl.
>
> Environment:
> * Alpine Linux 3.14.2
> * IPv6 module disabled
> * musl-libc 1.2.2
> * CoreDNS with "rewrite stop type AAAA A" configured
>
> Reproduce steps:
> 1. getent hosts <any-name-only-has-ipv4-record>
> 2. Check output
>
> Expected: Get correct IPv4 address, e.g. 10.96.36.74
>
> Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the<= br> > actual IPv4 32-bit address followed by zeros.
>
> Debugging:
>
> 1. Alpine Linux's getent (
> https://github.com/alp= inelinux/aports/blob/3.14-stable/main/musl/getent.c)
> tries gethostbyname2 with AF_INET6 first, if no result, then calls wit= h
> AF_INET.

gethostbyname2 is legacy and has lots of inherent problems, so this
really should be changed to use getaddrinfo. If nothing else, such a
change would make it finish twice as fast.

> 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_nam= e
> -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS= .
> 3. In CoreDNS, we configured "rewrite stop type AAAA A", whi= ch results in
> the RR_AAAA request type being rewritten to RR_A. So the response from=
> CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).<= br> >
> > (gdb) p addrs
> > $55 =3D {{family =3D 2, scopeid =3D 0, addr =3D "\n`$J"= , '\000' <repeats 11
> > times>, sortkey =3D 0}
> > (gdb) x/4xb addrs[0]->addr
> > 0x7fffffffe0d8: 0x0a=C2=A0 =C2=A0 0x60=C2=A0 =C2=A0 0x24=C2=A0 = =C2=A0 0x4a

Are you saying that the configured nameserver (coredns) is responding
with an answer packet for a different question (A instead of AAAA)
than the query packet asked for? I'm confused why it would do that.
The expected behavior of a process receiving such an answer would be
to ignore it as bogus, but maybe we're not doing that, and then
wrongly returning answers that don't match the type we asked for?

It sounds like this is also wrong behavior by coredns. If it's
rewriting AAAA queries to A for the sake of determining if the name
exists vs NxDomain, that's fine for the outgoing query it performs,
but instead of returning the answer to the rewritten query back to the
asker as an answer that mismatches the question, it should just be
removing the answer section to make it into a NODATA answer (if it's not already NxDomain).

> 4. In gethostbyname2_r, the result address family is assigned to the > request's family, regardless of the family in the response.
>
> > h->h_addrtype =3D af;
> > h->h_length =3D af=3D=3DAF_INET6 ? 16 : 4;
> >
>=C2=A0 This results in the IPv4 address replied from CoreDNS is wrongly= copied to
> the result as IPv6 address which is a60:244a:: .

OK, it looks like gethostbyname2_r is assuming the results match the
address type it asked for, which is mostly correct (except AF_UNSPEC?
that doesn't seem to be handled and it's probably invalid but we should probably error out on it or something); it's __lookup_name
which is wrong to return results of the wrong type (as a consequence
of not validating the DNS responses).

> 5. getent gets an IPv4 address, so it won't try AF_INET, rather di= rectly
> return the invalid IPv6 address a60:244a:: .
>
> Suggested fix:
> In gethostbyname2_r when adding answered addresses into the result, we= need
> to filter out the addresses with mismatching address family.
>
> > for (i=3D0; i<cnt; i++) {
> >=C2=A0 =C2=A0 =C2=A0// if (addrs[i].family !=3D h->h_addrtype) = continue; // need to fix the
> > index of h->h_addr_list too
> >=C2=A0 =C2=A0 =C2=A0h->h_addr_list[i] =3D (void *)buf;
> >=C2=A0 =C2=A0 =C2=A0buf +=3D h->h_length;
> >=C2=A0 =C2=A0 =C2=A0memcpy(h->h_addr_list[i], addrs[i].addr, h-= >h_length);
> > }
> >
>
> Could you review this issue? Thanks in advance!

I'd rather fix this at the layer the actual bug (missing validation) is at.

Rich
--0000000000003d212a05eb5b08d3--