mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
@ 2022-10-18 16:02 Tom Shen
  2022-10-18 17:27 ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Shen @ 2022-10-18 16:02 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2153 bytes --]

Hi Musl team,

Recently we encountered an issue in requesting name resolution in Alpine
Linux with Musl.

Environment:
* Alpine Linux 3.14.2
* IPv6 module disabled
* musl-libc 1.2.2
* CoreDNS with "rewrite stop type AAAA A" configured

Reproduce steps:
1. getent hosts <any-name-only-has-ipv4-record>
2. Check output

Expected: Get correct IPv4 address, e.g. 10.96.36.74

Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the
actual IPv4 32-bit address followed by zeros.

Debugging:

1. Alpine Linux's getent (
https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c)
tries gethostbyname2 with AF_INET6 first, if no result, then calls with
AF_INET.
2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name
-> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS.
3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in
the RR_AAAA request type being rewritten to RR_A. So the response from
CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).

> (gdb) p addrs
> $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' <repeats 11
> times>, sortkey = 0}
> (gdb) x/4xb addrs[0]->addr
> 0x7fffffffe0d8: 0x0a    0x60    0x24    0x4a
>

4. In gethostbyname2_r, the result address family is assigned to the
request's family, regardless of the family in the response.

> h->h_addrtype = af;
> h->h_length = af==AF_INET6 ? 16 : 4;
>
 This results in the IPv4 address replied from CoreDNS is wrongly copied to
the result as IPv6 address which is a60:244a:: .
5. getent gets an IPv4 address, so it won't try AF_INET, rather directly
return the invalid IPv6 address a60:244a:: .

Suggested fix:
In gethostbyname2_r when adding answered addresses into the result, we need
to filter out the addresses with mismatching address family.

> for (i=0; i<cnt; i++) {
>     // if (addrs[i].family != h->h_addrtype) continue; // need to fix the
> index of h->h_addr_list too
>     h->h_addr_list[i] = (void *)buf;
>     buf += h->h_length;
>     memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length);
> }
>

Could you review this issue? Thanks in advance!

Best regards!
Tom Shen

[-- Attachment #2: Type: text/html, Size: 3166 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-18 16:02 [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address Tom Shen
@ 2022-10-18 17:27 ` Rich Felker
  2022-10-19  3:44   ` Tom Shen
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2022-10-18 17:27 UTC (permalink / raw)
  To: Tom Shen; +Cc: musl

On Wed, Oct 19, 2022 at 12:02:36AM +0800, Tom Shen wrote:
> Hi Musl team,
> 
> Recently we encountered an issue in requesting name resolution in Alpine
> Linux with Musl.
> 
> Environment:
> * Alpine Linux 3.14.2
> * IPv6 module disabled
> * musl-libc 1.2.2
> * CoreDNS with "rewrite stop type AAAA A" configured
> 
> Reproduce steps:
> 1. getent hosts <any-name-only-has-ipv4-record>
> 2. Check output
> 
> Expected: Get correct IPv4 address, e.g. 10.96.36.74
> 
> Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the
> actual IPv4 32-bit address followed by zeros.
> 
> Debugging:
> 
> 1. Alpine Linux's getent (
> https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c)
> tries gethostbyname2 with AF_INET6 first, if no result, then calls with
> AF_INET.

gethostbyname2 is legacy and has lots of inherent problems, so this
really should be changed to use getaddrinfo. If nothing else, such a
change would make it finish twice as fast.

> 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name
> -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS.
> 3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in
> the RR_AAAA request type being rewritten to RR_A. So the response from
> CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).
> 
> > (gdb) p addrs
> > $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' <repeats 11
> > times>, sortkey = 0}
> > (gdb) x/4xb addrs[0]->addr
> > 0x7fffffffe0d8: 0x0a    0x60    0x24    0x4a

Are you saying that the configured nameserver (coredns) is responding
with an answer packet for a different question (A instead of AAAA)
than the query packet asked for? I'm confused why it would do that.
The expected behavior of a process receiving such an answer would be
to ignore it as bogus, but maybe we're not doing that, and then
wrongly returning answers that don't match the type we asked for?

It sounds like this is also wrong behavior by coredns. If it's
rewriting AAAA queries to A for the sake of determining if the name
exists vs NxDomain, that's fine for the outgoing query it performs,
but instead of returning the answer to the rewritten query back to the
asker as an answer that mismatches the question, it should just be
removing the answer section to make it into a NODATA answer (if it's
not already NxDomain).

> 4. In gethostbyname2_r, the result address family is assigned to the
> request's family, regardless of the family in the response.
> 
> > h->h_addrtype = af;
> > h->h_length = af==AF_INET6 ? 16 : 4;
> >
>  This results in the IPv4 address replied from CoreDNS is wrongly copied to
> the result as IPv6 address which is a60:244a:: .

OK, it looks like gethostbyname2_r is assuming the results match the
address type it asked for, which is mostly correct (except AF_UNSPEC?
that doesn't seem to be handled and it's probably invalid but we
should probably error out on it or something); it's __lookup_name
which is wrong to return results of the wrong type (as a consequence
of not validating the DNS responses).

> 5. getent gets an IPv4 address, so it won't try AF_INET, rather directly
> return the invalid IPv6 address a60:244a:: .
> 
> Suggested fix:
> In gethostbyname2_r when adding answered addresses into the result, we need
> to filter out the addresses with mismatching address family.
> 
> > for (i=0; i<cnt; i++) {
> >     // if (addrs[i].family != h->h_addrtype) continue; // need to fix the
> > index of h->h_addr_list too
> >     h->h_addr_list[i] = (void *)buf;
> >     buf += h->h_length;
> >     memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length);
> > }
> >
> 
> Could you review this issue? Thanks in advance!

I'd rather fix this at the layer the actual bug (missing validation)
is at.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-18 17:27 ` Rich Felker
@ 2022-10-19  3:44   ` Tom Shen
  2022-10-19 14:24     ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Shen @ 2022-10-19  3:44 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 5871 bytes --]

Thanks for replying!

> gethostbyname2 is legacy and has lots of inherent problems, so this
> really should be changed to use getaddrinfo. If nothing else, such a
> change would make it finish twice as fast.

Ya, it is obsolete as stated in Linux doc. But a third-party service is
using "getent hosts" which calls this API.

> Are you saying that the configured nameserver (coredns) is responding
>   with an answer packet for a different question (A instead of AAAA)
>   than the query packet asked for? I'm confused why it would do that.

>   It sounds like this is also wrong behavior by coredns. If it's
>   rewriting AAAA queries to A for the sake of determining if the name
>   exists vs NxDomain, that's fine for the outgoing query it performs,
>   but instead of returning the answer to the rewritten query back to the
>   asker as an answer that mismatches the question, it should just be
>   removing the answer section to make it into a NODATA answer (if it's
>   not already NxDomain).

Yes. In CoreDNS, we can configure it to rewrite the request type AAAA to A,
so that the response will be IPv4 family and address. This is a CoreDNS
feature. I cannot find anything in the DNS standards against it so far.

>   The expected behavior of a process receiving such an answer would be
>   to ignore it as bogus, but maybe we're not doing that, and then
>   wrongly returning answers that don't match the type we asked for?

Unfortunately, we return invalid IPv6 address as stated in the previous
email. But, maybe it's not good to mention it here, glibc can handle this
situation.

> I'd rather fix this at the layer the actual bug (missing validation)
> is at.

So, let me try to suggest a fix in __lookup_name or musl team would help to
take a look? So far I cannot find a way to create a pull request to musl
project.

On Wed, Oct 19, 2022 at 1:27 AM Rich Felker <dalias@libc.org> wrote:

> On Wed, Oct 19, 2022 at 12:02:36AM +0800, Tom Shen wrote:
> > Hi Musl team,
> >
> > Recently we encountered an issue in requesting name resolution in Alpine
> > Linux with Musl.
> >
> > Environment:
> > * Alpine Linux 3.14.2
> > * IPv6 module disabled
> > * musl-libc 1.2.2
> > * CoreDNS with "rewrite stop type AAAA A" configured
> >
> > Reproduce steps:
> > 1. getent hosts <any-name-only-has-ipv4-record>
> > 2. Check output
> >
> > Expected: Get correct IPv4 address, e.g. 10.96.36.74
> >
> > Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the
> > actual IPv4 32-bit address followed by zeros.
> >
> > Debugging:
> >
> > 1. Alpine Linux's getent (
> >
> https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c)
> > tries gethostbyname2 with AF_INET6 first, if no result, then calls with
> > AF_INET.
>
> gethostbyname2 is legacy and has lots of inherent problems, so this
> really should be changed to use getaddrinfo. If nothing else, such a
> change would make it finish twice as fast.
>
> > 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name
> > -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS.
> > 3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in
> > the RR_AAAA request type being rewritten to RR_A. So the response from
> > CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).
> >
> > > (gdb) p addrs
> > > $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' <repeats 11
> > > times>, sortkey = 0}
> > > (gdb) x/4xb addrs[0]->addr
> > > 0x7fffffffe0d8: 0x0a    0x60    0x24    0x4a
>
> Are you saying that the configured nameserver (coredns) is responding
> with an answer packet for a different question (A instead of AAAA)
> than the query packet asked for? I'm confused why it would do that.
> The expected behavior of a process receiving such an answer would be
> to ignore it as bogus, but maybe we're not doing that, and then
> wrongly returning answers that don't match the type we asked for?
>
> It sounds like this is also wrong behavior by coredns. If it's
> rewriting AAAA queries to A for the sake of determining if the name
> exists vs NxDomain, that's fine for the outgoing query it performs,
> but instead of returning the answer to the rewritten query back to the
> asker as an answer that mismatches the question, it should just be
> removing the answer section to make it into a NODATA answer (if it's
> not already NxDomain).
>
> > 4. In gethostbyname2_r, the result address family is assigned to the
> > request's family, regardless of the family in the response.
> >
> > > h->h_addrtype = af;
> > > h->h_length = af==AF_INET6 ? 16 : 4;
> > >
> >  This results in the IPv4 address replied from CoreDNS is wrongly copied
> to
> > the result as IPv6 address which is a60:244a:: .
>
> OK, it looks like gethostbyname2_r is assuming the results match the
> address type it asked for, which is mostly correct (except AF_UNSPEC?
> that doesn't seem to be handled and it's probably invalid but we
> should probably error out on it or something); it's __lookup_name
> which is wrong to return results of the wrong type (as a consequence
> of not validating the DNS responses).
>
> > 5. getent gets an IPv4 address, so it won't try AF_INET, rather directly
> > return the invalid IPv6 address a60:244a:: .
> >
> > Suggested fix:
> > In gethostbyname2_r when adding answered addresses into the result, we
> need
> > to filter out the addresses with mismatching address family.
> >
> > > for (i=0; i<cnt; i++) {
> > >     // if (addrs[i].family != h->h_addrtype) continue; // need to fix
> the
> > > index of h->h_addr_list too
> > >     h->h_addr_list[i] = (void *)buf;
> > >     buf += h->h_length;
> > >     memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length);
> > > }
> > >
> >
> > Could you review this issue? Thanks in advance!
>
> I'd rather fix this at the layer the actual bug (missing validation)
> is at.
>
> Rich
>

[-- Attachment #2: Type: text/html, Size: 7576 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-19  3:44   ` Tom Shen
@ 2022-10-19 14:24     ` Rich Felker
  2022-10-19 16:00       ` Tom Shen
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2022-10-19 14:24 UTC (permalink / raw)
  To: Tom Shen; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 3156 bytes --]

On Wed, Oct 19, 2022 at 11:44:45AM +0800, Tom Shen wrote:
> Thanks for replying!
> 
> > gethostbyname2 is legacy and has lots of inherent problems, so this
> > really should be changed to use getaddrinfo. If nothing else, such a
> > change would make it finish twice as fast.
> 
> Ya, it is obsolete as stated in Linux doc. But a third-party service is
> using "getent hosts" which calls this API.

Right. I'm saying the maintainers of this version of getent should be
informed that they're using a backwards nonstandard function here and
that it would work better with getaddrinfo.

> > Are you saying that the configured nameserver (coredns) is responding
> >   with an answer packet for a different question (A instead of AAAA)
> >   than the query packet asked for? I'm confused why it would do that.
> 
> >   It sounds like this is also wrong behavior by coredns. If it's
> >   rewriting AAAA queries to A for the sake of determining if the name
> >   exists vs NxDomain, that's fine for the outgoing query it performs,
> >   but instead of returning the answer to the rewritten query back to the
> >   asker as an answer that mismatches the question, it should just be
> >   removing the answer section to make it into a NODATA answer (if it's
> >   not already NxDomain).
> 
> Yes. In CoreDNS, we can configure it to rewrite the request type AAAA to A,
> so that the response will be IPv4 family and address. This is a CoreDNS
> feature. I cannot find anything in the DNS standards against it so far.

I inquired about this and got the response that there might not be an
explicit rule against it, "much like no RFC prohibits you from eating
glass". :-) I don't see how the behavior makes any sense though. There
is a risk that strict stub resolvers could consider these packets
non-answers and timeout/fail rather than returning a response, and
like you found, a chance that buggy stub resolvers not doing
sufficient validation could misinterpret the answers. I'm not clear
where these wrong-type answers could give useful results, especially
not anything more-useful than just returning NODATA in place of the v6
result you want to suppress.

> >   The expected behavior of a process receiving such an answer would be
> >   to ignore it as bogus, but maybe we're not doing that, and then
> >   wrongly returning answers that don't match the type we asked for?
> 
> Unfortunately, we return invalid IPv6 address as stated in the previous
> email. But, maybe it's not good to mention it here, glibc can handle this
> situation.
> 
> > I'd rather fix this at the layer the actual bug (missing validation)
> > is at.
> 
> So, let me try to suggest a fix in __lookup_name or musl team would help to
> take a look? So far I cannot find a way to create a pull request to musl
> project.

You can send patches to the list in plain diff or git format-patch
form. Attachments are preferred over inline but inline is okay too if
you're sure your mailer won't mess them up. But I have a proposed fix
here. The last hunk needs to be applied manually since it has context
affected by other DNS work I'm about to push (reversal of the for loop
direction).

Rich

[-- Attachment #2: badrrtypefix.diff --]
[-- Type: text/plain, Size: 1946 bytes --]

diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c
index 3eda65a7..57585b8e 100644
--- a/src/network/lookup_name.c
+++ b/src/network/lookup_name.c
@@ -102,6 +102,7 @@ struct dpc_ctx {
 	struct address *addrs;
 	char *canon;
 	int cnt;
+	int rrtype;
 };
 
 #define RR_A 1
@@ -115,12 +116,14 @@ static int dns_parse_callback(void *c, int rr, const void *data, int len, const
 	if (ctx->cnt >= MAXADDRS) return -1;
 	switch (rr) {
 	case RR_A:
+		if (rr != ctx->rrtype) return 0;
 		if (len != 4) return -1;
 		ctx->addrs[ctx->cnt].family = AF_INET;
 		ctx->addrs[ctx->cnt].scopeid = 0;
 		memcpy(ctx->addrs[ctx->cnt++].addr, data, 4);
 		break;
 	case RR_AAAA:
+		if (rr != ctx->rrtype) return 0;
 		if (len != 16) return -1;
 		ctx->addrs[ctx->cnt].family = AF_INET6;
 		ctx->addrs[ctx->cnt].scopeid = 0;
@@ -140,7 +143,7 @@ static int name_from_dns(struct address buf[static MAXADDRS], char canon[static
 	unsigned char qbuf[2][280], abuf[2][768];
 	const unsigned char *qp[2] = { qbuf[0], qbuf[1] };
 	unsigned char *ap[2] = { abuf[0], abuf[1] };
-	int qlens[2], alens[2];
+	int qlens[2], alens[2], qtypes[2];
 	int i, nq = 0;
 	struct dpc_ctx ctx = { .addrs = buf, .canon = canon };
 	static const struct { int af; int rr; } afrr[2] = {
@@ -154,6 +157,7 @@ static int name_from_dns(struct address buf[static MAXADDRS], char canon[static
 				0, 0, 0, qbuf[nq], sizeof *qbuf);
 			if (qlens[nq] == -1)
 				return 0;
+			qtypes[nq] = afrr[i].rr;
 			qbuf[nq][3] = 0; /* don't need AD flag */
 			/* Ensure query IDs are distinct. */
 			if (nq && qbuf[nq][0] == qbuf[0][0])
@@ -171,8 +175,10 @@ static int name_from_dns(struct address buf[static MAXADDRS], char canon[static
 		if ((abuf[i][3] & 15) != 0) return EAI_FAIL;
 	}
 
-	for (i=nq-1; i>=0; i--)
+	for (i=nq-1; i>=0; i--) {
+		ctx.rrtype = qtypes[i];
 		__dns_parse(abuf[i], alens[i], dns_parse_callback, &ctx);
+	}
 
 	if (ctx.cnt) return ctx.cnt;
 	return EAI_NODATA;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-19 14:24     ` Rich Felker
@ 2022-10-19 16:00       ` Tom Shen
  2022-10-20  0:52         ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Shen @ 2022-10-19 16:00 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 5535 bytes --]

> Right. I'm saying the maintainers of this version of getent should be
> informed that they're using a backwards nonstandard function here and
> that it would work better with getaddrinfo.

The getent Linux manual states that when using database hosts,
gethostbyname2 would be called.
https://man7.org/linux/man-pages/man1/getent.1.html . Maybe we can persuade
the third party service to use getent ahosts.

> I inquired about this and got the response that there might not be an
> explicit rule against it, "much like no RFC prohibits you from eating
> glass". :-) I don't see how the behavior makes any sense though. There
> is a risk that strict stub resolvers could consider these packets
> non-answers and timeout/fail rather than returning a response, and
> like you found, a chance that buggy stub resolvers not doing
> sufficient validation could misinterpret the answers. I'm not clear
> where these wrong-type answers could give useful results, especially
> not anything more-useful than just returning NODATA in place of the v6
> result you want to suppress.

True. Maybe we should not use this "rewrite type AAAA A" feature in
CoreDNS. Just on Internet there are some posts suggested it to fix the name
resolution issues caused by unexpected IPv6 logic. We already removed it
from our system since an incident. I am not an expert in DNS though, so
cannot comment more about the standards.

Anyway, although rewriting the DNS request type may make no sense, we
should either return correct IPv6 address (e.g. ::ffff:10.96.36.74) or IPv4
address (10.96.36.74) or NODATA, rather than an invalid IPv6 address
(a60:244a::) :-). I got your point in last email, and I also have the same
idea, returning NODATA is a good (or only) solution. In getent hosts
command, when getting empty result for AF_INET6, it will call
gethostbyname2 with AF_INET (this is also the behavior of glibc's getent
https://codebrowser.dev/glibc/glibc/nss/getent.c.html), then we will get
correct IPv4 address.

> But I have a proposed fix here.

Thank you so much, Rich! Your change looks better than what I would do. I
will try it out. If tested good, would we merge the fix into later releases?

Tom Shen

On Wed, Oct 19, 2022, 22:24 Rich Felker <dalias@libc.org> wrote:

> On Wed, Oct 19, 2022 at 11:44:45AM +0800, Tom Shen wrote:
> > Thanks for replying!
> >
> > > gethostbyname2 is legacy and has lots of inherent problems, so this
> > > really should be changed to use getaddrinfo. If nothing else, such a
> > > change would make it finish twice as fast.
> >
> > Ya, it is obsolete as stated in Linux doc. But a third-party service is
> > using "getent hosts" which calls this API.
>
> Right. I'm saying the maintainers of this version of getent should be
> informed that they're using a backwards nonstandard function here and
> that it would work better with getaddrinfo.
>
> > > Are you saying that the configured nameserver (coredns) is responding
> > >   with an answer packet for a different question (A instead of AAAA)
> > >   than the query packet asked for? I'm confused why it would do that.
> >
> > >   It sounds like this is also wrong behavior by coredns. If it's
> > >   rewriting AAAA queries to A for the sake of determining if the name
> > >   exists vs NxDomain, that's fine for the outgoing query it performs,
> > >   but instead of returning the answer to the rewritten query back to
> the
> > >   asker as an answer that mismatches the question, it should just be
> > >   removing the answer section to make it into a NODATA answer (if it's
> > >   not already NxDomain).
> >
> > Yes. In CoreDNS, we can configure it to rewrite the request type AAAA to
> A,
> > so that the response will be IPv4 family and address. This is a CoreDNS
> > feature. I cannot find anything in the DNS standards against it so far.
>
> I inquired about this and got the response that there might not be an
> explicit rule against it, "much like no RFC prohibits you from eating
> glass". :-) I don't see how the behavior makes any sense though. There
> is a risk that strict stub resolvers could consider these packets
> non-answers and timeout/fail rather than returning a response, and
> like you found, a chance that buggy stub resolvers not doing
> sufficient validation could misinterpret the answers. I'm not clear
> where these wrong-type answers could give useful results, especially
> not anything more-useful than just returning NODATA in place of the v6
> result you want to suppress.
>
> > >   The expected behavior of a process receiving such an answer would be
> > >   to ignore it as bogus, but maybe we're not doing that, and then
> > >   wrongly returning answers that don't match the type we asked for?
> >
> > Unfortunately, we return invalid IPv6 address as stated in the previous
> > email. But, maybe it's not good to mention it here, glibc can handle this
> > situation.
> >
> > > I'd rather fix this at the layer the actual bug (missing validation)
> > > is at.
> >
> > So, let me try to suggest a fix in __lookup_name or musl team would help
> to
> > take a look? So far I cannot find a way to create a pull request to musl
> > project.
>
> You can send patches to the list in plain diff or git format-patch
> form. Attachments are preferred over inline but inline is okay too if
> you're sure your mailer won't mess them up. But I have a proposed fix
> here. The last hunk needs to be applied manually since it has context
> affected by other DNS work I'm about to push (reversal of the for loop
> direction).
>
> Rich
>

[-- Attachment #2: Type: text/html, Size: 7033 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-19 16:00       ` Tom Shen
@ 2022-10-20  0:52         ` Rich Felker
  2022-10-20 17:25           ` Tom Shen
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2022-10-20  0:52 UTC (permalink / raw)
  To: Tom Shen; +Cc: musl

On Thu, Oct 20, 2022 at 12:00:46AM +0800, Tom Shen wrote:
> > Right. I'm saying the maintainers of this version of getent should be
> > informed that they're using a backwards nonstandard function here and
> > that it would work better with getaddrinfo.
> 
> The getent Linux manual states that when using database hosts,
> gethostbyname2 would be called.
> https://man7.org/linux/man-pages/man1/getent.1.html . Maybe we can persuade
> the third party service to use getent ahosts.

Uhg, lovely that it's specified like that... But really, it only needs
to behave "as if" it did that, which can be done better using
getaddrinfo with the right options to get the behavior the legacy APIs
would have given, but in a better way. So I think it's still worth
pursuing this kind of improvement too. Orthogonal to your problem
report, though.

> > I inquired about this and got the response that there might not be an
> > explicit rule against it, "much like no RFC prohibits you from eating
> > glass". :-) I don't see how the behavior makes any sense though. There
> > is a risk that strict stub resolvers could consider these packets
> > non-answers and timeout/fail rather than returning a response, and
> > like you found, a chance that buggy stub resolvers not doing
> > sufficient validation could misinterpret the answers. I'm not clear
> > where these wrong-type answers could give useful results, especially
> > not anything more-useful than just returning NODATA in place of the v6
> > result you want to suppress.
> 
> True. Maybe we should not use this "rewrite type AAAA A" feature in
> CoreDNS. Just on Internet there are some posts suggested it to fix the name
> resolution issues caused by unexpected IPv6 logic. We already removed it
> from our system since an incident. I am not an expert in DNS though, so
> cannot comment more about the standards.
> 
> Anyway, although rewriting the DNS request type may make no sense, we
> should either return correct IPv6 address (e.g. ::ffff:10.96.36.74) or IPv4
> address (10.96.36.74) or NODATA, rather than an invalid IPv6 address
> (a60:244a::) :-). I got your point in last email, and I also have the same
> idea, returning NODATA is a good (or only) solution. In getent hosts
> command, when getting empty result for AF_INET6, it will call
> gethostbyname2 with AF_INET (this is also the behavior of glibc's getent
> https://codebrowser.dev/glibc/glibc/nss/getent.c.html), then we will get
> correct IPv4 address.
> 
> > But I have a proposed fix here.
> 
> Thank you so much, Rich! Your change looks better than what I would do. I
> will try it out. If tested good, would we merge the fix into later releases?

Yes, I'm getting it merged now. Thanks for the report. In some sense
this was a long-known issue (that "malicious" nameservers could give
wrong-RR-type answers -- see CVE-2017-15650) and it's nice to finally
have good reason to put in the attention to fixing it.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-20  0:52         ` Rich Felker
@ 2022-10-20 17:25           ` Tom Shen
  2022-10-20 23:55             ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Shen @ 2022-10-20 17:25 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl


[-- Attachment #1.1: Type: text/plain, Size: 3990 bytes --]

I tested the fix in your earlier email locally with my own test code and
getent in Alpine Linux 3.16.2. They both crashed. After debugging, I found
in gethostbyname2_r we should return non-zero if no address is returned.
Then gethostbyname2 will return NULL. I also check the Linux doc, it says:

> Return Value
> The gethostbyname() and gethostbyaddr() functions return the hostent
structure or *a NULL pointer if an error occurs*. On error, the h_errno
variable holds an error number. When non-NULL, the return value may point
at static data, see the notes below.

Based on your patch (except the "for (i=nq-1; i>=0; i--)"), I made a minor
change to address it. Tested with command getent hosts, it works well with
my CoreDNS. Although the h_errno is better to be NO_DATA rather than
HOST_NOT_FOUND, I think it's not a big issue.

The diff file attached.

Tom Shen

On Thu, Oct 20, 2022 at 8:52 AM Rich Felker <dalias@libc.org> wrote:

> On Thu, Oct 20, 2022 at 12:00:46AM +0800, Tom Shen wrote:
> > > Right. I'm saying the maintainers of this version of getent should be
> > > informed that they're using a backwards nonstandard function here and
> > > that it would work better with getaddrinfo.
> >
> > The getent Linux manual states that when using database hosts,
> > gethostbyname2 would be called.
> > https://man7.org/linux/man-pages/man1/getent.1.html . Maybe we can
> persuade
> > the third party service to use getent ahosts.
>
> Uhg, lovely that it's specified like that... But really, it only needs
> to behave "as if" it did that, which can be done better using
> getaddrinfo with the right options to get the behavior the legacy APIs
> would have given, but in a better way. So I think it's still worth
> pursuing this kind of improvement too. Orthogonal to your problem
> report, though.
>
> > > I inquired about this and got the response that there might not be an
> > > explicit rule against it, "much like no RFC prohibits you from eating
> > > glass". :-) I don't see how the behavior makes any sense though. There
> > > is a risk that strict stub resolvers could consider these packets
> > > non-answers and timeout/fail rather than returning a response, and
> > > like you found, a chance that buggy stub resolvers not doing
> > > sufficient validation could misinterpret the answers. I'm not clear
> > > where these wrong-type answers could give useful results, especially
> > > not anything more-useful than just returning NODATA in place of the v6
> > > result you want to suppress.
> >
> > True. Maybe we should not use this "rewrite type AAAA A" feature in
> > CoreDNS. Just on Internet there are some posts suggested it to fix the
> name
> > resolution issues caused by unexpected IPv6 logic. We already removed it
> > from our system since an incident. I am not an expert in DNS though, so
> > cannot comment more about the standards.
> >
> > Anyway, although rewriting the DNS request type may make no sense, we
> > should either return correct IPv6 address (e.g. ::ffff:10.96.36.74) or
> IPv4
> > address (10.96.36.74) or NODATA, rather than an invalid IPv6 address
> > (a60:244a::) :-). I got your point in last email, and I also have the
> same
> > idea, returning NODATA is a good (or only) solution. In getent hosts
> > command, when getting empty result for AF_INET6, it will call
> > gethostbyname2 with AF_INET (this is also the behavior of glibc's getent
> > https://codebrowser.dev/glibc/glibc/nss/getent.c.html), then we will get
> > correct IPv4 address.
> >
> > > But I have a proposed fix here.
> >
> > Thank you so much, Rich! Your change looks better than what I would do. I
> > will try it out. If tested good, would we merge the fix into later
> releases?
>
> Yes, I'm getting it merged now. Thanks for the report. In some sense
> this was a long-known issue (that "malicious" nameservers could give
> wrong-RR-type answers -- see CVE-2017-15650) and it's nice to finally
> have good reason to put in the attention to fixing it.
>
> Rich
>

[-- Attachment #1.2: Type: text/html, Size: 4981 bytes --]

[-- Attachment #2: gethostbyname2_r.diff --]
[-- Type: application/octet-stream, Size: 410 bytes --]

diff --git a/src/network/gethostbyname2_r.c b/src/network/gethostbyname2_r.c
index c9f3acc4..4653aac8 100644
--- a/src/network/gethostbyname2_r.c
+++ b/src/network/gethostbyname2_r.c
@@ -22,7 +22,7 @@ int gethostbyname2_r(const char *name, int af,
 	if (cnt<0) switch (cnt) {
 	case EAI_NONAME:
 		*err = HOST_NOT_FOUND;
-		return 0;
+		return ENODATA;
 	case EAI_AGAIN:
 		*err = TRY_AGAIN;
 		return EAGAIN;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address
  2022-10-20 17:25           ` Tom Shen
@ 2022-10-20 23:55             ` Rich Felker
  0 siblings, 0 replies; 8+ messages in thread
From: Rich Felker @ 2022-10-20 23:55 UTC (permalink / raw)
  To: Tom Shen; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

On Fri, Oct 21, 2022 at 01:25:50AM +0800, Tom Shen wrote:
> I tested the fix in your earlier email locally with my own test code and
> getent in Alpine Linux 3.16.2. They both crashed. After debugging, I found
> in gethostbyname2_r we should return non-zero if no address is returned.
> Then gethostbyname2 will return NULL. I also check the Linux doc, it says:
> 
> > Return Value
> > The gethostbyname() and gethostbyaddr() functions return the hostent
> structure or *a NULL pointer if an error occurs*. On error, the h_errno
> variable holds an error number. When non-NULL, the return value may point
> at static data, see the notes below.
> 
> Based on your patch (except the "for (i=nq-1; i>=0; i--)"), I made a minor
> change to address it. Tested with command getent hosts, it works well with
> my CoreDNS. Although the h_errno is better to be NO_DATA rather than
> HOST_NOT_FOUND, I think it's not a big issue.
> 
> The diff file attached.

This is a different bug you've found in gethostbyname2. It seems it
was not updated when commit f081d5336a80b68d3e1bed789cc373c5c3d6699b
fixed the return behavior of gethostbyname2_r to treat NODATA and
NxDomain as success conditions rather than errors. Attached patch
should fix it right.

Rich

[-- Attachment #2: 0001-fix-return-value-of-gethostby-name-2-addr-with-no-re.patch --]
[-- Type: text/plain, Size: 1713 bytes --]

From 8f9259450aa43a6fd539e428e61e2961b725fbae Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Thu, 20 Oct 2022 19:48:32 -0400
Subject: [PATCH] fix return value of gethostby{name[2],addr} with no result
 but no error

commit f081d5336a80b68d3e1bed789cc373c5c3d6699b fixed
gethostbyname[2]_r to treat negative results as a non-error, leaving
gethostbyname[2] wrongly returning a pointer to the unfilled result
buffer rather than a null pointer. since, as documented with commit
fe82bb9b921be34370e6b71a1c6f062c20999ae0, the caller of
gethostby{name[2],addr}_r can always rely on the result pointer being
set, use that consistently rather than trying to duplicate logic about
whether we have a result or not in gethostby{name[2],addr}.
---
 src/network/gethostbyaddr.c  | 2 +-
 src/network/gethostbyname2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/network/gethostbyaddr.c b/src/network/gethostbyaddr.c
index 598e2241..c3cacaac 100644
--- a/src/network/gethostbyaddr.c
+++ b/src/network/gethostbyaddr.c
@@ -20,5 +20,5 @@ struct hostent *gethostbyaddr(const void *a, socklen_t l, int af)
 		err = gethostbyaddr_r(a, l, af, h,
 			(void *)(h+1), size-sizeof *h, &res, &h_errno);
 	} while (err == ERANGE);
-	return err ? 0 : h;
+	return res;
 }
diff --git a/src/network/gethostbyname2.c b/src/network/gethostbyname2.c
index dc9d6621..bd0da7f8 100644
--- a/src/network/gethostbyname2.c
+++ b/src/network/gethostbyname2.c
@@ -21,5 +21,5 @@ struct hostent *gethostbyname2(const char *name, int af)
 		err = gethostbyname2_r(name, af, h,
 			(void *)(h+1), size-sizeof *h, &res, &h_errno);
 	} while (err == ERANGE);
-	return err ? 0 : h;
+	return res;
 }
-- 
2.21.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-20 23:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18 16:02 [musl] gethostbyname2_r returns invalid IPv6 address if DNS server replies IPv4 address Tom Shen
2022-10-18 17:27 ` Rich Felker
2022-10-19  3:44   ` Tom Shen
2022-10-19 14:24     ` Rich Felker
2022-10-19 16:00       ` Tom Shen
2022-10-20  0:52         ` Rich Felker
2022-10-20 17:25           ` Tom Shen
2022-10-20 23:55             ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).