I tested the fix in your earlier email locally with my own test code and getent in Alpine Linux 3.16.2. They both crashed. After debugging, I found in gethostbyname2_r we should return non-zero if no address is returned. Then gethostbyname2 will return NULL. I also check the Linux doc, it says: > Return Value > The gethostbyname() and gethostbyaddr() functions return the hostent structure or *a NULL pointer if an error occurs*. On error, the h_errno variable holds an error number. When non-NULL, the return value may point at static data, see the notes below. Based on your patch (except the "for (i=nq-1; i>=0; i--)"), I made a minor change to address it. Tested with command getent hosts, it works well with my CoreDNS. Although the h_errno is better to be NO_DATA rather than HOST_NOT_FOUND, I think it's not a big issue. The diff file attached. Tom Shen On Thu, Oct 20, 2022 at 8:52 AM Rich Felker wrote: > On Thu, Oct 20, 2022 at 12:00:46AM +0800, Tom Shen wrote: > > > Right. I'm saying the maintainers of this version of getent should be > > > informed that they're using a backwards nonstandard function here and > > > that it would work better with getaddrinfo. > > > > The getent Linux manual states that when using database hosts, > > gethostbyname2 would be called. > > https://man7.org/linux/man-pages/man1/getent.1.html . Maybe we can > persuade > > the third party service to use getent ahosts. > > Uhg, lovely that it's specified like that... But really, it only needs > to behave "as if" it did that, which can be done better using > getaddrinfo with the right options to get the behavior the legacy APIs > would have given, but in a better way. So I think it's still worth > pursuing this kind of improvement too. Orthogonal to your problem > report, though. > > > > I inquired about this and got the response that there might not be an > > > explicit rule against it, "much like no RFC prohibits you from eating > > > glass". :-) I don't see how the behavior makes any sense though. There > > > is a risk that strict stub resolvers could consider these packets > > > non-answers and timeout/fail rather than returning a response, and > > > like you found, a chance that buggy stub resolvers not doing > > > sufficient validation could misinterpret the answers. I'm not clear > > > where these wrong-type answers could give useful results, especially > > > not anything more-useful than just returning NODATA in place of the v6 > > > result you want to suppress. > > > > True. Maybe we should not use this "rewrite type AAAA A" feature in > > CoreDNS. Just on Internet there are some posts suggested it to fix the > name > > resolution issues caused by unexpected IPv6 logic. We already removed it > > from our system since an incident. I am not an expert in DNS though, so > > cannot comment more about the standards. > > > > Anyway, although rewriting the DNS request type may make no sense, we > > should either return correct IPv6 address (e.g. ::ffff:10.96.36.74) or > IPv4 > > address (10.96.36.74) or NODATA, rather than an invalid IPv6 address > > (a60:244a::) :-). I got your point in last email, and I also have the > same > > idea, returning NODATA is a good (or only) solution. In getent hosts > > command, when getting empty result for AF_INET6, it will call > > gethostbyname2 with AF_INET (this is also the behavior of glibc's getent > > https://codebrowser.dev/glibc/glibc/nss/getent.c.html), then we will get > > correct IPv4 address. > > > > > But I have a proposed fix here. > > > > Thank you so much, Rich! Your change looks better than what I would do. I > > will try it out. If tested good, would we merge the fix into later > releases? > > Yes, I'm getting it merged now. Thanks for the report. In some sense > this was a long-known issue (that "malicious" nameservers could give > wrong-RR-type answers -- see CVE-2017-15650) and it's nice to finally > have good reason to put in the attention to fixing it. > > Rich >