* [PATCH] resolver: only exit the search path loop there are a positive number of results given @ 2018-03-30 18:52 William Pitcock 2018-03-30 19:14 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: William Pitcock @ 2018-03-30 18:52 UTC (permalink / raw) To: musl; +Cc: William Pitcock In the event of no results being given by any of the lookup modules, EAI_NONAME will still be thrown. This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, such as the one Cloudflare have implemented, and appear in the search path. --- src/network/lookup_name.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c index 209c20f0..b068bb92 100644 --- a/src/network/lookup_name.c +++ b/src/network/lookup_name.c @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ memcpy(canon+l+1, p, z-p); canon[z-p+1+l] = 0; int cnt = name_from_dns(buf, canon, canon, family, &conf); - if (cnt) return cnt; + if (cnt > 0) return cnt; } } -- 2.16.2 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 18:52 [PATCH] resolver: only exit the search path loop there are a positive number of results given William Pitcock @ 2018-03-30 19:14 ` Rich Felker 2018-03-30 19:19 ` William Pitcock 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2018-03-30 19:14 UTC (permalink / raw) To: musl On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: > In the event of no results being given by any of the lookup modules, EAI_NONAME will still > be thrown. > > This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, > such as the one Cloudflare have implemented, and appear in the search path. > --- > src/network/lookup_name.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c > index 209c20f0..b068bb92 100644 > --- a/src/network/lookup_name.c > +++ b/src/network/lookup_name.c > @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ > memcpy(canon+l+1, p, z-p); > canon[z-p+1+l] = 0; > int cnt = name_from_dns(buf, canon, canon, family, &conf); > - if (cnt) return cnt; > + if (cnt > 0) return cnt; > } > } This patch is incorrect, and the reason should be an FAQ item if it's not already. Only a return value of 0 means that the requested name does not exist and that it's permissible to continue search. Other nonpositive return values indicate either that the name does exist but does not have a record of the quested type, or that a transient error occurred, making it impossible to determine whether the search can be continued and thus requiring the error to be reported to the caller. Anything else results in one or both of the following bugs: - Nondeterministically returning different results for the same query depending on transient unavailability of the nameservers to answer on time. - Returning inconsistent results (for different search components) depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. I'm aware that at least rancher-dns and Cloudflare's nameservers have had bugs related to this issue. I'm not sure what the status on getting them fixed is, and for Cloudflare I don't know exactly what it is they're doing wrong or why. But I do know the problem is that they're returning semantically incorrect dns replies. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:14 ` Rich Felker @ 2018-03-30 19:19 ` William Pitcock 2018-03-30 19:35 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: William Pitcock @ 2018-03-30 19:19 UTC (permalink / raw) To: musl Hello, On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@libc.org> wrote: > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still >> be thrown. >> >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, >> such as the one Cloudflare have implemented, and appear in the search path. >> --- >> src/network/lookup_name.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c >> index 209c20f0..b068bb92 100644 >> --- a/src/network/lookup_name.c >> +++ b/src/network/lookup_name.c >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ >> memcpy(canon+l+1, p, z-p); >> canon[z-p+1+l] = 0; >> int cnt = name_from_dns(buf, canon, canon, family, &conf); >> - if (cnt) return cnt; >> + if (cnt > 0) return cnt; >> } >> } > > This patch is incorrect, and the reason should be an FAQ item if it's > not already. Only a return value of 0 means that the requested name > does not exist and that it's permissible to continue search. Other > nonpositive return values indicate either that the name does exist but > does not have a record of the quested type, or that a transient error > occurred, making it impossible to determine whether the search can be > continued and thus requiring the error to be reported to the caller. > Anything else results in one or both of the following bugs: > > - Nondeterministically returning different results for the same query > depending on transient unavailability of the nameservers to answer > on time. > > - Returning inconsistent results (for different search components) > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. > > I'm aware that at least rancher-dns and Cloudflare's nameservers have > had bugs related to this issue. I'm not sure what the status on > getting them fixed is, and for Cloudflare I don't know exactly what it > is they're doing wrong or why. But I do know the problem is that > they're returning semantically incorrect dns replies. Kubernetes imposes a default search path with the cluster domain last, so: - local.prod.svc.whatever - prod.svc.whatever - svc.whatever - yourdomain.com The cloudflare issue is that they send SUCCESS code with 0 replies, which causes musl to error when it hits the yourdomain.com. Do you have any suggestions on a mitigation which would be more palatable? We need to ship a mitigation for this in Alpine 3.8 regardless. I would much rather carry a patch that is upstreamable, but I am quite willing to carry one that isn't, in order to solve this problem. William ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:19 ` William Pitcock @ 2018-03-30 19:35 ` Rich Felker 2018-03-30 19:44 ` William Pitcock 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2018-03-30 19:35 UTC (permalink / raw) To: musl On Fri, Mar 30, 2018 at 02:19:48PM -0500, William Pitcock wrote: > Hello, > > On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@libc.org> wrote: > > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: > >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still > >> be thrown. > >> > >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, > >> such as the one Cloudflare have implemented, and appear in the search path. > >> --- > >> src/network/lookup_name.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c > >> index 209c20f0..b068bb92 100644 > >> --- a/src/network/lookup_name.c > >> +++ b/src/network/lookup_name.c > >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ > >> memcpy(canon+l+1, p, z-p); > >> canon[z-p+1+l] = 0; > >> int cnt = name_from_dns(buf, canon, canon, family, &conf); > >> - if (cnt) return cnt; > >> + if (cnt > 0) return cnt; > >> } > >> } > > > > This patch is incorrect, and the reason should be an FAQ item if it's > > not already. Only a return value of 0 means that the requested name > > does not exist and that it's permissible to continue search. Other > > nonpositive return values indicate either that the name does exist but > > does not have a record of the quested type, or that a transient error > > occurred, making it impossible to determine whether the search can be > > continued and thus requiring the error to be reported to the caller. > > Anything else results in one or both of the following bugs: > > > > - Nondeterministically returning different results for the same query > > depending on transient unavailability of the nameservers to answer > > on time. > > > > - Returning inconsistent results (for different search components) > > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. > > > > I'm aware that at least rancher-dns and Cloudflare's nameservers have > > had bugs related to this issue. I'm not sure what the status on > > getting them fixed is, and for Cloudflare I don't know exactly what it > > is they're doing wrong or why. But I do know the problem is that > > they're returning semantically incorrect dns replies. > > Kubernetes imposes a default search path with the cluster domain last, so: > > - local.prod.svc.whatever > - prod.svc.whatever > - svc.whatever > - yourdomain.com > > The cloudflare issue is that they send SUCCESS code with 0 replies, > which causes musl to error when it hits the yourdomain.com. Yes, that makes sense. Do you know why they're doing it? If they refuse to fix it, the only clean fix I know is a local proxy configured to fix the records for the specific broken domains you care about. But of course that's not convenient. > Do you have any suggestions on a mitigation which would be more > palatable? We need to ship a mitigation for this in Alpine 3.8 > regardless. I would much rather carry a patch that is upstreamable, > but I am quite willing to carry one that isn't, in order to solve this > problem. A theoretically-non-horrible (but somewhat costly) solution is to always query both A and AAAA, rather than only doing it for AF_UNSPEC. Then if you see a reply with 0 (total, between both) records, you can opt to interpret that the same way as NxDomain without breaking consistency properties. If Cloudflare refuses to fix the bug, maybe we should consider adding an _option_ (in the resolv.conf options line) to do this. I don't think it should be the default behavior because it mildly slows down lookups, especially if you have nontrivial packet loss since probability of failure is now 1-(1-p)²=2p-p² rather than p (where p is the packet loss rate). Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:35 ` Rich Felker @ 2018-03-30 19:44 ` William Pitcock 2018-03-30 20:24 ` Szabolcs Nagy ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: William Pitcock @ 2018-03-30 19:44 UTC (permalink / raw) To: musl Hello, On Fri, Mar 30, 2018 at 2:35 PM, Rich Felker <dalias@libc.org> wrote: > On Fri, Mar 30, 2018 at 02:19:48PM -0500, William Pitcock wrote: >> Hello, >> >> On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@libc.org> wrote: >> > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: >> >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still >> >> be thrown. >> >> >> >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, >> >> such as the one Cloudflare have implemented, and appear in the search path. >> >> --- >> >> src/network/lookup_name.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c >> >> index 209c20f0..b068bb92 100644 >> >> --- a/src/network/lookup_name.c >> >> +++ b/src/network/lookup_name.c >> >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ >> >> memcpy(canon+l+1, p, z-p); >> >> canon[z-p+1+l] = 0; >> >> int cnt = name_from_dns(buf, canon, canon, family, &conf); >> >> - if (cnt) return cnt; >> >> + if (cnt > 0) return cnt; >> >> } >> >> } >> > >> > This patch is incorrect, and the reason should be an FAQ item if it's >> > not already. Only a return value of 0 means that the requested name >> > does not exist and that it's permissible to continue search. Other >> > nonpositive return values indicate either that the name does exist but >> > does not have a record of the quested type, or that a transient error >> > occurred, making it impossible to determine whether the search can be >> > continued and thus requiring the error to be reported to the caller. >> > Anything else results in one or both of the following bugs: >> > >> > - Nondeterministically returning different results for the same query >> > depending on transient unavailability of the nameservers to answer >> > on time. >> > >> > - Returning inconsistent results (for different search components) >> > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. >> > >> > I'm aware that at least rancher-dns and Cloudflare's nameservers have >> > had bugs related to this issue. I'm not sure what the status on >> > getting them fixed is, and for Cloudflare I don't know exactly what it >> > is they're doing wrong or why. But I do know the problem is that >> > they're returning semantically incorrect dns replies. >> >> Kubernetes imposes a default search path with the cluster domain last, so: >> >> - local.prod.svc.whatever >> - prod.svc.whatever >> - svc.whatever >> - yourdomain.com >> >> The cloudflare issue is that they send SUCCESS code with 0 replies, >> which causes musl to error when it hits the yourdomain.com. > > Yes, that makes sense. Do you know why they're doing it? If they > refuse to fix it, the only clean fix I know is a local proxy > configured to fix the records for the specific broken domains you care > about. But of course that's not convenient. My contacts at cloudflare indicate that their environment depends on this behaviour, so they have no interest in fixing it. A local proxy isn't going to be workable, because most people are going to just say "but Debian or Fedora doesn't require this," and then just go use a glibc distribution. There is a talk in a few weeks at Kubecon (the Kubernetes conference), explicitly titled "Don't Use Alpine If You Care About DNS." The talk largely centers around how musl's overly strict behaviour makes Alpine a bad choice for "the real world." I would like to turn this into a story where we can announce that Alpine 3.8 mitigates this problem instead, doing such will be good for both Alpine and the musl ecosystem as a whole, as it is defanging a point of possible FUD. > >> Do you have any suggestions on a mitigation which would be more >> palatable? We need to ship a mitigation for this in Alpine 3.8 >> regardless. I would much rather carry a patch that is upstreamable, >> but I am quite willing to carry one that isn't, in order to solve this >> problem. > > A theoretically-non-horrible (but somewhat costly) solution is to > always query both A and AAAA, rather than only doing it for AF_UNSPEC. > Then if you see a reply with 0 (total, between both) records, you can > opt to interpret that the same way as NxDomain without breaking > consistency properties. If Cloudflare refuses to fix the bug, maybe we > should consider adding an _option_ (in the resolv.conf options line) > to do this. I don't think it should be the default behavior because it > mildly slows down lookups, especially if you have nontrivial packet > loss since probability of failure is now 1-(1-p)²=2p-p² rather than p > (where p is the packet loss rate). It seems to me we could just send ANY and filter out the records we don't care about. This is what I did with charybdis's asynchronous DNS resolver when it had a similar problem. What are your thoughts on that? William ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:44 ` William Pitcock @ 2018-03-30 20:24 ` Szabolcs Nagy 2018-03-30 20:33 ` William Pitcock 2018-03-30 20:35 ` Rich Felker 2018-03-31 10:42 ` Florian Weimer 2 siblings, 1 reply; 12+ messages in thread From: Szabolcs Nagy @ 2018-03-30 20:24 UTC (permalink / raw) To: musl * William Pitcock <nenolod@dereferenced.org> [2018-03-30 14:44:44 -0500]: > There is a talk in a few weeks at Kubecon (the Kubernetes conference), > explicitly titled "Don't Use Alpine If You Care About DNS." The talk > largely centers around how musl's overly strict behaviour makes Alpine > a bad choice for "the real world." I would like to turn this into a > story where we can announce that Alpine 3.8 mitigates this problem > instead, doing such will be good for both Alpine and the musl > ecosystem as a whole, as it is defanging a point of possible FUD. musl has had a good track record catching bugs in dns implementations and in dns setups, i don't think this should be mitigated, it's good to know that cloudflare has a broken dns server, or that kubernetes had dns setup bugs, these are real reliability and performance issues affecting all systems not just musl. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 20:24 ` Szabolcs Nagy @ 2018-03-30 20:33 ` William Pitcock 0 siblings, 0 replies; 12+ messages in thread From: William Pitcock @ 2018-03-30 20:33 UTC (permalink / raw) To: musl Hello, On Fri, Mar 30, 2018 at 3:24 PM, Szabolcs Nagy <nsz@port70.net> wrote: > * William Pitcock <nenolod@dereferenced.org> [2018-03-30 14:44:44 -0500]: >> There is a talk in a few weeks at Kubecon (the Kubernetes conference), >> explicitly titled "Don't Use Alpine If You Care About DNS." The talk >> largely centers around how musl's overly strict behaviour makes Alpine >> a bad choice for "the real world." I would like to turn this into a >> story where we can announce that Alpine 3.8 mitigates this problem >> instead, doing such will be good for both Alpine and the musl >> ecosystem as a whole, as it is defanging a point of possible FUD. > > musl has had a good track record catching bugs in dns > implementations and in dns setups, i don't think this > should be mitigated, it's good to know that cloudflare > has a broken dns server, or that kubernetes had dns > setup bugs, these are real reliability and performance > issues affecting all systems not just musl. We can add a strict option to turn off the mitigation, but Alpine 3.8 will ship with a mitigation regardless. William ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:44 ` William Pitcock 2018-03-30 20:24 ` Szabolcs Nagy @ 2018-03-30 20:35 ` Rich Felker 2018-03-30 21:09 ` William Pitcock 2018-03-31 10:42 ` Florian Weimer 2 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2018-03-30 20:35 UTC (permalink / raw) To: musl On Fri, Mar 30, 2018 at 02:44:44PM -0500, William Pitcock wrote: > Hello, > > On Fri, Mar 30, 2018 at 2:35 PM, Rich Felker <dalias@libc.org> wrote: > > On Fri, Mar 30, 2018 at 02:19:48PM -0500, William Pitcock wrote: > >> Hello, > >> > >> On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@libc.org> wrote: > >> > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: > >> >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still > >> >> be thrown. > >> >> > >> >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, > >> >> such as the one Cloudflare have implemented, and appear in the search path. > >> >> --- > >> >> src/network/lookup_name.c | 2 +- > >> >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> > >> >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c > >> >> index 209c20f0..b068bb92 100644 > >> >> --- a/src/network/lookup_name.c > >> >> +++ b/src/network/lookup_name.c > >> >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ > >> >> memcpy(canon+l+1, p, z-p); > >> >> canon[z-p+1+l] = 0; > >> >> int cnt = name_from_dns(buf, canon, canon, family, &conf); > >> >> - if (cnt) return cnt; > >> >> + if (cnt > 0) return cnt; > >> >> } > >> >> } > >> > > >> > This patch is incorrect, and the reason should be an FAQ item if it's > >> > not already. Only a return value of 0 means that the requested name > >> > does not exist and that it's permissible to continue search. Other > >> > nonpositive return values indicate either that the name does exist but > >> > does not have a record of the quested type, or that a transient error > >> > occurred, making it impossible to determine whether the search can be > >> > continued and thus requiring the error to be reported to the caller. > >> > Anything else results in one or both of the following bugs: > >> > > >> > - Nondeterministically returning different results for the same query > >> > depending on transient unavailability of the nameservers to answer > >> > on time. > >> > > >> > - Returning inconsistent results (for different search components) > >> > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. > >> > > >> > I'm aware that at least rancher-dns and Cloudflare's nameservers have > >> > had bugs related to this issue. I'm not sure what the status on > >> > getting them fixed is, and for Cloudflare I don't know exactly what it > >> > is they're doing wrong or why. But I do know the problem is that > >> > they're returning semantically incorrect dns replies. > >> > >> Kubernetes imposes a default search path with the cluster domain last, so: > >> > >> - local.prod.svc.whatever > >> - prod.svc.whatever > >> - svc.whatever > >> - yourdomain.com > >> > >> The cloudflare issue is that they send SUCCESS code with 0 replies, > >> which causes musl to error when it hits the yourdomain.com. > > > > Yes, that makes sense. Do you know why they're doing it? If they > > refuse to fix it, the only clean fix I know is a local proxy > > configured to fix the records for the specific broken domains you care > > about. But of course that's not convenient. > > My contacts at cloudflare indicate that their environment depends on > this behaviour, so they have no interest in fixing it. > > A local proxy isn't going to be workable, because most people are > going to just say "but Debian or Fedora doesn't require this," and > then just go use a glibc distribution. > > There is a talk in a few weeks at Kubecon (the Kubernetes conference), > explicitly titled "Don't Use Alpine If You Care About DNS." The talk > largely centers around how musl's overly strict behaviour makes Alpine > a bad choice for "the real world." I would like to turn this into a > story where we can announce that Alpine 3.8 mitigates this problem > instead, doing such will be good for both Alpine and the musl > ecosystem as a whole, as it is defanging a point of possible FUD. I understand that this is a frustrating position to be in, where someone who (from your description so far, at least) seems like a troll, is planning to publicly malign your distro. I don't think rushing quick, wrong workarounds helps. Rather it makes it look like you knew there was a problem and only acted on it because someone threatened to make you look bad in public. The reality is that Cloudflare is returning wrong DNS results that aren't suitable for the search settings k8s users want to use. I think a better response would be demonstrating why the glibc behavior they're relying on is buggy. This should be possible by configuring two search domains: ex1.example.com ex2.example.com and adding the following records to them: foo.ex1.example.com. A 10.0.0.1 foo.ex2.example.com. A 192.168.0.1 foo.ex2.example.com. AAAA fd00::1 Then, try glibc's getaddrinfo for the name "foo" with AF_INET, AF_INET6, and AF_UNSPEC. The results I expect you'll see are: AF_INET: 10.0.0.1 AF_INET6: 192.168.0.1, fd00::1 AF_UNSPEC: 10.0.0.1 (possibly also fd00::1 but doubtful) That is, *which fqdn you see results for* is depending on which address family you made the query with. This is clearly wrong. With musl you'll consistently see results for foo.ex1.example.com: AF_INET: 10.0.0.1 AF_INET6: no results AF_UNSPEC: 10.0.0.1 since it's the first definition of foo in the serch domains. > >> Do you have any suggestions on a mitigation which would be more > >> palatable? We need to ship a mitigation for this in Alpine 3.8 > >> regardless. I would much rather carry a patch that is upstreamable, > >> but I am quite willing to carry one that isn't, in order to solve this > >> problem. > > > > A theoretically-non-horrible (but somewhat costly) solution is to > > always query both A and AAAA, rather than only doing it for AF_UNSPEC. > > Then if you see a reply with 0 (total, between both) records, you can > > opt to interpret that the same way as NxDomain without breaking > > consistency properties. If Cloudflare refuses to fix the bug, maybe we > > should consider adding an _option_ (in the resolv.conf options line) > > to do this. I don't think it should be the default behavior because it > > mildly slows down lookups, especially if you have nontrivial packet > > loss since probability of failure is now 1-(1-p)²=2p-p² rather than p > > (where p is the packet loss rate). > > It seems to me we could just send ANY and filter out the records we > don't care about. This is what I did with charybdis's asynchronous > DNS resolver when it had a similar problem. What are your thoughts on > that? If ANY requests actually worked we would have done this years ago. They don't. Not only is the ANY result unlikely to fit in a UDP reply (and lacking any reason to include the records you need if it doesn't fit); A and AAAA queries are also special in that they resolve CNAMEs and automatically include the A or AAAA records for the name the CNAME points to in the reply. With ANY you'd have to reimplement the whole CNAME logic as a second (slow) query from the stub resolver rather than having the recursive resolver give you the answer it already knew when you made the first query. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 20:35 ` Rich Felker @ 2018-03-30 21:09 ` William Pitcock 0 siblings, 0 replies; 12+ messages in thread From: William Pitcock @ 2018-03-30 21:09 UTC (permalink / raw) To: musl Hello, On Fri, Mar 30, 2018 at 3:35 PM, Rich Felker <dalias@libc.org> wrote: > On Fri, Mar 30, 2018 at 02:44:44PM -0500, William Pitcock wrote: >> Hello, >> >> On Fri, Mar 30, 2018 at 2:35 PM, Rich Felker <dalias@libc.org> wrote: >> > On Fri, Mar 30, 2018 at 02:19:48PM -0500, William Pitcock wrote: >> >> Hello, >> >> >> >> On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@libc.org> wrote: >> >> > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: >> >> >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still >> >> >> be thrown. >> >> >> >> >> >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, >> >> >> such as the one Cloudflare have implemented, and appear in the search path. >> >> >> --- >> >> >> src/network/lookup_name.c | 2 +- >> >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> >> >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c >> >> >> index 209c20f0..b068bb92 100644 >> >> >> --- a/src/network/lookup_name.c >> >> >> +++ b/src/network/lookup_name.c >> >> >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ >> >> >> memcpy(canon+l+1, p, z-p); >> >> >> canon[z-p+1+l] = 0; >> >> >> int cnt = name_from_dns(buf, canon, canon, family, &conf); >> >> >> - if (cnt) return cnt; >> >> >> + if (cnt > 0) return cnt; >> >> >> } >> >> >> } >> >> > >> >> > This patch is incorrect, and the reason should be an FAQ item if it's >> >> > not already. Only a return value of 0 means that the requested name >> >> > does not exist and that it's permissible to continue search. Other >> >> > nonpositive return values indicate either that the name does exist but >> >> > does not have a record of the quested type, or that a transient error >> >> > occurred, making it impossible to determine whether the search can be >> >> > continued and thus requiring the error to be reported to the caller. >> >> > Anything else results in one or both of the following bugs: >> >> > >> >> > - Nondeterministically returning different results for the same query >> >> > depending on transient unavailability of the nameservers to answer >> >> > on time. >> >> > >> >> > - Returning inconsistent results (for different search components) >> >> > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. >> >> > >> >> > I'm aware that at least rancher-dns and Cloudflare's nameservers have >> >> > had bugs related to this issue. I'm not sure what the status on >> >> > getting them fixed is, and for Cloudflare I don't know exactly what it >> >> > is they're doing wrong or why. But I do know the problem is that >> >> > they're returning semantically incorrect dns replies. >> >> >> >> Kubernetes imposes a default search path with the cluster domain last, so: >> >> >> >> - local.prod.svc.whatever >> >> - prod.svc.whatever >> >> - svc.whatever >> >> - yourdomain.com >> >> >> >> The cloudflare issue is that they send SUCCESS code with 0 replies, >> >> which causes musl to error when it hits the yourdomain.com. >> > >> > Yes, that makes sense. Do you know why they're doing it? If they >> > refuse to fix it, the only clean fix I know is a local proxy >> > configured to fix the records for the specific broken domains you care >> > about. But of course that's not convenient. >> >> My contacts at cloudflare indicate that their environment depends on >> this behaviour, so they have no interest in fixing it. >> >> A local proxy isn't going to be workable, because most people are >> going to just say "but Debian or Fedora doesn't require this," and >> then just go use a glibc distribution. >> >> There is a talk in a few weeks at Kubecon (the Kubernetes conference), >> explicitly titled "Don't Use Alpine If You Care About DNS." The talk >> largely centers around how musl's overly strict behaviour makes Alpine >> a bad choice for "the real world." I would like to turn this into a >> story where we can announce that Alpine 3.8 mitigates this problem >> instead, doing such will be good for both Alpine and the musl >> ecosystem as a whole, as it is defanging a point of possible FUD. > > I understand that this is a frustrating position to be in, where > someone who (from your description so far, at least) seems like a > troll, is planning to publicly malign your distro. I don't think > rushing quick, wrong workarounds helps. Rather it makes it look like > you knew there was a problem and only acted on it because someone > threatened to make you look bad in public. The reality is that > Cloudflare is returning wrong DNS results that aren't suitable for the > search settings k8s users want to use. I agree with this conclusion, but also think that mitigating the k8s problem is worthwhile, too, as there have been many people who have complained about it. Strangely, no bug report has ever been filed with Alpine for this though. In IRC, we talked about conditionally enabling a mitigation based on split A/AAAA lookup. I believe this is the way to go, and am working on a patch for it. I believe we should enable the mitigation where ndots > 1 or n_search_domains > 1, as discussed in IRC. > I think a better response would be demonstrating why the glibc > behavior they're relying on is buggy. This should be possible by > configuring two search domains: > > ex1.example.com > ex2.example.com > > and adding the following records to them: > > foo.ex1.example.com. A 10.0.0.1 > foo.ex2.example.com. A 192.168.0.1 > foo.ex2.example.com. AAAA fd00::1 > > Then, try glibc's getaddrinfo for the name "foo" with AF_INET, > AF_INET6, and AF_UNSPEC. The results I expect you'll see are: > > AF_INET: 10.0.0.1 > AF_INET6: 192.168.0.1, fd00::1 > AF_UNSPEC: 10.0.0.1 (possibly also fd00::1 but doubtful) > > That is, *which fqdn you see results for* is depending on which > address family you made the query with. This is clearly wrong. Yes, I agree with this as a counter-example. I intend to write a blog about it, and that is a good scenario to demonstrate. > With musl you'll consistently see results for foo.ex1.example.com: > > AF_INET: 10.0.0.1 > AF_INET6: no results > AF_UNSPEC: 10.0.0.1 > > since it's the first definition of foo in the serch domains. Yes, I agree that the consistency is good to demonstrate also. >> >> Do you have any suggestions on a mitigation which would be more >> >> palatable? We need to ship a mitigation for this in Alpine 3.8 >> >> regardless. I would much rather carry a patch that is upstreamable, >> >> but I am quite willing to carry one that isn't, in order to solve this >> >> problem. >> > >> > A theoretically-non-horrible (but somewhat costly) solution is to >> > always query both A and AAAA, rather than only doing it for AF_UNSPEC. >> > Then if you see a reply with 0 (total, between both) records, you can >> > opt to interpret that the same way as NxDomain without breaking >> > consistency properties. If Cloudflare refuses to fix the bug, maybe we >> > should consider adding an _option_ (in the resolv.conf options line) >> > to do this. I don't think it should be the default behavior because it >> > mildly slows down lookups, especially if you have nontrivial packet >> > loss since probability of failure is now 1-(1-p)²=2p-p² rather than p >> > (where p is the packet loss rate). >> >> It seems to me we could just send ANY and filter out the records we >> don't care about. This is what I did with charybdis's asynchronous >> DNS resolver when it had a similar problem. What are your thoughts on >> that? > > If ANY requests actually worked we would have done this years ago. > They don't. Not only is the ANY result unlikely to fit in a UDP reply > (and lacking any reason to include the records you need if it doesn't > fit); A and AAAA queries are also special in that they resolve CNAMEs > and automatically include the A or AAAA records for the name the CNAME > points to in the reply. With ANY you'd have to reimplement the whole > CNAME logic as a second (slow) query from the stub resolver rather > than having the recursive resolver give you the answer it already knew > when you made the first query. Right, I forgot about CNAMEs. In charybdis, we wound up just using A/AAAA conditionally, because of them. DNS gives me a headache. William ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-30 19:44 ` William Pitcock 2018-03-30 20:24 ` Szabolcs Nagy 2018-03-30 20:35 ` Rich Felker @ 2018-03-31 10:42 ` Florian Weimer 2018-03-31 14:01 ` Rich Felker 2 siblings, 1 reply; 12+ messages in thread From: Florian Weimer @ 2018-03-31 10:42 UTC (permalink / raw) To: William Pitcock; +Cc: musl * William Pitcock: > A local proxy isn't going to be workable, because most people are > going to just say "but Debian or Fedora doesn't require this," and > then just go use a glibc distribution. Some parts of the glibc behavior are clearly wrong and not even internally consistent. Rich is right that for correctness, you can only proceed on the search path if you have received a successful reply. However, making changing in this area difficult, both due to the current state of the glibc code, and existing deployments depending on corner cases which are not well-understood. I'm not entirely convinced that using different search path domains for different address families is necessarily wrong. Historically, the NODATA/NXDOMAIN signaling has been really inconsistent anyway, and I suspect it still is for some users. What Cloudflare is doing appears to be some kind of protection against NSEC-based zone enumeration, and that requires synthesizing NODATA response. They are unlikely to change that, and they won't be the only ones doing this. (going further upthread) > Kubernetes imposes a default search path with the cluster domain last, so: > > - local.prod.svc.whatever > - prod.svc.whatever > - svc.whatever > - yourdomain.com Do you have a source for that? Considering that glibc had for a long time a hard limit at six entries, I find that approach rather surprising. This leaves just three domains in the end user's context. That's not going to be sufficient for many users. Anyway … > The cloudflare issue is that they send SUCCESS code with 0 replies, > which causes musl to error when it hits the yourdomain.com. Is the long search path really the problem here? Isn't it ndots:5? It means that queries destined to the general DNS tree hit the organizational tree first, where the search stops due to the NODATA response. So you never get the expected response from the public tree. Is this what's happening? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-31 10:42 ` Florian Weimer @ 2018-03-31 14:01 ` Rich Felker 2018-03-31 16:08 ` Florian Weimer 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2018-03-31 14:01 UTC (permalink / raw) To: Florian Weimer; +Cc: William Pitcock, musl On Sat, Mar 31, 2018 at 12:42:17PM +0200, Florian Weimer wrote: > * William Pitcock: > > > A local proxy isn't going to be workable, because most people are > > going to just say "but Debian or Fedora doesn't require this," and > > then just go use a glibc distribution. > > Some parts of the glibc behavior are clearly wrong and not even > internally consistent. Rich is right that for correctness, you can > only proceed on the search path if you have received a successful > reply. However, making changing in this area difficult, both due to > the current state of the glibc code, and existing deployments > depending on corner cases which are not well-understood. The behavior of path search on failures is a separate issue from the behavior on "NODATA" so we can probably stick to the latter for now. > I'm not entirely convinced that using different search path domains > for different address families is necessarily wrong. It breaks the completely reasonable application expectation that the results produced by AF_INET and AF_INET6 queries are subsets of the results produced by AF_UNSPEC. The proper application idiom is to use AF_UNSPEC (or no hints) and respect the order the results are returned in, in order to honor RFC 3484/gai.conf or any other means by which getaddrinfo determines which order results should be tried in. It's (IMO at least) utterly wrong to try to merge results from different search domains, but I can see applications trying both queries separately when they encounter the inconsistency... > Historically, > the NODATA/NXDOMAIN signaling has been really inconsistent anyway, and > I suspect it still is for some users. Do you have a reference for this? AFAIK it was very consistent in all historical implementations. It's also documented (in RFC-????...I forget where but I looked it up during this). > What Cloudflare is doing appears to be some kind of protection against > NSEC-based zone enumeration, and that requires synthesizing NODATA > response. They are unlikely to change that, and they won't be the > only ones doing this. Thanks for the explanation. > > Kubernetes imposes a default search path with the cluster domain last, so: > > > > - local.prod.svc.whatever > > - prod.svc.whatever > > - svc.whatever > > - yourdomain.com > > Do you have a source for that? > > Considering that glibc had for a long time a hard limit at six > entries, I find that approach rather surprising. This leaves just > three domains in the end user's context. That's not going to be > sufficient for many users. Anyway … k8s isn't software you install as a package on your user system. It's cloud/container stuff, where it wouldn't make sense to add more search domains beyond the ones for your application. > > The cloudflare issue is that they send SUCCESS code with 0 replies, > > which causes musl to error when it hits the yourdomain.com. > > Is the long search path really the problem here? Isn't it ndots:5? > It means that queries destined to the general DNS tree hit the > organizational tree first, where the search stops due to the NODATA > response. So you never get the expected response from the public > tree. > > Is this what's happening? Yes. ndots>1 is utterly awful -- it greatly increases latency of every lookup, and has failure modes like what we're seeing now -- but the k8s folks designed stuff around it. Based on conversations when musl added search domains, I think there are people on the k8s side that realize this was a bad design choice and want to fix it, but that probably won't be easy to roll out to everyone and I have no idea if it's really going to happen. FWIW, if ndots<=1 and there is only one search domain, the NODATA/NXDOMAIN issue does not make any difference to the results (assuming no TLDs with top-level A/AAAA records :). But if ndots>1 or there are at least 2 search domains, the result does change. In the former case, global lookups get broken; in the latter, subsequent search domains get missed. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given 2018-03-31 14:01 ` Rich Felker @ 2018-03-31 16:08 ` Florian Weimer 0 siblings, 0 replies; 12+ messages in thread From: Florian Weimer @ 2018-03-31 16:08 UTC (permalink / raw) To: Rich Felker; +Cc: William Pitcock, musl * Rich Felker: >> I'm not entirely convinced that using different search path domains >> for different address families is necessarily wrong. > > It breaks the completely reasonable application expectation that the > results produced by AF_INET and AF_INET6 queries are subsets of the > results produced by AF_UNSPEC. The proper application idiom is to use > AF_UNSPEC (or no hints) and respect the order the results are returned > in, in order to honor RFC 3484/gai.conf or any other means by which > getaddrinfo determines which order results should be tried in. It's > (IMO at least) utterly wrong to try to merge results from different > search domains, but I can see applications trying both queries > separately when they encounter the inconsistency... Well, yes, but I'm not sure you can get consistent behavior without always issuing two queries. And least not if you want to stay compatible with certain forms of DNSSEC online signing. >> Historically, >> the NODATA/NXDOMAIN signaling has been really inconsistent anyway, and >> I suspect it still is for some users. > > Do you have a reference for this? AFAIK it was very consistent in all > historical implementations. It's also documented (in RFC-????...I > forget where but I looked it up during this). Today, I expected that it is consistent among the major implementations, mainly due to DNSSEC influence. Some load balancers returned NXDOMAIN for AAAA queries. I'm not sure if F5 was one of them, but this document suggest something in this direction: https://support.f5.com/csp/article/K2345 Here's a report of this issue: https://www.nanog.org/mailinglist/mailarchives/old_archive/2002-04/msg00559.html Here's a more concrete bug report about MaraDNS: http://maradns.samiam.org/old-list-archive/2009-October/000476.html (Which is surprisingly recent, but then, non-lower-case domain names are probably quite rare.) Peter van Dijk reports something else, some form of NODATA-to-NXDOMAIN escalation: https://blog.powerdns.com/2015/03/02/from-noerror-to-refused/ Although that doesn't happen on the stub resolver interface, it shows that the behavior still isn't as uniform as we would like it to be. >> > Kubernetes imposes a default search path with the cluster domain last, so: >> > >> > - local.prod.svc.whatever >> > - prod.svc.whatever >> > - svc.whatever >> > - yourdomain.com >> >> Do you have a source for that? >> >> Considering that glibc had for a long time a hard limit at six >> entries, I find that approach rather surprising. This leaves just >> three domains in the end user's context. That's not going to be >> sufficient for many users. Anyway … > > k8s isn't software you install as a package on your user system. It's > cloud/container stuff, where it wouldn't make sense to add more search > domains beyond the ones for your application. From what I've heard, quite a few people use it to run older software which interacts with the corporate network. Even before, the six domain limit was quite low for some deployments (and some sites apparently stuck to NIS because of its server-side search list processing). All I'm saying is that it's a curious choice due to the compatibility and performance issues involved. > Yes. ndots>1 is utterly awful -- it greatly increases latency of every > lookup, and has failure modes like what we're seeing now -- but the > k8s folks designed stuff around it. Based on conversations when musl > added search domains, I think there are people on the k8s side that > realize this was a bad design choice and want to fix it, but that > probably won't be easy to roll out to everyone and I have no idea if > it's really going to happen. They probably do not want to maintain an NSS module for that. 8-> In the past, in the container context, there have also been reports about injecting the recursive resolver endpoint, so that it appears in the container under a 127.0.0.0/8 address. I don't know if that has been solved completely. I suspect a DNS transport over a UNIX domain socket would help here. For the search path problems, we would need a DNS protocol extension which transfers search path expansion to the recursive resolver. I'm not sure if this is worth the effort, and how many glibc-based distributions would be willing to backport a patch for that in a relatively timely fashion. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-03-31 16:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-30 18:52 [PATCH] resolver: only exit the search path loop there are a positive number of results given William Pitcock 2018-03-30 19:14 ` Rich Felker 2018-03-30 19:19 ` William Pitcock 2018-03-30 19:35 ` Rich Felker 2018-03-30 19:44 ` William Pitcock 2018-03-30 20:24 ` Szabolcs Nagy 2018-03-30 20:33 ` William Pitcock 2018-03-30 20:35 ` Rich Felker 2018-03-30 21:09 ` William Pitcock 2018-03-31 10:42 ` Florian Weimer 2018-03-31 14:01 ` Rich Felker 2018-03-31 16:08 ` Florian Weimer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).