mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
@ 2024-06-22  9:54 Lance Yang
  2024-06-22 13:06 ` Jan Mercl
  2024-06-22 14:37 ` Rich Felker
  0 siblings, 2 replies; 11+ messages in thread
From: Lance Yang @ 2024-06-22  9:54 UTC (permalink / raw)
  To: musl; +Cc: Lance Yang

From: Lance Yang <ioworker0@gmail.com>

musl’s resolver queries some configured nameservers in parallel and accepts
the first response. However, if the first response's RCODE indicates
NXDOMAIN, the resolver terminates the resolution process too early,
potentially missing valid responses from other nameservers.

There is a DNS issue that is reproducible under specific conditions. For
instance, it occurs when one of the nameservers does not have the domain
name and responds first. Even worse, if this nameserver consistently
responds the fastest, the domain name will never be resolved successfully.

This commit introduces a 'send_tracker' counter to track the number of
queries sent. The resolver now continues waiting for responses from other
nameservers unless only one query was sent, ensuring more robust DNS
resolution.

Signed-off-by: Lance Yang <ioworker0@gmail.com>
---
 src/network/res_msend.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/network/res_msend.c b/src/network/res_msend.c
index 86c2fcf4..29f1ce0b 100644
--- a/src/network/res_msend.c
+++ b/src/network/res_msend.c
@@ -98,6 +98,7 @@ int __res_msend_rc(int nqueries, const unsigned char *const *queries,
 	unsigned char alen_buf[nqueries][2];
 	int r;
 	unsigned long t0, t1, t2;
+	int send_tracker = 0;
 
 	pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs);
 
@@ -185,7 +186,7 @@ int __res_msend_rc(int nqueries, const unsigned char *const *queries,
 			/* Query all configured namservers in parallel */
 			for (i=0; i<nqueries; i++)
 				if (!alens[i])
-					for (j=0; j<nns; j++)
+					for (j=0; j<nns; j++, send_tracker++)
 						sendto(fd, queries[i],
 							qlens[i], MSG_NOSIGNAL,
 							(void *)&ns[j], sl);
@@ -228,14 +229,19 @@ int __res_msend_rc(int nqueries, const unsigned char *const *queries,
 			 * all other codes such as refusal. */
 			switch (answers[next][3] & 15) {
 			case 0:
-			case 3:
 				break;
+			case 3:
+				if (send_tracker <= 1)
+					break;
 			case 2:
-				if (servfail_retry && servfail_retry--)
+				if (servfail_retry && servfail_retry--) {
 					sendto(fd, queries[i],
 						qlens[i], MSG_NOSIGNAL,
 						(void *)&ns[j], sl);
+					send_tracker++;
+				}
 			default:
+				send_tracker--;
 				continue;
 			}
 
-- 
2.45.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-22  9:54 [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries Lance Yang
@ 2024-06-22 13:06 ` Jan Mercl
  2024-06-23  3:39   ` Lance Yang
  2024-06-22 14:37 ` Rich Felker
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Mercl @ 2024-06-22 13:06 UTC (permalink / raw)
  To: musl; +Cc: Lance Yang

On Sat, Jun 22, 2024 at 2:51 PM Lance Yang <lance.yang@linux.dev> wrote:

> musl’s resolver queries some configured nameservers in parallel and accepts
> the first response. However, if the first response's RCODE indicates
> NXDOMAIN, the resolver terminates the resolution process too early,
> potentially missing valid responses from other nameservers.

Linux uses the first valid response, even if it is NXDOMAIN. So it's
not clear terminating the resolve process in that case is "too early".
I think that continuing the search after getting NXDOMAIN can be
possibly considered a security risk.

Source, possibly outdated:
https://www.unix.com/ip-networking/133552-howto-linux-multihomed-dns-client.html

-j

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-22  9:54 [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries Lance Yang
  2024-06-22 13:06 ` Jan Mercl
@ 2024-06-22 14:37 ` Rich Felker
  2024-06-23  4:09   ` Lance Yang
  1 sibling, 1 reply; 11+ messages in thread
From: Rich Felker @ 2024-06-22 14:37 UTC (permalink / raw)
  To: Lance Yang; +Cc: musl, Lance Yang

On Sat, Jun 22, 2024 at 05:54:29PM +0800, Lance Yang wrote:
> From: Lance Yang <ioworker0@gmail.com>
> 
> musl’s resolver queries some configured nameservers in parallel and accepts
> the first response. However, if the first response's RCODE indicates
> NXDOMAIN, the resolver terminates the resolution process too early,
> potentially missing valid responses from other nameservers.
> 
> There is a DNS issue that is reproducible under specific conditions. For
> instance, it occurs when one of the nameservers does not have the domain
> name and responds first. Even worse, if this nameserver consistently
> responds the fastest, the domain name will never be resolved successfully.
> 
> This commit introduces a 'send_tracker' counter to track the number of
> queries sent. The resolver now continues waiting for responses from other
> nameservers unless only one query was sent, ensuring more robust DNS
> resolution.
> 
> Signed-off-by: Lance Yang <ioworker0@gmail.com>

The behavior you're trying to "fix" is intentional and necessary. See
the recent question here on the list:

https://www.openwall.com/lists/musl/2024/06/14/2

and the answer:

https://www.openwall.com/lists/musl/2024/06/14/3

If you don't accept a (semantically conclusive) NxDomain response but
keep waiting for replies from other nameservers, you necessarily
undermine the whole redundancy purpose of the resolver allowing more
than one nameserver. Negative results at least stall until all servers
respond or time out, and if any of them do time out, you're forced
either to report a temporary failure (making the redundancy breakage
not just slow response but functional distinction) or reverse your
decision to treat the NxDomain as inconclusive (making it so that
attacker who can disrupt network controls how a name resolves).
Neither of these is acceptable.

It sounds like what you want is unioning of multiple disjoint DNS
namespaces served by different nameservers. Doing this in any reliable
and consistent way depends on a lot of policy, that's completely
outside the scope of what libc/stub-resolver could let you define. You
need an actual proxy nameserver running on localhost or somewhere else
you control that performs the unioning according to the particular
policy you want.

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-22 13:06 ` Jan Mercl
@ 2024-06-23  3:39   ` Lance Yang
  2024-06-23 18:52     ` Thorsten Glaser
  0 siblings, 1 reply; 11+ messages in thread
From: Lance Yang @ 2024-06-23  3:39 UTC (permalink / raw)
  To: Jan Mercl, musl; +Cc: Lance Yang

June 22, 2024 at 9:06 PM, "Jan Mercl" <0xjnml@gmail.com> wrote:



> 
> On Sat, Jun 22, 2024 at 2:51 PM Lance Yang <lance.yang@linux.dev> wrote:
> 
> > 
> > musl’s resolver queries some configured nameservers in parallel and accepts
> > 
> >  the first response. However, if the first response's RCODE indicates
> > 
> >  NXDOMAIN, the resolver terminates the resolution process too early,
> > 
> >  potentially missing valid responses from other nameservers.
> > 
> 
> Linux uses the first valid response, even if it is NXDOMAIN. So it's
> 
> not clear terminating the resolve process in that case is "too early".
> 
> I think that continuing the search after getting NXDOMAIN can be
> 
> possibly considered a security risk.
> 
> Source, possibly outdated:
> 
> https://www.unix.com/ip-networking/133552-howto-linux-multihomed-dns-client.html
> 
> -j

Hi Jan,

Thanks for paying attention and sharing this information!

I understand your concern that continuing the search after receiving an
NXDOMAIN response might pose a security risk. Will look into this issue
further.

Thanks again!
Lance

>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-22 14:37 ` Rich Felker
@ 2024-06-23  4:09   ` Lance Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Lance Yang @ 2024-06-23  4:09 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Lance Yang

June 22, 2024 at 10:37 PM, "Rich Felker" <dalias@libc.org> wrote:



> 
> On Sat, Jun 22, 2024 at 05:54:29PM +0800, Lance Yang wrote:
> 
> > 
> > From: Lance Yang <ioworker0@gmail.com>
> > 
> >  
> > 
> >  musl’s resolver queries some configured nameservers in parallel and accepts
> > 
> >  the first response. However, if the first response's RCODE indicates
> > 
> >  NXDOMAIN, the resolver terminates the resolution process too early,
> > 
> >  potentially missing valid responses from other nameservers.
> > 
> >  
> > 
> >  There is a DNS issue that is reproducible under specific conditions. For
> > 
> >  instance, it occurs when one of the nameservers does not have the domain
> > 
> >  name and responds first. Even worse, if this nameserver consistently
> > 
> >  responds the fastest, the domain name will never be resolved successfully.
> > 
> >  
> > 
> >  This commit introduces a 'send_tracker' counter to track the number of
> > 
> >  queries sent. The resolver now continues waiting for responses from other
> > 
> >  nameservers unless only one query was sent, ensuring more robust DNS
> > 
> >  resolution.
> > 
> >  
> > 
> >  Signed-off-by: Lance Yang <ioworker0@gmail.com>
> > 
> 

Hi Rich,

Thanks for taking time to review!


> The behavior you're trying to "fix" is intentional and necessary. See
> 
> the recent question here on the list:
> 
> https://www.openwall.com/lists/musl/2024/06/14/2
> 
> and the answer:
> 
> https://www.openwall.com/lists/musl/2024/06/14/3

Missed that, thanks.

> 
> If you don't accept a (semantically conclusive) NxDomain response but
> 
> keep waiting for replies from other nameservers, you necessarily
> 
> undermine the whole redundancy purpose of the resolver allowing more
> 
> than one nameserver. Negative results at least stall until all servers
> 
> respond or time out, and if any of them do time out, you're forced
> 
> either to report a temporary failure (making the redundancy breakage
> 
> not just slow response but functional distinction) or reverse your
> 
> decision to treat the NxDomain as inconclusive (making it so that
> 
> attacker who can disrupt network controls how a name resolves).
> 
> Neither of these is acceptable.
> 
> It sounds like what you want is unioning of multiple disjoint DNS
> 
> namespaces served by different nameservers. Doing this in any reliable
> 
> and consistent way depends on a lot of policy, that's completely
> 
> outside the scope of what libc/stub-resolver could let you define. You
> 
> need an actual proxy nameserver running on localhost or somewhere else
> 
> you control that performs the unioning according to the particular
> 
> policy you want.

Thanks again for clarifying!

It seems ensuring reliability across multiple separate DNS namespaces
probably require a different approach, such as using a proxy nameserver
that we can control locally to do specific policies.

Have a good weekend!
Lance


> 
> Rich
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-23  3:39   ` Lance Yang
@ 2024-06-23 18:52     ` Thorsten Glaser
  2024-06-23 19:23       ` Rich Felker
  2024-06-24 11:56       ` Lance Yang
  0 siblings, 2 replies; 11+ messages in thread
From: Thorsten Glaser @ 2024-06-23 18:52 UTC (permalink / raw)
  To: musl; +Cc: Jan Mercl, Lance Yang

Lance Yang dixit:

>I understand your concern that continuing the search after receiving an
>NXDOMAIN response might pose a security risk. Will look into this issue

It’s not (just) a security risk, it’s how DNS works.

NXDOMAIN means “I am a nameserver responsible for resolving your
query, and I can state with confidence that the entry you requested
does not exist” so no other responsible nameserver’s response can
rightly differ.

If you need to merge different zones together, the normal method is
running a caching nameserver like dnscache from DJBDNS and configuring
it to ask specific upstream nameservers for specific zones, for example
“echo 192.168.178.1 >/service/dnscache/root/servers/box”, then it will
ask the normal root zone for normal requests but for *.box it’ll ask
a local Fritz!box instead.

bye,
//mirabilos
-- 
Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
hervorragend.		-- Andreas Bogk über boehm-gc in d.a.s.r

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-23 18:52     ` Thorsten Glaser
@ 2024-06-23 19:23       ` Rich Felker
  2024-06-24  4:35         ` Lance Yang
  2024-06-24 11:56       ` Lance Yang
  1 sibling, 1 reply; 11+ messages in thread
From: Rich Felker @ 2024-06-23 19:23 UTC (permalink / raw)
  To: Thorsten Glaser; +Cc: musl, Jan Mercl, Lance Yang

On Sun, Jun 23, 2024 at 06:52:54PM +0000, Thorsten Glaser wrote:
> Lance Yang dixit:
> 
> >I understand your concern that continuing the search after receiving an
> >NXDOMAIN response might pose a security risk. Will look into this issue
> 
> It’s not (just) a security risk, it’s how DNS works.
> 
> NXDOMAIN means “I am a nameserver responsible for resolving your
> query, and I can state with confidence that the entry you requested
> does not exist” so no other responsible nameserver’s response can
> rightly differ.

Moreover, if you're using a nameserver that validates DNSSEC it means
"I am a nameserver.... and I have witnessed cryptographic proof that
the name you requested does not exist or that the delegating authority
at one level of the hierarchy made a delegation that opts out of
further cryptographic validation."

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-23 19:23       ` Rich Felker
@ 2024-06-24  4:35         ` Lance Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Lance Yang @ 2024-06-24  4:35 UTC (permalink / raw)
  To: Rich Felker, Thorsten Glaser; +Cc: musl, Jan Mercl, Lance Yang

June 24, 2024 at 3:23 AM, "Rich Felker" <dalias@libc.org> wrote:



> 
> On Sun, Jun 23, 2024 at 06:52:54PM +0000, Thorsten Glaser wrote:
> 
> > 
> > Lance Yang dixit:
> > 
> >  
> > 
> > I understand your concern that continuing the search after receiving an
> > 
> > NXDOMAIN response might pose a security risk. Will look into this issue
> > 
> >  
> > 
> >  It’s not (just) a security risk, it’s how DNS works.
> > 
> >  
> > 
> >  NXDOMAIN means “I am a nameserver responsible for resolving your
> > 
> >  query, and I can state with confidence that the entry you requested
> > 
> >  does not exist” so no other responsible nameserver’s response can
> > 
> >  rightly differ.

Yep, I got it wrong, thanks for clarifying!

> > 
> 
> Moreover, if you're using a nameserver that validates DNSSEC it means
> 
> "I am a nameserver.... and I have witnessed cryptographic proof that
> 
> the name you requested does not exist or that the delegating authority
> 
> at one level of the hierarchy made a delegation that opts out of
> 
> further cryptographic validation."

Thanks again for the lesson!
Lance

> 
> Rich
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-23 18:52     ` Thorsten Glaser
  2024-06-23 19:23       ` Rich Felker
@ 2024-06-24 11:56       ` Lance Yang
  2024-06-24 14:57         ` Rich Felker
  1 sibling, 1 reply; 11+ messages in thread
From: Lance Yang @ 2024-06-24 11:56 UTC (permalink / raw)
  To: Thorsten Glaser, musl; +Cc: Jan Mercl, Lance Yang

June 24, 2024 at 2:52 AM, "Thorsten Glaser" <tg@mirbsd.de> wrote:



> 
> Lance Yang dixit:
> 
> > 
> > I understand your concern that continuing the search after receiving an
> > 
> > NXDOMAIN response might pose a security risk. Will look into this issue
> > 
> 
> It’s not (just) a security risk, it’s how DNS works.
> 
> NXDOMAIN means “I am a nameserver responsible for resolving your
> 
> query, and I can state with confidence that the entry you requested
> 
> does not exist” so no other responsible nameserver’s response can
> 
> rightly differ.

Sorry to bother you again. Could you please let me know from which
document or standard this description is taken?

Any details about the specific RFC, technical documentation, or other
authoritative sources would be greatly appreciated.

Thanks,
Lance

> 
> If you need to merge different zones together, the normal method is
> 
> running a caching nameserver like dnscache from DJBDNS and configuring
> 
> it to ask specific upstream nameservers for specific zones, for example
> 
> “echo 192.168.178.1 >/service/dnscache/root/servers/box”, then it will
> 
> ask the normal root zone for normal requests but for *.box it’ll ask
> 
> a local Fritz!box instead.
> 
> bye,
> 
> //mirabilos
> 
> -- 
> 
> Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
> 
> schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
> 
> Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
> 
> hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-24 11:56       ` Lance Yang
@ 2024-06-24 14:57         ` Rich Felker
  2024-06-25  1:57           ` Lance Yang
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Felker @ 2024-06-24 14:57 UTC (permalink / raw)
  To: Lance Yang; +Cc: Thorsten Glaser, musl, Jan Mercl, Lance Yang

On Mon, Jun 24, 2024 at 11:56:01AM +0000, Lance Yang wrote:
> June 24, 2024 at 2:52 AM, "Thorsten Glaser" <tg@mirbsd.de> wrote:
> > 
> > Lance Yang dixit:
> > 
> > > 
> > > I understand your concern that continuing the search after receiving an
> > > 
> > > NXDOMAIN response might pose a security risk. Will look into this issue
> > > 
> > 
> > It’s not (just) a security risk, it’s how DNS works.
> > 
> > NXDOMAIN means “I am a nameserver responsible for resolving your
> > 
> > query, and I can state with confidence that the entry you requested
> > 
> > does not exist” so no other responsible nameserver’s response can
> > 
> > rightly differ.
> 
> Sorry to bother you again. Could you please let me know from which
> document or standard this description is taken?
> 
> Any details about the specific RFC, technical documentation, or other
> authoritative sources would be greatly appreciated.

RFC 2308 is the main source I can think of for clarifying the meaning
and expected behavior for NxDomain. The only relevant amendments I can
find are RFC 8020 and 9520, but neither of them change anything
related to the basic meaning.

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries
  2024-06-24 14:57         ` Rich Felker
@ 2024-06-25  1:57           ` Lance Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Lance Yang @ 2024-06-25  1:57 UTC (permalink / raw)
  To: Rich Felker; +Cc: Thorsten Glaser, musl, Jan Mercl, Lance Yang

June 24, 2024 at 10:57 PM, "Rich Felker" <dalias@libc.org> wrote:



> 
> On Mon, Jun 24, 2024 at 11:56:01AM +0000, Lance Yang wrote:
> 
> > 
> > June 24, 2024 at 2:52 AM, "Thorsten Glaser" <tg@mirbsd.de> wrote:
> > 
> >  
> > 
> >  Lance Yang dixit:
> > 
> >  
> > 
> >  > 
> > 
> >  > I understand your concern that continuing the search after receiving an
> > 
> >  > 
> > 
> >  > NXDOMAIN response might pose a security risk. Will look into this issue
> > 
> >  > 
> > 
> >  
> > 
> >  It’s not (just) a security risk, it’s how DNS works.
> > 
> >  
> > 
> >  NXDOMAIN means “I am a nameserver responsible for resolving your
> > 
> >  
> > 
> >  query, and I can state with confidence that the entry you requested
> > 
> >  
> > 
> >  does not exist” so no other responsible nameserver’s response can
> > 
> >  
> > 
> >  rightly differ.
> > 
> >  
> > 
> >  Sorry to bother you again. Could you please let me know from which
> > 
> >  document or standard this description is taken?
> > 
> >  
> > 
> >  Any details about the specific RFC, technical documentation, or other
> > 
> >  authoritative sources would be greatly appreciated.
> > 
> 
> RFC 2308 is the main source I can think of for clarifying the meaning
> 
> and expected behavior for NxDomain. The only relevant amendments I can
> 
> find are RFC 8020 and 9520, but neither of them change anything
> 
> related to the basic meaning.

Thanks a lot for taking time to reply!

I will refer to these documents for a deeper understanding of the
NxDomain behavior ;)

Have a good one,
Lance



> 
> Rich
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-06-25  2:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-22  9:54 [musl] [PATCH 1/1] improve DNS resolution logic for parallel queries Lance Yang
2024-06-22 13:06 ` Jan Mercl
2024-06-23  3:39   ` Lance Yang
2024-06-23 18:52     ` Thorsten Glaser
2024-06-23 19:23       ` Rich Felker
2024-06-24  4:35         ` Lance Yang
2024-06-24 11:56       ` Lance Yang
2024-06-24 14:57         ` Rich Felker
2024-06-25  1:57           ` Lance Yang
2024-06-22 14:37 ` Rich Felker
2024-06-23  4:09   ` Lance Yang

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).