mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <>
Subject: Re: [musl] [PATCH] [draft] dns tcp fallback
Date: Sun, 25 Sep 2022 23:12:38 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

On Thu, Sep 22, 2022 at 04:28:34PM -0400, Rich Felker wrote:
> Here is my first draft of the DNS TCP fallback support that seems to
> be basically working. There may of course be bugs, and there are cases
> where I probably don't want it to fall back like it's doing, but it's
> at a stage where review and [smoke-]testing would be helpful.
> Some known issues:
> - getaddrinfo should use larger buffers (probably 640-768 bytes) or
>   the "huge CNAME" problem iss not solved.
> - If res_send is used to send EDNS0 queries crafted by the caller and
>   the caller provides a buffer smaller than what the OPT header
>   advertises, the answer will be silently truncated to the caller's
>   buffer size.
> I've tested querying TXT for (>7k RRset) and
> basic non-fallback usage. Further large-answer query test cases would
> be appreciated, especially names where one or both of the A or AAAA
> RRset are large so that fallback in parallel lookup can be tested, and
> later so that getaddrinfo buffer size increase can be tested.

OK, I have patches for both of the above now.

[-- Attachment #2: 0004-getaddrinfo-dns-lookup-use-larger-answer-buffer-to-h.patch --]
[-- Type: text/plain, Size: 2282 bytes --]

From 4b5570f57e0784611225968d47cfbe7babe02da4 Mon Sep 17 00:00:00 2001
From: Rich Felker <>
Date: Thu, 22 Sep 2022 19:11:48 -0400
Subject: [PATCH 4/5] getaddrinfo dns lookup: use larger answer buffer to
 handle long CNAMEs

the size of 512 is not sufficient to get at least one address in the
worst case where the name is at or near max length and resolves to a
CNAME at or near max length. prior to tcp fallback, there was nothing
we could do about this case anyway, but now it's fixable.

the new limit 768 is chosen so as to admit roughly the number of
addresses with a worst-case CNAME as could fit for a worst-case name
that's not a CNAME in the old 512-byte limit. outside of this
worst-case, the number of addresses that might be obtained is

MAXADDRS (48) was originally chosen as an upper bound on the combined
number of A and AAAA records that could fit in 512-byte packets (31
and 17, respectively). it is not increased at this time.

so as to prevent a situation where the A records consume almost all of
these slots (at 768 bytes, a "best-case" name can fit almost 47 A
records), the order of parsing is swapped to process AAAA first. this
ensures roughly half of the slots are available to each address
 src/network/lookup_name.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c
index 37d481f9..3eda65a7 100644
--- a/src/network/lookup_name.c
+++ b/src/network/lookup_name.c
@@ -137,7 +137,7 @@ static int dns_parse_callback(void *c, int rr, const void *data, int len, const
 static int name_from_dns(struct address buf[static MAXADDRS], char canon[static 256], const char *name, int family, const struct resolvconf *conf)
-	unsigned char qbuf[2][280], abuf[2][512];
+	unsigned char qbuf[2][280], abuf[2][768];
 	const unsigned char *qp[2] = { qbuf[0], qbuf[1] };
 	unsigned char *ap[2] = { abuf[0], abuf[1] };
 	int qlens[2], alens[2];
@@ -171,7 +171,7 @@ static int name_from_dns(struct address buf[static MAXADDRS], char canon[static
 		if ((abuf[i][3] & 15) != 0) return EAI_FAIL;
-	for (i=0; i<nq; i++)
+	for (i=nq-1; i>=0; i--)
 		__dns_parse(abuf[i], alens[i], dns_parse_callback, &ctx);
 	if (ctx.cnt) return ctx.cnt;

[-- Attachment #3: 0005-dns-query-core-detect-udp-truncation-at-recv-time.patch --]
[-- Type: text/plain, Size: 2414 bytes --]

From 28960a5de11ecb39ce2d50dd1e3e97d55139052e Mon Sep 17 00:00:00 2001
From: Rich Felker <>
Date: Sun, 25 Sep 2022 22:48:12 -0400
Subject: [PATCH 5/5] dns query core: detect udp truncation at recv time

we already attempt to preclude this case by having res_send use a
sufficiently large temporary buffer even if the caller did not provide
one as large as or larger than the udp dns max of 512 bytes. however,
it's possible that the caller passed a custom-crafted query packet
using EDNS0, e.g. to get detailed DNSSEC results, with a larger udp
size allowance.

I have also seen claims that there are some broken nameservers in the
wild that do not honor the dns udp limit of 512 and send large answers
without the TC bit set, when the query was not using EDNS.

we generally don't aim to support broken nameservers, but in this case
both problems, if the latter is even real, have a common solution:
using recvmsg instead of recvfrom so we can examine the MSG_TRUNC
 src/network/res_msend.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/network/res_msend.c b/src/network/res_msend.c
index 1e76886a..11c6aa0e 100644
--- a/src/network/res_msend.c
+++ b/src/network/res_msend.c
@@ -194,9 +194,18 @@ int __res_msend_rc(int nqueries, const unsigned char *const *queries,
 		/* Wait for a response, or until time to retry */
 		if (poll(pfd, nqueries+1, t1+retry_interval-t2) <= 0) continue;
-		while (next < nqueries &&
-		  (rlen = recvfrom(fd, answers[next], asize, 0,
-		  (void *)&sa, (socklen_t[1]){sl})) >= 0) {
+		while (next < nqueries) {
+			struct msghdr mh = {
+				.msg_name = (void *)&sa,
+				.msg_namelen = sl,
+				.msg_iovlen = 1,
+				.msg_iov = (struct iovec []){
+					{ .iov_base = (void *)answers[next],
+					  .iov_len = asize }
+				}
+			};
+			rlen = recvmsg(fd, &mh, 0);
+			if (rlen < 0) break;
 			/* Ignore non-identifiable packets */
 			if (rlen < 4) continue;
@@ -240,7 +249,7 @@ int __res_msend_rc(int nqueries, const unsigned char *const *queries,
 			if (next == nqueries) pfd[nqueries].events = 0;
 			/* If answer is truncated (TC bit), fallback to TCP */
-			if (answers[i][2] & 2) {
+			if ((answers[i][2] & 2) || (mh.msg_flags & MSG_TRUNC)) {
 				alens[i] = -1;
 				pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, 0);
 				r = start_tcp(pfd+i, family, ns+j, sl, queries[i], qlens[i]);

      reply	other threads:[~2022-09-26  3:12 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 20:28 Rich Felker
2022-09-26  3:12 ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).