mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] DNS answer buffer is too small
@ 2023-07-04  0:17 Hamish Forbes
  2023-07-04  3:29 ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Hamish Forbes @ 2023-07-04  0:17 UTC (permalink / raw)
  To: musl

In lookup_name.c the DNS answer buffer (ABUF_SIZE) is currently
hardcoded to 768 bytes, increased from 512 after TCP DNS was
implemented.

The DNS RFC is, unfortunately, not very explicit about the size of TCP
responses.
However it does have this little line:

> RFC 1035 4.2.2.
> The message is prefixed with a two byte length field which gives the message length, excluding the two byte length field.

Which you could interpret as the TCP response length is a 16bit
unsigned int, so 64K.
But that's also the raw (potentially compressed) DNS response, the
uncompressed response could be larger still.

As best I can tell (I am not a C programmer!) glibc is using a scratch
buffer which grows when the parsing function returns an ERANGE error.

If that's more complex than you want, maybe a good compromise would be
allocating 768 (or 512?) for UDP queries and expanding to 64k when a
query falls back to TCP?

Hamish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] DNS answer buffer is too small
  2023-07-04  0:17 [musl] DNS answer buffer is too small Hamish Forbes
@ 2023-07-04  3:29 ` Rich Felker
       [not found]   ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2023-07-04  3:29 UTC (permalink / raw)
  To: Hamish Forbes; +Cc: musl

Thanks for sending an email where we can document this.

On Tue, Jul 04, 2023 at 12:17:05PM +1200, Hamish Forbes wrote:
> In lookup_name.c the DNS answer buffer (ABUF_SIZE) is currently
> hardcoded to 768 bytes, increased from 512 after TCP DNS was
> implemented.
> 
> The DNS RFC is, unfortunately, not very explicit about the size of TCP
> responses.
> However it does have this little line:
> 
> > RFC 1035 4.2.2.
> > The message is prefixed with a two byte length field which gives the message length, excluding the two byte length field.
> 
> Which you could interpret as the TCP response length is a 16bit
> unsigned int, so 64K.
> But that's also the raw (potentially compressed) DNS response, the
> uncompressed response could be larger still.

"Compression" is not relevant; there is no "decompressed state" the
message is converted into. Everything is processed in the original DNS
protocol form, with "compression" (backpointers).

> As best I can tell (I am not a C programmer!) glibc is using a scratch
> buffer which grows when the parsing function returns an ERANGE error.
> 
> If that's more complex than you want, maybe a good compromise would be
> allocating 768 (or 512?) for UDP queries and expanding to 64k when a
> query falls back to TCP?

Your report here is missing the motivation for why you might care to
have more than 768 bytes of response, which, as I understand it, is
because of CNAME chains. Otherwise, the buffer size is chosen to hold
the number of answer records the stub resolver is willing to accept,
and there is no problem.

Long CNAME chains are rather hostile and are not guaranteed to be well
supported -- AIUI recursive nameservers already impose their own
limits on the number of redirections in a chain, though I cannot find
any specification in the RFCs for this behavior or suggesting a value
for that limit, so if you can dig up what they actually do, this would
be useful to know. But it seems there are chains longer than what we
currently support out in the wild, which most software does support.
So the next step is nailing down exactly what the "requirement" here
is, so we can figure out what's the most reasonable and least costly
way to do it.

If there's some moderately small limit on the number of redirections
that recursive software supports, it may make sense to just increase
the buffer size to match that. If there really can be very large
chains, this is a mess.

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [musl] DNS answer buffer is too small
       [not found]   ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com>
@ 2023-07-04  4:19     ` Hamish Forbes
  2023-07-04 16:07       ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Hamish Forbes @ 2023-07-04  4:19 UTC (permalink / raw)
  To: musl

On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote:
> Your report here is missing the motivation for why you might care to
> have more than 768 bytes of response, which, as I understand it, is
> because of CNAME chains.

Yes, sorry, a response can contain a chain of CNAMEs like

a.example.com 10 IN CNAME b.example.com.
b.example.com 10 IN CNAME c.example.com.
...
n.example.com 10 IN A 127.0.0.1.

With a max length per CNAME of 255 this could quickly blow out the 768
byte limit.

> Otherwise, the buffer size is chosen to hold
> the number of answer records the stub resolver is willing to accept,
> and there is no problem.

Are you referring to the 48 MAXADDRS limit?
That's also possibly a problem in edge cases though, isn't it?
AIUI that was chosen as the maximum number of IP responses that can
fit in a 512 byte UDP response.
With TCP DNS implemented now, is that still a safe assumption?

It seems nuts to be returning more than 48 addresses but the DNS spec
doesn't prevent it,
so I bet it's being done somewhere!

This post[0] I came across while investigating all this stuff is
somewhat related

> Long CNAME chains are rather hostile and are not guaranteed to be well
> supported -- AIUI recursive nameservers already impose their own
> limits on the number of redirections in a chain, though I cannot find
> any specification in the RFCs for this behavior or suggesting a value
> for that limit, so if you can dig up what they actually do, this would
> be useful to know. But it seems there are chains longer than what we
> currently support out in the wild, which most software does support.
> So the next step is nailing down exactly what the "requirement" here
> is, so we can figure out what's the most reasonable and least costly
> way to do it.

Bind defaults[1] to 7, although that appears[2] to only be when it has
to switch to another NS.
So if the CNAME chain is all in the same zone... that's only  a depth of 1

Looks like Unbound defaults[3] to 11.
Although it also has a slightly confusing 'target-fetch-policy' which
looks to limit depth in a similar way to Bind, maybe...

> If there's some moderately small limit on the number of redirections
> that recursive software supports, it may make sense to just increase
> the buffer size to match that. If there really can be very large
> chains, this is a mess.

I have a strong suspicion that this is, in fact, going to be a mess :)


> Rich

[0] https://www.netmeister.org/blog/dns-size.html
[1] https://bind9.readthedocs.io/en/latest/reference.html#namedconf-statement-max-recursion-depth
[2] https://groups.google.com/g/comp.protocols.dns.bind/c/1GoKujM2Ylw/m/l_DuNxWFiCUJ
[3] https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound.conf.html#unbound-conf-max-query-restarts

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] DNS answer buffer is too small
  2023-07-04  4:19     ` Hamish Forbes
@ 2023-07-04 16:07       ` Rich Felker
  2023-07-05  0:18         ` Hamish Forbes
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2023-07-04 16:07 UTC (permalink / raw)
  To: Hamish Forbes; +Cc: musl

On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote:
> On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote:
> > Your report here is missing the motivation for why you might care to
> > have more than 768 bytes of response, which, as I understand it, is
> > because of CNAME chains.
> 
> Yes, sorry, a response can contain a chain of CNAMEs like
> 
> a.example.com 10 IN CNAME b.example.com.
> b.example.com 10 IN CNAME c.example.com.
> ....
> n.example.com 10 IN A 127.0.0.1.
> 
> With a max length per CNAME of 255 this could quickly blow out the 768
> byte limit.
> 
> > Otherwise, the buffer size is chosen to hold
> > the number of answer records the stub resolver is willing to accept,
> > and there is no problem.
> 
> Are you referring to the 48 MAXADDRS limit?
> That's also possibly a problem in edge cases though, isn't it?
> AIUI that was chosen as the maximum number of IP responses that can
> fit in a 512 byte UDP response.
> With TCP DNS implemented now, is that still a safe assumption?

It's as safe as it was before, and it's always been the intended
behavior. Aside from the CNAME chain issue which wasn't even realized
at the time, use of TCP for getaddrinfo is not about getting more
answers than fit in the UDP response size. It's about handling the
case where the recursive server returns a truncated response with zero
answer records instead of the max number that fit. It turned out this
could also occur with a single CNAME where both the queried name and
the CNAME target take up nearly the full 255 length.

As for why not to care about more results, getaddrinfo does not
provide a precise view of DNS space. It takes a hostname and gives you
a set of addresses you can use to attempt to connect to that host (or
bind if that name is your own, etc.). There's very little utility in
timing out more than 47 times then continuing to try more addresses
rather than just failing. "Our name resolves to 100 addresses and you
have to try all of them to find the one that works" is not a viable
configuration. (A lot of software does not even iterate and try
fallbacks at all, but only attempts to use the first one, typically
round-robin rotated by the nameserver.)

Anyway, if there are objections to this behavior, it's a completely
separate issue from handling long CNAME chains.

> It seems nuts to be returning more than 48 addresses but the DNS spec
> doesn't prevent it,
> so I bet it's being done somewhere!
> 
> This post[0] I came across while investigating all this stuff is
> somewhat related
> 
> > Long CNAME chains are rather hostile and are not guaranteed to be well
> > supported -- AIUI recursive nameservers already impose their own
> > limits on the number of redirections in a chain, though I cannot find
> > any specification in the RFCs for this behavior or suggesting a value
> > for that limit, so if you can dig up what they actually do, this would
> > be useful to know. But it seems there are chains longer than what we
> > currently support out in the wild, which most software does support.
> > So the next step is nailing down exactly what the "requirement" here
> > is, so we can figure out what's the most reasonable and least costly
> > way to do it.
> 
> Bind defaults[1] to 7, although that appears[2] to only be when it has
> to switch to another NS.
> So if the CNAME chain is all in the same zone... that's only  a depth of 1
> 
> Looks like Unbound defaults[3] to 11.
> Although it also has a slightly confusing 'target-fetch-policy' which
> looks to limit depth in a similar way to Bind, maybe...

From my reading of your links, and

https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI

I don't think max-recursion-depth is related to CNAMEs. It's the depth
of delegation recursion. The max CNAME chain length is separate, and
in unbound terminology is the number of "restarts". Unbound's limit as
you've found is 11. BIND's is supposedly hard-coded at 16.

Assuming the recursive server uses pointers properly, max size of a
length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a
little over 4k for the BIND limit, and that's assuming max-length
names with no further redundancy. I would expect the real-world need
is considerably lower than this, and that the Unbound default limit on
chain length also suffices in practice (or it wouldn't be the default
for a widely used recursive server). So, for example, using a 4k
buffer (adding a little over 3k to what we have now, which already had
enough for one CNAME) should solve the problem entirely.

Does this sound like an okay fix to you?

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] DNS answer buffer is too small
  2023-07-04 16:07       ` Rich Felker
@ 2023-07-05  0:18         ` Hamish Forbes
  2023-07-05  3:41           ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Hamish Forbes @ 2023-07-05  0:18 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Wed, 5 Jul 2023 at 04:06, Rich Felker <dalias@libc.org> wrote:
>
> On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote:
> > On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote:
> It's as safe as it was before, and it's always been the intended
> behavior. Aside from the CNAME chain issue which wasn't even realized
> at the time, use of TCP for getaddrinfo is not about getting more
> answers than fit in the UDP response size. It's about handling the
> case where the recursive server returns a truncated response with zero
> answer records instead of the max number that fit. It turned out this
> could also occur with a single CNAME where both the queried name and
> the CNAME target take up nearly the full 255 length.
>
> As for why not to care about more results, getaddrinfo does not
> provide a precise view of DNS space. It takes a hostname and gives you
> a set of addresses you can use to attempt to connect to that host (or
> bind if that name is your own, etc.). There's very little utility in
> timing out more than 47 times then continuing to try more addresses
> rather than just failing. "Our name resolves to 100 addresses and you
> have to try all of them to find the one that works" is not a viable
> configuration. (A lot of software does not even iterate and try
> fallbacks at all, but only attempts to use the first one, typically
> round-robin rotated by the nameserver.)
>
> Anyway, if there are objections to this behavior, it's a completely
> separate issue from handling long CNAME chains.

Ah yeah, ok that makes sense.
I wasn't thinking about it as "we just need any address".
No objections to that from me!

> From my reading of your links, and
>
> https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI
>
> I don't think max-recursion-depth is related to CNAMEs. It's the depth
> of delegation recursion. The max CNAME chain length is separate, and
> in unbound terminology is the number of "restarts". Unbound's limit as
> you've found is 11. BIND's is supposedly hard-coded at 16.
>
> Assuming the recursive server uses pointers properly, max size of a
> length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a
> little over 4k for the BIND limit, and that's assuming max-length
> names with no further redundancy. I would expect the real-world need
> is considerably lower than this, and that the Unbound default limit on
> chain length also suffices in practice (or it wouldn't be the default
> for a widely used recursive server). So, for example, using a 4k
> buffer (adding a little over 3k to what we have now, which already had
> enough for one CNAME) should solve the problem entirely.
>
> Does this sound like an okay fix to you?

Sounds good to me!

>
> Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] DNS answer buffer is too small
  2023-07-05  0:18         ` Hamish Forbes
@ 2023-07-05  3:41           ` Rich Felker
  2023-07-05 11:24             ` A. Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2023-07-05  3:41 UTC (permalink / raw)
  To: Hamish Forbes; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 3069 bytes --]

On Wed, Jul 05, 2023 at 12:18:11PM +1200, Hamish Forbes wrote:
> On Wed, 5 Jul 2023 at 04:06, Rich Felker <dalias@libc.org> wrote:
> >
> > On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote:
> > > On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote:
> > It's as safe as it was before, and it's always been the intended
> > behavior. Aside from the CNAME chain issue which wasn't even realized
> > at the time, use of TCP for getaddrinfo is not about getting more
> > answers than fit in the UDP response size. It's about handling the
> > case where the recursive server returns a truncated response with zero
> > answer records instead of the max number that fit. It turned out this
> > could also occur with a single CNAME where both the queried name and
> > the CNAME target take up nearly the full 255 length.
> >
> > As for why not to care about more results, getaddrinfo does not
> > provide a precise view of DNS space. It takes a hostname and gives you
> > a set of addresses you can use to attempt to connect to that host (or
> > bind if that name is your own, etc.). There's very little utility in
> > timing out more than 47 times then continuing to try more addresses
> > rather than just failing. "Our name resolves to 100 addresses and you
> > have to try all of them to find the one that works" is not a viable
> > configuration. (A lot of software does not even iterate and try
> > fallbacks at all, but only attempts to use the first one, typically
> > round-robin rotated by the nameserver.)
> >
> > Anyway, if there are objections to this behavior, it's a completely
> > separate issue from handling long CNAME chains.
> 
> Ah yeah, ok that makes sense.
> I wasn't thinking about it as "we just need any address".
> No objections to that from me!
> 
> > From my reading of your links, and
> >
> > https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI
> >
> > I don't think max-recursion-depth is related to CNAMEs. It's the depth
> > of delegation recursion. The max CNAME chain length is separate, and
> > in unbound terminology is the number of "restarts". Unbound's limit as
> > you've found is 11. BIND's is supposedly hard-coded at 16.
> >
> > Assuming the recursive server uses pointers properly, max size of a
> > length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a
> > little over 4k for the BIND limit, and that's assuming max-length
> > names with no further redundancy. I would expect the real-world need
> > is considerably lower than this, and that the Unbound default limit on
> > chain length also suffices in practice (or it wouldn't be the default
> > for a widely used recursive server). So, for example, using a 4k
> > buffer (adding a little over 3k to what we have now, which already had
> > enough for one CNAME) should solve the problem entirely.
> >
> > Does this sound like an okay fix to you?
> 
> Sounds good to me!

Great. Writing up the commit message, I figured it's not much larger
to go with supporting the full 16, and easier to justify. Proposed
patch attached.

Rich

[-- Attachment #2: 0001-dns-stub-resolver-increase-buffer-size-to-handle-cha.patch --]
[-- Type: text/plain, Size: 1966 bytes --]

From a4ecaf89a9b88df76e8bf9f28e1cc6cb89e4bfa8 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Tue, 4 Jul 2023 23:11:17 -0400
Subject: [PATCH] dns stub resolver: increase buffer size to handle chained
 CNAMEs

in the event of chained CNAMEs, the answer to a query will contain the
entire CNAME chain, not just one CNAME record. previously, the answer
buffer size had been chosen to admit a maximal-length CNAME, but only
one. a moderate-length chain could fill the available 768 bytes
leaving no room for an actual address answering the query.

while the DNS RFCs do not specify any limit on the length of a CNAME
chain, or any reasonable behavior is the chain exceeds the entire 64k
possible message size, actual recursive servers have to impose a
limit, and a such, for all practical purposes, chains longer than this
limit are not usable. it turns out BIND has a hard-coded limit of 16,
and Unbound has a default limit of 11.

assuming the recursive server makes use of "compression" (pointers),
each maximal-length CNAME record takes at most 268 bytes, and thus any
chain up to length 16 fits in at most 4288 bytes.

this patch increases the answer buffer size to preserve the original
intent of having 512 bytes available for address answers, plus space
needed for a maximal CNAME chain, for a total of 4800 bytes. the
resulting size of 9600 bytes for two queries (A+AAAA) is still well
within what is reasonable to place in automatic storage.
---
 src/network/lookup_name.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c
index 4281482e..35218185 100644
--- a/src/network/lookup_name.c
+++ b/src/network/lookup_name.c
@@ -109,7 +109,7 @@ struct dpc_ctx {
 #define RR_CNAME 5
 #define RR_AAAA 28
 
-#define ABUF_SIZE 768
+#define ABUF_SIZE 4800
 
 static int dns_parse_callback(void *c, int rr, const void *data, int len, const void *packet, int plen)
 {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] DNS answer buffer is too small
  2023-07-05  3:41           ` Rich Felker
@ 2023-07-05 11:24             ` A. Wilcox
  0 siblings, 0 replies; 7+ messages in thread
From: A. Wilcox @ 2023-07-05 11:24 UTC (permalink / raw)
  To: musl; +Cc: Rich Felker

On Jul 4, 2023, at 10:41 PM, Rich Felker <dalias@libc.org> wrote:

> 
> while the DNS RFCs do not specify any limit on the length of a CNAME
> chain, or any reasonable behavior is the chain exceeds the entire 64k



Is -> if, maybe?  Otherwise, this is a fragment and perhaps it should
start “the DNS RFCs” and the comma should be removed.
Then the sentence should end at “any reasonable behaviour."


> possible message size, actual recursive servers have to impose a
> limit, and a such, for all practical purposes, chains longer than this


…and *as* such…


> limit are not usable. it turns out BIND has a hard-coded limit of 16,
> and Unbound has a default limit of 11.


The code change looks good; just trying to fix up the commit msg.

Best,
-A.

--
A. Wilcox (they/them)
SW Engineering: C++/Rust, DevOps, POSIX, Py/Ruby
Wilcox Technologies Inc.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-05 11:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-04  0:17 [musl] DNS answer buffer is too small Hamish Forbes
2023-07-04  3:29 ` Rich Felker
     [not found]   ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com>
2023-07-04  4:19     ` Hamish Forbes
2023-07-04 16:07       ` Rich Felker
2023-07-05  0:18         ` Hamish Forbes
2023-07-05  3:41           ` Rich Felker
2023-07-05 11:24             ` A. Wilcox

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).