* [musl] DNS answer buffer is too small
@ 2023-07-04 0:17 Hamish Forbes
2023-07-04 3:29 ` Rich Felker
0 siblings, 1 reply; 7+ messages in thread
From: Hamish Forbes @ 2023-07-04 0:17 UTC (permalink / raw)
To: musl
In lookup_name.c the DNS answer buffer (ABUF_SIZE) is currently
hardcoded to 768 bytes, increased from 512 after TCP DNS was
implemented.
The DNS RFC is, unfortunately, not very explicit about the size of TCP
responses.
However it does have this little line:
> RFC 1035 4.2.2.
> The message is prefixed with a two byte length field which gives the message length, excluding the two byte length field.
Which you could interpret as the TCP response length is a 16bit
unsigned int, so 64K.
But that's also the raw (potentially compressed) DNS response, the
uncompressed response could be larger still.
As best I can tell (I am not a C programmer!) glibc is using a scratch
buffer which grows when the parsing function returns an ERANGE error.
If that's more complex than you want, maybe a good compromise would be
allocating 768 (or 512?) for UDP queries and expanding to 64k when a
query falls back to TCP?
Hamish
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] DNS answer buffer is too small 2023-07-04 0:17 [musl] DNS answer buffer is too small Hamish Forbes @ 2023-07-04 3:29 ` Rich Felker [not found] ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com> 0 siblings, 1 reply; 7+ messages in thread From: Rich Felker @ 2023-07-04 3:29 UTC (permalink / raw) To: Hamish Forbes; +Cc: musl Thanks for sending an email where we can document this. On Tue, Jul 04, 2023 at 12:17:05PM +1200, Hamish Forbes wrote: > In lookup_name.c the DNS answer buffer (ABUF_SIZE) is currently > hardcoded to 768 bytes, increased from 512 after TCP DNS was > implemented. > > The DNS RFC is, unfortunately, not very explicit about the size of TCP > responses. > However it does have this little line: > > > RFC 1035 4.2.2. > > The message is prefixed with a two byte length field which gives the message length, excluding the two byte length field. > > Which you could interpret as the TCP response length is a 16bit > unsigned int, so 64K. > But that's also the raw (potentially compressed) DNS response, the > uncompressed response could be larger still. "Compression" is not relevant; there is no "decompressed state" the message is converted into. Everything is processed in the original DNS protocol form, with "compression" (backpointers). > As best I can tell (I am not a C programmer!) glibc is using a scratch > buffer which grows when the parsing function returns an ERANGE error. > > If that's more complex than you want, maybe a good compromise would be > allocating 768 (or 512?) for UDP queries and expanding to 64k when a > query falls back to TCP? Your report here is missing the motivation for why you might care to have more than 768 bytes of response, which, as I understand it, is because of CNAME chains. Otherwise, the buffer size is chosen to hold the number of answer records the stub resolver is willing to accept, and there is no problem. Long CNAME chains are rather hostile and are not guaranteed to be well supported -- AIUI recursive nameservers already impose their own limits on the number of redirections in a chain, though I cannot find any specification in the RFCs for this behavior or suggesting a value for that limit, so if you can dig up what they actually do, this would be useful to know. But it seems there are chains longer than what we currently support out in the wild, which most software does support. So the next step is nailing down exactly what the "requirement" here is, so we can figure out what's the most reasonable and least costly way to do it. If there's some moderately small limit on the number of redirections that recursive software supports, it may make sense to just increase the buffer size to match that. If there really can be very large chains, this is a mess. Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com>]
* RE: [musl] DNS answer buffer is too small [not found] ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com> @ 2023-07-04 4:19 ` Hamish Forbes 2023-07-04 16:07 ` Rich Felker 0 siblings, 1 reply; 7+ messages in thread From: Hamish Forbes @ 2023-07-04 4:19 UTC (permalink / raw) To: musl On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote: > Your report here is missing the motivation for why you might care to > have more than 768 bytes of response, which, as I understand it, is > because of CNAME chains. Yes, sorry, a response can contain a chain of CNAMEs like a.example.com 10 IN CNAME b.example.com. b.example.com 10 IN CNAME c.example.com. ... n.example.com 10 IN A 127.0.0.1. With a max length per CNAME of 255 this could quickly blow out the 768 byte limit. > Otherwise, the buffer size is chosen to hold > the number of answer records the stub resolver is willing to accept, > and there is no problem. Are you referring to the 48 MAXADDRS limit? That's also possibly a problem in edge cases though, isn't it? AIUI that was chosen as the maximum number of IP responses that can fit in a 512 byte UDP response. With TCP DNS implemented now, is that still a safe assumption? It seems nuts to be returning more than 48 addresses but the DNS spec doesn't prevent it, so I bet it's being done somewhere! This post[0] I came across while investigating all this stuff is somewhat related > Long CNAME chains are rather hostile and are not guaranteed to be well > supported -- AIUI recursive nameservers already impose their own > limits on the number of redirections in a chain, though I cannot find > any specification in the RFCs for this behavior or suggesting a value > for that limit, so if you can dig up what they actually do, this would > be useful to know. But it seems there are chains longer than what we > currently support out in the wild, which most software does support. > So the next step is nailing down exactly what the "requirement" here > is, so we can figure out what's the most reasonable and least costly > way to do it. Bind defaults[1] to 7, although that appears[2] to only be when it has to switch to another NS. So if the CNAME chain is all in the same zone... that's only a depth of 1 Looks like Unbound defaults[3] to 11. Although it also has a slightly confusing 'target-fetch-policy' which looks to limit depth in a similar way to Bind, maybe... > If there's some moderately small limit on the number of redirections > that recursive software supports, it may make sense to just increase > the buffer size to match that. If there really can be very large > chains, this is a mess. I have a strong suspicion that this is, in fact, going to be a mess :) > Rich [0] https://www.netmeister.org/blog/dns-size.html [1] https://bind9.readthedocs.io/en/latest/reference.html#namedconf-statement-max-recursion-depth [2] https://groups.google.com/g/comp.protocols.dns.bind/c/1GoKujM2Ylw/m/l_DuNxWFiCUJ [3] https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound.conf.html#unbound-conf-max-query-restarts ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] DNS answer buffer is too small 2023-07-04 4:19 ` Hamish Forbes @ 2023-07-04 16:07 ` Rich Felker 2023-07-05 0:18 ` Hamish Forbes 0 siblings, 1 reply; 7+ messages in thread From: Rich Felker @ 2023-07-04 16:07 UTC (permalink / raw) To: Hamish Forbes; +Cc: musl On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote: > On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote: > > Your report here is missing the motivation for why you might care to > > have more than 768 bytes of response, which, as I understand it, is > > because of CNAME chains. > > Yes, sorry, a response can contain a chain of CNAMEs like > > a.example.com 10 IN CNAME b.example.com. > b.example.com 10 IN CNAME c.example.com. > .... > n.example.com 10 IN A 127.0.0.1. > > With a max length per CNAME of 255 this could quickly blow out the 768 > byte limit. > > > Otherwise, the buffer size is chosen to hold > > the number of answer records the stub resolver is willing to accept, > > and there is no problem. > > Are you referring to the 48 MAXADDRS limit? > That's also possibly a problem in edge cases though, isn't it? > AIUI that was chosen as the maximum number of IP responses that can > fit in a 512 byte UDP response. > With TCP DNS implemented now, is that still a safe assumption? It's as safe as it was before, and it's always been the intended behavior. Aside from the CNAME chain issue which wasn't even realized at the time, use of TCP for getaddrinfo is not about getting more answers than fit in the UDP response size. It's about handling the case where the recursive server returns a truncated response with zero answer records instead of the max number that fit. It turned out this could also occur with a single CNAME where both the queried name and the CNAME target take up nearly the full 255 length. As for why not to care about more results, getaddrinfo does not provide a precise view of DNS space. It takes a hostname and gives you a set of addresses you can use to attempt to connect to that host (or bind if that name is your own, etc.). There's very little utility in timing out more than 47 times then continuing to try more addresses rather than just failing. "Our name resolves to 100 addresses and you have to try all of them to find the one that works" is not a viable configuration. (A lot of software does not even iterate and try fallbacks at all, but only attempts to use the first one, typically round-robin rotated by the nameserver.) Anyway, if there are objections to this behavior, it's a completely separate issue from handling long CNAME chains. > It seems nuts to be returning more than 48 addresses but the DNS spec > doesn't prevent it, > so I bet it's being done somewhere! > > This post[0] I came across while investigating all this stuff is > somewhat related > > > Long CNAME chains are rather hostile and are not guaranteed to be well > > supported -- AIUI recursive nameservers already impose their own > > limits on the number of redirections in a chain, though I cannot find > > any specification in the RFCs for this behavior or suggesting a value > > for that limit, so if you can dig up what they actually do, this would > > be useful to know. But it seems there are chains longer than what we > > currently support out in the wild, which most software does support. > > So the next step is nailing down exactly what the "requirement" here > > is, so we can figure out what's the most reasonable and least costly > > way to do it. > > Bind defaults[1] to 7, although that appears[2] to only be when it has > to switch to another NS. > So if the CNAME chain is all in the same zone... that's only a depth of 1 > > Looks like Unbound defaults[3] to 11. > Although it also has a slightly confusing 'target-fetch-policy' which > looks to limit depth in a similar way to Bind, maybe... From my reading of your links, and https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI I don't think max-recursion-depth is related to CNAMEs. It's the depth of delegation recursion. The max CNAME chain length is separate, and in unbound terminology is the number of "restarts". Unbound's limit as you've found is 11. BIND's is supposedly hard-coded at 16. Assuming the recursive server uses pointers properly, max size of a length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a little over 4k for the BIND limit, and that's assuming max-length names with no further redundancy. I would expect the real-world need is considerably lower than this, and that the Unbound default limit on chain length also suffices in practice (or it wouldn't be the default for a widely used recursive server). So, for example, using a 4k buffer (adding a little over 3k to what we have now, which already had enough for one CNAME) should solve the problem entirely. Does this sound like an okay fix to you? Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] DNS answer buffer is too small 2023-07-04 16:07 ` Rich Felker @ 2023-07-05 0:18 ` Hamish Forbes 2023-07-05 3:41 ` Rich Felker 0 siblings, 1 reply; 7+ messages in thread From: Hamish Forbes @ 2023-07-05 0:18 UTC (permalink / raw) To: Rich Felker; +Cc: musl On Wed, 5 Jul 2023 at 04:06, Rich Felker <dalias@libc.org> wrote: > > On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote: > > On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote: > It's as safe as it was before, and it's always been the intended > behavior. Aside from the CNAME chain issue which wasn't even realized > at the time, use of TCP for getaddrinfo is not about getting more > answers than fit in the UDP response size. It's about handling the > case where the recursive server returns a truncated response with zero > answer records instead of the max number that fit. It turned out this > could also occur with a single CNAME where both the queried name and > the CNAME target take up nearly the full 255 length. > > As for why not to care about more results, getaddrinfo does not > provide a precise view of DNS space. It takes a hostname and gives you > a set of addresses you can use to attempt to connect to that host (or > bind if that name is your own, etc.). There's very little utility in > timing out more than 47 times then continuing to try more addresses > rather than just failing. "Our name resolves to 100 addresses and you > have to try all of them to find the one that works" is not a viable > configuration. (A lot of software does not even iterate and try > fallbacks at all, but only attempts to use the first one, typically > round-robin rotated by the nameserver.) > > Anyway, if there are objections to this behavior, it's a completely > separate issue from handling long CNAME chains. Ah yeah, ok that makes sense. I wasn't thinking about it as "we just need any address". No objections to that from me! > From my reading of your links, and > > https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI > > I don't think max-recursion-depth is related to CNAMEs. It's the depth > of delegation recursion. The max CNAME chain length is separate, and > in unbound terminology is the number of "restarts". Unbound's limit as > you've found is 11. BIND's is supposedly hard-coded at 16. > > Assuming the recursive server uses pointers properly, max size of a > length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a > little over 4k for the BIND limit, and that's assuming max-length > names with no further redundancy. I would expect the real-world need > is considerably lower than this, and that the Unbound default limit on > chain length also suffices in practice (or it wouldn't be the default > for a widely used recursive server). So, for example, using a 4k > buffer (adding a little over 3k to what we have now, which already had > enough for one CNAME) should solve the problem entirely. > > Does this sound like an okay fix to you? Sounds good to me! > > Rich ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] DNS answer buffer is too small 2023-07-05 0:18 ` Hamish Forbes @ 2023-07-05 3:41 ` Rich Felker 2023-07-05 11:24 ` A. Wilcox 0 siblings, 1 reply; 7+ messages in thread From: Rich Felker @ 2023-07-05 3:41 UTC (permalink / raw) To: Hamish Forbes; +Cc: musl [-- Attachment #1: Type: text/plain, Size: 3069 bytes --] On Wed, Jul 05, 2023 at 12:18:11PM +1200, Hamish Forbes wrote: > On Wed, 5 Jul 2023 at 04:06, Rich Felker <dalias@libc.org> wrote: > > > > On Tue, Jul 04, 2023 at 04:19:16PM +1200, Hamish Forbes wrote: > > > On Tue, 4 Jul 2023 at 15:29, Rich Felker <dalias@libc.org> wrote: > > It's as safe as it was before, and it's always been the intended > > behavior. Aside from the CNAME chain issue which wasn't even realized > > at the time, use of TCP for getaddrinfo is not about getting more > > answers than fit in the UDP response size. It's about handling the > > case where the recursive server returns a truncated response with zero > > answer records instead of the max number that fit. It turned out this > > could also occur with a single CNAME where both the queried name and > > the CNAME target take up nearly the full 255 length. > > > > As for why not to care about more results, getaddrinfo does not > > provide a precise view of DNS space. It takes a hostname and gives you > > a set of addresses you can use to attempt to connect to that host (or > > bind if that name is your own, etc.). There's very little utility in > > timing out more than 47 times then continuing to try more addresses > > rather than just failing. "Our name resolves to 100 addresses and you > > have to try all of them to find the one that works" is not a viable > > configuration. (A lot of software does not even iterate and try > > fallbacks at all, but only attempts to use the first one, typically > > round-robin rotated by the nameserver.) > > > > Anyway, if there are objections to this behavior, it's a completely > > separate issue from handling long CNAME chains. > > Ah yeah, ok that makes sense. > I wasn't thinking about it as "we just need any address". > No objections to that from me! > > > From my reading of your links, and > > > > https://groups.google.com/g/comp.protocols.dns.bind/c/rXici9NvIqI > > > > I don't think max-recursion-depth is related to CNAMEs. It's the depth > > of delegation recursion. The max CNAME chain length is separate, and > > in unbound terminology is the number of "restarts". Unbound's limit as > > you've found is 11. BIND's is supposedly hard-coded at 16. > > > > Assuming the recursive server uses pointers properly, max size of a > > length-N CNAME chain is (N+1)*(255+epsilon). This comes out to a > > little over 4k for the BIND limit, and that's assuming max-length > > names with no further redundancy. I would expect the real-world need > > is considerably lower than this, and that the Unbound default limit on > > chain length also suffices in practice (or it wouldn't be the default > > for a widely used recursive server). So, for example, using a 4k > > buffer (adding a little over 3k to what we have now, which already had > > enough for one CNAME) should solve the problem entirely. > > > > Does this sound like an okay fix to you? > > Sounds good to me! Great. Writing up the commit message, I figured it's not much larger to go with supporting the full 16, and easier to justify. Proposed patch attached. Rich [-- Attachment #2: 0001-dns-stub-resolver-increase-buffer-size-to-handle-cha.patch --] [-- Type: text/plain, Size: 1966 bytes --] From a4ecaf89a9b88df76e8bf9f28e1cc6cb89e4bfa8 Mon Sep 17 00:00:00 2001 From: Rich Felker <dalias@aerifal.cx> Date: Tue, 4 Jul 2023 23:11:17 -0400 Subject: [PATCH] dns stub resolver: increase buffer size to handle chained CNAMEs in the event of chained CNAMEs, the answer to a query will contain the entire CNAME chain, not just one CNAME record. previously, the answer buffer size had been chosen to admit a maximal-length CNAME, but only one. a moderate-length chain could fill the available 768 bytes leaving no room for an actual address answering the query. while the DNS RFCs do not specify any limit on the length of a CNAME chain, or any reasonable behavior is the chain exceeds the entire 64k possible message size, actual recursive servers have to impose a limit, and a such, for all practical purposes, chains longer than this limit are not usable. it turns out BIND has a hard-coded limit of 16, and Unbound has a default limit of 11. assuming the recursive server makes use of "compression" (pointers), each maximal-length CNAME record takes at most 268 bytes, and thus any chain up to length 16 fits in at most 4288 bytes. this patch increases the answer buffer size to preserve the original intent of having 512 bytes available for address answers, plus space needed for a maximal CNAME chain, for a total of 4800 bytes. the resulting size of 9600 bytes for two queries (A+AAAA) is still well within what is reasonable to place in automatic storage. --- src/network/lookup_name.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c index 4281482e..35218185 100644 --- a/src/network/lookup_name.c +++ b/src/network/lookup_name.c @@ -109,7 +109,7 @@ struct dpc_ctx { #define RR_CNAME 5 #define RR_AAAA 28 -#define ABUF_SIZE 768 +#define ABUF_SIZE 4800 static int dns_parse_callback(void *c, int rr, const void *data, int len, const void *packet, int plen) { -- 2.21.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [musl] DNS answer buffer is too small 2023-07-05 3:41 ` Rich Felker @ 2023-07-05 11:24 ` A. Wilcox 0 siblings, 0 replies; 7+ messages in thread From: A. Wilcox @ 2023-07-05 11:24 UTC (permalink / raw) To: musl; +Cc: Rich Felker On Jul 4, 2023, at 10:41 PM, Rich Felker <dalias@libc.org> wrote: > > while the DNS RFCs do not specify any limit on the length of a CNAME > chain, or any reasonable behavior is the chain exceeds the entire 64k Is -> if, maybe? Otherwise, this is a fragment and perhaps it should start “the DNS RFCs” and the comma should be removed. Then the sentence should end at “any reasonable behaviour." > possible message size, actual recursive servers have to impose a > limit, and a such, for all practical purposes, chains longer than this …and *as* such… > limit are not usable. it turns out BIND has a hard-coded limit of 16, > and Unbound has a default limit of 11. The code change looks good; just trying to fix up the commit msg. Best, -A. -- A. Wilcox (they/them) SW Engineering: C++/Rust, DevOps, POSIX, Py/Ruby Wilcox Technologies Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-07-05 11:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-04 0:17 [musl] DNS answer buffer is too small Hamish Forbes 2023-07-04 3:29 ` Rich Felker [not found] ` <CAO3VF8TnAX97Mt7h3MdzHScE-5660dcrZVn45p4vKXkYpbuicw@mail.gmail.com> 2023-07-04 4:19 ` Hamish Forbes 2023-07-04 16:07 ` Rich Felker 2023-07-05 0:18 ` Hamish Forbes 2023-07-05 3:41 ` Rich Felker 2023-07-05 11:24 ` A. Wilcox
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).