mailing list of musl libc
 help / color / mirror / code / Atom feed
* Malformed DNS requests for single-label hostnames with `search .`
@ 2019-05-07 16:29 Luke Shumaker
  2019-05-07 18:04 ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Luke Shumaker @ 2019-05-07 16:29 UTC (permalink / raw)
  To: musl

In some scenarios, musl libc generates invalid DNS queries that are
discarded by the DNS server.  Particularly when `resolv.conf` says
`search .` and we attempt to resolv a single-label hostname.

    / # cat /etc/resolv.conf
    search .
    nameserver 1.1.1.1

For context of "what it should do", if I have a trailing `.` to tell
it to ignore the `search`-path, it makes the request correctly:

    / # time strace -f -e trace=sendto,sendmsg,sendmmsg getent hosts label.
    sendto(3, "\214\302\1\0\0\1\0\0\0\0\0\0\5label\0\0\34\0\1", 23,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 23
    sendto(3, "\355b\1\0\0\1\0\0\0\0\0\0\5label\0\0\1\0\1", 23,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 23
    +++ exited with 2 +++
    Command exited with non-zero status 2
    real    0m 0.03s
    user    0m 0.00s
    sys     0m 0.00s

But if I allow it to use the `search`-path, the query is invalid:

    / # time strace -f -e trace=sendto,sendmsg,sendmmsg getent hosts label
    sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
    sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
    sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
    sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
    +++ exited with 2 +++
    Command exited with non-zero status 2
    real    0m 10.01s
    user    0m 0.00s
    sys     0m 0.00s

We see it take 10s to time-out waiting for a reply from the DNS server
that will never come (because the server ignored the query as
malformed).  To annotate the queries a bit:

    Good request:

        sendto(3, "\214\302\1\0\0\1\0\0\0\0\0\0\5label\0\0\34\0\1",
23, MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 23
                   [      header-section       [question-section]
                                               [-----][][---][--]
                                               ^      ^ ^    ^
              QNAME[0] = octet[5]{"label"}  --'       | |    |
              QNAME[1] = end  -----------------------'  |    |
              QTYPE    = AAAA  ------------------------'     |
              QCLASS   = IN  -------------------------------'

    Bad request (as seen by a parser)

        sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
                   [    header-section     [question-section ]
                                           [-----][----------- - - -
                                           ^      ^
       QNAME[0] = octet[5]{"label"}  -----'       |
       QNAME[1] = octet[46]{"\0\34\0\1\0"...}  --'
       QNAME[n] = end  --------------------------------------- - - -
       QTYPE    = ???  --------------------------------------- - - -
       QCLASS   = ???  --------------------------------------- - - -

    Bad request (as seen by a human):

        sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 16) = 24
                   [    header-section     [question-section ]
                                           [-----]|[---][--][]
                                           ^      ^^    ^   ^
       QNAME[0] = octet[5]{"label"}  -----'       ||    |   |
       QNAME[1] = should-be-end -----------------' |    |   |
       QTYPE    = AAAA  --------------------------'     |   |
       QCLASS   = IN  ---------------------------------'    |
       garbage  = garbage  --------------------------------'

So there are 2 pieces of corruption going on here:

 1. Instead of getting the \0 terminator indicating that there are no
    more lables in the QNAME, it gets an ASCII '.', indicating another
    label of length 46.
 2. An extra byte is allocated, which appears at the end of the
    message.

I have verified that the error happens with:

 - Alpine 3.9's musl 1.1.20-r3 on x86_64
 - Alpine 3.9's musl 1.1.20-r4 on x86_64
 - Alpine edge's musl 1.1.21-r2 on x86_64
 - Alpine edge's musl 1.1.22-r0 on x86_64

-- 
Happy hacking,
~ Luke Shumaker


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Malformed DNS requests for single-label hostnames with `search .`
  2019-05-07 16:29 Malformed DNS requests for single-label hostnames with `search .` Luke Shumaker
@ 2019-05-07 18:04 ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2019-05-07 18:04 UTC (permalink / raw)
  To: musl

On Tue, May 07, 2019 at 12:29:43PM -0400, Luke Shumaker wrote:
> In some scenarios, musl libc generates invalid DNS queries that are
> discarded by the DNS server.  Particularly when `resolv.conf` says
> `search .` and we attempt to resolv a single-label hostname.
> 
>     / # cat /etc/resolv.conf
>     search .
>     nameserver 1.1.1.1
> 
> For context of "what it should do", if I have a trailing `.` to tell
> it to ignore the `search`-path, it makes the request correctly:

Note that this is not a good idea, even if it weren't buggy, as it
will just perform all your queries twice. If you don't want to search,
omit the search option or leave it blank.

>     / # time strace -f -e trace=sendto,sendmsg,sendmmsg getent hosts label.
>     sendto(3, "\214\302\1\0\0\1\0\0\0\0\0\0\5label\0\0\34\0\1", 23,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 23
>     sendto(3, "\355b\1\0\0\1\0\0\0\0\0\0\5label\0\0\1\0\1", 23,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 23
>     +++ exited with 2 +++
>     Command exited with non-zero status 2
>     real    0m 0.03s
>     user    0m 0.00s
>     sys     0m 0.00s
> 
> But if I allow it to use the `search`-path, the query is invalid:
> 
>     / # time strace -f -e trace=sendto,sendmsg,sendmmsg getent hosts label
>     sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>     sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>     sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>     sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>     +++ exited with 2 +++
>     Command exited with non-zero status 2
>     real    0m 10.01s
>     user    0m 0.00s
>     sys     0m 0.00s
> 
> We see it take 10s to time-out waiting for a reply from the DNS server
> that will never come (because the server ignored the query as
> malformed).  To annotate the queries a bit:
> 
>     Good request:
> 
>         sendto(3, "\214\302\1\0\0\1\0\0\0\0\0\0\5label\0\0\34\0\1",
> 23, MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 23
>                    [      header-section       [question-section]
>                                                [-----][][---][--]
>                                                ^      ^ ^    ^
>               QNAME[0] = octet[5]{"label"}  --'       | |    |
>               QNAME[1] = end  -----------------------'  |    |
>               QTYPE    = AAAA  ------------------------'     |
>               QCLASS   = IN  -------------------------------'
> 
>     Bad request (as seen by a parser)
> 
>         sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>                    [    header-section     [question-section ]
>                                            [-----][----------- - - -
>                                            ^      ^
>        QNAME[0] = octet[5]{"label"}  -----'       |
>        QNAME[1] = octet[46]{"\0\34\0\1\0"...}  --'
>        QNAME[n] = end  --------------------------------------- - - -
>        QTYPE    = ???  --------------------------------------- - - -
>        QCLASS   = ???  --------------------------------------- - - -
> 
>     Bad request (as seen by a human):
> 
>         sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24,
> MSG_NOSIGNAL, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 16) = 24
>                    [    header-section     [question-section ]
>                                            [-----]|[---][--][]
>                                            ^      ^^    ^   ^
>        QNAME[0] = octet[5]{"label"}  -----'       ||    |   |
>        QNAME[1] = should-be-end -----------------' |    |   |
>        QTYPE    = AAAA  --------------------------'     |   |
>        QCLASS   = IN  ---------------------------------'    |
>        garbage  = garbage  --------------------------------'
> 
> So there are 2 pieces of corruption going on here:
> 
>  1. Instead of getting the \0 terminator indicating that there are no
>     more lables in the QNAME, it gets an ASCII '.', indicating another
>     label of length 46.
>  2. An extra byte is allocated, which appears at the end of the
>     message.
> 
> I have verified that the error happens with:
> 
>  - Alpine 3.9's musl 1.1.20-r3 on x86_64
>  - Alpine 3.9's musl 1.1.20-r4 on x86_64
>  - Alpine edge's musl 1.1.21-r2 on x86_64
>  - Alpine edge's musl 1.1.22-r0 on x86_64

Yes, this is probably a bug, if search is expected to accept trailing
dots, which seems like reasonable-ish functionality. Around line 203
of lookup_name.c, we'd need to detect this case and replace the search
component with a zero-length one. I don't recall right off if we'd
also need to strip the . separating the query from the search
component; that depends on whether name_from_dns accepts a trailing
dot, which I think it does, so such stripping probably isn't needed.

Again, I think it's a really bad idea to configure your resolv.conf
like this. As you've done, it will repeat the same query twice in the
case of NxDomain, for no benefit. This will only happen for queries
with fewer than ndots dots in them, which, unless you've increaded
ndots (which has a lot of other problems), will always be NxDomain.
And in the case where you have other nontrivial search components
*after* ".", it will produce a situation where appearance of new
domains in the global namespace will mask local names you might be
using.

I wonder if it would make more sense to just skip/ignore "." in the
search path...

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Malformed DNS requests for single-label hostnames with `search .`
@ 2019-05-08 15:57 Luke Shumaker
  0 siblings, 0 replies; 3+ messages in thread
From: Luke Shumaker @ 2019-05-08 15:57 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Rich Felker wrote:
> On Tue, May 07, 2019 at 12:29:43PM -0400, Luke Shumaker wrote:
> > In some scenarios, musl libc generates invalid DNS queries that are
> > discarded by the DNS server.  Particularly when `resolv.conf` says
> > `search .` and we attempt to resolv a single-label hostname.
> >
> >     / # cat /etc/resolv.conf
> >     search .
> >     nameserver 1.1.1.1
>
> Note that this is not a good idea, even if it weren't buggy, as it
> will just perform all your queries twice. If you don't want to search,
> omit the search option or leave it blank.

Use-case 1: On macOS, disabling `search`/`domain`-from-DHCP results in
`search .`.  musl isn't really used on macOS, but it _is_ used in
Alpine-on-Docker-on-macOS (by default, Docker just copies the
resolv.conf from the host).

Use-case 2: Since `search` defaults to `domain`, setting `domain .`
will also trigger the bug.  Setting `domain .` is a thing that is
explicitly mentioned as valid in the Linux man-pages project
<https://www.kernel.org/doc/man-pages/> page for resolv.conf.

Use-case 3: Some users wish for to use a search suffix, but to try the
name-as-requested first, then falling back to the suffix; they can do
this by explicitly mentioning `.` before the suffix, setting

    search . whatever.com

> > But if I allow it to use the `search`-path, the query is invalid:
> >
> >     / # time strace -f -e trace=sendto,sendmsg,sendmmsg getent hosts label
> >     sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24, ...
> >     sendto(3, "\16s\1\0\0\1\0\0\0\0\0\0\5label.\0\34\0\1\0", 24, ...
> >     sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24, ...
> >     sendto(3, "\363\365\1\0\0\1\0\0\0\0\0\0\5label.\0\1\0\1\0", 24, ...
> >     +++ exited with 2 +++
> >     Command exited with non-zero status 2
> >     real    0m 10.01s
> >     user    0m 0.00s
> >     sys     0m 0.00s
> ...
> > So there are 2 pieces of corruption going on here:
> >
> >  1. Instead of getting the \0 terminator indicating that there are no
> >     more lables in the QNAME, it gets an ASCII '.', indicating another
> >     label of length 46.
> >  2. An extra byte is allocated, which appears at the end of the
> >     message.

Oh, I forgot to mention a 3rd piece of corruption: It submits the same
request (with the same ID!) twice.

> Yes, this is probably a bug, if search is expected to accept trailing
> dots, which seems like reasonable-ish functionality. Around line 203
> of lookup_name.c, we'd need to detect this case and replace the search
> component with a zero-length one. I don't recall right off if we'd
> also need to strip the . separating the query from the search
> component; that depends on whether name_from_dns accepts a trailing
> dot, which I think it does, so such stripping probably isn't needed.

Thanks for the pointers!

> Again, I think it's a really bad idea to configure your resolv.conf
> like this. As you've done, it will repeat the same query twice in the
> case of NxDomain, for no benefit. This will only happen for queries
> with fewer than ndots dots in them, which, unless you've increaded
> ndots (which has a lot of other problems), will always be NxDomain.
> And in the case where you have other nontrivial search components
> *after* ".", it will produce a situation where appearance of new
> domains in the global namespace will mask local names you might be
> using.
>
> I wonder if it would make more sense to just skip/ignore "." in the
> search path...

It looks like what GNU libc does is keep track of whether "." appeared
in `search`, and skip the no-suffix lookup if it reaches the end of
the list (tracking `root_on_list` in `res_query.c`).  On the other
hand, bind-tools will perform the lookup twice, as you described.

-- 
Happy hacking,
~ Luke Shumaker


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-05-08 15:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 16:29 Malformed DNS requests for single-label hostnames with `search .` Luke Shumaker
2019-05-07 18:04 ` Rich Felker
2019-05-08 15:57 Luke Shumaker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).