mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: Sascha Braun <sascha.braun.lpz@googlemail.com>
Cc: musl@lists.openwall.com
Subject: Re: [musl] Problably Issue in name_from_dns // __res_msend_rc (lookup_name.c)
Date: Fri, 3 Jun 2022 08:27:04 -0400	[thread overview]
Message-ID: <20220603122704.GZ7074@brightrain.aerifal.cx> (raw)
In-Reply-To: <CALATqH=0MkjYhTYcuMxpu+nKmQxn25N+gmQDha_E3EUdxnwWkQ@mail.gmail.com>

On Fri, Jun 03, 2022 at 12:28:40AM +0200, Sascha Braun wrote:
> Hi Rich,
> 
> I think I narrowed the thing down. Below is a dump of what happens in a
> 'normal' situation and what happened when the sporadic issue appeared.
> 
> Normal:
> begin lookup3...
> __syscall_send_internal 020000350808040400000000000000008.8.4.4:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d0000010001
> __syscall_send_internal 02000035d043dede0000000000000000208.67.222.222:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d0000010001
> __syscall_send_internal 020000350909090900000000000000009.9.9.9:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d0000010001
> __syscall_send_internal 020000350808040400000000000000008.8.4.4:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d00001c0001
> __syscall_send_internal 02000035d043dede0000000000000000208.67.222.222:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d00001c0001
> __syscall_send_internal 020000350909090900000000000000009.9.9.9:53]
> ecea010000010000000000000377777706676f6f676c6503636f6d00001c0001
> __syscall_recv begin EP[0.0.0.0:0]
> __syscall_recv'd_internal [020000000000000000000000000000009.9.9.9:53]
> ecea818000010001000000000377777706676f6f676c6503636f6d0000010001c00c00010001000000390004d83ad4840000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> __syscall_recv begin EP[9.9.9.9:53]
> __syscall_recv'd_internal [020000350909090900000000000000009.9.9.9:53]
> ecea818000010001000000000377777706676f6f676c6503636f6d00001c0001c00c001c00010000003400102a0014504001080800000000000020040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> connect:0200ffffd83ad4840000000000000000
> connect:0a00ffff000000002a00145040010808000000000000200400000000
> 
> So we did sent the bytes
> ecea010000010000000000000377777706676f6f676c6503636f6d0000010001 and the
> other sequence (hex encoded, 2 characters per byte) to each of the DNS and
> we received back one 'short' and one 'long' reply from 9.9.9.9.
> I guess the short one is IPv4, long one IPv6(?). That's the case with all
> successful lookups, i.e. the 99% ok ones - (at least) one short - (at
> least) one long.
> 
> Now the problematic one:
> 
> begin lookup1...
> __syscall_send_internal 020000350808040400000000000000008.8.4.4:53]
> 94d90100000100000000000003777777037765620264650000010001
> __syscall_send_internal 02000035d043dede0000000000000000208.67.222.222:53]
> 94d90100000100000000000003777777037765620264650000010001
> __syscall_send_internal 020000350909090900000000000000009.9.9.9:53]
> 94d90100000100000000000003777777037765620264650000010001
> __syscall_send_internal 020000350808040400000000000000008.8.4.4:53]
> 94d901000001000000000000037777770377656202646500001c0001
> __syscall_send_internal 02000035d043dede0000000000000000208.67.222.222:53]
> 94d901000001000000000000037777770377656202646500001c0001
> __syscall_send_internal 020000350909090900000000000000009.9.9.9:53]
> 94d901000001000000000000037777770377656202646500001c0001
> 
> __syscall_recv begin EP[0.0.0.0:0]
> __syscall_recv'd_internal [020000000000000000000000000000009.9.9.9:53]
> 94d981800001000100010000037777770377656202646500001c0001c00c000500010000004e000f0377777708672d68612d776562c014c02c00060001000000340031036e733102706f0675692d646e73c0140a686f73746d6173746572c04378860f1200002a3000000e1000093a800000003c000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> __syscall_recv begin EP[9.9.9.9:53]
> __syscall_recv'd_internal [020000350909090900000000000000008.8.4.4:53]
> 94d981800001000100010000037777770377656202646500001c0001c00c0005000100000114000f0377777708672d68612d776562c014c02c000600010000002c0031036e733102706f0675692d646e73c0140a686f73746d6173746572c04378860f1200002a3000000e1000093a800000003c000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> getaddrinfo1: Address in use
> 
> We received two 'long' responses, one from 9.9.9.9; one from 8.8.4.4
> 
> All occurrences of the problem show this constellation - two 'long'
> responses received.
> As a note, of course my implementation of recv returns the correct number
> of bytes received. The zeros you see are only from the dump function, it's
> dumping the 512 byte buffer.
> 
> I hope this is helpful in some manner.
> 
> I came across this, I seems glibc had a similar issue (I did not look
> in-depth, just want to share the link)
> https://bugzilla.redhat.com/show_bug.cgi?id=1044628
> https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00321.html

OK, I found your problem. It's that the query ids for both the A and
AAAA are the same, probably because you have a low-resolution or
non-working clock_gettime. If the host environment does not provide a
way to get a high resolution clock, I think you should still apply a
monotonic increasing increment of the nanoseconds on each call where
the host environment's time did not increase so that the clock is
strictly monotonic. However musl's resolver should also deal with this
case since it's always possible to get identical query ids (with low
probability). We should just check if they're equal, and if so,
increment the second one. I'll write a patch to do this.

Rich

  reply	other threads:[~2022-06-03 12:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31 10:24 Sascha Braun
2022-06-01 14:14 ` Rich Felker
2022-06-01 20:35   ` Sascha Braun
2022-06-01 21:30     ` Rich Felker
2022-06-01 21:52       ` Sascha Braun
2022-06-02 14:25         ` Rich Felker
2022-06-02 22:28           ` Sascha Braun
2022-06-03 12:27             ` Rich Felker [this message]
2022-06-03 12:34               ` Rich Felker
2022-06-03 13:44                 ` Sascha Braun
2022-06-05 18:14                 ` Sascha Braun
2022-06-05 18:47                   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220603122704.GZ7074@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=sascha.braun.lpz@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).