From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 7549923253 for ; Sat, 17 Feb 2024 13:46:38 +0100 (CET) Received: (qmail 9416 invoked by uid 550); 17 Feb 2024 12:43:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 18088 invoked from network); 17 Feb 2024 11:11:47 -0000 x-libjamoibt: 1601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=libero.it; s=s2021; t=1708168485; bh=2Mfm/iGH5RqvrB7Tl+BtwAGDAkm+KA+79WuH82RhQTk=; h=From; b=FpgslCRtZ0kBnemvADwUxGmj5ixpwXgrvlP1KyJNZVx2ufwSplu55IjQxN46tC6cP 4v7TI9zW42xdRj4EvWeGcZN2mNSFh//eo544HiG8u3zutdZNlN+2NMjsC6PfEWOtLO h7r3GTnBl6ErdM7nAU8M2Hb0xwSmboirV+OWiIBF49tCOtGAgfPg8i5Ak7cNJM3inv lS3APxZxrNd6gEGR/yq+AWvVa5X0JURtzPWgG54YEki336T2W76fMoGuCDAlEK9i/Z T2lqgP7RJ/lMtbeAvZRT+mOEhl6gkuZdym4EOxudCPIjXw2xWOhdnOURi2cT3R8+L1 y4hz39nq/lNxw== X-CNFS-Analysis: v=2.4 cv=SqQz6+O0 c=1 sm=1 tr=0 ts=65d09525 cx=a_exe a=ysqyREb6S0hakrSWcZdePw==:117 a=ysqyREb6S0hakrSWcZdePw==:17 a=kj9zAlcOel0A:10 a=A1X0JdhQAAAA:8 a=v-tHFhjpOW7c4TRErasA:9 a=CjuIK1q_8ugA:10 a=Df3jFdWbhGDLdZNm0fyq:22 Date: Sat, 17 Feb 2024 12:08:12 +0100 From: g1pi@libero.it To: musl@lists.openwall.com Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-CMAE-Envelope: MS4xfOuQd41UKe1um8+bdKSo9POb+2bq5AktsPMIeABA/Skx62kiJdeAmm8OHg7qHt0USiLhzSmsxyoSiUYNsLA2WnV1ZWEX0OiHH8MWJ86EJqj1nwlHbjY2 y9ZobrL3u+h1xEPnyW26w5RhsZJT2YoogIbA7lXzp2TKCAS89uLIgs8/iiV8sfSD9DCShkbSTtwlPetMG8ziCk3qfTOxaLkxGbU= Subject: [musl] dns resolution failure in virtio-net guest Hi all. I stumped on a weird instance of domain resolution failure in a virtualization scenario involving a MUSL-based guest. A little investigation turned out results that are puzzling, at least for me. This is the scenario: Host: - debian 12 x86_64 - kernel 6.1.0-18-amd64, qemu 7.2 - caching nameserver listening on 127.0.0.1 Guest: - void linux x86_64 - kvm acceleration - virtio netdev, configured in (default) user-mode - kernel 6.1.71_1, musl-1.1.24_20 - /etc/resolv.conf: nameserver 10.0.2.2 the caching dns in the host nameserver 192.168.1.123 non existent In this scenario, "getent hosts example.com" consistently fails. The problem vanishes when I do any of these: - strace the command (!) - replace 10.0.2.2 with another working dns across a physical cable/wifi (e.g. 192.168.1.1) - remove the non-existent dns - swap the nameservers in /etc/resolv.conf I wrote a short test program (see below) to perform the same system calls done by the MUSL resolver, and it turns out that - when all sendto() calls are performed in short order, the (unique) response packet is never received $ ./a.out 10.0.2.2 192.168.1.123 poll: 0 1 0 recvfrom() -1 recvfrom() -1 - if a short delay (16 msec) is inserted between the calls, all is fine $ ./a.out 10.0.2.2 delay 192.168.1.123 poll: 1 1 1 recvfrom() 45 recvfrom() -1 The program's output is the same in several guests with different kernel/libc combinations (linux/glibc, linux/musl, freebsd, openbsd). Only when the emulated netdev was switched from virtio to pcnet, did the problem go away. I guess that, when there is no delay between the sendto() calls, the second one happens exactly while the kernel is receiving the response packet, and the latter is silently dropped. A short delay before the second sendto(), or a random delay in the response (because the working dns is "far away"), apparently solve the issue. I don't know what the UDP standard mandates, and especially what should happen when a packet is received on a socket at the exact time another packet is sent out on the same socket. If the kernel is allowed to drop the packet, then the MUSL resolver could be modified to introduce some minimal delay between calls, at least when retrying. Otherwise, there could be a race-condition in the network layer. Perhaps in the host linux/kvm/qemu. Perhaps in virtio-net, since the problem shows up in guests with different kernels, and only when they use virtio-net; but it might just be that other emulated devices mask the issue just by adding a little overhead. Please, CC me in replies. Best regards, g.b. ===== cut here ===== #include #include #include #include #include #include #include #include #include #include #include static void dump(const char *s, size_t len) { while (len--) { char t = *s++; if (' ' <= t && t <= '~' && t != '\\') printf("%c", t); else printf("\\%o", t & 0xff); } printf("\n"); } int main(int argc, char *argv[]) { int sock, rv, n; const char req[] = "\202\254\1\0\0\1\0\0\0\0\0\0\7example\3com\0\0\1\0\1"; struct timespec delay_l = { 1, 0 }; /* 1 sec */ struct pollfd pfs; struct sockaddr_in me = { 0 }; sock = socket(AF_INET, SOCK_DGRAM | SOCK_CLOEXEC | SOCK_NONBLOCK, IPPROTO_IP); assert(sock >= 0); me.sin_family = AF_INET; me.sin_port = 0; me.sin_addr.s_addr = inet_addr("0.0.0.0"); rv = bind(sock, (struct sockaddr *) &me, sizeof me); assert(0 == rv); for (n = 1; n < argc; n++) { if (0 == strcmp("delay", argv[n])) { struct timespec delay_s = { 0, (1 << 24) }; /* ~ 16 msec */ nanosleep(&delay_s, NULL); } else { struct sockaddr_in dst = { 0 }; dst.sin_family = AF_INET; dst.sin_port = htons(53); dst.sin_addr.s_addr = inet_addr(argv[n]); rv = sendto(sock, req, sizeof req - 1, MSG_NOSIGNAL, (struct sockaddr *) &dst, sizeof dst); assert(rv >= 0); } } nanosleep(&delay_l, NULL); pfs.fd = sock; pfs.events = POLLIN; rv = poll(&pfs, 1, 2000); printf("poll: %d %d %d\n", rv, pfs.events, pfs.revents); for (n = 1; n < argc; n++) { char resp[4000]; if (0 == strcmp("delay", argv[n])) continue; rv = recvfrom(sock, resp, sizeof resp, 0, NULL, NULL); printf("recvfrom() %d\n", rv); if (rv > 0) dump(resp, rv); } return 0; } ===== cut here =====