From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 18079 invoked from network); 24 Jun 2022 14:59:54 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 24 Jun 2022 14:59:54 -0000 Received: (qmail 16339 invoked by uid 550); 24 Jun 2022 14:59:51 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 16301 invoked from network); 24 Jun 2022 14:59:50 -0000 Date: Fri, 24 Jun 2022 10:59:36 -0400 From: Rich Felker To: Markus Geiger Cc: musl@lists.openwall.com Message-ID: <20220624145936.GP7074@brightrain.aerifal.cx> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [BUG] Non-FQDN domain resolving failure on musl-1.2.x On Fri, Jun 24, 2022 at 12:28:24PM +0200, Markus Geiger wrote: > Hej! > > First, I love MUSL (and alpine linux). Great project! > > We encountered a bug in our CI pipeline using alpine images in conjunction > with AWS DNS servers - and it seems to be related to MUSL: > > $ curl -fsSL https://slack.com > curl: (6) Could not resolve host: slack.com > > Usually that should return some HTML. It seems to affect only non-FQDN > domains. As a workaround we use now full FQDN api.slack.com. But there is a > bug in resolvement! It seems if an AAAA domain is queried over an IPV4 > IP/DNS and doesn’t not return a record the overall resolvement of the > domain fails. That's not non-FQDN. Non-FQDN would be "api" as short for api.slack.com. slack.com is just the apex of a zone, but there's nothing special about that for resolving; it's likely just a difference in the records for it vs api, or something fishy the recursive nameserver you're using is doing... > *DEBUG LOG* > > We try several alpine images and musl libs on an EC2 host with docker and > AWS DNS exclusivly: > > - > > alpine 3.12 with musl-1.1.24-r10 is last known to work > - > > alpine 3.13 with musl-1.2.2-r1 starts failing (something introduced in > musl-1.2 ?) > - > > current alpine 3.16 with current musl-1.2.3-r0 still fails > > alpine 3.12 with musl-1.1.24-r10 is last known to work (see string > “success”) > > docker run -it --rm --dns=10.204.109.209 alpine:3.12 ash -c 'apk add > curl bind-tools;set -x;curl -fsSL https://slack.com 1>/dev/null && > echo success;host -4 -AAAA slack.com;apk list | grep musl' ✓ > ns-watch-attribution-nonprod 12:13 > fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz > fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz > (1/21) Installing fstrm (0.6.0-r1) > (2/21) Installing krb5-conf (1.0-r2) > (3/21) Installing libcom_err (1.45.6-r0) > (4/21) Installing keyutils-libs (1.6.1-r1) > (5/21) Installing libverto (0.3.1-r1) > (6/21) Installing krb5-libs (1.18.5-r0) > (7/21) Installing json-c (0.14-r1) > (8/21) Installing libgcc (9.3.0-r2) > (9/21) Installing libstdc++ (9.3.0-r2) > (10/21) Installing libprotobuf (3.12.2-r0) > (11/21) Installing libprotoc (3.12.2-r0) > (12/21) Installing protobuf-c (1.3.3-r1) > (13/21) Installing libuv (1.38.1-r0) > (14/21) Installing xz-libs (5.2.5-r1) > (15/21) Installing libxml2 (2.9.14-r0) > (16/21) Installing bind-libs (9.16.27-r1) > (17/21) Installing bind-tools (9.16.27-r1) > (18/21) Installing ca-certificates (20211220-r0) > (19/21) Installing nghttp2-libs (1.41.0-r0) > (20/21) Installing libcurl (7.79.1-r1) > (21/21) Installing curl (7.79.1-r1) > Executing busybox-1.31.1-r22.trigger > Executing ca-certificates-20211220-r0.trigger > OK: 20 MiB in 35 packages > + curl -fsSL https://slack.com > + echo success > success > + host -4 -AAAA slack.com ^^^^ This does not request AAAA. It (-A repeated redundantly 4 times) request ANY, which is deprecated. So the output is not terribly helpful in figuring out what's going on. Can you provide tcpdump of port 53 traffic when curl makes the query, and/or full strace of the curl execution? This would show what wrong responses the nameserver is giving that's causing curl to fail.