From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 31640 invoked from network); 17 Feb 2022 16:05:16 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 17 Feb 2022 16:05:16 -0000 Received: (qmail 1403 invoked by uid 550); 17 Feb 2022 16:05:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 1366 invoked from network); 17 Feb 2022 16:05:13 -0000 Date: Thu, 17 Feb 2022 11:05:01 -0500 From: Rich Felker To: Satadru Pramanik Cc: musl@lists.openwall.com Message-ID: <20220217160501.GS7074@brightrain.aerifal.cx> References: <20220216213335.GO7074@brightrain.aerifal.cx> <20220217132434.GP7074@brightrain.aerifal.cx> <20220217134651.GQ7074@brightrain.aerifal.cx> <20220217155351.GR7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220217155351.GR7074@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Re: musl getaddr info breakage on older kernels On Thu, Feb 17, 2022 at 10:53:52AM -0500, Rich Felker wrote: > On Thu, Feb 17, 2022 at 09:49:45AM -0500, Satadru Pramanik wrote: > > Apologies for not being as familiar with gdb as I ought to be. > > I used the __clock_gettime64 breakpoint and did a backtrace and finish > > repeatedly. > > I couldn't figure out how to best get the timespec struct info. > > > > Alternately if you want to throw out a sample test program for me to build > > and run, and what gdb commands to run to get the right info, happy to do > > that too. > > > > gdb output is attached. > > If gdb reported it correctly, clock_gettime returned 403, which should > be impossible. It can only return 0 or -1. Incidentally, 403 is the > syscall number for SYS_clock_gettime64, which suggests your kernel is > simply *returning the syscall number* instead of -ENOSYS for syscalls > that don't exist on it. Is this a stock kernel (3.8 IIRC) or does it > have any sort of weird vendor patching? Any LSMs loaded? > > If you'd like to run a test just to make sure we're accurately seeing > what's happening, the attached should work. It should print 0 followed > by the current time in seconds and nanoseconds. It looks like you hit the bug introduced in commit 554086d85e71f30abe46fc014fea31929a7c6a8a and fixed in commit 8142b215501f8b291a108a202b3a053a265b03dd. It looks like, since the former was a CVE fix, somebody backported it to the kernel you're using, but they failed to backport the fix-for-the-fix, so you have a kernel that operates dangerously incorrectly for syscall numbers it's unaware of. This really needs to be fixed in the kernel if you can. On our side (musl) we probably need to find out if such kernels are actually out in the wild, and if so, whether there's any reasonable way to detect the false success and treat it as failure. > > On Thu, Feb 17, 2022 at 8:46 AM Rich Felker wrote: > > > > > On Thu, Feb 17, 2022 at 08:30:47AM -0500, Satadru Pramanik wrote: > > > > *This is a failure:* > > > > tcpdump -i any -vvv host 192.168.0.115 > > > > tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture > > > > size 262144 bytes > > > > 08:29:38.043849 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto > > > UDP > > > > (17), length 56) > > > > 192.168.0.115.60625 > office.lan.53: [udp sum ok] 0+ A? google.com. > > > (28) > > > > 08:29:38.044237 IP (tos 0x0, ttl 64, id 11463, offset 0, flags [DF], > > > proto > > > > UDP (17), length 72) > > > > office.lan.53 > 192.168.0.115.60625: [bad udp cksum 0x820a -> > > > 0x5c7d!] > > > > 0 q: A? google.com. 1/0/0 google.com. [2m15s] A 142.250.80.110 (44) > > > > 08:29:38.047754 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto > > > UDP > > > > (17), length 56) > > > > 192.168.0.115.60625 > office.lan.53: [udp sum ok] 0+ AAAA? > > > google.com. > > > > (28) > > > > 08:29:38.048078 IP (tos 0x0, ttl 64, id 11464, offset 0, flags [DF], > > > proto > > > > UDP (17), length 84) > > > > office.lan.53 > 192.168.0.115.60625: [bad udp cksum 0x8216 -> > > > 0xb42f!] > > > > 0 q: AAAA? google.com. 1/0/0 google.com. [4m26s] AAAA > > > > 2607:f8b0:4006:80d::200e (56) > > > > 08:29:38.048955 IP (tos 0xc0, ttl 64, id 59728, offset 0, flags [none], > > > > proto ICMP (1), length 112) > > > > 192.168.0.115 > office.lan: ICMP 192.168.0.115 udp port 60625 > > > > unreachable, length 92 > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > OK, this shows that the client has requested both answers and the > > > nameserver replied almost immediately (about 0.5ms later), but when > > > the second reply arrives (to the AAAA), the client has already closed > > > the listening port, despite only a few ms having passed. The only way > > > I see this could happen is by "timing out". This suggests that > > > something is wrong with telling time. > > > > > > Can you either put a breakpoint in __clock_gettime64 (this is the name > > > you have to use for a breakpoint -- sorry I messed it up last time) > > > and then see what it returns when you "finish" it and what's in the > > > timespec struct after that? Or just write a test program to call > > > clock_gettime(CLOCK_REALTIME, &ts) (note: you do NOT need or want to > > > use the time64 symbol name here) and print the results (return value > > > and contents of the timespec struct). > > > > > > > > > > > > > IP (tos 0x0, ttl 64, id 11464, offset 0, flags [DF], proto UDP > > > > (17), length 84) > > > > office.lan.53 > 192.168.0.115.60625: [udp sum ok] 0 q: AAAA? > > > google.com. > > > > 1/0/0 google.com. [4m26s] AAAA 2607:f8b0:4006:80d::200e (56) > > > > 08:29:39.476101 IP (tos 0x0, ttl 64, id 12690, offset 0, flags [DF], > > > proto > > > > TCP (6), length 52) > > > > 192.168.0.115.51204 > lga34s35-in-f3.1e100.net.80: Flags [.], cksum > > > > 0xa666 (correct), seq 1466707759, ack 3358943837, win 115, options > > > > [nop,nop,TS val 198422160 ecr 2351261566], length 0 > > > > 08:29:39.478914 IP (tos 0x80, ttl 122, id 6227, offset 0, flags [none], > > > > proto TCP (6), length 52) > > > > lga34s35-in-f3.1e100.net.80 > 192.168.0.115.51204: Flags [.], cksum > > > > 0xa5b7 (correct), seq 1, ack 1, win 282, options [nop,nop,TS val > > > 2351306585 > > > > ecr 198377148], length 0 > > > > ^C > > > > 7 packets captured > > > > 7 packets received by filter > > > > 0 packets dropped by kernel > > > > > > #include > #include > int main() > { > struct timespec ts; > printf("%d", clock_gettime(CLOCK_REALTIME, &ts)); > printf(" %lld %.9ld\n", (long long)ts.tv_sec, ts.tv_nsec); > }