From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23182 invoked from network); 8 Oct 2022 07:07:27 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 8 Oct 2022 07:07:27 -0000 Received: (qmail 30331 invoked by uid 550); 8 Oct 2022 07:07:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30296 invoked from network); 8 Oct 2022 07:07:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1665212831; bh=Uv2gZFASdob1tciBBbGGIpNNfoq53pemoG0jW/KDLQY=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:References:In-Reply-To; b=jHEgf059GbyAeYjo74egY1+SQgVQdiCrMMhB+nYeiRKpvTpSeF75ppwx/n3cUCiy1 1egRxgPDk0MWeSzo0JI8bd+SFIj8WiNb0B77DVpL0IQSZihzR+myI8+xd5Kf8KRQnx Bu+Qj0DTr3hsA5gMILIWZBMy8mtSm9nBraK8uStg= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Date: Sat, 8 Oct 2022 09:07:10 +0200 From: Markus Wichmann To: musl@lists.openwall.com Cc: Uwe =?iso-8859-1?Q?Kleine-K=F6nig?= Message-ID: <20221008070710.GA2442@voyager> References: <0175978a-476c-e5f3-1da0-12bb78de7f54@kleine-koenig.org> <20221007232540.GC29905@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Provags-ID: V03:K1:KYVcI4wfE/W1OOcyFX0v7xbdDon+HJW0xb4068YhAllOcDX7zQW NuJXSlP5oFDl2W/oFSEB3CNmUJlvlKhQAwYr3yjPm7tDC9mjzuTc7veNLkkAezFgmsQ2NwI yx+fLLCheryzAl7djmK6Zss9gqOdlAfJUQuYZze6JVbx0lG6/iXjNh6U9SJ819236i++fa6 c/no370BVVSgbmiAnDuUA== X-UI-Out-Filterresults: notjunk:1;V03:K0:rA9DRZNkViQ=:9TQ1NJmBvVMDfKm9LSeVif yIdEaMpu8NbPPVwYpfYkvLJwzU/l/2YEZLMVRykiEd0jabmuWyUGDvzBYUj1aFZYJRH6YZZq3 mc/cpDTh2RSAaCzuXH5f4LBPUXrwRCN0MxGTn40iD/JbimUO1dZsHO6IOwKy0ELxgS3VFreKS 2IlpDsOWHQ59oLv55gnkvdzaJLj6tIfqZ0uioyrae0wa+tYxQ2t/p3cnXB8pkf2UCDvFgSJWL rjQtwyuOgzuPTtZAkjQ8JLkU5KtqO7wAM+DvTSL+8wxmgU6z3ITM9H7OdLD5NIM+1vit3pcBS XO+0+9i9FUwcSi953fu+9VMNlXJsDc3jMSwoxVOfFIakwDQtRHmkXjHOuMWIjTb5r5yGkEJne uG66dftm/qtpILjXk5Q/tUgcwZncgnul7rd8OPRj6paBhAO1s5kBxlGlRxWMjrtuYhcnuoM2H 8X5XyvXm9aUeUWwJp5LCcLV7+lHjIle9FbsiCDV2n6KCtIC83Ig1eGqX5zYWsWhrBI1xrJaQ5 gfgWOZ87rL/wgzVGmZ1WrUUzHpCGhzZaCFJOc75/MRQR5rmfxxz8zR3lRhtJ+2uziPzsfzzDz +DKzrlWqQ95DFjKzCpheZb+GQzWJm9kTBKcIqdH3ZMw3XrsLWB2MTx+AIuIKBOZUbwD5bYpvE j/2XfQitWmzRr+XATPbJle1tMfezw/SGT26d2BsOF7Pe9zFeN+/EktKufG3hdcuIw7koPwShC QiwGj1oWbQw123/1Ksor64gT9JutrCFZ7ZW/ZkAAAPwrcnHQpEiGoJXAqzvwnkQuu8oL3p5hr ofiPuxNzN7tEXjUlPJpfmXDq8iBBQXakZfGgn4ePr840qg6yuYiuWyZE6vOfzm6M5SFeLMnc6 KpgPuXZ1gsfDdXNBo5AK97VSba6/lE0hQolhVQLJz7q5ttCALX12ztoyh7x3iuJXUo8mu+ydy 5/jR6pDinuuhxg0OZLdz1k8jMojNkkNYa9uv6dnnRU3O3pHSs8yYzZ3ZQBgkhEib24h2mEsf3 2mIPqMIAfidXL8IdgoZ28UWfX2PKtA+StIYi3CEMNLvNnJ3Mb4dtaiicTLnkWuY86WsYcIj3M O2Ch1e0I/I9m/jWqYbDm76oE0Ij3J4j2nL/NWuiL9A3ZP80yXyjkoi/JS9ssA17dJJBJLN0Ec N8WwWMR1npWNDB+3T4j9MWjkI7XKzCtKh2ly3v8rGVzZMn7rQiS3TAt7p8OhqKpMC5ny230sH 4gNOk0f1SaKaCx3UDBcUmUtv5AM6J6OcGtOonLTVGP1J7WN69n7674nsaHvDlJsdoPWRhjGNa BtXSoRHR3p6AVu+FT67PDH2LxdAUprnaZNFo5KwyqIP+VX9MjsuScUBIB9ZqJQ0gqHfyoX5f4 5AGgkJT7KKCEOca6OIekV9qo2++2gPf8dOkaUd4b5cCMZdL84GQ1rk/iKgOY3iDNC6aVS9QAB xCHzxunfjFgJ4HDAKONCiL8DwuX1c/Ic1/Pw43ekaItPZ1umogyLg4PJC27B9c3Eo26UC3ETl GNcIIpMsL4z1tC5vRiwTKyNJGRKO9LOjesQiSd/PLHKeB Subject: Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC On Sat, Oct 08, 2022 at 02:37:30AM +0200, Uwe Kleine-K=F6nig wrote: > On 10/8/22 01:25, Rich Felker wrote: > > On Sat, Oct 08, 2022 at 01:04:25AM +0200, Uwe Kleine-K=F6nig wrote: > > > To improve the situation I suggest something like: > > > > > > diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c > > > index 614bf7864b48..78b3095fe959 100644 > > > --- a/src/network/res_mkquery.c > > > +++ b/src/network/res_mkquery.c > > > @@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int > > > class, int type, > > > struct timespec ts; > > > size_t l =3D strnlen(dname, 255); > > > int n; > > > + static unsigned int querycnt; > > > > > > if (l && dname[l-1]=3D=3D'.') l--; > > > if (l && dname[l-1]=3D=3D'.') return -1; > > > @@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int > > > class, int type, > > > > > > /* Make a reasonably unpredictable id */ > > > clock_gettime(CLOCK_REALTIME, &ts); > > > + /* force a different ID if mkquery was called twice during > > > the same tick */ > > > + ts.tv_nsec +=3D querycnt++; > > > id =3D ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff; > > > q[0] =3D id/256; > > > q[1] =3D id; > > > > > > > This isn't acceptable as-is because it introduces a data race. > > Huh. You mean if there is a race the pseudo random IDs are changed in a > platform dependent way? Doesn't sound sooo bad to me, but probably I'm t= oo > naive here. Also res_mkquery is documented to be not thread-safe. I didn= 't > check deeply, but I guess the counter should be moved to struct __res_st= ate. > (Assuming the manpage for res_mkquery also applies to musl.) A data race and a race condition are two different things. A data race is when the same memory location is accessed from different threads unsynchronized at the same time, and at least one of those is writing, and at least one of those is not atomic. Data races are undefined behavior in C (and as such must be avoided). Race conditions on the other hand are logic errors that occur when the code is failing to take into account changes to globally visible state made concurrently by other threads. If multiple threads increment the same variable at the same time, you have a data race and a race condition. If the threads are changed to use an atomic load and an atomic store instead, then the data race is alleviated, but the race condition remains. Notably, if you have one thread spinning in a loop until a global flag is set, and another thread setting that flag, you have a data race and no race condition. __res_mkquery() is used in __lookup_name(), which is used in getaddrinfo(), which notably *is* defined as thread-safe, so the concern stands. It could be remedied by simply replacing "querycnt++" with "a_inc(&querycnt)". However, the patch is still not desirable for reasons set out by the others. > > Despite your concerns, I updated the libc on my RE200 with the above pat= ch > now, and I cannot observe the failure any more. > Well, yes, it works around the bug in the original application. However, nothing (in theory) stops an application from generating thousands of queries at the same time and sending them simultaneously, and you are bound to have at least a few ID collisions in there. > Of course updating nslookup to use a better API is a nicer solution. > The API is fine, it just needs to be used correctly. Simple solution in this case would be to always overwrite the ID of the second query with one more than the ID of the original query, or its bitwise inverse or something. Something that is always different. res_mkquery() cannot know what other queries are outstanding at the time it creates the ID, only the application can do that. See name_from_dns() for an example. We do create multiple queries to the same server (potentially) and have to avoid ID collision manually. Ciao, Markus