mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] nslookup failures with coarse CLOCK_MONOTONIC
@ 2022-10-07 23:04 Uwe Kleine-König
  2022-10-07 23:25 ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Uwe Kleine-König @ 2022-10-07 23:04 UTC (permalink / raw)
  To: openwrt-devel, musl

Hello,

on a TP-Link RE200 v1 (platform = ramips/mt7620) I experience often:

  root@ares:~# nslookup www.openwrt.org
  Server:		127.0.0.1
  Address:	127.0.0.1:53

  Non-authoritative answer:
  www.openwrt.org	canonical name = wiki-01.infra.openwrt.org
  Name:	wiki-01.infra.openwrt.org
  Address: 2a03:b0c0:3:d0::1af1:1

  *** Can't find www.openwrt.org: No answer

I narrowed the problem down to the following:

nslookup creates and sends two querys (for A and AAAA) using 
res_mkquery(). Each query has a more or less random ID and nslookup 
matches the received responses using these IDs to the sent querys.

Looking at the sent queries using tcpdump, I saw that in the above 
scenario the two IDs are identical. Then nslookup matches the first 
received answer to the first query and discards the second reply, as 
it's matched to the already handled first query, too.

In a few cases where both lookups succeed, I saw the following pairs of IDs:

17372 37373
40961 60961
45955 419
47302 1766

Musl does the following to create the 16 bit ID:

          /* Make a reasonably unpredictable id */
          clock_gettime(CLOCK_REALTIME, &ts);
          id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
          q[0] = id/256;
          q[1] = id;

(from musl's src/network/res_mkquery.c) My hypothesis now is that the 
monotonic clock has a resolution of 20 µs only. So if the two 
res_mkquery() calls are called within the same 20 µs tick, the IDs end 
up being identical. If they happen in two consecutive ticks, the IDs 
have a delta of 20000 or 20001 which matches the four cases observed above.

To improve the situation I suggest something like:

diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c
index 614bf7864b48..78b3095fe959 100644
--- a/src/network/res_mkquery.c
+++ b/src/network/res_mkquery.c
@@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int 
class, int type,
         struct timespec ts;
         size_t l = strnlen(dname, 255);
         int n;
+       static unsigned int querycnt;

         if (l && dname[l-1]=='.') l--;
         if (l && dname[l-1]=='.') return -1;
@@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int 
class, int type,

         /* Make a reasonably unpredictable id */
         clock_gettime(CLOCK_REALTIME, &ts);
+       /* force a different ID if mkquery was called twice during the 
same tick */
+       ts.tv_nsec += querycnt++;
         id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
         q[0] = id/256;
         q[1] = id;

Would that make sense?

Note I'm not subscribed to the musl mailing list, so please Cc: me on 
replies.

Best regards
Uwe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
  2022-10-07 23:04 [musl] nslookup failures with coarse CLOCK_MONOTONIC Uwe Kleine-König
@ 2022-10-07 23:25 ` Rich Felker
  2022-10-07 23:53   ` Jo-Philipp Wich
  2022-10-08  0:37   ` Uwe Kleine-König
  0 siblings, 2 replies; 6+ messages in thread
From: Rich Felker @ 2022-10-07 23:25 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: openwrt-devel, musl

On Sat, Oct 08, 2022 at 01:04:25AM +0200, Uwe Kleine-König wrote:
> Hello,
> 
> on a TP-Link RE200 v1 (platform = ramips/mt7620) I experience often:
> 
>  root@ares:~# nslookup www.openwrt.org
>  Server:		127.0.0.1
>  Address:	127.0.0.1:53
> 
>  Non-authoritative answer:
>  www.openwrt.org	canonical name = wiki-01.infra.openwrt.org
>  Name:	wiki-01.infra.openwrt.org
>  Address: 2a03:b0c0:3:d0::1af1:1
> 
>  *** Can't find www.openwrt.org: No answer
> 
> I narrowed the problem down to the following:
> 
> nslookup creates and sends two querys (for A and AAAA) using
> res_mkquery(). Each query has a more or less random ID and nslookup
> matches the received responses using these IDs to the sent querys.

This was fixed for the libc stub resolver in commit
6c858d6fd4df8b5498ef2cae66c8f3c3eff1587b, which is not present in any
release yet but in mainline git. However, it looks like you've hit it
with code directly using the res_* API, which would not get the fix.

> Looking at the sent queries using tcpdump, I saw that in the above
> scenario the two IDs are identical. Then nslookup matches the first
> received answer to the first query and discards the second reply, as
> it's matched to the already handled first query, too.
> 
> In a few cases where both lookups succeed, I saw the following pairs of IDs:
> 
> 17372 37373
> 40961 60961
> 45955 419
> 47302 1766
> 
> Musl does the following to create the 16 bit ID:
> 
>          /* Make a reasonably unpredictable id */
>          clock_gettime(CLOCK_REALTIME, &ts);
>          id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
>          q[0] = id/256;
>          q[1] = id;
> 
> (from musl's src/network/res_mkquery.c) My hypothesis now is that
> the monotonic clock has a resolution of 20 µs only. So if the two
> res_mkquery() calls are called within the same 20 µs tick, the IDs
> end up being identical. If they happen in two consecutive ticks, the
> IDs have a delta of 20000 or 20001 which matches the four cases
> observed above.
> 
> To improve the situation I suggest something like:
> 
> diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c
> index 614bf7864b48..78b3095fe959 100644
> --- a/src/network/res_mkquery.c
> +++ b/src/network/res_mkquery.c
> @@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int
> class, int type,
>         struct timespec ts;
>         size_t l = strnlen(dname, 255);
>         int n;
> +       static unsigned int querycnt;
> 
>         if (l && dname[l-1]=='.') l--;
>         if (l && dname[l-1]=='.') return -1;
> @@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int
> class, int type,
> 
>         /* Make a reasonably unpredictable id */
>         clock_gettime(CLOCK_REALTIME, &ts);
> +       /* force a different ID if mkquery was called twice during
> the same tick */
> +       ts.tv_nsec += querycnt++;
>         id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
>         q[0] = id/256;
>         q[1] = id;
> 
> Would that make sense?
> 
> Note I'm not subscribed to the musl mailing list, so please Cc: me
> on replies.

This isn't acceptable as-is because it introduces a data race.

That could be fixed in various ways, but I'm not sure if it's even our
responsibility to fix it, If a caller of res_mkquery is going to send
multiple queries on the same source port, it really needs to be making
sure on its own that they have distinct query IDs. Any 16-bit
random-ish identity is going to have collisions given enough attempts,
even if the clock is not low-resolution. Being time-based probably
makes it slightly less bad than pure random here, but I still suspect
it's a problem. The res_mkquery function simply doesn't and can't know
when you're going to use the results and what domain their IDs need to
be unique in.

My view of the randomness here is that it wasn't put in to avoid
collisions (space is too small) but to help (along with random ports)
make spoofing results less successful.

Which implementation of nslookup is this? Busybox? It would probably
be useful to hear thoughts on it from their side.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
  2022-10-07 23:25 ` Rich Felker
@ 2022-10-07 23:53   ` Jo-Philipp Wich
  2022-10-08  0:05     ` Rich Felker
  2022-10-08  0:37   ` Uwe Kleine-König
  1 sibling, 1 reply; 6+ messages in thread
From: Jo-Philipp Wich @ 2022-10-07 23:53 UTC (permalink / raw)
  To: Rich Felker, Uwe Kleine-König; +Cc: openwrt-devel, musl


[-- Attachment #1.1: Type: text/plain, Size: 826 bytes --]

Hi,

> [...]
> Which implementation of nslookup is this? Busybox? It would probably
> be useful to hear thoughts on it from their side.
assuming the OP is using standard OpenWrt nslookup, it is the "big" busybox
nslookup implementation, which is using the res_*() api and name lookup logic
borrowed from musl libc instead of the original "small" version fiddling with
the `_res` state directly (and being broken on musl libc due to that).

The proper course of action here is likely adapting the solution in
6c858d6fd4df8b5498ef2cae66c8f3c3eff1587b and porting it to the busybox "big"
nslookup code itself.

I agree that musl libc itself cannot do much more to ensure uniqueness of the
IDs generated by res_mkquery() and that it should be solved in the application
code itself in this case.

Regards,
Jo


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
  2022-10-07 23:53   ` Jo-Philipp Wich
@ 2022-10-08  0:05     ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2022-10-08  0:05 UTC (permalink / raw)
  To: Jo-Philipp Wich; +Cc: Uwe Kleine-König, openwrt-devel, musl

On Sat, Oct 08, 2022 at 01:53:29AM +0200, Jo-Philipp Wich wrote:
> Hi,
> 
> > [...]
> > Which implementation of nslookup is this? Busybox? It would probably
> > be useful to hear thoughts on it from their side.
> assuming the OP is using standard OpenWrt nslookup, it is the "big" busybox
> nslookup implementation, which is using the res_*() api and name lookup logic
> borrowed from musl libc instead of the original "small" version fiddling with
> the `_res` state directly (and being broken on musl libc due to that).
> 
> The proper course of action here is likely adapting the solution in
> 6c858d6fd4df8b5498ef2cae66c8f3c3eff1587b and porting it to the busybox "big"
> nslookup code itself.
> 
> I agree that musl libc itself cannot do much more to ensure uniqueness of the
> IDs generated by res_mkquery() and that it should be solved in the application
> code itself in this case.

While it won't be as fast (not parallel unless you do threads) it
might be worth just using res_send in the Busybox "big" nslookup. At
present neither busybox nor musl supports TCP fallback for large
records, but the next release of musl will, and the fact that it's
using its own query look with UDP derived from the musl code means it
won't get that benefit.

Alternatively, busybox could copy our parallel TCP fallback code if
they like. :-)

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
  2022-10-07 23:25 ` Rich Felker
  2022-10-07 23:53   ` Jo-Philipp Wich
@ 2022-10-08  0:37   ` Uwe Kleine-König
  2022-10-08  7:07     ` Markus Wichmann
  1 sibling, 1 reply; 6+ messages in thread
From: Uwe Kleine-König @ 2022-10-08  0:37 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Hello Rich,

On 10/8/22 01:25, Rich Felker wrote:
> On Sat, Oct 08, 2022 at 01:04:25AM +0200, Uwe Kleine-König wrote:
>>
>> Musl does the following to create the 16 bit ID:
>>
>>           /* Make a reasonably unpredictable id */
>>           clock_gettime(CLOCK_REALTIME, &ts);
>>           id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
>>           q[0] = id/256;
>>           q[1] = id;
>>
>> (from musl's src/network/res_mkquery.c) My hypothesis now is that
>> the monotonic clock has a resolution of 20 µs only. So if the two
>> res_mkquery() calls are called within the same 20 µs tick, the IDs
>> end up being identical. If they happen in two consecutive ticks, the
>> IDs have a delta of 20000 or 20001 which matches the four cases
>> observed above.
>>
>> To improve the situation I suggest something like:
>>
>> diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c
>> index 614bf7864b48..78b3095fe959 100644
>> --- a/src/network/res_mkquery.c
>> +++ b/src/network/res_mkquery.c
>> @@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int
>> class, int type,
>>          struct timespec ts;
>>          size_t l = strnlen(dname, 255);
>>          int n;
>> +       static unsigned int querycnt;
>>
>>          if (l && dname[l-1]=='.') l--;
>>          if (l && dname[l-1]=='.') return -1;
>> @@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int
>> class, int type,
>>
>>          /* Make a reasonably unpredictable id */
>>          clock_gettime(CLOCK_REALTIME, &ts);
>> +       /* force a different ID if mkquery was called twice during
>> the same tick */
>> +       ts.tv_nsec += querycnt++;
>>          id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
>>          q[0] = id/256;
>>          q[1] = id;
>>
>> Would that make sense?
> 
> This isn't acceptable as-is because it introduces a data race.

Huh. You mean if there is a race the pseudo random IDs are changed in a 
platform dependent way? Doesn't sound sooo bad to me, but probably I'm 
too naive here. Also res_mkquery is documented to be not thread-safe. I 
didn't check deeply, but I guess the counter should be moved to struct 
__res_state. (Assuming the manpage for res_mkquery also applies to musl.)

Despite your concerns, I updated the libc on my RE200 with the above 
patch now, and I cannot observe the failure any more.

Of course updating nslookup to use a better API is a nicer solution.

> Which implementation of nslookup is this? Busybox? It would probably
> be useful to hear thoughts on it from their side.

Jo already replied here, nothing to add from my side.

BTW, glibc also generates the IDs from the monotonic clock only. It 
makes a bit bigger effort to scramble the bits in there, but if the time 
doesn't advance, it also generates identical IDs.

Best regards and thanks for your quick response
Uwe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
  2022-10-08  0:37   ` Uwe Kleine-König
@ 2022-10-08  7:07     ` Markus Wichmann
  0 siblings, 0 replies; 6+ messages in thread
From: Markus Wichmann @ 2022-10-08  7:07 UTC (permalink / raw)
  To: musl; +Cc: Uwe Kleine-König

On Sat, Oct 08, 2022 at 02:37:30AM +0200, Uwe Kleine-König wrote:
> On 10/8/22 01:25, Rich Felker wrote:
> > On Sat, Oct 08, 2022 at 01:04:25AM +0200, Uwe Kleine-König wrote:
> > > To improve the situation I suggest something like:
> > >
> > > diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c
> > > index 614bf7864b48..78b3095fe959 100644
> > > --- a/src/network/res_mkquery.c
> > > +++ b/src/network/res_mkquery.c
> > > @@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int
> > > class, int type,
> > >          struct timespec ts;
> > >          size_t l = strnlen(dname, 255);
> > >          int n;
> > > +       static unsigned int querycnt;
> > >
> > >          if (l && dname[l-1]=='.') l--;
> > >          if (l && dname[l-1]=='.') return -1;
> > > @@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int
> > > class, int type,
> > >
> > >          /* Make a reasonably unpredictable id */
> > >          clock_gettime(CLOCK_REALTIME, &ts);
> > > +       /* force a different ID if mkquery was called twice during
> > > the same tick */
> > > +       ts.tv_nsec += querycnt++;
> > >          id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
> > >          q[0] = id/256;
> > >          q[1] = id;
> > >
> >
> > This isn't acceptable as-is because it introduces a data race.
>
> Huh. You mean if there is a race the pseudo random IDs are changed in a
> platform dependent way? Doesn't sound sooo bad to me, but probably I'm too
> naive here. Also res_mkquery is documented to be not thread-safe. I didn't
> check deeply, but I guess the counter should be moved to struct __res_state.
> (Assuming the manpage for res_mkquery also applies to musl.)

A data race and a race condition are two different things. A data race
is when the same memory location is accessed from different threads
unsynchronized at the same time, and at least one of those is writing,
and at least one of those is not atomic. Data races are undefined
behavior in C (and as such must be avoided).

Race conditions on the other hand are logic errors that occur when the
code is failing to take into account changes to globally visible state
made concurrently by other threads. If multiple threads increment the
same variable at the same time, you have a data race and a race
condition. If the threads are changed to use an atomic load and an
atomic store instead, then the data race is alleviated, but the race
condition remains.

Notably, if you have one thread spinning in a loop until a global flag
is set, and another thread setting that flag, you have a data race and
no race condition.

__res_mkquery() is used in __lookup_name(), which is used in
getaddrinfo(), which notably *is* defined as thread-safe, so the concern
stands. It could be remedied by simply replacing "querycnt++" with
"a_inc(&querycnt)". However, the patch is still not desirable for
reasons set out by the others.

>
> Despite your concerns, I updated the libc on my RE200 with the above patch
> now, and I cannot observe the failure any more.
>

Well, yes, it works around the bug in the original application. However,
nothing (in theory) stops an application from generating thousands of
queries at the same time and sending them simultaneously, and you are
bound to have at least a few ID collisions in there.

> Of course updating nslookup to use a better API is a nicer solution.
>

The API is fine, it just needs to be used correctly. Simple solution in
this case would be to always overwrite the ID of the second query with
one more than the ID of the original query, or its bitwise inverse or
something. Something that is always different. res_mkquery() cannot know
what other queries are outstanding at the time it creates the ID, only
the application can do that. See name_from_dns() for an example. We do
create multiple queries to the same server (potentially) and have to
avoid ID collision manually.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-08  7:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-07 23:04 [musl] nslookup failures with coarse CLOCK_MONOTONIC Uwe Kleine-König
2022-10-07 23:25 ` Rich Felker
2022-10-07 23:53   ` Jo-Philipp Wich
2022-10-08  0:05     ` Rich Felker
2022-10-08  0:37   ` Uwe Kleine-König
2022-10-08  7:07     ` Markus Wichmann

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).