mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Markus Wichmann <nullplan@gmx.net>
To: musl@lists.openwall.com
Cc: "Uwe Kleine-König" <uwe+openwrt@kleine-koenig.org>
Subject: Re: [musl] nslookup failures with coarse CLOCK_MONOTONIC
Date: Sat, 8 Oct 2022 09:07:10 +0200	[thread overview]
Message-ID: <20221008070710.GA2442@voyager> (raw)
In-Reply-To: <c77d5fd0-63b6-1156-7c44-f5ac2b816839@kleine-koenig.org>

On Sat, Oct 08, 2022 at 02:37:30AM +0200, Uwe Kleine-König wrote:
> On 10/8/22 01:25, Rich Felker wrote:
> > On Sat, Oct 08, 2022 at 01:04:25AM +0200, Uwe Kleine-König wrote:
> > > To improve the situation I suggest something like:
> > >
> > > diff --git a/src/network/res_mkquery.c b/src/network/res_mkquery.c
> > > index 614bf7864b48..78b3095fe959 100644
> > > --- a/src/network/res_mkquery.c
> > > +++ b/src/network/res_mkquery.c
> > > @@ -11,6 +11,7 @@ int __res_mkquery(int op, const char *dname, int
> > > class, int type,
> > >          struct timespec ts;
> > >          size_t l = strnlen(dname, 255);
> > >          int n;
> > > +       static unsigned int querycnt;
> > >
> > >          if (l && dname[l-1]=='.') l--;
> > >          if (l && dname[l-1]=='.') return -1;
> > > @@ -34,6 +35,8 @@ int __res_mkquery(int op, const char *dname, int
> > > class, int type,
> > >
> > >          /* Make a reasonably unpredictable id */
> > >          clock_gettime(CLOCK_REALTIME, &ts);
> > > +       /* force a different ID if mkquery was called twice during
> > > the same tick */
> > > +       ts.tv_nsec += querycnt++;
> > >          id = ts.tv_nsec + ts.tv_nsec/65536UL & 0xffff;
> > >          q[0] = id/256;
> > >          q[1] = id;
> > >
> >
> > This isn't acceptable as-is because it introduces a data race.
>
> Huh. You mean if there is a race the pseudo random IDs are changed in a
> platform dependent way? Doesn't sound sooo bad to me, but probably I'm too
> naive here. Also res_mkquery is documented to be not thread-safe. I didn't
> check deeply, but I guess the counter should be moved to struct __res_state.
> (Assuming the manpage for res_mkquery also applies to musl.)

A data race and a race condition are two different things. A data race
is when the same memory location is accessed from different threads
unsynchronized at the same time, and at least one of those is writing,
and at least one of those is not atomic. Data races are undefined
behavior in C (and as such must be avoided).

Race conditions on the other hand are logic errors that occur when the
code is failing to take into account changes to globally visible state
made concurrently by other threads. If multiple threads increment the
same variable at the same time, you have a data race and a race
condition. If the threads are changed to use an atomic load and an
atomic store instead, then the data race is alleviated, but the race
condition remains.

Notably, if you have one thread spinning in a loop until a global flag
is set, and another thread setting that flag, you have a data race and
no race condition.

__res_mkquery() is used in __lookup_name(), which is used in
getaddrinfo(), which notably *is* defined as thread-safe, so the concern
stands. It could be remedied by simply replacing "querycnt++" with
"a_inc(&querycnt)". However, the patch is still not desirable for
reasons set out by the others.

>
> Despite your concerns, I updated the libc on my RE200 with the above patch
> now, and I cannot observe the failure any more.
>

Well, yes, it works around the bug in the original application. However,
nothing (in theory) stops an application from generating thousands of
queries at the same time and sending them simultaneously, and you are
bound to have at least a few ID collisions in there.

> Of course updating nslookup to use a better API is a nicer solution.
>

The API is fine, it just needs to be used correctly. Simple solution in
this case would be to always overwrite the ID of the second query with
one more than the ID of the original query, or its bitwise inverse or
something. Something that is always different. res_mkquery() cannot know
what other queries are outstanding at the time it creates the ID, only
the application can do that. See name_from_dns() for an example. We do
create multiple queries to the same server (potentially) and have to
avoid ID collision manually.

Ciao,
Markus

      reply	other threads:[~2022-10-08  7:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-07 23:04 Uwe Kleine-König
2022-10-07 23:25 ` Rich Felker
2022-10-07 23:53   ` Jo-Philipp Wich
2022-10-08  0:05     ` Rich Felker
2022-10-08  0:37   ` Uwe Kleine-König
2022-10-08  7:07     ` Markus Wichmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221008070710.GA2442@voyager \
    --to=nullplan@gmx.net \
    --cc=musl@lists.openwall.com \
    --cc=uwe+openwrt@kleine-koenig.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).