From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS
	autolearn=ham autolearn_force=no version=3.4.2
Received: (qmail 13279 invoked from network); 18 Apr 2020 17:14:40 -0000
Received-SPF:  pass (mother.openwall.net: domain of lists.openwall.com
  designates 195.42.179.200 as permitted sender)
  receiver=inbox.vuxu.org; client-ip=195.42.179.200
  envelope-from=<musl-return-15758-ml=inbox.vuxu.org@lists.openwall.com>
Received: from mother.openwall.net (195.42.179.200)
  by inbox.vuxu.org with UTF8ESMTPZ; 18 Apr 2020 17:14:40 -0000
Received: (qmail 1665 invoked by uid 550); 18 Apr 2020 17:14:37 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 1635 invoked from network); 18 Apr 2020 17:14:36 -0000
From: Florian Weimer <fw@deneb.enyo.de>
To: Rich Felker <dalias@libc.org>
Cc: musl@lists.openwall.com
References: <20200413183522.GX11469@brightrain.aerifal.cx>
	<20200413190412.GF41308@straasha.imrryr.org>
	<20200413193505.GY11469@brightrain.aerifal.cx>
	<20200413214138.GG41308@straasha.imrryr.org>
	<20200414035303.GZ11469@brightrain.aerifal.cx>
	<87v9m0hdjk.fsf@mid.deneb.enyo.de>
	<20200415180149.GH11469@brightrain.aerifal.cx>
	<87imi0haf7.fsf@mid.deneb.enyo.de>
	<20200417034059.GF11469@brightrain.aerifal.cx>
	<878siucvqd.fsf@mid.deneb.enyo.de>
	<20200417160726.GG11469@brightrain.aerifal.cx>
Date: Sat, 18 Apr 2020 19:14:24 +0200
In-Reply-To: <20200417160726.GG11469@brightrain.aerifal.cx> (Rich Felker's
	message of "Fri, 17 Apr 2020 12:07:26 -0400")
Message-ID: <87o8ro67in.fsf_-_@mid.deneb.enyo.de>
MIME-Version: 1.0
Content-Type: text/plain
Subject: [musl] TCP support in the stub resolver (was: Re: Outgoing DANE not working)

* Rich Felker:

> On Fri, Apr 17, 2020 at 11:22:34AM +0200, Florian Weimer wrote:
>> >> > However it's not clear how "fallback to tcp" logic should interact
>> >> > with such concurrent requests -- switch to tcp for everything and
>> >> > just one nameserver as soon as we get any TC response?
>> >> 
>> >> It's TCP for this query only, not all subsequent queries.  It makes
>> >> sense to query the name server that provided the TC response: It
>> >> reduces latency because that server is more likely to have the large
>> >> response in its cache.
>> >
>> > I'm not talking about future queries but other unfinished queries that
>> > are part of the same operation (presently just concurrent A and AAAA
>> > lookups).
>> 
>> If the second response has TC set (but not the first), you can keep
>> the first response.  Re-querying both over TCP increases the
>> likelihood that you get a response from the same cluster node (so more
>> consistency), but you won't get that over UDP, ever, so I don't think
>> it matters.
>> 
>> If the first response has TC set, you have an open TCP connection you
>> could use for the second query as well.  Pipelining of DNS requests
>> has compatibility issues because there is no application-layer
>> connection teardown (an equivalent to HTTP's Connection: close).  If
>> the server closes the connection after sending the response to the
>> first query, without reading the second, this is a TCP data loss
>> event, which results in an RST segment and potentially, loss of the
>> response to the first query.  Ideally, a client would wait for the
>> second UDP response and the TCP response to arrive.  If the second UDP
>> response is TC as well, the TCP query should be delayed until the
>> first TCP response came back.

> Indeed it sounds like one TCP connection would be needed per request,
> so switchover would just be per-request if done.

No, you can reuse the connection for the second query (in most cases).
However, for maximum robustness, you should not send the second query
until the first response has arrived (no pipelining).  You may still
need a new connection for the second query if the TCP stream ends
without a response, though.

> My leaning is probably not to do fallback at all (complex logic,
> potential for unexpected slowness, not needed by vast majority of
> users) and just add TCP support with option use-vc for users who
> really want complete replies. All of this would be contingent anyway
> on making internal mechanisms able to handle variable result size
> rather than fixed-size 512 bytes so it's not happening right away.
> Doing it carelessly would create possibly dangerous bugs.

I still think it's wrong.  The protocol says that you must perform TCP
fallback.  If you don't, it's rather confusing for the libresolv
interfaces.

> I'm still also somewhat of the opinion that users who want a resolver
> library (res_* API) with lots of features should just link BIND's, but
> it would be nice not to have to do that.

You could drop the res_* interfaces from musl.  They are mostly needed
for non-address queries, and those are the ones that tend to be larger
than 512 bytes.

Then it might be possible that no one will notice the missing TCP
fallback.