From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5186 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Requirements for new dns backend, factoring considerations Date: Sun, 1 Jun 2014 10:53:52 -0400 Message-ID: <20140601145352.GF507@brightrain.aerifal.cx> References: <20140601063103.GA12091@brightrain.aerifal.cx> <538B0C50.8060005@skarnet.org> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1401634455 22671 80.91.229.3 (1 Jun 2014 14:54:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 1 Jun 2014 14:54:15 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5191-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jun 01 16:54:09 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Wr79C-0006nT-G5 for gllmg-musl@plane.gmane.org; Sun, 01 Jun 2014 16:54:06 +0200 Original-Received: (qmail 22031 invoked by uid 550); 1 Jun 2014 14:54:05 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 22018 invoked from network); 1 Jun 2014 14:54:05 -0000 Content-Disposition: inline In-Reply-To: <538B0C50.8060005@skarnet.org> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:5186 Archived-At: On Sun, Jun 01, 2014 at 12:19:44PM +0100, Laurent Bercot wrote: > > Hi Rich, > Great work, as usual. > > > >The problem however is implementing this on top of something that > >looks like res_send. Even if not for search paths, res_search and > >res_query need parallel A and AAAA queries, whereas res_send has no > >means to request both. We could imagine implementing res_send on top > >of a hypothetical "res_multisend" with a count of 1. While this would > >work, it's not a very friendly interface for implementing res_search > >or res_query since they would have to provide a large number of > >pre-generated query packets (6 search domains * 2 RR types * up to 280 > >bytes per query packet = 3360 bytes of stack usage) despite it being > >trivial to generate the Nth packet in just 280 bytes of storage. The > >storage requirements for storing all the results are even worse > >(6*2*512 = 6144) compared to what's actually needed (2*512 = 1024). > > I actually never thought about that. Since s6-dns stores answers in the > heap, it doesn't have to pre-allocate storage for them, so it happily > sends everything in parallel. Well the important part is the same: dirty pages. Whether it's on the heap or the stack, an extra 9k of temp data means touching 2-3 extra pages that may have previously been untouched. If this just happens at startup, the memory usage persists for the rest of the process's lifetime and it's essentially wasted. Also the 9k is just _additional_ here. There are already several 0.5k-1k buffers for accessing files, storing address results, storing the canonical name, etc. and it quickly adds up. It doesn't reach the threshold where I'd say "this isn't reasonable to assume we have available on the stack" but it's cost and probably throws getaddrinfo into being "musl's biggest stack user" by a nontrivial margin (otherwise printf with float is the biggest). > >The alternative I see is some sort of "res_multisend" that uses a > >callback to let the caller generate the Nth packet and a callback to > >notify the caller of replies. Then res_send could be implemented with > >callbacks that just feed in and save out the single query/response. > >And res_search would generate all the query packets but only save the > >"current best match" for each address family. > > That sounds reasonable. It's definitely reasonable from an efficiency standpoint, but it's also the most complex approach (storing the working set in structs and passing a context back and forth, how to pass buffers back and forth without wasteful copying, etc.), and possibly has the largest code size too. > >As another alternative, we could drop the goal of doing search > >suffixes in parallel. > > The best choice depends on your timeout values and retry policy. Current retry time is 1s and failure timeout is 5s. These should be configurable, but to do that, we need a way of reinterpreting resolv.conf's timeout settings for the way musl does things: musl does not wait for one nameserver to timeout then fallback to the next, but queries them in parallel. (I know some people have doubts about this, but in practice it results in massive performance improvement for resolving, especially if you have several nameservers with different latency and caching properties, such as localhost, isp-nameserver, and 8.8.8.8.) > So if your timeouts are very short, sure, serial search will work. But Timeout is not really the relevant factor unless your nameservers are misconfigured. A properly configured nameserver returns a negative response rather than just timing out, but it might not cache negative results (or might not cache them long) so the request may have latency higher than a typical request due to repeating the whole recursive lookup, but it still should be nowhere near as long as a timeout. > >negligible, I think) and make a trivial res_multisend that does N(=2) > >queries in parallel using packets provided by the caller. > > That's what the synchronous s6-dns resolver does, and I think that's > the least amount of parallelism that's still acceptable in a v4+v6 world. That's what musl's current implementation does too, and this is not a feature I'd want to drop. In fact I consider it essential to adoption of IPv6; otherwise everyone will disable IPv6 because it "makes DNS lookups slower". Rich