From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 16 Jan 2012 12:13:59 -0500 To: 9fans@9fans.net Message-ID: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [9fans] small dns improvements Topicbox-Message-UUID: 5d45cd46-ead7-11e9-9d60-3106f5b1d025 it must be that time of year. dns is driving folks bats. :-) i've been spending some time looking at why ndb/dns fails. as is well kn= own, there are very long-standing locking problems. in the past, i've gotten = hung up on those and not made any progress. while imho, the long-term strategy shou= ld be to replace ndb/dns with an easier-to-maintain structure, i only have a fe= w weeks to fix as much as possible. so i decided to see if there were simple thi= ngs we could do to improve things. geoff has made a few big improvements. some sites which were broken for = a long time are now working. tomshardware.com is one that i've used as a test, = and it finally works. (although the results don't seem worth the effort. =E2=98= =BA) but there are a number of other lookups that are still broken for me, and= it there seem to be some straightforward reasons that i think i've fixed: 1. we're sending the RD (recursion desired) bit when we ourselves are ac= ting as a recursive server. this looks okay by the standard, but many servers re= turn Srvfail (code 2, Rserver in the dns code) rather than ignoring this bit. turning= this off helps alot (example: ocsp.netsolssl.com). 2. we're ignoring status codes that we should be treating as fatal (like= Srvfail) 3. we're not using edns0. this is kind of a sticky bit. some servers i= nsist on sending enormous answers but don't answer via tcp. on the other hand, some serve= rs insist on sending enormous answers, but return nasty errors when given edns0 que= ries. what seems to work best is to send udp/no edns0, udp/edns0 and finally tc= p. 4. we get confused attaching the name servers to an answer for an out-of= -baliwick cname record. (this is largely a problem with logging, but has the poten= tial to corrupt the database.) if anyone would like to try a 386 executable (amd64 available on request)= , i've put a copy at http://ftp.quanstro.net/other/^(dns dnsdebug) i'd be happy to hear of any dns lookup problems. please let me know which version of dns you're using. thanks, - erik From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> References: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> Date: Mon, 16 Jan 2012 18:02:30 +0000 Message-ID: From: Charles Forsyth To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=0015175cfb7eab005704b6a9047a Subject: Re: [9fans] small dns improvements Topicbox-Message-UUID: 5d4e034e-ead7-11e9-9d60-3106f5b1d025 --0015175cfb7eab005704b6a9047a Content-Type: text/plain; charset=UTF-8 that one's not inherently fatal, in the sense that it shouldn't stop the search. On 16 January 2012 17:13, erik quanstrom wrote: > 2. we're ignoring status codes that we should be treating as fatal (like > Srvfail) --0015175cfb7eab005704b6a9047a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable that one's not inherently fatal, in the sense that it shouldn't sto= p the search.

On 16 January 2012 17:13, e= rik quanstrom <quanstro@quanstro.net> wrote:
2. =C2=A0we're ignoring status codes tha= t we should be treating as fatal (like Srvfail)

--0015175cfb7eab005704b6a9047a-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 16 Jan 2012 13:05:38 -0500 To: 9fans@9fans.net Message-ID: <6e3600609acfa1120c83f85812891f3a@chula.quanstro.net> In-Reply-To: References: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] small dns improvements Topicbox-Message-UUID: 5dd9ee0e-ead7-11e9-9d60-3106f5b1d025 On Mon Jan 16 13:03:38 EST 2012, charles.forsyth@gmail.com wrote: > that one's not inherently fatal, in the sense that it shouldn't stop the > search. > > On 16 January 2012 17:13, erik quanstrom wrote: > > > 2. we're ignoring status codes that we should be treating as fatal (like > > Srvfail) not clear enough. we were persisting in asking the same question in the same manner of a server returning srvfail, thus preventing us from asking the same question in a different way, or of a different server. we persisted long enough that we timed out the query before asking a reasonable question of a capable server. - erik From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 16 Jan 2012 13:07:49 -0500 To: 9fans@9fans.net Message-ID: <420ea21bfa6b17dde6a39c10116de187@coraid.com> In-Reply-To: References: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] small dns improvements Topicbox-Message-UUID: 5d5241f2-ead7-11e9-9d60-3106f5b1d025 On Mon Jan 16 13:03:20 EST 2012, charles.forsyth@gmail.com wrote: > that one's not inherently fatal, in the sense that it shouldn't stop the > search. > > On 16 January 2012 17:13, erik quanstrom wrote: > > > 2. we're ignoring status codes that we should be treating as fatal (like > > Srvfail) also, i forgot that it's possible to return Srvfail and return some RRs. these all need to be ignored. we weren't ignoring them in the past. - erik From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <6e3600609acfa1120c83f85812891f3a@chula.quanstro.net> References: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> <6e3600609acfa1120c83f85812891f3a@chula.quanstro.net> Date: Mon, 16 Jan 2012 18:13:13 +0000 Message-ID: From: Charles Forsyth To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=000e0ce0f306f5a00404b6a92a52 Subject: Re: [9fans] small dns improvements Topicbox-Message-UUID: 5def0ece-ead7-11e9-9d60-3106f5b1d025 --000e0ce0f306f5a00404b6a92a52 Content-Type: text/plain; charset=UTF-8 ah. On 16 January 2012 18:05, erik quanstrom wrote: > we were persisting in asking the same question in the > same manner of a server returning srvfail, thus preventing us from asking > the same question in a different way, or of a different server. > --000e0ce0f306f5a00404b6a92a52 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable ah.

On 16 January 2012 18:05, erik quanst= rom <quanstro= @quanstro.net> wrote:
we were persisting in asking the same question in the
same manner of a server returning srvfail, thus preventing us from asking the same question in a different way, or of a different server.

--000e0ce0f306f5a00404b6a92a52-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 16 Jan 2012 13:19:34 -0500 To: 9fans@9fans.net Message-ID: <5dfa8244b787c9188fae5a8ea9d90b8a@chula.quanstro.net> In-Reply-To: References: <81ab20f06c1b8b20cf869565b01e2e29@chula.quanstro.net> <6e3600609acfa1120c83f85812891f3a@chula.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] small dns improvements Topicbox-Message-UUID: 5df334fe-ead7-11e9-9d60-3106f5b1d025 On Mon Jan 16 13:14:01 EST 2012, charles.forsyth@gmail.com wrote: > ah. > > On 16 January 2012 18:05, erik quanstrom wrote: > > > we were persisting in asking the same question in the > > same manner of a server returning srvfail, thus preventing us from asking > > the same question in a different way, or of a different server. thanks for asking the question. the way i wrote it wasn't very clear. here are just a few domains that i've had trouble with that work for me now: reject queries with the RD flag ocsp.netsolssl.com ocsp.trust-secure.com hangs c.l.britecove.com world-100.bc.gapx.yahoodns.net if you have a linux box, dig +trace is similar to dnsdebug. if dig +trace fails for a query, there's no point in debugging it. - erik