From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <9a5d188e7b220e2d8ce41bf79f8081b6@quanstro.net> From: erik quanstrom Date: Mon, 13 Aug 2007 07:56:30 -0400 To: 9fans@cse.psu.edu Subject: Re: [9fans] lsub.org In-Reply-To: <599f06db0708122047p67c975a6y1f355eb0cdf767e5@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Topicbox-Message-UUID: a76c9670-ead2-11e9-9d60-3106f5b1d025 assuming things are broken without external help .... i've been having trouble with dns infinitely extending the life of queries when a "srvfail" is returned by an authoratitive server. eventually one query to a broken ns will hold up all the threads= available on the server. this happends a lot on reverse lookups. i fire this scri= pt every 10 minutes to help ease the pain until i have the time figure out e= xactly what's going wrong. - erik #!/bin/rc rfork en mailuser=3Dguywhogetstocheckonthisstuff fflag=3D0 nl=3D' ' fn usage{ echo 'usage: restartdns [-f]' >[1=3D2] exit usage } fn why{ if(! ~ $#nbroken 0) echo getting medi=C3=A6val on $#nbroken broken dns processes. if not{ echo getting medi=C3=A6val on $#nwait deadlocked dns processes. for(i in $nwait) echo $i } } for(i)switch($i){ case -f fflag=3D1 case * usage } if(~ $fflag 0){ nbroken=3D`{ps -a | grep dns | grep Broken} ifs=3D$nl nwait=3D`{ps -a |sed -n 's/.* +dns \[query lock wait for(.*)\]= /\1/gp' | sort | uniq -c | awk '$1>2'} if(~ $#nbroken 0 && ~ $#nwait 0) exit 'none broken' why if(~ $service rx) {date; echo; why; echo; ps -a | grep dns}| mail $guywhogetstocheckonthi= sstuff } slay dns | rc ndb/dns -s ndb/dns -Rrsx /net.alt -f /lib/ndb/external