From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: From: erik quanstrom Date: Mon, 9 Apr 2007 12:23:40 -0400 To: 9fans@cse.psu.edu Subject: Re: [9fans] bell-labs website and plan9 MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 42495d00-ead2-11e9-9d60-3106f5b1d025 On Mon Apr 9 11:50:41 EDT 2007, rsc@swtch.com wrote: > erik: > > i have also noticed that replica/applylog has a problem. when i started > > experimenting with copying history from our old fileserver to the new > > one, i started using replica/updatedb and replica/applylog. updatedb > > worked very well, but applylog hung for me pretty consistantly. > > Did you ever use acid to get a stack trace from the `hung' applylogs? > The only threading in applylog is an implementation of something > like fcp to copy files using multiple outstanding 9P read requests. > Since no one else seems to have had problems, I would guess that > there were just some requests that made your file server thrash. > But stack traces would make the answer very clear. i apologize for not having a backtrace, they looked uninformative at the time. what i rememer was that applylog was not doing any i/o at the time. (unless it was reading the same blocks over and over.) in once instance, applylog had the same /proc/$pid/fd for 4 hrs and generated no system load at all. the one problem i do see that was not my case (i was working on two successive days from the dump) is that there is no maximum number of tries to keep a file from changing underfoot. a log file competing with a slow link could be problematic. restarting it where it left off (with an initial line number) generally fixed the problem. i didn't mention it at the time because i didn't get to the bottom of the problem. i'll try to recreate the problem with a backtrace, but anyone else is welcome to beat me to it. - erik