9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] More venti sync woes.
@ 2007-09-26 19:32 Anthony Sorace
  2007-09-26 19:39 ` Steve Simon
  2007-09-27 21:53 ` Russ Cox
  0 siblings, 2 replies; 15+ messages in thread
From: Anthony Sorace @ 2007-09-26 19:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I've had a cpu server running off a non-venti-backed fossil for a few
weeks now. the same machine has also been running venti (but the
fossil wasn't talking to it, intentionally). I'd confirmed the venti
was working by doing direct dumps and mounting the results from vacfs.
All was well.

Yesterday I modified my fossil config to use the venti. Edited the
config with fossil/conf, rebooted, and all was well. At boot time, the
"sync..." message stayed for about 10 seconds (I didn't time it, but
that's the right order), as it had been on every previous reboot
(before fossil was using it), and then it moved on and booted as
normal.

Last night something outside my house got struck by lightning and we
lost power for a few seconds. On boot, it hung at the "sync..."
message. It's now been double-digit hours. The disk is slowish, and
lacks supported DMA, but that still seems ridiculous, especially on a
system with now one day's worth of dumps (with less than 50MB data
beyond the stock plan9 install).

On the up side, my microwave, which has been broken for months, is now
working properly again. Go figure.

So I've got questions. First, I was under the impression that venti's
structure made it more or less immune to abrupt shutdown. In that
case, assuming no damage to the actual hardware, is it safe to factor
the power outage out of the equation and just treat this as a reboot?

And the big one: what's going on? I've had this sync issue in a couple
different setups. In the earlier ones, I wrote it off to having
re-used oventi partitions and that confusing nventi. But this has been
all nventi throughout. A handful of folks on IRC have observed
indefinite stalls at the same place. Aside from the clock time theory
proposed just a little bit ago (which is not the case for me; I
checked), I've not heard any good working theories.

My next step is going to be to try booting off some other medium and
rebuild the index partitions, assuming the actual arenas are unharmed.
Any bets on whether that's likely to pay off?

Anthony


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-26 19:32 [9fans] More venti sync woes Anthony Sorace
@ 2007-09-26 19:39 ` Steve Simon
  2007-09-26 19:41   ` erik quanstrom
  2007-09-27 21:53 ` Russ Cox
  1 sibling, 1 reply; 15+ messages in thread
From: Steve Simon @ 2007-09-26 19:39 UTC (permalink / raw)
  To: 9fans

I have had similar problems but not had the time or enthusism to look
for it - I found disabling one of the CPUs in my twin cpu box solved this
for me, not ideal but it got me going.

I am running nventi on oventi partitions, and it works fine with one CPU.

-Steve


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-26 19:39 ` Steve Simon
@ 2007-09-26 19:41   ` erik quanstrom
  2007-09-26 19:49     ` Anthony Sorace
  0 siblings, 1 reply; 15+ messages in thread
From: erik quanstrom @ 2007-09-26 19:41 UTC (permalink / raw)
  To: 9fans

> I have had similar problems but not had the time or enthusism to look
> for it - I found disabling one of the CPUs in my twin cpu box solved this
> for me, not ideal but it got me going.
>
> I am running nventi on oventi partitions, and it works fine with one CPU.
>
> -Steve

since venti is a user-level process, i would think that rather than
solving the problem, disabling a cpu makes it less likely.

- erik


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-26 19:41   ` erik quanstrom
@ 2007-09-26 19:49     ` Anthony Sorace
  0 siblings, 0 replies; 15+ messages in thread
From: Anthony Sorace @ 2007-09-26 19:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On 9/26/07, erik quanstrom <quanstro@coraid.com> wrote:

// since venti is a user-level process, i would think that rather than
// solving the problem, disabling a cpu makes it less likely.

agree.

also certainly not the problem in any of the cases where I've
encountered this, as it's all been on one of two (significantly
different) single-proc systems.

a


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-26 19:32 [9fans] More venti sync woes Anthony Sorace
  2007-09-26 19:39 ` Steve Simon
@ 2007-09-27 21:53 ` Russ Cox
  2007-09-27 22:49   ` Anthony Sorace
  2007-09-27 23:54   ` Charles Forsyth
  1 sibling, 2 replies; 15+ messages in thread
From: Russ Cox @ 2007-09-27 21:53 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

i don't understand what printed sync... - venti or fossil?

boot from a boot cd and then run venti/venti -c config
by hand, and then you can run ps and acid to see what
it is spending its time doing.

i half-doubt that venti is actually the one sitting around.
it could be that fossil is trying to write the initial set of
blocks out to venti, and that that's what is taking a while.
but booting from a cd and being able to run ps, etc.
will tell you for sure.

finally, don't underestimate the slowness of non-dma disk.
venti is *very* disk-intensive.  i was measuring some
changes i had made to venti recently and was very surprised
that i was only getting under 1MB/s, and then i realized
that dma was off.  it matters.

what's your venti config?

russ


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-27 21:53 ` Russ Cox
@ 2007-09-27 22:49   ` Anthony Sorace
  2007-09-27 23:03     ` erik quanstrom
  2007-09-28  3:16     ` Russ Cox
  2007-09-27 23:54   ` Charles Forsyth
  1 sibling, 2 replies; 15+ messages in thread
From: Anthony Sorace @ 2007-09-27 22:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

it's almost certainly venti sitting there; i don't think fossil is
even running yet. the last two lines on my screen are:

root is from (tcp, local)[local!#S/sdC0/fossil]: time...
venti...2007/0926 17:57:23 venti: conf...httpd tcp!127.1!8000...init...sync...

that sequence matches my read of the venti source. also, ^t^tp shows a
few venti procs that just keep racking up cpu time:

 8:     venti pc f0100366 dbgpc	2557f	Rendez (Running) ut 1547792 st
2058589 bss 6650000 qpc 0 nl 1 nd 0 lpc f01bb8ec pri 2
14:     venti pc   2557f dbgpc    2557f    Rendez (Ready) ut 2477463
st 2354871 bss 6650000 qpc 0 nl 0 nd 0 lpc f01c7bc7 pri 2
16:     venti pc f01c88ae dbgpc    1f141     Pread (Ready) ut 467438
st 1254296 bss 6650000 qpc f01c35e5 nl 0 nd 0 lpc f01bb8ec pri 1

they're definitely increasing more-or-less regularly. everything else
(well, i can't see above 8) had 0-2 for ut and st.

as for my venti config: i'll confirm when i get thin thing booted off
another medium, but from memory: i've got a ~30GB fossil partition, a
64MB bloom filter, a 5-10GB index, and a ~120GB arenas partition.
there's also a 9fat and swap in there somewhere. it's all on one disk.

i certainly appreciate the fact that non-dma disks can be dog slow
under load. but at this point, whatever it's doing it's been doing for
over 24 hours; even a factor of 50 puts that at just under half an
hour to reboot, which seems like an unreasonable amount of time for a
server to spin on an unclean reboot. what was the rough improvement
factor you observed?

this machine has no optical drive, so i'll have to make a floppy and
get it to boot of the net. i'll report back when i have info from
that.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-27 22:49   ` Anthony Sorace
@ 2007-09-27 23:03     ` erik quanstrom
  2007-09-28  3:16     ` Russ Cox
  1 sibling, 0 replies; 15+ messages in thread
From: erik quanstrom @ 2007-09-27 23:03 UTC (permalink / raw)
  To: 9fans

> as for my venti config: i'll confirm when i get thin thing booted off
> another medium, but from memory: i've got a ~30GB fossil partition, a
> 64MB bloom filter, a 5-10GB index, and a ~120GB arenas partition.
> there's also a 9fat and swap in there somewhere. it's all on one disk.

> i certainly appreciate the fact that non-dma disks can be dog slow
> under load. but at this point, whatever it's doing it's been doing for
> over 24 hours; even a factor of 50 puts that at just under half an
> hour to reboot, which seems like an unreasonable amount of time for a
> server to spin on an unclean reboot. what was the rough improvement
> factor you observed?

generally, a sata disk is good for 30-50 MB/s using sequential
IDE dma transfers on the outer tracks.  /non-sequential access
can be as slow as non-dma access./

(i fixed a similar problem reciently with the on-disk cache of ken's fs.
throughput went up by a factor of 20 or so.)

russ would have to answer questions about disk access patterns with
venti, but you could be generating constant seeks between the various
partitions.

- erik



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-27 21:53 ` Russ Cox
  2007-09-27 22:49   ` Anthony Sorace
@ 2007-09-27 23:54   ` Charles Forsyth
  2007-09-28  0:10     ` erik quanstrom
  1 sibling, 1 reply; 15+ messages in thread
From: Charles Forsyth @ 2007-09-27 23:54 UTC (permalink / raw)
  To: 9fans

> then i realized
> that dma was off.  it matters.

it matters greatly.
in some cases, if there's a lot to do, don't even bother until dma is on.
otherwise, it takes simply ages.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-27 23:54   ` Charles Forsyth
@ 2007-09-28  0:10     ` erik quanstrom
  2007-09-28  0:19       ` Charles Forsyth
  2007-09-28  4:01       ` Bruce Ellis
  0 siblings, 2 replies; 15+ messages in thread
From: erik quanstrom @ 2007-09-28  0:10 UTC (permalink / raw)
  To: 9fans

> it matters greatly.
> in some cases, if there's a lot to do, don't even bother until dma is on.
> otherwise, it takes simply ages.

regardless of the need to get things onto disk due to memory pressure,
the data's not safe from a momentary power problem until it hits
the disk.

my experience has been that even with upses, power is less reliable than
disks.

- erik



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-28  0:19       ` Charles Forsyth
@ 2007-09-28  0:19         ` erik quanstrom
  0 siblings, 0 replies; 15+ messages in thread
From: erik quanstrom @ 2007-09-28  0:19 UTC (permalink / raw)
  To: 9fans

>> my experience has been that even with upses, power is less reliable than
>> disks.
>
> you're suggesting that power corrupts?

lack of power corrupts dram; absolute lack of power corrupts dram absolutely.

hardware has always been a bit odd.

- erik



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-28  0:10     ` erik quanstrom
@ 2007-09-28  0:19       ` Charles Forsyth
  2007-09-28  0:19         ` erik quanstrom
  2007-09-28  4:01       ` Bruce Ellis
  1 sibling, 1 reply; 15+ messages in thread
From: Charles Forsyth @ 2007-09-28  0:19 UTC (permalink / raw)
  To: 9fans

> my experience has been that even with upses, power is less reliable than
> disks.

you're suggesting that power corrupts?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-27 22:49   ` Anthony Sorace
  2007-09-27 23:03     ` erik quanstrom
@ 2007-09-28  3:16     ` Russ Cox
  2007-09-28  9:01       ` Kernel Panic
  1 sibling, 1 reply; 15+ messages in thread
From: Russ Cox @ 2007-09-28  3:16 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

dma is worth around 10x, certainly less than 50.
i agree that your venti server is taking a very long
time to come back.  i reboot mine all the time
and don't have this problem.

i am at a loss for what could be taking it so long.
it's probably not going to hurt any to stop it.
it could take forever -- maybe it's looping!

when you manage to boot in other means,
it would be nice to see what ps -a|grep venti
says.  venti sets its proc args that show up in ps -a
to tell you what each proc does.

the new venti is very careful both about the
consistency of what is stored on disk and about
recovering quickly after a disk failure
(there's not a lot to do -- just pick up the unindexed
arena entries from the arena tocs and toss them
back into the index write buffer where they were
when you restarted the system).

what you're describing could happen if you were
running a new venti (which buffers index updates
quite aggressively) and then on reboot managed
to start an old venti (which would then process the
unindexed new blocks one at a time instead of
buffering the updates, with about 3 seeks per block).

without more information i'm afraid i have no good answers.

russ


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-28  0:10     ` erik quanstrom
  2007-09-28  0:19       ` Charles Forsyth
@ 2007-09-28  4:01       ` Bruce Ellis
  1 sibling, 0 replies; 15+ messages in thread
From: Bruce Ellis @ 2007-09-28  4:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

absurd .. a UPS is your friend.  there is something else wrong.

brucee

On 9/28/07, erik quanstrom <quanstro@quanstro.net> wrote:
> > it matters greatly.
> > in some cases, if there's a lot to do, don't even bother until dma is on.
> > otherwise, it takes simply ages.
>
> regardless of the need to get things onto disk due to memory pressure,
> the data's not safe from a momentary power problem until it hits
> the disk.
>
> my experience has been that even with upses, power is less reliable than
> disks.
>
> - erik
>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-28  3:16     ` Russ Cox
@ 2007-09-28  9:01       ` Kernel Panic
  2007-09-28 16:35         ` Anthony Sorace
  0 siblings, 1 reply; 15+ messages in thread
From: Kernel Panic @ 2007-09-28  9:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Russ Cox wrote:

>dma is worth around 10x, certainly less than 50.
>i agree that your venti server is taking a very long
>time to come back.  i reboot mine all the time
>and don't have this problem.
>
>i am at a loss for what could be taking it so long.
>it's probably not going to hurt any to stop it.
>it could take forever -- maybe it's looping!
>
>
It is...

while(1){
proc main: kick icache
work icachewritecoord: start
proc icachewritecoord: icachewritecoord kick dcache
work flushproc: start
proc flushproc: build t=131
proc flushproc: writeblocks t=991
proc flushproc: writeblocks.1 t=1632
proc flushproc: writeblocks.2 t=2296
proc flushproc: writeblocks.3 t=2944
proc flushproc: undirty.4 t=3564
work flushproc: finish
proc icachewritecoord: kick dcache
proc icachewritecoord: icachewritecoord kicked dcache
proc icachewritecoord: icachewritecoord start flush
proc icachewritecoord: icachedirty enter
proc icachewritecoord: icachedirty exit
proc icachewritecoord: icachewritecoord sleep
proc main: kick icache
}

the main proc loops in icachealloc():

while(icache.ndirty == icache.entries){
	/*
	 * This is a bit suspect.  Kickicache will wake up the
	 * icachewritecoord, but if all the index entries are for
	 * unflushed disk blocks, icachewritecoord won't be
	 * able to do much.  It always rewakes everyone when
	 * it thinks it is done, though, so at least we'll go around
	 * the while loop again.  Also, if icachewritecoord sees
	 * that the disk state hasn't change at all since the last
	 * time around, it kicks the disk.  This needs to be
	 * rethought, but it shouldn't deadlock anymore.
	 */
	kickicache();
	rsleep(&icache.full);
}

but icache.ndirty never changes... so it hangs forever in
"sync..." because it cant allocate ientries.

>when you manage to boot in other means,
>it would be nice to see what ps -a|grep venti
>says.  venti sets its proc args that show up in ps -a
>to tell you what each proc does.
>
>the new venti is very careful both about the
>consistency of what is stored on disk and about
>recovering quickly after a disk failure
>(there's not a lot to do -- just pick up the unindexed
>arena entries from the arena tocs and toss them
>back into the index write buffer where they were
>when you restarted the system).
>
>what you're describing could happen if you were
>running a new venti (which buffers index updates
>quite aggressively) and then on reboot managed
>to start an old venti (which would then process the
>unindexed new blocks one at a time instead of
>buffering the updates, with about 3 seeks per block).
>
>without more information i'm afraid i have no good answers.
>
>russ
>
>
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [9fans] More venti sync woes.
  2007-09-28  9:01       ` Kernel Panic
@ 2007-09-28 16:35         ` Anthony Sorace
  0 siblings, 0 replies; 15+ messages in thread
From: Anthony Sorace @ 2007-09-28 16:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

agreed. 'ps -a | grep venti' shows the followg after about 15 minutes:

glenda          198    3:04   3:44   104508K Rendez   venti [main]
glenda          199    0:00   0:00   104508K Rendez   venti
glenda          200    0:00   0:00   104508K Sleep    venti
glenda          201    0:00   0:00   104508K Rendez   venti
[icachewriteproc:/dev/sdC0/isect]
glenda          202    4:49   4:23   104508K Rendez   venti [icachewritecoord]
glenda          203    0:00   0:00   104508K Sleep    venti
[delaykickproc icache]
glenda          204    0:23   1:11   104508K Rendez   venti [flushproc]
glenda          205    0:00   0:00   104508K Rendez   venti
[delaykickproc dcache]
glenda          206    0:00   0:00   104508K Rendez   venti
glenda          206    0:00   0:00   104508K Rendez   venti [bloomwriteproc]

once it hits "sync...", load, context, and sycall are pegged in stats;
memory ramps up a bit over the first ~ half minute, but levels out.
For the big three processes, here's everything over 3% in tprof:

:; tprof 198
total: 3040
TEXT 00001000
    ms      %   sym
   480	 15.7	_tas
   240	  7.8	runthread
   230	  7.5	lock
   180	  5.9	_threadrendezvous
   170	  5.5	rendezvous
   140	  4.6	qlock
   130	  4.2	_sched
   110	  3.6	trace
   110	  3.6	_threadready
   100	  3.2	waitforkick
   100	  3.2	icachewritecoord

:; tprof 202
total: 7570
TEXT 00001000
    ms      %   sym
  1090	 14.3	_tas
   520	  6.8	runthread
   500	  6.6	rendezvous
   490	  6.4	_threadrendezvous
   490	  6.4	lock
   290	  3.8	icachewritecoord
   280	  3.6	_sched
   260	  3.4	qlock
   240	  3.1	trace
   230	  3.0	_threadready

:; tprof 204
total: 14040
TEXT 00001000
    ms      %   sym
  2020	 14.3	_tas
  1010	  7.1	_threadrendezvous
   950	  6.7	rendezvous
   930	  6.6	runthread
   920	  6.5	lock
   590	  4.2	icachewritecoord
   540	  3.8	trace
   510	  3.6	_sched
   470	  3.3	_threadready

tight loops with most of its time in the thread library. poking around
with acid now to get more info.

On 9/28/07, Kernel Panic <cinap_lenrek@gmx.de> wrote:
> Russ Cox wrote:
>
> >dma is worth around 10x, certainly less than 50.
> >i agree that your venti server is taking a very long
> >time to come back.  i reboot mine all the time
> >and don't have this problem.
> >
> >i am at a loss for what could be taking it so long.
> >it's probably not going to hurt any to stop it.
> >it could take forever -- maybe it's looping!
> >
> >
> It is...
>
> while(1){
> proc main: kick icache
> work icachewritecoord: start
> proc icachewritecoord: icachewritecoord kick dcache
> work flushproc: start
> proc flushproc: build t=131
> proc flushproc: writeblocks t=991
> proc flushproc: writeblocks.1 t=1632
> proc flushproc: writeblocks.2 t=2296
> proc flushproc: writeblocks.3 t=2944
> proc flushproc: undirty.4 t=3564
> work flushproc: finish
> proc icachewritecoord: kick dcache
> proc icachewritecoord: icachewritecoord kicked dcache
> proc icachewritecoord: icachewritecoord start flush
> proc icachewritecoord: icachedirty enter
> proc icachewritecoord: icachedirty exit
> proc icachewritecoord: icachewritecoord sleep
> proc main: kick icache
> }
>
> the main proc loops in icachealloc():
>
> while(icache.ndirty == icache.entries){
> 	/*
> 	 * This is a bit suspect.  Kickicache will wake up the
> 	 * icachewritecoord, but if all the index entries are for
> 	 * unflushed disk blocks, icachewritecoord won't be
> 	 * able to do much.  It always rewakes everyone when
> 	 * it thinks it is done, though, so at least we'll go around
> 	 * the while loop again.  Also, if icachewritecoord sees
> 	 * that the disk state hasn't change at all since the last
> 	 * time around, it kicks the disk.  This needs to be
> 	 * rethought, but it shouldn't deadlock anymore.
> 	 */
> 	kickicache();
> 	rsleep(&icache.full);
> }
>
> but icache.ndirty never changes... so it hangs forever in
> "sync..." because it cant allocate ientries.
>
> >when you manage to boot in other means,
> >it would be nice to see what ps -a|grep venti
> >says.  venti sets its proc args that show up in ps -a
> >to tell you what each proc does.
> >
> >the new venti is very careful both about the
> >consistency of what is stored on disk and about
> >recovering quickly after a disk failure
> >(there's not a lot to do -- just pick up the unindexed
> >arena entries from the arena tocs and toss them
> >back into the index write buffer where they were
> >when you restarted the system).
> >
> >what you're describing could happen if you were
> >running a new venti (which buffers index updates
> >quite aggressively) and then on reboot managed
> >to start an old venti (which would then process the
> >unindexed new blocks one at a time instead of
> >buffering the updates, with about 3 seeks per block).
> >
> >without more information i'm afraid i have no good answers.
> >
> >russ
> >
> >
> >
>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-09-28 16:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-26 19:32 [9fans] More venti sync woes Anthony Sorace
2007-09-26 19:39 ` Steve Simon
2007-09-26 19:41   ` erik quanstrom
2007-09-26 19:49     ` Anthony Sorace
2007-09-27 21:53 ` Russ Cox
2007-09-27 22:49   ` Anthony Sorace
2007-09-27 23:03     ` erik quanstrom
2007-09-28  3:16     ` Russ Cox
2007-09-28  9:01       ` Kernel Panic
2007-09-28 16:35         ` Anthony Sorace
2007-09-27 23:54   ` Charles Forsyth
2007-09-28  0:10     ` erik quanstrom
2007-09-28  0:19       ` Charles Forsyth
2007-09-28  0:19         ` erik quanstrom
2007-09-28  4:01       ` Bruce Ellis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).