9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] venti+fossil, the boot process, and 48-bit lba
@ 2003-10-22 15:43 Christopher Nielsen
  2003-10-22 15:57 ` Charles Forsyth
  2003-10-22 16:08 ` ron minnich
  0 siblings, 2 replies; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-22 15:43 UTC (permalink / raw)
  To: 9fans

I've spent the night working to recover/rebuild my venti+fossil
fileserver. The recover part was due to something hosing the
superblock in fossil. No idea what it was. The rebuild part was
to add another disc for use as the venti index so I can use the
much larger disc the index was on for more arenas. I also wanted
to setup the configs on the discs using [venti,fossil]/conf.

I thought this would be easy enough. For recovery, just reformat
fossil with a venti score. Then I could go about rebuilding the
index on the new disc and creating more arenas.

This has led to nothing but trouble. After much pulling of hair
and head-scratching, it turns out that 9load doesn't know how
to handle discs over 128G because it doesn't know anything about
48-bit lba. As a result, when boot tries to start venti and venti
tries to find the arenas, it all goes to hell.

Patches to 9load should be done later today or early tomorrow.

Ron, weren't you working on eliminating 9load from the picture?

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
@ 2003-10-22 15:57 ` Charles Forsyth
  2003-10-22 17:10   ` Christopher Nielsen
  2003-10-22 17:29   ` Christopher Nielsen
  2003-10-22 16:08 ` ron minnich
  1 sibling, 2 replies; 12+ messages in thread
From: Charles Forsyth @ 2003-10-22 15:57 UTC (permalink / raw)
  To: 9fans

>>and head-scratching, it turns out that 9load doesn't know how
	...
>>As a result, when boot tries to start venti and venti
>>tries to find the arenas, it all goes to hell.

how does 9load cause that to happen?
the thing that starts venti is a normal kernel, which
should handle 48-bit LBA.  does 9load (not knowing
about them) leave the ATA in a state where the kernel
can't use 48-bits either?

i can see that 9load not knowing 48-bit addressing wouldn't
be able to see boot/9fat partitions beyond a certain point,
or perhaps with the right geometry.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
  2003-10-22 15:57 ` Charles Forsyth
@ 2003-10-22 16:08 ` ron minnich
  1 sibling, 0 replies; 12+ messages in thread
From: ron minnich @ 2003-10-22 16:08 UTC (permalink / raw)
  To: 9fans

On Wed, 22 Oct 2003, Christopher Nielsen wrote:

> Ron, weren't you working on eliminating 9load from the picture?


yep, I'm still trying to recover my auth server from the switch to
fossil+venti. I may reload from scratch, things did not go well.

ron



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 15:57 ` Charles Forsyth
@ 2003-10-22 17:10   ` Christopher Nielsen
  2003-10-22 17:29   ` Christopher Nielsen
  1 sibling, 0 replies; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-22 17:10 UTC (permalink / raw)
  To: 9fans

On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote:
>
> how does 9load cause that to happen?
> the thing that starts venti is a normal kernel, which
> should handle 48-bit LBA.  does 9load (not knowing
> about them) leave the ATA in a state where the kernel
> can't use 48-bits either?

no idea. my conclusion was a semi-educated guess.

> i can see that 9load not knowing 48-bit addressing wouldn't
> be able to see boot/9fat partitions beyond a certain point,
> or perhaps with the right geometry.

something weird is going on. i originally had the venti conf
on one of the 167G drives, it would get to 'time...' in the
boot sequence, and then suicide with an invalid address. so
i moved the venti conf to one of the smaller discs, and then
it found the venti conf and barfed because it couldn't open
the arenas on the 167G discs. based on that, i am assuming
something is hating those large discs.

do you have any other ideas?

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 15:57 ` Charles Forsyth
  2003-10-22 17:10   ` Christopher Nielsen
@ 2003-10-22 17:29   ` Christopher Nielsen
  2003-10-22 17:58     ` Charles Forsyth
  1 sibling, 1 reply; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-22 17:29 UTC (permalink / raw)
  To: 9fans

On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote:
> >>and head-scratching, it turns out that 9load doesn't know how
> 	...
> >>As a result, when boot tries to start venti and venti
> >>tries to find the arenas, it all goes to hell.
>
> how does 9load cause that to happen?
> the thing that starts venti is a normal kernel, which
> should handle 48-bit LBA.  does 9load (not knowing
> about them) leave the ATA in a state where the kernel
> can't use 48-bits either?
>
> i can see that 9load not knowing 48-bit addressing wouldn't
> be able to see boot/9fat partitions beyond a certain point,
> or perhaps with the right geometry.

one more bit of information. when 9load detects the discs,
it says it can't add the partitions on the 167G drives
because the boundaries are out of range. the upper end of
the boundary that it prints is the upper end of 28-bit
addressing. that's why i concluded it had to do with 48-bit
lba.

this is transcribed by hand, but it looks something like
this:

cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 17:29   ` Christopher Nielsen
@ 2003-10-22 17:58     ` Charles Forsyth
  2003-10-22 18:06       ` Russ Cox
  2003-10-22 19:59       ` Christopher Nielsen
  0 siblings, 2 replies; 12+ messages in thread
From: Charles Forsyth @ 2003-10-22 17:58 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]

i see. i'd wondered after i'd posted whether this might be it:

the kernel driver doesn't work out partitions on its own any more.
the devices' partition tables are set, for instance in /bin/termrc, by
disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the
file systems to get termrc (let alone fdisk or prep), and you need the partitions
to find the file system.  thus, boot adds the results of its own probe to the
internal copy of plan9.ini passed to the kernel, and so the partitions are
defined in time for boot to find them.

/sys/src/9/port/devsd.c has
		 * Use partitions passed from boot program,
		 *	sdC0part=dos 63 123123/plan9 123123 456456
		 * This happens before /boot sets hostname so the
		 * partitions will have the null-string for user.
		...

you could configure fdisk and prep into /boot, and have boot use them, but perhaps the
easiest change for you might be to stop 9load from checking the values (temporarily)
until something better is done.

[-- Attachment #2: Type: message/rfc822, Size: 3637 bytes --]

From: Christopher Nielsen <cnielsen@pobox.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba
Date: Wed, 22 Oct 2003 10:29:53 -0700
Message-ID: <20031022172953.GA834@cassie.foobarbaz.net>

On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote:
> >>and head-scratching, it turns out that 9load doesn't know how
> 	...
> >>As a result, when boot tries to start venti and venti
> >>tries to find the arenas, it all goes to hell.
>
> how does 9load cause that to happen?
> the thing that starts venti is a normal kernel, which
> should handle 48-bit LBA.  does 9load (not knowing
> about them) leave the ATA in a state where the kernel
> can't use 48-bits either?
>
> i can see that 9load not knowing 48-bit addressing wouldn't
> be able to see boot/9fat partitions beyond a certain point,
> or perhaps with the right geometry.

one more bit of information. when 9load detects the discs,
it says it can't add the partitions on the 167G drives
because the boundaries are out of range. the upper end of
the boundary that it prints is the upper end of 28-bit
addressing. that's why i concluded it had to do with 48-bit
lba.

this is transcribed by hand, but it looks something like
this:

cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 17:58     ` Charles Forsyth
@ 2003-10-22 18:06       ` Russ Cox
  2003-10-22 20:07         ` vdharani
  2003-10-22 19:59       ` Christopher Nielsen
  1 sibling, 1 reply; 12+ messages in thread
From: Russ Cox @ 2003-10-22 18:06 UTC (permalink / raw)
  To: 9fans

> you could configure fdisk and prep into /boot, and have boot use them, but perhaps the
> easiest change for you might be to stop 9load from checking the values (temporarily)
> until something better is done.

yes, this.  change 9load so that if the max disk address is 2^28,
it doesn't sanity check the upper bounds in the partition table.

russ


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 17:58     ` Charles Forsyth
  2003-10-22 18:06       ` Russ Cox
@ 2003-10-22 19:59       ` Christopher Nielsen
  2003-10-23  2:16         ` Christopher Nielsen
  1 sibling, 1 reply; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-22 19:59 UTC (permalink / raw)
  To: 9fans

I modified 9load as Russ suggested. I cannot test it just yet,
but I will do so when I get home.

It was a very simple, one-line change that, should it work, will
suffice until a better solution is available.

On Wed, Oct 22, 2003 at 06:58:56PM +0100, Charles Forsyth wrote:
> i see. i'd wondered after i'd posted whether this might be it:
>
> the kernel driver doesn't work out partitions on its own any more.
> the devices' partition tables are set, for instance in /bin/termrc, by
> disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the
> file systems to get termrc (let alone fdisk or prep), and you need the partitions
> to find the file system.  thus, boot adds the results of its own probe to the
> internal copy of plan9.ini passed to the kernel, and so the partitions are
> defined in time for boot to find them.
>
> /sys/src/9/port/devsd.c has
> 		 * Use partitions passed from boot program,
> 		 *	sdC0part=dos 63 123123/plan9 123123 456456
> 		 * This happens before /boot sets hostname so the
> 		 * partitions will have the null-string for user.
> 		...
>
> you could configure fdisk and prep into /boot, and have boot use them, but perhaps the
> easiest change for you might be to stop 9load from checking the values (temporarily)
> until something better is done.

> From: Christopher Nielsen <cnielsen@pobox.com>
> To: 9fans@cse.psu.edu
> Date: Wed, 22 Oct 2003 10:29:53 -0700
> Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba
>
> On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote:
> > >>and head-scratching, it turns out that 9load doesn't know how
> > 	...
> > >>As a result, when boot tries to start venti and venti
> > >>tries to find the arenas, it all goes to hell.
> >
> > how does 9load cause that to happen?
> > the thing that starts venti is a normal kernel, which
> > should handle 48-bit LBA.  does 9load (not knowing
> > about them) leave the ATA in a state where the kernel
> > can't use 48-bits either?
> >
> > i can see that 9load not knowing 48-bit addressing wouldn't
> > be able to see boot/9fat partitions beyond a certain point,
> > or perhaps with the right geometry.
>
> one more bit of information. when 9load detects the discs,
> it says it can't add the partitions on the 167G drives
> because the boundaries are out of range. the upper end of
> the boundary that it prints is the upper end of 28-bit
> addressing. that's why i concluded it had to do with 48-bit
> lba.
>
> this is transcribed by hand, but it looks something like
> this:
>
> cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
> cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
>
> --
> Christopher Nielsen
> "They who can give up essential liberty for temporary
> safety, deserve neither liberty nor safety." --Benjamin Franklin


--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 18:06       ` Russ Cox
@ 2003-10-22 20:07         ` vdharani
  0 siblings, 0 replies; 12+ messages in thread
From: vdharani @ 2003-10-22 20:07 UTC (permalink / raw)
  To: 9fans

hi,

>> you could configure fdisk and prep into /boot, and have boot use them,
>> but perhaps the easiest change for you might be to stop 9load from
>> checking the values (temporarily) until something better is done.
>
> yes, this.  change 9load so that if the max disk address is 2^28,
> it doesn't sanity check the upper bounds in the partition table.

is someone changing 9load source centrally also? that way it would be
useful for other similar instances as well.

and, may be, if reqd, the sanity check could be changed from display-error-
and-exit to warn-and-proceed.

thanks
dharani





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-22 19:59       ` Christopher Nielsen
@ 2003-10-23  2:16         ` Christopher Nielsen
  2003-10-24  4:32           ` [9fans] trouble down t'labs George Michaelson
  2003-10-27  7:48           ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
  0 siblings, 2 replies; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-23  2:16 UTC (permalink / raw)
  To: 9fans

That didn't work. It shut up 9load about the partitions, but
venti still couldn't open the partition.

Unless someone has a better idea, I'm going to proceed with
incorporating fdisk and prep into /boot and teach boot to
use them to sniff out the partitions.

On Wed, Oct 22, 2003 at 12:59:40PM -0700, Christopher Nielsen wrote:
> I modified 9load as Russ suggested. I cannot test it just yet,
> but I will do so when I get home.
>
> It was a very simple, one-line change that, should it work, will
> suffice until a better solution is available.
>
> On Wed, Oct 22, 2003 at 06:58:56PM +0100, Charles Forsyth wrote:
> > i see. i'd wondered after i'd posted whether this might be it:
> >
> > the kernel driver doesn't work out partitions on its own any more.
> > the devices' partition tables are set, for instance in /bin/termrc, by
> > disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the
> > file systems to get termrc (let alone fdisk or prep), and you need the partitions
> > to find the file system.  thus, boot adds the results of its own probe to the
> > internal copy of plan9.ini passed to the kernel, and so the partitions are
> > defined in time for boot to find them.
> >
> > /sys/src/9/port/devsd.c has
> > 		 * Use partitions passed from boot program,
> > 		 *	sdC0part=dos 63 123123/plan9 123123 456456
> > 		 * This happens before /boot sets hostname so the
> > 		 * partitions will have the null-string for user.
> > 		...
> >
> > you could configure fdisk and prep into /boot, and have boot use them, but perhaps the
> > easiest change for you might be to stop 9load from checking the values (temporarily)
> > until something better is done.
>
> > From: Christopher Nielsen <cnielsen@pobox.com>
> > To: 9fans@cse.psu.edu
> > Date: Wed, 22 Oct 2003 10:29:53 -0700
> > Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba
> >
> > On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote:
> > > >>and head-scratching, it turns out that 9load doesn't know how
> > > 	...
> > > >>As a result, when boot tries to start venti and venti
> > > >>tries to find the arenas, it all goes to hell.
> > >
> > > how does 9load cause that to happen?
> > > the thing that starts venti is a normal kernel, which
> > > should handle 48-bit LBA.  does 9load (not knowing
> > > about them) leave the ATA in a state where the kernel
> > > can't use 48-bits either?
> > >
> > > i can see that 9load not knowing 48-bit addressing wouldn't
> > > be able to see boot/9fat partitions beyond a certain point,
> > > or perhaps with the right geometry.
> >
> > one more bit of information. when 9load detects the discs,
> > it says it can't add the partitions on the 167G drives
> > because the boundaries are out of range. the upper end of
> > the boundary that it prints is the upper end of 28-bit
> > addressing. that's why i concluded it had to do with 48-bit
> > lba.
> >
> > this is transcribed by hand, but it looks something like
> > this:
> >
> > cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
> > cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range
> >
> > --
> > Christopher Nielsen
> > "They who can give up essential liberty for temporary
> > safety, deserve neither liberty nor safety." --Benjamin Franklin
>
>
> --
> Christopher Nielsen
> "They who can give up essential liberty for temporary
> safety, deserve neither liberty nor safety." --Benjamin Franklin

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [9fans] trouble down t'labs
  2003-10-23  2:16         ` Christopher Nielsen
@ 2003-10-24  4:32           ` George Michaelson
  2003-10-27  7:48           ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
  1 sibling, 0 replies; 12+ messages in thread
From: George Michaelson @ 2003-10-24  4:32 UTC (permalink / raw)
  To: 9fans


AT&T have admitted to severe SPAM related slowdowns. It might be that the data
congestion along the path is causing so much packetloss, crypted and other loss-
intolerant paths are not working well.

how much of 9 protocol is over IP or UDP rather than TCP? I suppose even if TCP,
it could score massive delay as the windowsize kept down due to fragloss.

Only a hypotenuse. Somebody more in-the-know can kybosh it.

-George


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [9fans] venti+fossil, the boot process, and 48-bit lba
  2003-10-23  2:16         ` Christopher Nielsen
  2003-10-24  4:32           ` [9fans] trouble down t'labs George Michaelson
@ 2003-10-27  7:48           ` Christopher Nielsen
  1 sibling, 0 replies; 12+ messages in thread
From: Christopher Nielsen @ 2003-10-27  7:48 UTC (permalink / raw)
  To: 9fans

On Wed, Oct 22, 2003 at 07:16:28PM -0700, Christopher Nielsen wrote:
> That didn't work. It shut up 9load about the partitions, but
> venti still couldn't open the partition.
>
> Unless someone has a better idea, I'm going to proceed with
> incorporating fdisk and prep into /boot and teach boot to
> use them to sniff out the partitions.

Everything is back in working order.

However, the solution is a complete hack, specific to my
configuration, and not worth sending in patches. I'll have
a go at fixing it properly when I have the time, but that
may not be before ron has a better solution replacing 9load.
BTW, my idea of a proper solution is teaching 9load about
48-bit LBA.

I like having the configuration info on the partition. As
expected, it makes for much simpler management of the
fileserver.

Thanks to everyone that had suggestions.

--
Christopher Nielsen
"They who can give up essential liberty for temporary
safety, deserve neither liberty nor safety." --Benjamin Franklin


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-10-27  7:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
2003-10-22 15:57 ` Charles Forsyth
2003-10-22 17:10   ` Christopher Nielsen
2003-10-22 17:29   ` Christopher Nielsen
2003-10-22 17:58     ` Charles Forsyth
2003-10-22 18:06       ` Russ Cox
2003-10-22 20:07         ` vdharani
2003-10-22 19:59       ` Christopher Nielsen
2003-10-23  2:16         ` Christopher Nielsen
2003-10-24  4:32           ` [9fans] trouble down t'labs George Michaelson
2003-10-27  7:48           ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen
2003-10-22 16:08 ` ron minnich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).