* [9fans] venti+fossil, the boot process, and 48-bit lba @ 2003-10-22 15:43 Christopher Nielsen 2003-10-22 15:57 ` Charles Forsyth 2003-10-22 16:08 ` ron minnich 0 siblings, 2 replies; 12+ messages in thread From: Christopher Nielsen @ 2003-10-22 15:43 UTC (permalink / raw) To: 9fans I've spent the night working to recover/rebuild my venti+fossil fileserver. The recover part was due to something hosing the superblock in fossil. No idea what it was. The rebuild part was to add another disc for use as the venti index so I can use the much larger disc the index was on for more arenas. I also wanted to setup the configs on the discs using [venti,fossil]/conf. I thought this would be easy enough. For recovery, just reformat fossil with a venti score. Then I could go about rebuilding the index on the new disc and creating more arenas. This has led to nothing but trouble. After much pulling of hair and head-scratching, it turns out that 9load doesn't know how to handle discs over 128G because it doesn't know anything about 48-bit lba. As a result, when boot tries to start venti and venti tries to find the arenas, it all goes to hell. Patches to 9load should be done later today or early tomorrow. Ron, weren't you working on eliminating 9load from the picture? -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen @ 2003-10-22 15:57 ` Charles Forsyth 2003-10-22 17:10 ` Christopher Nielsen 2003-10-22 17:29 ` Christopher Nielsen 2003-10-22 16:08 ` ron minnich 1 sibling, 2 replies; 12+ messages in thread From: Charles Forsyth @ 2003-10-22 15:57 UTC (permalink / raw) To: 9fans >>and head-scratching, it turns out that 9load doesn't know how ... >>As a result, when boot tries to start venti and venti >>tries to find the arenas, it all goes to hell. how does 9load cause that to happen? the thing that starts venti is a normal kernel, which should handle 48-bit LBA. does 9load (not knowing about them) leave the ATA in a state where the kernel can't use 48-bits either? i can see that 9load not knowing 48-bit addressing wouldn't be able to see boot/9fat partitions beyond a certain point, or perhaps with the right geometry. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 15:57 ` Charles Forsyth @ 2003-10-22 17:10 ` Christopher Nielsen 2003-10-22 17:29 ` Christopher Nielsen 1 sibling, 0 replies; 12+ messages in thread From: Christopher Nielsen @ 2003-10-22 17:10 UTC (permalink / raw) To: 9fans On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote: > > how does 9load cause that to happen? > the thing that starts venti is a normal kernel, which > should handle 48-bit LBA. does 9load (not knowing > about them) leave the ATA in a state where the kernel > can't use 48-bits either? no idea. my conclusion was a semi-educated guess. > i can see that 9load not knowing 48-bit addressing wouldn't > be able to see boot/9fat partitions beyond a certain point, > or perhaps with the right geometry. something weird is going on. i originally had the venti conf on one of the 167G drives, it would get to 'time...' in the boot sequence, and then suicide with an invalid address. so i moved the venti conf to one of the smaller discs, and then it found the venti conf and barfed because it couldn't open the arenas on the 167G discs. based on that, i am assuming something is hating those large discs. do you have any other ideas? -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 15:57 ` Charles Forsyth 2003-10-22 17:10 ` Christopher Nielsen @ 2003-10-22 17:29 ` Christopher Nielsen 2003-10-22 17:58 ` Charles Forsyth 1 sibling, 1 reply; 12+ messages in thread From: Christopher Nielsen @ 2003-10-22 17:29 UTC (permalink / raw) To: 9fans On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote: > >>and head-scratching, it turns out that 9load doesn't know how > ... > >>As a result, when boot tries to start venti and venti > >>tries to find the arenas, it all goes to hell. > > how does 9load cause that to happen? > the thing that starts venti is a normal kernel, which > should handle 48-bit LBA. does 9load (not knowing > about them) leave the ATA in a state where the kernel > can't use 48-bits either? > > i can see that 9load not knowing 48-bit addressing wouldn't > be able to see boot/9fat partitions beyond a certain point, > or perhaps with the right geometry. one more bit of information. when 9load detects the discs, it says it can't add the partitions on the 167G drives because the boundaries are out of range. the upper end of the boundary that it prints is the upper end of 28-bit addressing. that's why i concluded it had to do with 48-bit lba. this is transcribed by hand, but it looks something like this: cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 17:29 ` Christopher Nielsen @ 2003-10-22 17:58 ` Charles Forsyth 2003-10-22 18:06 ` Russ Cox 2003-10-22 19:59 ` Christopher Nielsen 0 siblings, 2 replies; 12+ messages in thread From: Charles Forsyth @ 2003-10-22 17:58 UTC (permalink / raw) To: 9fans [-- Attachment #1: Type: text/plain, Size: 1027 bytes --] i see. i'd wondered after i'd posted whether this might be it: the kernel driver doesn't work out partitions on its own any more. the devices' partition tables are set, for instance in /bin/termrc, by disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the file systems to get termrc (let alone fdisk or prep), and you need the partitions to find the file system. thus, boot adds the results of its own probe to the internal copy of plan9.ini passed to the kernel, and so the partitions are defined in time for boot to find them. /sys/src/9/port/devsd.c has * Use partitions passed from boot program, * sdC0part=dos 63 123123/plan9 123123 456456 * This happens before /boot sets hostname so the * partitions will have the null-string for user. ... you could configure fdisk and prep into /boot, and have boot use them, but perhaps the easiest change for you might be to stop 9load from checking the values (temporarily) until something better is done. [-- Attachment #2: Type: message/rfc822, Size: 3637 bytes --] From: Christopher Nielsen <cnielsen@pobox.com> To: 9fans@cse.psu.edu Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba Date: Wed, 22 Oct 2003 10:29:53 -0700 Message-ID: <20031022172953.GA834@cassie.foobarbaz.net> On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote: > >>and head-scratching, it turns out that 9load doesn't know how > ... > >>As a result, when boot tries to start venti and venti > >>tries to find the arenas, it all goes to hell. > > how does 9load cause that to happen? > the thing that starts venti is a normal kernel, which > should handle 48-bit LBA. does 9load (not knowing > about them) leave the ATA in a state where the kernel > can't use 48-bits either? > > i can see that 9load not knowing 48-bit addressing wouldn't > be able to see boot/9fat partitions beyond a certain point, > or perhaps with the right geometry. one more bit of information. when 9load detects the discs, it says it can't add the partitions on the 167G drives because the boundaries are out of range. the upper end of the boundary that it prints is the upper end of 28-bit addressing. that's why i concluded it had to do with 48-bit lba. this is transcribed by hand, but it looks something like this: cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 17:58 ` Charles Forsyth @ 2003-10-22 18:06 ` Russ Cox 2003-10-22 20:07 ` vdharani 2003-10-22 19:59 ` Christopher Nielsen 1 sibling, 1 reply; 12+ messages in thread From: Russ Cox @ 2003-10-22 18:06 UTC (permalink / raw) To: 9fans > you could configure fdisk and prep into /boot, and have boot use them, but perhaps the > easiest change for you might be to stop 9load from checking the values (temporarily) > until something better is done. yes, this. change 9load so that if the max disk address is 2^28, it doesn't sanity check the upper bounds in the partition table. russ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 18:06 ` Russ Cox @ 2003-10-22 20:07 ` vdharani 0 siblings, 0 replies; 12+ messages in thread From: vdharani @ 2003-10-22 20:07 UTC (permalink / raw) To: 9fans hi, >> you could configure fdisk and prep into /boot, and have boot use them, >> but perhaps the easiest change for you might be to stop 9load from >> checking the values (temporarily) until something better is done. > > yes, this. change 9load so that if the max disk address is 2^28, > it doesn't sanity check the upper bounds in the partition table. is someone changing 9load source centrally also? that way it would be useful for other similar instances as well. and, may be, if reqd, the sanity check could be changed from display-error- and-exit to warn-and-proceed. thanks dharani ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 17:58 ` Charles Forsyth 2003-10-22 18:06 ` Russ Cox @ 2003-10-22 19:59 ` Christopher Nielsen 2003-10-23 2:16 ` Christopher Nielsen 1 sibling, 1 reply; 12+ messages in thread From: Christopher Nielsen @ 2003-10-22 19:59 UTC (permalink / raw) To: 9fans I modified 9load as Russ suggested. I cannot test it just yet, but I will do so when I get home. It was a very simple, one-line change that, should it work, will suffice until a better solution is available. On Wed, Oct 22, 2003 at 06:58:56PM +0100, Charles Forsyth wrote: > i see. i'd wondered after i'd posted whether this might be it: > > the kernel driver doesn't work out partitions on its own any more. > the devices' partition tables are set, for instance in /bin/termrc, by > disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the > file systems to get termrc (let alone fdisk or prep), and you need the partitions > to find the file system. thus, boot adds the results of its own probe to the > internal copy of plan9.ini passed to the kernel, and so the partitions are > defined in time for boot to find them. > > /sys/src/9/port/devsd.c has > * Use partitions passed from boot program, > * sdC0part=dos 63 123123/plan9 123123 456456 > * This happens before /boot sets hostname so the > * partitions will have the null-string for user. > ... > > you could configure fdisk and prep into /boot, and have boot use them, but perhaps the > easiest change for you might be to stop 9load from checking the values (temporarily) > until something better is done. > From: Christopher Nielsen <cnielsen@pobox.com> > To: 9fans@cse.psu.edu > Date: Wed, 22 Oct 2003 10:29:53 -0700 > Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba > > On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote: > > >>and head-scratching, it turns out that 9load doesn't know how > > ... > > >>As a result, when boot tries to start venti and venti > > >>tries to find the arenas, it all goes to hell. > > > > how does 9load cause that to happen? > > the thing that starts venti is a normal kernel, which > > should handle 48-bit LBA. does 9load (not knowing > > about them) leave the ATA in a state where the kernel > > can't use 48-bits either? > > > > i can see that 9load not knowing 48-bit addressing wouldn't > > be able to see boot/9fat partitions beyond a certain point, > > or perhaps with the right geometry. > > one more bit of information. when 9load detects the discs, > it says it can't add the partitions on the 167G drives > because the boundaries are out of range. the upper end of > the boundary that it prints is the upper end of 28-bit > addressing. that's why i concluded it had to do with 48-bit > lba. > > this is transcribed by hand, but it looks something like > this: > > cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range > cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range > > -- > Christopher Nielsen > "They who can give up essential liberty for temporary > safety, deserve neither liberty nor safety." --Benjamin Franklin -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 19:59 ` Christopher Nielsen @ 2003-10-23 2:16 ` Christopher Nielsen 2003-10-24 4:32 ` [9fans] trouble down t'labs George Michaelson 2003-10-27 7:48 ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen 0 siblings, 2 replies; 12+ messages in thread From: Christopher Nielsen @ 2003-10-23 2:16 UTC (permalink / raw) To: 9fans That didn't work. It shut up 9load about the partitions, but venti still couldn't open the partition. Unless someone has a better idea, I'm going to proceed with incorporating fdisk and prep into /boot and teach boot to use them to sniff out the partitions. On Wed, Oct 22, 2003 at 12:59:40PM -0700, Christopher Nielsen wrote: > I modified 9load as Russ suggested. I cannot test it just yet, > but I will do so when I get home. > > It was a very simple, one-line change that, should it work, will > suffice until a better solution is available. > > On Wed, Oct 22, 2003 at 06:58:56PM +0100, Charles Forsyth wrote: > > i see. i'd wondered after i'd posted whether this might be it: > > > > the kernel driver doesn't work out partitions on its own any more. > > the devices' partition tables are set, for instance in /bin/termrc, by > > disk/fdisk -p and disk/prep -p, into /dev/sd??/ctl, but that's not much help if you need the > > file systems to get termrc (let alone fdisk or prep), and you need the partitions > > to find the file system. thus, boot adds the results of its own probe to the > > internal copy of plan9.ini passed to the kernel, and so the partitions are > > defined in time for boot to find them. > > > > /sys/src/9/port/devsd.c has > > * Use partitions passed from boot program, > > * sdC0part=dos 63 123123/plan9 123123 456456 > > * This happens before /boot sets hostname so the > > * partitions will have the null-string for user. > > ... > > > > you could configure fdisk and prep into /boot, and have boot use them, but perhaps the > > easiest change for you might be to stop 9load from checking the values (temporarily) > > until something better is done. > > > From: Christopher Nielsen <cnielsen@pobox.com> > > To: 9fans@cse.psu.edu > > Date: Wed, 22 Oct 2003 10:29:53 -0700 > > Subject: Re: [9fans] venti+fossil, the boot process, and 48-bit lba > > > > On Wed, Oct 22, 2003 at 04:57:55PM +0100, Charles Forsyth wrote: > > > >>and head-scratching, it turns out that 9load doesn't know how > > > ... > > > >>As a result, when boot tries to start venti and venti > > > >>tries to find the arenas, it all goes to hell. > > > > > > how does 9load cause that to happen? > > > the thing that starts venti is a normal kernel, which > > > should handle 48-bit LBA. does 9load (not knowing > > > about them) leave the ATA in a state where the kernel > > > can't use 48-bits either? > > > > > > i can see that 9load not knowing 48-bit addressing wouldn't > > > be able to see boot/9fat partitions beyond a certain point, > > > or perhaps with the right geometry. > > > > one more bit of information. when 9load detects the discs, > > it says it can't add the partitions on the 167G drives > > because the boundaries are out of range. the upper end of > > the boundary that it prints is the upper end of 28-bit > > addressing. that's why i concluded it had to do with 48-bit > > lba. > > > > this is transcribed by hand, but it looks something like > > this: > > > > cannot add sdF0!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range > > cannot add sdF1!plan9 [63,351646785) to disk [0,268435455): partition boundaries out of range > > > > -- > > Christopher Nielsen > > "They who can give up essential liberty for temporary > > safety, deserve neither liberty nor safety." --Benjamin Franklin > > > -- > Christopher Nielsen > "They who can give up essential liberty for temporary > safety, deserve neither liberty nor safety." --Benjamin Franklin -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* [9fans] trouble down t'labs 2003-10-23 2:16 ` Christopher Nielsen @ 2003-10-24 4:32 ` George Michaelson 2003-10-27 7:48 ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen 1 sibling, 0 replies; 12+ messages in thread From: George Michaelson @ 2003-10-24 4:32 UTC (permalink / raw) To: 9fans AT&T have admitted to severe SPAM related slowdowns. It might be that the data congestion along the path is causing so much packetloss, crypted and other loss- intolerant paths are not working well. how much of 9 protocol is over IP or UDP rather than TCP? I suppose even if TCP, it could score massive delay as the windowsize kept down due to fragloss. Only a hypotenuse. Somebody more in-the-know can kybosh it. -George ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-23 2:16 ` Christopher Nielsen 2003-10-24 4:32 ` [9fans] trouble down t'labs George Michaelson @ 2003-10-27 7:48 ` Christopher Nielsen 1 sibling, 0 replies; 12+ messages in thread From: Christopher Nielsen @ 2003-10-27 7:48 UTC (permalink / raw) To: 9fans On Wed, Oct 22, 2003 at 07:16:28PM -0700, Christopher Nielsen wrote: > That didn't work. It shut up 9load about the partitions, but > venti still couldn't open the partition. > > Unless someone has a better idea, I'm going to proceed with > incorporating fdisk and prep into /boot and teach boot to > use them to sniff out the partitions. Everything is back in working order. However, the solution is a complete hack, specific to my configuration, and not worth sending in patches. I'll have a go at fixing it properly when I have the time, but that may not be before ron has a better solution replacing 9load. BTW, my idea of a proper solution is teaching 9load about 48-bit LBA. I like having the configuration info on the partition. As expected, it makes for much simpler management of the fileserver. Thanks to everyone that had suggestions. -- Christopher Nielsen "They who can give up essential liberty for temporary safety, deserve neither liberty nor safety." --Benjamin Franklin ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [9fans] venti+fossil, the boot process, and 48-bit lba 2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen 2003-10-22 15:57 ` Charles Forsyth @ 2003-10-22 16:08 ` ron minnich 1 sibling, 0 replies; 12+ messages in thread From: ron minnich @ 2003-10-22 16:08 UTC (permalink / raw) To: 9fans On Wed, 22 Oct 2003, Christopher Nielsen wrote: > Ron, weren't you working on eliminating 9load from the picture? yep, I'm still trying to recover my auth server from the switch to fossil+venti. I may reload from scratch, things did not go well. ron ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2003-10-27 7:48 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-10-22 15:43 [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen 2003-10-22 15:57 ` Charles Forsyth 2003-10-22 17:10 ` Christopher Nielsen 2003-10-22 17:29 ` Christopher Nielsen 2003-10-22 17:58 ` Charles Forsyth 2003-10-22 18:06 ` Russ Cox 2003-10-22 20:07 ` vdharani 2003-10-22 19:59 ` Christopher Nielsen 2003-10-23 2:16 ` Christopher Nielsen 2003-10-24 4:32 ` [9fans] trouble down t'labs George Michaelson 2003-10-27 7:48 ` [9fans] venti+fossil, the boot process, and 48-bit lba Christopher Nielsen 2003-10-22 16:08 ` ron minnich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).