From mboxrd@z Thu Jan 1 00:00:00 1970 Mime-Version: 1.0 (Apple Message framework v752.2) To: 9fans@cse.psu.edu Message-Id: Content-Type: multipart/alternative; boundary=Apple-Mail-1-382446015 From: Gregory Pavelcak Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 09:08:25 -0400 Topicbox-Message-UUID: 969dab50-ead1-11e9-9d60-3106f5b1d025 --Apple-Mail-1-382446015 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed I finally got to check on this. First thanks to Geoff for his informative replies and to jmk for taking the time to let me know Geoff was away. That saved me from checking my email every 5 seconds looking for his posts. I tried copyworm yesterday. I wasn't sure what device I was supposed to use. First I made main fh0 and output fw2, but I got a message saying something like `no blocks to copy from worm'. So, then I just made main h0 and output w2. Copyworm ran and ended with (and I actually wrote the message down this time): wreniocmd out of range a=w2 b=6105859 wrenwrite: w2(6105859) bad status 0040 out block 6105859: write error; bailing copied 615859 blocks from h0 to w2 sync: wormcopy looping: reset the machine at any time h0 is an 80G IDE drive (Western Digital something), and w2 is a Seagate 50GB U2WCS 3HH ST150176LC 1.6-IN HIGH 80-PIN according to the seller's blurb. I wouldn't think 50 is just too small given that I only had a few GB of stuff and have only been running since December. I need to try the p(w2)1.99. I didn't have any intention of having an MBR but maybe something there is the problem. Do you think that might help? It takes almost 20hrs to complete the copy, so it's not an easy experiment to do. Anyway, any thoughts appreciated. Greg -=-=-=-=-=-=-=-=-=-=-=-=- It sounds like cannot add sd12!plan9 [63,156296385] to disk [0,976937355): partition boundaries out of range is coming from 9load. I haven't seen this, but it does sound like garbled partition tables or a bad disk. The first block of a file system contains its configuration block, so if you plan to have an MBR or other DOS/FAT stuff on the disk, you'll need to skip it. I do that on my main file server to allow a small DOS partition. To do so, use p(w0)1.99 instead of w0, for example. `panic: fworm: checktag 6105775' is quite serious. The fake-worm bitmap isn't initialised. Copying fworms is tricky, particularly if the output device is larger than the input device. You probably ought to use copyworm instead of copydev, since copydev just blindly copies blocks, assuming input and output are the same size, but copyworm knows how to copy fworms, including reaming the output fworm, thus creating its bitmap. --Apple-Mail-1-382446015 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1

I finally got = to check on this.=A0

First thanks = to Geoff for his informative replies and to jmk for taking the
time to let me know Geoff was away. That saved me = from checking my
email every 5 seconds looking = for his posts.

I tried = copyworm yesterday. I wasn't sure what device I was supposed = to
use. First I made main fh0 and output fw2, but = I got a message saying=A0
something = like `no blocks to copy from worm'. So, then I just made main
h0 and output w2. Copyworm ran and ended with (and I = actually wrote the
message down this = time):

wreniocmd out = of range a=3Dw2 b=3D6105859
wrenwrite: = w2(6105859) bad status 0040
out block = 6105859: write error; bailing copied 615859 blocks from h0 to = w2
sync: wormcopy
looping: = reset the machine at any time

h0 is an 80G = IDE drive (Western Digital something), and w2 is a Seagate
50GB U2WCS 3HH = ST150176LC 1.6-IN HIGH 80-PIN
according to the seller's blurb.

I=A0 wouldn't think 50 is just too small given that = I only had a few GB of stuff and
have only = been running since December.

I need to try = the p(w2)1.99. I didn't have any intention of having an MBR
but maybe something there is the problem. Do you = think that might help?
It takes almost 20hrs to = complete the copy, so it's not an easy experiment
to do.

Anyway, any = thoughts appreciated.=A0





-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-
It sounds like

=A0=A0 =A0= =A0 cannot add sd12!plan9 [63,156296385] to disk = [0,976937355):
partition boundaries out of = range

is coming from 9load.=A0 I haven't seen this, but it = does sound like
garbled partition tables or a = bad disk.=A0 The first block of a file
system = contains its configuration block, so if you plan to have an = MBR
or other DOS/FAT stuff on the = disk, you'll need to skip it.=A0 I do that
on my = main file server to allow a small DOS partition.=A0 To do so, = use
p(w0)1.99 instead of w0, for = example.

`panic: fworm: checktag 6105775' is quite serious.=A0 = The fake-worm
bitmap isn't initialised.=A0 = Copying fworms is tricky, particularly if
the = output device is larger than the input device.=A0 You probably = ought
to use copyworm instead of = copydev, since copydev just blindly copies
blocks, = assuming input and output are the same size, but copyworm
knows how to copy fworms, including reaming the = output fworm, thus
creating its = bitmap.

= --Apple-Mail-1-382446015-- From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <149389885664fccd49bafd429c7c5e31@quanstro.net> From: erik quanstrom Date: Sat, 5 Aug 2006 08:57:31 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 96ad5460-ead1-11e9-9d60-3106f5b1d025 i don't have a worm or pseudoworm, but it's interesting that 50G/8k blocksize is approximately 6105859 (i get 6103515+5/8). is your total size of your original worm >50G? i'm pretty sure copyworm just does a block-for-block copy and doesn't check to see if blocks are used or not. also, the error code matches this code from wreniocmd: /sys/src/fs/dev/wren.c:83,86; #1566,#1668 if(b >= dr->max) { print("wreniocmd out of range a=%Z b=%lld\n", d, (Wideoff)b); return 0x40; } so i think it's pretty likely that your pseudoworm is too large for the new disk. - erik From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 10:38:23 -0400 From: geoff@plan9.bell-labs.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 96f294bc-ead1-11e9-9d60-3106f5b1d025 At the beginning of the copyworm operation, writtensize() should do this: print("limit(%Z) = %lld\n", worm, (Wideoff)lim); It would help to know what that line was when you ran copyworm. From mboxrd@z Thu Jan 1 00:00:00 1970 Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: References: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Gregory Pavelcak Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 14:11:59 -0400 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Topicbox-Message-UUID: 9760ed72-ead1-11e9-9d60-3106f5b1d025 limit(h0) = 9768447 I copied all of the devinit stuff if you need anything else, but I won't bother typing it unless you want it. Thanks. Greg On Aug 5, 2006, at 10:38 AM, geoff@plan9.bell-labs.com wrote: > At the beginning of the copyworm operation, writtensize() should do > this: > > print("limit(%Z) = %lld\n", worm, (Wideoff)lim); > > It would help to know what that line was when you ran copyworm. > From mboxrd@z Thu Jan 1 00:00:00 1970 Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <149389885664fccd49bafd429c7c5e31@quanstro.net> References: <149389885664fccd49bafd429c7c5e31@quanstro.net> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Gregory Pavelcak Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 14:14:52 -0400 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Topicbox-Message-UUID: 97696ad8-ead1-11e9-9d60-3106f5b1d025 That would be funny if that's all it was. And surprising to me. How fast does a worm grow under "ordinary" usage? Greg On Aug 5, 2006, at 9:57 AM, erik quanstrom wrote: > i don't have a worm or pseudoworm, but it's interesting that > 50G/8k blocksize is approximately 6105859 (i get 6103515+5/8). > is your total size of your original worm >50G? i'm pretty sure > copyworm just does a block-for-block copy and doesn't check to see > if blocks are used or not. also, the error code matches this code > from > wreniocmd: > > /sys/src/fs/dev/wren.c:83,86; #1566,#1668 > if(b >= dr->max) { > print("wreniocmd out of range a=%Z b=%lld\n", d, (Wideoff)b); > return 0x40; > } > > so i think it's pretty likely that your pseudoworm is too large for > the new disk. > > - erik From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <509071940608051232w13c7b89w1776297d83e3de73@mail.gmail.com> Date: Sat, 5 Aug 2006 15:32:57 -0400 From: "Anthony Sorace" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu> Subject: Re: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <149389885664fccd49bafd429c7c5e31@quanstro.net> Topicbox-Message-UUID: 977584a8-ead1-11e9-9d60-3106f5b1d025 defining "ordinary" is, of course, impossible, but about yea fast: http://www.cs.bell-labs.com/who/seanq/p9trace.html From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> From: Gregory Pavelcak Subject: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: Your message of "Sat, 05 Aug 2006 15:32:57 EDT." <509071940608051232w13c7b89w1776297d83e3de73@mail.gmail.com> Date: Sat, 5 Aug 2006 16:19:57 -0400 Message-Id: <20060805202005.6F647A9C0@mail.cse.psu.edu> Topicbox-Message-UUID: 977ac030-ead1-11e9-9d60-3106f5b1d025 Well, don't I feel silly. I thought I had stumbled upon a problem that required an fs wizard, but, unless Geoff says otherwise, I'm now convinced that I was just trying to get an elephant into a hamster cage. I'll just have to rethink my disk usage. Or maybe I'll try fossil/venti. Fs has been great, but the graph indicates that venti grows more slowly, and I don't want to give up my mirror for concatenation. Now, the question is, if I switch to venti, how do I do it without losing my history? Seems to me that I've seen posts saying something like "Nemo posted a script", but I've never found the script. But that's just a vague memory. I'll have to do some 9fans archive searching. Thanks. Greg From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <1eb60af701688ea28c07b2ca3ad595d0@quanstro.net> From: erik quanstrom Date: Sat, 5 Aug 2006 15:24:40 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: <20060805202005.6F647A9C0@mail.cse.psu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 978988fe-ead1-11e9-9d60-3106f5b1d025 i'd really be suprised if you had actually used up 50G. i think that the problem is that copyworm trys to copy the whole worm --- even empty blocks. the relevant check in getbuf combined with written size seems to do this: static Devsize writtensize() { Devsize lim; for(lim = devsize(worm); lim > 0; lim--) if(blocknum(lim) is "active" OR blocknum(lim) can be read) return lim+1; return 0; } am i missing something? it's not obvious to me how to determine the last block that's actually got data in it. - erik On Sat Aug 5 15:22:54 CDT 2006, g.pavelcak@comcast.net wrote: > > Well, don't I feel silly. I thought I had stumbled upon a problem > that required an fs wizard, but, unless Geoff says otherwise, I'm > now convinced that I was just trying to get an elephant into a > hamster cage. I'll just have to rethink my disk usage. Or maybe I'll > try fossil/venti. Fs has been great, but the graph indicates that > venti grows more slowly, and I don't want to give up my mirror for > concatenation. > > Now, the question is, if I switch to venti, how do I do it without > losing my history? Seems to me that I've seen posts saying something > like "Nemo posted a script", but I've never found the script. But > that's just a vague memory. I'll have to do some 9fans archive > searching. > > Thanks. > > Greg From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 19:13:05 -0400 From: geoff@plan9.bell-labs.com In-Reply-To: <1eb60af701688ea28c07b2ca3ad595d0@quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 97945928-ead1-11e9-9d60-3106f5b1d025 I believe Charles wrote the original version of what is now dowormcopy() and I probably modified it; Nigel may have also. It copes with real and fake worms. On a fake worm, the block-allocation bitmap is at the end of the file system, but all blocks can be read. On a real worm, only already-written blocks can be read and there is no allocation bitmap (or at least none visible outside the jukebox hardware). Because the file server kernel gradually extends the upper limit of blocks it will allocate on the worm, there will tend to be both written and unwritten blocks just before the current nominal end of the file system (but the end is increased when cwgrow() is called). After the current end there should only be unwritten blocks, and almost all of the blocks before it should be written. All sizes are in blocks of RBUFSIZE bytes. writtensize() gets the device's size from devsize(), which calls fwormsize() for fake worms or cwsize() for real ones. cwsize() just gets the current end of the file system from the cache device, where it's conveniently stored. fwormsize() subtracts the size of the allocation bitmap from the size of the underlying device. writtensize() works backward from the end of the device, using the size returned by devsize() as its nominal size, reading blocks until it succeeds. This should be pretty quick; for fake worms, fwormread() just consults the allocation bitmap to see if a block is allocated and thus readable, so writtensize() will see a long series of failed getbuf() attempts before reading the last allocated block before the allocation bitmap. Real worms are similar, but starting at the cache's notion of where the file system currently ends will be much closer to the last-written block than starting at the end of the non-bitmap fake worm device, so the backward scan will be short. From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <7dd00e3a7067afeaad6074b0a0bc64ec@plan9.bell-labs.com> To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 19:21:48 -0400 From: geoff@plan9.bell-labs.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 979af062-ead1-11e9-9d60-3106f5b1d025 I didn't explain that as clearly as I could have: all blocks on the disk underlying a fake worm can be read from the disk device, but only already-written blocks can be read via the fake-worm device, and it's the fake worm's allocation bitmap that determines which blocks have been written. This in turn emulates the real worm hardware, which only permits reading written blocks. From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sat, 5 Aug 2006 19:34:11 -0400 From: geoff@plan9.bell-labs.com In-Reply-To: <801d46846a08dbbcb038a6cfb7692f9e@quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 97abeb60-ead1-11e9-9d60-3106f5b1d025 Good; when I reread my first message, it looked imprecise enough to confuse. /sys/src/fs/dev/fworm.c contains the fake worm implementation and fwormread() and fwormwrite() consult and maintain the bitmap of written blocks. From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <801d46846a08dbbcb038a6cfb7692f9e@quanstro.net> From: erik quanstrom Date: Sat, 5 Aug 2006 23:27:08 -0500 To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: <7dd00e3a7067afeaad6074b0a0bc64ec@plan9.bell-labs.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 97a6a786-ead1-11e9-9d60-3106f5b1d025 you're explination made perfect sense to me. where is the bitmap consulted? - erik On Sat Aug 5 18:22:14 CDT 2006, geoff@plan9.bell-labs.com wrote: > I didn't explain that as clearly as I could have: all blocks on the > disk underlying a fake worm can be read from the disk device, but only > already-written blocks can be read via the fake-worm device, and it's > the fake worm's allocation bitmap that determines which blocks have > been written. This in turn emulates the real worm hardware, which > only permits reading written blocks. > From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> From: Gregory Pavelcak Subject: Re: [9fans] Fs64 file server, partition boundaries out of range In-Reply-To: Your message of "Sat, 05 Aug 2006 19:13:05 EDT." Date: Sun, 6 Aug 2006 11:06:56 -0400 Message-Id: <20060806150717.6A4E3EAFD@mail.cse.psu.edu> Topicbox-Message-UUID: 97ef41c6-ead1-11e9-9d60-3106f5b1d025 I hate to be the slowest guy in the room, but it's not clear where this leaves me. If I understand the paragraph below, the upshot is that there may be some unwritten blocks that copyworm thinks it has to copy, but it's not the case that it is just trying to copy all 80G worth, written or not, onto my 50G disk. So, I either really do have over 50G of blocks that it's trying to copy, or there is some other problem. I see that the limit() value Geoff asked for is basically devsize() for my source disk. Does that mean it's just too much stuff? Thanks for all the help. Greg > > On a real worm, only already-written blocks can be read and there is > no allocation bitmap (or at least none visible outside the jukebox > hardware). Because the file server kernel gradually extends the upper > limit of blocks it will allocate on the worm, there will tend to be > both written and unwritten blocks just before the current nominal end > of the file system (but the end is increased when cwgrow() is called). > After the current end there should only be unwritten blocks, and > almost all of the blocks before it should be written. > From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: To: 9fans@cse.psu.edu Subject: Re: [9fans] Fs64 file server, partition boundaries out of range Date: Sun, 6 Aug 2006 21:06:08 -0400 From: geoff@plan9.bell-labs.com In-Reply-To: <20060806150717.6A4E3EAFD@mail.cse.psu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 9800c66c-ead1-11e9-9d60-3106f5b1d025 copyworm shouldn't copy unwritten blocks. I'm still not sure that I know what the problem here is. Could you run these commands on your file server console and send us the output? cfs main check printconf The file server will be unresponsive until check finishes, so don't be alarmed.