From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <509071940709181528h29854b52jf1856b5bbfde7259@mail.gmail.com> Date: Tue, 18 Sep 2007 18:28:14 -0400 From: "Anthony Sorace" To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_19040_26724548.1190154494053" Subject: [9fans] cwfs(4) failing: phase error after recover or suicide after normal startup Topicbox-Message-UUID: c0ba4816-ead2-11e9-9d60-3106f5b1d025 ------=_Part_19040_26724548.1190154494053 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline Having played around with cwfs for a week or so now, I'm trying to use it to migrate my old kenfs. The relevant config line is 'filsys main cw4f[w<0-3>]'; w4 no longer works. I've gotten w0-3 hooked up to my cpu server and have created a devmap mapping w4 to a 30GB disk file (the original w4 disk was slightly larger than that). Starting up cwfs with -f (and other appropriate invocations) seems to work fine; I give it the config, it reports the correct mapping, and I end the conversation with 'recover main' and 'end'. The recover seems to work fine - it reports the block numbers of a few hundred dumps - but the process ends with this (I have CHAT(cp) always return true): next dump at Wed Sep 19 05:00:00 2007 c_session 0 c_attach 0 fid = 1 uid = adm arg = main fworm: read 1400715 error: phase error -- directory entry not allocated panic: FID1 attach to root halted at Tue Sep 18 14:16:07 2007. That phase error is Ealloc in 9p1.c^f_attach. If I omit the -f from cwfs's invocation after doing the recover, I get this, instead: next dump at Wed Sep 19 05:00:00 2007 c_session 0 c_attach 0 fid = 1 uid = adm arg = main cwfs:10685: suicide: sys: trap: divide error pc=0x000131cd PC there points to /sys/src/cmd/cwfs/cw.c:562 - cwio(); I've got acid traces in, but I don't see anything obviously wrong (no divide-by-zero, h->msize is positive, &c). In this case, only the first cwfs proc has suicided; the rest are running along, getting 9p requests (although not managing to actually do anything with them). I'm going to do more tracing after dinner or tomorrow, but I'm reasonably stumped at this point. Any pointers or other help is greatly appreciated. Particularly intriguing is why the behavior differs, when the first case successfully completes the recover and moves on. Attached is a summary of an acid debugging run on the suicided process from the second form, should anyone want to take a look. ------=_Part_19040_26724548.1190154494053 Content-Type: text/plain; name="cwfs.acid.txt" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="cwfs.acid.txt" X-Attachment-Id: f_7ezzq5 OiBzb3BoaWE7IGFjaWQgMTA2ODUKL3Byb2MvMTA2ODUvdGV4dDozODYgcGxhbiA5IGV4ZWN1dGFi bGUKCi9zeXMvbGliL2FjaWQvcG9ydAovc3lzL2xpYi9hY2lkLzM4NgphY2lkOiBsc3RrKCkKY3dp byhkZXY9MHgxNmY0MDgsYWRkcj0weDAsYnVmPTB4ODI1MGNjMCxvcGNvZGU9MHg4KSsweDk4IC9z eXMvc3JjL2NtZC9jd2ZzL2N3LmM6NTYyCgljdz0weDg1ZmIxODgKCWNiPTB4YmZkNzUwCgloPTB4 ODBhOGNjMAoJYTE9MHgyNWRkOQoJYm49MHhiZmQ3NTAKCWEyPTB4MzQwMDAwMDEKCW1heD0weDAK CW5ld21heD0weDEKCXA9MHhiZmQ3NTAKCWI9MHgyMDI2NwoJYz0weDM3NWJjCglzdGF0ZT0weDEK CXAxPTB4MjVkZDkKCXAyPTB4YmZkNzUwCmN3cmVhZChkZXY9MHgxNmY0MDgsYj0weDAsYz0weDgy NTBjYzApKzB4MjggL3N5cy9zcmMvY21kL2N3ZnMvY3cuYzo0OTYKZGV2cmVhZChjPTB4ODI1MGNj MCxiPTB4MCxkPTB4MTZmNDA4KSsweDFiNiAvc3lzL3NyYy9jbWQvY3dmcy9zdWIuYzo5ODgKCWU9 MHgyOTc0MwpnZXRidWYoZD0weDE2ZjQwOCxhZGRyPTB4MCxmbGFnPTB4MSkrMHgxZGUgL3N5cy9z cmMvY21kL2N3ZnMvaW9idWYuYzoxMDkKCWhwPTB4YjVhMjU4CglwPTB4YmZmYmMwCmZfYXR0YWNo KGNwPTB4ODVmYWQyOCxpbj0weGRmZmZlYmNjLG91PTB4ZGZmZmViMzApKzB4MjliIC9zeXMvc3Jj L2NtZC9jd2ZzLzlwMS5jOjI0MQoJcD0weDAKCWY9MHgxNzYyZTAKCWZzPTB4NGQzYjAKCXJhZGRy PTB4MAoJZD0weDIwMmFmCmZjYWxsOXAxKGluPTB4ZGZmZmViY2Msb3U9MHhkZmZmZWIzMCxjcD0w eDg1ZmFkMjgpKzB4OTUgL3N5cy9zcmMvY21kL2N3ZnMvY29uc29sZS5jOjIxCgl0PTB4NTYKY29u X2F0dGFjaChmaWQ9MHgxLHVpZD0weDQwNThmLGFyZz0weDE3MzRlOCkrMHg4NCAvc3lzL3NyYy9j bWQvY3dmcy9jb25zb2xlLmM6NDgKCWluPTB4MWVjNTYKCW91PTB4MWVhNTcKY21kX2Nmcyhhcmdj PTB4MSxhcmd2PTB4ZGZmZmVjYTgpKzB4NzAgL3N5cy9zcmMvY21kL2N3ZnMvY29uLmM6NjE4Cglu YW1lPTB4NDA1NzEKCWZzPTB4NGQzYjAKY21kX2V4ZWMoYXJnPTB4NDAxYTApKzB4ZGMgL3N5cy9z cmMvY21kL2N3ZnMvY29uLmM6MTE4CglsaW5lPTB4NzM2NjYzCglhcmd2PTB4ZGZmZmVjZDQKCWFy Z2M9MHgxCglpPTB4MQpjb25zc2VydmUoKSsweDNkIC9zeXMvc3JjL2NtZC9jd2ZzL2Nvbi5jOjIw CglpPTB4MjlkMAptYWluKGFyZ3Y9MHhkZmZmZWY4OCxhcmdjPTB4MSkrMHgyYjAgL3N5cy9zcmMv Y21kL2N3ZnMvbWFpbi5jOjMzMQoJbmV0cz0weDEKCV9hcmdjPTB4NmQKCV9hcmdzPTB4M2ZkNjQK CWFubj0weDAKCWk9MHhmCl9tYWluKzB4MzEgL3N5cy9zcmMvbGliYy8zODYvbWFpbjkuczoxNgph Y2lkOiBwcmludChwY2ZpbGUoMHgwMDAxMzFjZCkpCi9zeXMvc3JjL2NtZC9jd2ZzL2N3LmMKYWNp ZDogcHJpbnQocGNsaW5lKDB4MDAwMTMxY2QpKQo1NjIKYWNpZDogaW5jbHVkZSgiL3N5cy9zcmMv Y21kL2N3ZnMvYXJraXZlL2N3LmFjaWQiKQphY2lkOiBjd2lvOmRldgoJdHlwZQkweDA4Cglpbml0 CTB4ZjQKCWxpbmsJMHgwMDAwMDAwMAoJZGxpbmsJMHgwODI1MGNjMAoJcHJpdmF0ZQkweDAwMDAw MDA4CglzaXplCTU4MjMyODg0NTg2MDg2NApfMTBfIHsKXzJfIHdyZW4gewoJY3RybAkxNTA0MjY0 Cgl0YXJnCTAKCWx1bgkxMzY2NDU4MjQKCW1hcHBlZAkxMTkwMzU4MAoJZmlsZQkweDAwMDI5ODQ1 CglmZAkxMjU4MTgyNAoJc2RkaXIJMHgwMDAyOTc0MwoJc2RkYXRhCTB4MDAwMTg0MWYKfQpfM18g Y2F0IHsKCWZpcnN0CTB4MDAxNmY0MDgKCWxhc3QJMHgwMDAwMDAwMAoJbmRldgkxMzY2NDU4MjQK fQpfNF8gY3cgewoJYwkweDAwMTZmNDA4Cgl3CTB4MDAwMDAwMDAKCXJvCTB4MDgyNTBjYzAKfQpf NV8gaiB7CglqCTB4MDAxNmY0MDgKCW0JMHgwMDAwMDAwMAp9Cl82XyBybyB7CglwYXJlbnQJMHgw MDE2ZjQwOAp9Cl83XyBmdyB7CglmdwkweDAwMTZmNDA4Cn0KXzhfIHBhcnQgewoJZAkweDAwMTZm NDA4CgliYXNlCTAKCXNpemUJMTM2NjQ1ODI0Cn0KXzlfIHN3YWIgewoJZAkweDAwMTZmNDA4Cn0K fQoKYWNpZDogY3dpbzpoCgltYWRkcgkxMzQ5MDkxMjAKCW1zaXplCTEyNTcyNDk2CgljYWRkcgkx MjU3MjQ5NgoJY3NpemUJMTU1MDk3Cglmc2l6ZQkxMjU3MjQ5NgoJd3NpemUJNzc3MTgKCXdtYXgJ MTUwNDI2NAoJc2JhZGRyCTAKCWN3cmFkZHIJMTM2NjQ1ODI0Cglyb3JhZGRyCTgKCXRveXRpbWUJ MAoJdGltZQkxMzU1ODQKCmFjaWQ6IGN3aW86YWRkcgoweGRmZmZlYTYwCmFjaWQ6IGN3aW86Ym4K MHhkZmZmZWEzOAoKYWNpZDogcmMoImNhdCAvZGV2L3RleHQgPiAvbW50L3Rlcm0vVXNlcnMvYW50 aG9ueS9EZXNrdG9wL2N3ZnMuYWNpZC50eHQiKQo= ------=_Part_19040_26724548.1190154494053--