From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Haertel Message-Id: <200205262040.g4QKeqbC094464@ducky.net> To: 9fans@cse.psu.edu Subject: Re: [9fans] bug in disk/format In-Reply-To: <895b1e01e565d3dae52f32ef59bfee96@plan9.bell-labs.com> Date: Sun, 26 May 2002 13:40:52 -0700 Topicbox-Message-UUID: 9cd82d4e-eaca-11e9-9e20-41e7f4b1d025 >If you update, you'll get a new format that >has both your patch and my fix from earlier >tonight. Do you still see the problem when >you run that format? If so, run format with >the -v flag and let me know what you see. Yup, the problem is still there. And, curiously enough, the file offset at which the miscomparison occurs has gone back to match that in my *original* bug report. This suggests that: 1) in my original bug report, I had neglected to first zero the previous contents of the 9fat partition, so the old version of disk/format was getting its idea of the geometry from sector 0 of the 9fat partition, which at the time held the correct geometry (255 heads, 63 sectors) for the disk. 2) the behavior of the bug does indeed depend on what format's idea of the geometry is. So, on the theory that the problem has something to do with format's notion of the geometry, I tried to set up a fake "sd01" directory containing ctl, data, and 9fat files constructed to fool format into thinking it is looking at a real disk, and run format on that. I discovered that in this case the problem does not occur. Just to be sure that I had enough bits and pieces in my fake sd01 directory to fool format into thinking it was looking at a real disk, I ran format on the real disk drive under "iostats -d" to see all its I/O. I discovered in this case the problem also does not occur. So I can only see the bug when format is *directly* accessing the real SCSI driver. Watch this: # first, we run format -v on the real scsi disk and record its output. # just to make things r term% dd -if /dev/zero -of /dev/sd01/9fat -count 1 1+0 records in 1+0 records out term% dd -if /dev/zero -of /dev/sd01/9fat -seek 2 -count 20480 20480+0 records in 20480+0 records out term% disk/format -v -b /386/pbs -d -r 2 /dev/sd01/9fat /386/9load /386/9pcdisk /tmp/plan9.ini > format.out1 >[2] format.err1 (at this point, starting up a fresh dossrv and running cmp verifies that /n/9fat/9load has bogus data starting at byte 55809 and /n/9fat/9pcdisk has bogus data starting at byte 1) # now, we run the exact same command under "iostats -d" term% dd -if /dev/zero -of /dev/sd01/9fat -count 1 1+0 records in 1+0 records out term% dd -if /dev/zero -of /dev/sd01/9fat -seek 2 -count 20480 20480+0 records in 20480+0 records out term% iostats -d disk/format -v -b /386/pbs -d -r 2 /dev/sd01/9fat /386/9load /386/9pcdisk /tmp/plan9.ini > format.out2 >[2] format.err2 (at this point, starting up a fresh dossrv and running cmp verifies that /n/9fat/9load and /n/9fat/9pcdisk were both copied correctly) # and here we see that disk/format produces different output when it # is directly accessing the real scsi disk vs. when it is accessing # the scsi disk proxied through iostats -d. # (we also see that iostats eats standard error entirely; grr) term% diff format.out1 format.out2 33c33 < plan9.ini @1BEE00 --- > plan9.ini @1DEC00 term% diff format.err1 format.err2 1,6d0 < add 9load at clust 2 < add 9pcdisk at clust 59 < add plan9.ini at clust 3ad < add 9load at clust 2 < add 9pcdisk at clust 59 < add plan9.ini at clust 3ad So at this point my hypothesis is that there is no problem with disk/format at all, but rather that there is a bug of some kind in the SCSI driver.