From mboxrd@z Thu Jan 1 00:00:00 1970 From: Artem Letko Subject: Re: [9fans] exception/interrupt 14 bug To: 9fans@cse.psu.edu Message-id: <000701c1fad8$71fef9e0$1c742544@rosele01.nj.comcast.net> MIME-version: 1.0 Content-type: text/plain; charset=koi8-r Content-transfer-encoding: 7BIT References: <293d086081456bfda9777d45dfa18ba1@plan9.bell-labs.com> Date: Mon, 13 May 2002 19:46:46 -0400 Topicbox-Message-UUID: 90639170-eaca-11e9-9e20-41e7f4b1d025 The fix doesn't seem to be related to my problem though - exception/interrupt 14 happens much earlier in my case, before the kernel is fully loaded. The new floppy still panics. Art. ----- Original Message ----- From: "Russ Cox" To: <9fans@cse.psu.edu> Sent: Monday, May 13, 2002 3:58 PM Subject: [9fans] exception/interrupt 14 bug > The short version is that I think this bug is fixed; at > least, I think the bug that cause the two reports with > register dumps are fixed. I don't know about the others > that were mentioned without full panic messages, but it's > worth trying again. Thanks to Marc Rocas for reporting the > first instance of the bug and to Kevin Adams of VMware for > reporting the fix five days ago (though I didn't realize it > then). > > The long version is the story of the actual bug, just > because it's fun. > > Here's the first transcript I got of the bug in action: > > PBS... Plan 9 from Bell Labs > ELCR: 0E00 > apm ax=f000 cx=f000 dx=40 di=ffff ebx=57c7 esi=-1 > using fd0!dos!plan9.ini > dev A0 port 1F0 config 045A capabilities 2F00 mwdma 0007 udma 0407 > dev A0 port 170 config 80A0 capabilities 0E00 mwdam 0000 > > found 9PCFLOP .GZ attr 0x0 start 0xf0 len 1162928 > ...gz..... 1162928 -> 637473+992156+63892 = 1693521 > entry:80100020 > FLAGS=10286 TRAP=e ECODE=0 PC=8001c50b > AX d3c04c57 BX 000006e0 CX 80317718 DX 00000000 > SI 8003b580 DI 8003b55c BP 0000000a > CS 0010 DS 0008 ES 0008 FS 0008 GS 0008 > CR0 80000011 CR2 d3c04c6b CR3 000c000 > panic: exception/interrupt 14 > > After printing the entry line and before jumping to the > loaded kernel, 9load calls _sddetach to turn off any devices > that might otherwise interrupt the nascent kernel. > > static void > _sddetach(void) > { > SDev *sdev; > > for(sdev = sdlist; sdev != nil; sdev = sdev->next){ > if(sdev->enabled == 0) > continue; > if(sdev->ifc->disable) > sdev->ifc->disable(sdev); > sdev->enabled = 0; > } > } > > Unfortunately, sddetach has died because sdev->ifc does not > contain &sdataifc as expected, but rather d3c04c57 (AX). We > know that sdev is the right pointer because it's in CX and > booting another 9load with some debugging prints in malloc > shows that the first sdev always ends up at 80317718. > > The interesting question now is what d3c04c57 is. It's not > a pointer or a small integer. It really looks like random > noise; it could conceivably be a clock tick count except > that the clock runs at 100Hz and is initialized to zero. > Still, it's distinctive enough to look for. It's not in the > 9load binary nor in 9pcflop.gz, but it _is_ in the > uncompressed 9pcflop, at location 125159 (hex). The > kernel size is (decimal) 637473+992156+63892, so > that puts it in the data segment. The kernel file layout > looks like: > > 00000 exec header > 00020 text > 9ba41 data > > while the layout in memory will eventually look like > > 80100000 exec header > 80100020 text > 8019c000 data > > with the data shifted to begin on the next page boundary > after the end of the text. That means our magic value > d3c04c57 should have ended up at 80225718, but we > found it at 80317718 instead. The identical page alignment > of the expected and actual locations of the value is very > suspicious. Their difference is f2000, which doesn't ring > any bells. > > Meanwhile, we have this puzzle. > > if(gunzip((uchar*)PADDR(entry)-sizeof(Exec), sizeof(Exec)+text+data, > (uchar*)b->bp, b->wp-b->bp) < sizeof(Exec)+text+data) { > print("error uncompressing kernel\n"); > return FAIL; > } > > /* relocate data to start at page boundary */ > memmove((void*)PGROUND(entry+text), (void*)(entry+text), data); > > print("entry: %lux\n", entry); > warp9(PADDR(entry)); > > The magic value came into existence in memory as a result > of the gunzip call above, but it would have been in an unshifted > text segment, so it would be at 80225159 rather than 80225718. > The memmove is the only way the data could have been shifted > into the right alignment, but how did it end up at 80317718? > > But first, a different story. Five days ago Kevin Adams of > VMware pointed out that our protected mode entry code > looked like: > > 0x1000:0x0066 | 0f 01 16 6d 01 LGDT (0x16d) > 0x1000:0x006b | b8 01 00 MOV %ax,0x1 > 0x1000:0x006e | 00 00 ADD 0x0(%bx+%si),%al > 0x1000:0x0070 | 0f 01 f0 LMSW %cr0,%ax > 0x1000:0x0073 | eb 00 JMP 0x0 ;0x75 > > the weird part being the ADD instruction. Of course, this > is because the mov %eax, 0x1 that precedes it was > assembled with a 32-bit immediate rather than a 16-bit > immediate. > > This is a fairly new bug. I introduced it while fiddling with the > VMware port a few months ago. The effect is to increment > the byte at (BX+SI), whatever that may be. Maybe, just maybe, > some bit in a page table is getting whacked just right to double > map one page at both 80225000 and 80317000. > > I put a check in warp9 to look for page table corruption > and sent out a new boot disk, but the check didn't fire; > the crash remained the same. So the page tables are okay. > > Of course, if I could do decimal/hex conversion a little > better in my head I would have noticed that f2000 (the difference > twixt 80225000 and 80317000) is almost exactly the size of > the data segment itself: 992156 = f239c. > > If the memmove call being executed were really > > memmove((void*)PGROUND(entry+text+data), (void*)(entry+text), data); > > then that would break things in exactly the manner we > observed. But clearly the memmove isn't that, and further > it works on most people's machines. > > The disassembly looks perfectly fine: > > bootpass+0x210 0x80015798 MOVL entry+0x28(SP),DX > bootpass+0x214 0x8001579c MOVL data+0x24(SP),CX > bootpass+0x218 0x800157a0 MOVL text+0x20(SP),BX > bootpass+0x21c 0x800157a4 MOVL CX,BP > bootpass+0x21e 0x800157a6 ADDL BX,CX > bootpass+0x220 0x800157a8 ADDL $0x20,CX > bootpass+0x223 0x800157ab CMPL CX,AX > bootpass+0x225 0x800157ad JCC bootpass+0x23d(SB) > ... > bootpass+0x23d 0x800157c5 MOVL BX,AX > bootpass+0x23f 0x800157c7 ADDL DX,AX > bootpass+0x241 0x800157c9 ADDL $0xfff,AX > bootpass+0x246 0x800157ce ANDL $0xfffff000,AX > bootpass+0x24b 0x800157d3 MOVL AX,0x0(SP) > bootpass+0x24e 0x800157d6 LEA 0x0(BX)(DX*1),AX > bootpass+0x251 0x800157d9 MOVL AX,0x4(SP) > bootpass+0x255 0x800157dd MOVL BP,0x8(SP) > bootpass+0x259 0x800157e1 CALL memmove(SB) > > The first bit is doing the comparison of the gunzip return > value, and then we set the arguments for memmove. > > But wait. We don't have to guess at what byte got incremented > by the other bug. The last thing before the jump to protected > mode is the BIOS call to initialize APM. The registers upon > return were saved for the kernel, and even printed in the > transcript: > > apm ax=f000 cx=f000 dx=40 di=ffff ebx=57c7 esi=-1 > > And there it is. BX+SI = 57c6, and the data segment > at the time of the increment is 1000, making the byte > 157c6 aka 80157c6. The MOVL BX,AX at 80157c5 gets > assembled as 89 D8; after the second byte is incremented > it becomes 89 D9, which is MOVL BX, CX. > > So in fact, the memmove call that was being executed was > > memmove((void*)PGROUND(entry+text+data+sizeof(Exec)), > (void*)(entry+text), data); > > which ends up being close enough. > > The fix is in /sys/src/boot/pc/l.s. In the section marked > > /* > * goto protected mode > */ > > change > > MOVL $1, AX > > into > > LWI(1, rAX) > > I put up a fixed boot disk, although the distribution archive > still contains a bad 9load. Hopefully we can get a new > distribution archive together soon. That takes a bit longer > to test. > > Russ > >