From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 29 Dec 1997 19:20:46 -0600 From: G. David Butler gdb@dbSystems.com Subject: [9fans] Re: etherelnk3.c Topicbox-Message-UUID: 6ec74d6a-eac8-11e9-9e20-41e7f4b1d025 Message-ID: <19971230012046._o6X-syQmm9KyIQeTu04PH8j-s0Z_uA3Db01brTOcA4@z> From: eld@jewel.ucsd.edu (Eric Dorman) >Definately worth a look; I've got a 9pcfs that allows the use >of IDE disks for filesystems Why? I agree IDE disks are very attractive from a $$/MB point of view, but you can't get enough of them on a machine. The overall $$/MB is less with SCSI since you can aggregate the cost of the CPU/RAM over more disks. Also, IDE is PIO and that is a bad place to waste CPU resources. > and have found that the network >is far and away the limiting factor (10Mbit/sec 10BaseT on NE2000s) If you use two transmit buffers, sending one and filling the other, you will find much more "network" in your NE2000. [I don't use the NE2000 anymore now that I have the 3c515 working! Not for the 100bT, but for the 64k of RAM and busmaster transfers!] >and 100BaseT is practically free these days; I'd love to use 100BaseT. Not when you look at the price/performance of 10bT full duplex switches and 100bT hubs. 100bT full duplex switches are nice, and expensive. (I'm looking at many nodes on the network, all *very* busy.) If you follow the thread a while ago about the 3Com cards and the big packet problem, you will see it's pretty easy to get the 3C509 PCI 10/100 card up and running (in PIO mode). I still haven't ported the Brazil driver because of the ongoing discussion about ringbufs/blocks/msgbufs. (I'm still leaning towards using the ringbufs.) But, once that is done, you'll have that option. > Might even be able to interleave across primary and >secondary IDEs (if the braindead chipsets will support it..). No need, filsys main [h0h1] will interleave for you. >So far I've had to scramble around in the fs code changing 'long's >to 'ulong's in bytewise size computations and changing the type >of disk block addresses to ulongs; the matched-pair of 3.5G disks >breaks 'long's. I'm worrying though that this may have caused >my block tag bug. Very possible. I looked at doing a global long to ulong change but found some places that weren't easy to fix (I don't remeber where now), so I left it alone. I was thinking of reducing the block size and needed more blocks. If you leave the block size at 4K, and handle the multiplier like devwren does, then you shouldn't need to make that change. [snip] >reads a block, expecting it to have tag 3 (IND1 block) but instead >got a file data block (tag 5, err DFile); I'm pretty sure the block What is the path set to? Is it the first file or the second? If it is the second, then you are overwriting the block. >in question *should* have been an IND1 block but I haven't actually >seen the fs scribble on that particular block any time after it >gets flushed for the first time. (Grr) It appears the right Was it in memory correctly before being flushed? >I would like some comments on changing the way the fs knows >about available physical disks. As it stands the fs knows >about 'Devwren', 'Devworm' and etc. which is fine. The choices >for adding an IDE interface were to utilize a new dev type >(Devide) and codeletter (h) for IDE disks, (requires rewiring >stuff in fs/port/sub.c) Yes! > or somehow patching into the scsi >stuff below the Devwren level. No! > I chose the former as an >easier solution but it's, well, icky; changing stuff in >fs/port is evil since I'd have to stub out 'ideread/idewrite' >in all other architectures. Why? Use different names. You need to handle the translation of RBUFSIZE blocks to real sectors somewhere. >Seems to me a better solution would be to have the hardware-specific >initialization stuff build a table describing the disks connected >to the box (complete with codeletters, size, traps into the >hardware driver, etc) and have fs/port/sub.c go indirect >through the table to the hardware. Maybe. I like the way it is now because my mirror code likes to know that a disk is missing (for whatever reason) and then know that it is available again later (a reboot cleared the error, or the drive was replaced.) The config block is written to a mirror set with "config {w0.0w1.0}" and it tells you what the system is suppose to look like even if it doesn't look like that now. I have caused drive and controller failures and the system just takes the drive (or all the drives on a failed controller) off line and keeps running. When the system is fixed and rebooted, it finds the mirrors needing recovery and does it. Even if it is booted with the drives sick, it does the right thing. When I added log support to the system, I thought about going directly to scsiio so I could write less data to the log, (you have to go down to that level before 512 byte blocks are visible). But I also wanted to use the striping [] and mirror {} capabilities so I created another special file system "log". (Special in the way that "main" is special.) This decision made many other things easier, e.g. I can use the buffer cache for recovery (getbuf/putbuf) and bypass it otherwise (devwrite). I would recommend using what is there, it works pretty well. David Butler gdb@dbSystems.com