From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <44b45b851b4bde52572c4ef5ec703f86@plan9.bell-labs.com> From: presotto@plan9.bell-labs.com To: afrayedknot@thefrayedknot.armory.com, 9fans@cse.psu.edu Subject: Re: [9fans] setup MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Date: Tue, 10 Sep 2002 00:16:03 -0400 Topicbox-Message-UUID: e7cf0390-eaca-11e9-9e20-41e7f4b1d025 > > Next Ive been noticing that my auth server periodically loses touch with > the file server and is unable to run any commands or renegotiate the > connection. When at a prompt everything yields "i/o on hungup channel" > the only thing ive found is to reboot the auth server. This seems > strange to me because the two systems are on the same hub connected > through 100megabit ethernet. Is it just from too many collisions, or il > not handling it? Can i use aan to ensure a perminant connection? If so > how do get it to work on the remotely imported root from bootup? Would > adding a cache file system help/resolve the problem? The file server won't run aan. However, you shouldn't be getting hung up. I take it the terminal doesn't have these problems? Similar hardware or not? If not, I'ld switch which machine is which and see if it moves with the machine or stays with the auth server. Cron is probably running periodicly on the auth server. Is there any correlation between auth doing something and the connection going south? Is there anything going on when you get hung up? Last but not least, are you on a switch? Do the switch and cpu agree about the line being full duplex or half duplex? Our track record in that department is fuzzy. If a cache file system helps, it will be because you're side stepping the real problem and it'll come back and bit you in the backside. > > Im also confused about how /rc/bin/service works. I understand that > all the files there are just shell scripts, but is it 'ok' if i have > all my cpu servers looking at the same /rc/bin/service are there any > services in there that are specific to the auth server or are all those > in service.auth? Are there any services in /rc/bin/service that are > best left to just one system to be listening for instead of all the cpu > servers? (if im missing something please tell me). We normally run as few services on the auth server as possible. However, since your auth server and cpu server are the same machine... Once you have your n cpu servers, you can scale back the auth server to do just a few things. We normally run them standalone so that they can't easily be hacked because we left something unprotected on the mail file server. /rc/ibin/service.auth has the things that need to run as the hostowner. It contains things that cpu's run (imap4d, pop3, and ssh servers). This should eventually become the null set because of factotum. It also includes the authentication server (il566 and tcp567). This needs to run as hostowner because it needs to access the key database. > > What ports should be open for the cpu command (and what do they do?). The > manuals say one thing and then the wiki says something slightly different, > I dont want to leave a bunch of ports open when i dont have to. As > far as authenticating cpu connections Im not sure if I have everything > set up right. In order for the command to work I have to set the cpu > variable to tcp!cpuserver!cpu, so in my case tcp!ra!cpu (thats correct > right?). I looked at the debug output from a cpu command on the cpu > server and I noticed that there are a lot of complaints about messages > being the wrong size and that it tries to authenticate as the hostowner > of the cpu server, then fails and tries the username of who executed the > command and finishes the handshaking. Is that what is supposed to happen? The cpu command now uses only tcp port 17010/ncpu. 17013 is only for an older version of cpu. 17006 for an older one still. They're both obsolete. If you're running cron, you probably want to leave the rexexec port open also, tcp port 17009/rexexec. Finally, both systemdialing the cpu will have to authenticate to your auth server so it needs to get to tcp port 567 on the auth server. You should be able to just have the variable cpu=ra Or you could just use '-h ra' whenever you run cpu. By adding the !cpu you're trying to connect to an old version cpu server. I'm surprised anything answers at all. Check that your /lib/ndb/common matches what's on sources.cs.bell-labs.com. It could be that you have some odd mix of old and new style cpu clients and servers. I don't understand the short messages unless they're just aborted attempts. Could you send us some examples of the debugging output? > > Sorry for the long post, one more thing. Is there a way to 're-login' > on a terminal without rebooting it? I know you can use auth/login to > change the namespace of one window, but is there a way to do it for rio's > namespace? Am i better off having everyone reboot or having it log in > as a guest account and then running a nested rio as a specific user? no, just reboot.