On Sun, Nov 16, 2008 at 09:24:19PM +0000, Eris Discordia wrote: >> That isn't happening. All we have is one TCP connection and one small >> program exporting file service. > > I see. But then, is it the "small program exporting file service" that > does the multiplexing? I mean, if two machines import a gateway's /net > and both run HTTP servers binding to and listening on *:80 what takes > care of which packet belongs to which HTTP server? I don't think you've quite got it yet.... also I swore I wouldn't post in this thread. Oh well, here goes. First, let me draw a picture, just to introduce the characters: +----------+ +---------+ | Internal |<--Ethernet-->| Gateway |<--Ethernet-->(Internet) | Computer | ^ +---------+ +----------+ | | +----------+ | | Other |<--+ | Internal | | Computer | +----------+ OK. Here, we have two internal computers (IC and OIC) and a gateway G. There are two Ethernet networks in flight, one connecting IC, OIC, and G, and the other connecting G to the internet at large, somehow (e.g. ADSL). IC and OIC both initialize somehow, say using DHCP from G, and bring their network stacks up using this information. Their kernels now provide the services that will be mounted, by convention, at /net, again using this private information. G initializes statically, bringing its IP stack up using two interfaces, with two routes: one for internal traffic, and one default route. So far, it's all very similar between Plan 9 and Linux. Here's where our story diverges. In Linux, IC and OIC are both taught that they have default routes to the Internet At Large by sending IP datagrams to G's internal ethernet interface. How this works in more detail, they don't care. G will also send them corresponding IP datagrams from its interface; that's all they care about. G works furiously to decode the network traffic bound for the internet and NA(P)T it out to its gateway, maintaining very detailed tables about TCP connections, UDP datagrams, etc, etc. How could it be but any other way? Let's redraw the picture, as things stand now, under Plan 9, just so things are clear: (OIC similar to IC) +----Internal Computer--------------------+ | | bind '#I' /net | | | #I : Kernel IP stack (192.168.1.2) | | | #l : Kernel ethernet driver |<---Ethernet---+ +-----------------------------------------+ | | +----Gateway------------------------------+ | | bind '#I' /net | | | #I : Kernel IP stack (192.168.1.1) | | | (4.2.2.2) | | | #l : Kernel ethernet driver (ether0) |<---Ethernet---+ | (ether1) |<---Ethernet----->(Internet) +-----------------------------------------+ In Plan 9, G behaves like any other machine, building a /net out of pieces exported by its kernel, including the bits that know how to reach the internet at large through the appropriate interface. Good so far? Let's have G run an exportfs, exposing its /net on the internal IP address. This /net knows how to talk to the internal addresses and the external ones. Meanwhile, IC can reach out and import G's /net, binding it at /net.alt, let's say. Now, programs can talk to the Internet by opening files in /net.alt. These open requests will be carried by IC's mount driver, and then IC's network stack, to G, whereupon the exportfs (in G's userland) will forward them to its idea of /net (by open()ing, read()ing, write()ing, etc.), which is the one built on G's kernel, which knows how to reach the Internet. Tada! Picture time: (OIC) +----Internal Computer-------------------------+ | | abaco: open /net.alt/tcp/clone | | | | | | import tcp!192.168.1.1!9fs /net.alt (devmnt) | | | bind '#I' /net | | | #I : Kernel IP stack (192.168.1.2) | | | #l : Kernel ethernet driver |<---------+ +----------------------------------------------+ | | +----Gateway------------------------------+ | | exportfs -a -r /net | | | | | | bind '#I' /net | | | #I : Kernel IP stack (192.168.1.1) | | | (4.2.2.2) | | | #l : Kernel ethernet driver (ether0) |<---Ethernet---+ | (ether1) |<---Ethernet----->(Internet) +-----------------------------------------+ This works perfectly for making connections: IC's IP stack is aware only of devmnt requests, and G's IP stack is aware of some trafic to&from a normal process called exportfs, and that that process happens to be making network requests via #I bound at /net. The beauty of this design is just how well it works, everywhere, for everything you'd ever want. Now, suppose IC goes to listen on TCP:80, by opening /net.alt/tcp/clone. The same flow of events happen, and to a certain extent, G's network stack thinks that the exportfs program (running on G) is listening on TCP:80. exportfs dutifully copies the /net data back to its client. Naturally, if another program on G were already listening on TCP:80, or the same program (possibly exportfs) attempted to listen twice (if, say, OIC played the same game and also tried to listen on G's TCP:80), it would be told that the port was busy. This error would be carried back along the exportfs path just as any other. So as you see, there is no need to take care of "which packet belongs to which server" since there can, of course, be only one server listening on TCP:80. That server is running on G, and behaves like any other program. That it just so happens to be an exportfs program, relaying data to and from another computer, is immaterial. This also works for FTP, and UDP, and ESP (which are notorious problems in the NAT world), and for IL, and for IPv6, and for ethernet frames (!), and ... you get the idea. It does this with no special tools, no complex code, and a very minimal per-connection overhead (just the IC and G kernels and G's exportfs tracking a file descriptor). There are no connection tracking tables anywhere in this design. There are just normal IP requests over normal ethernet frames, and a few more TCP (or IL) connections transporting 9P data. > On a UNIX clone, or on Windows, because there is exactly one TCP/IP > protocol stack in usual setups no two programs can bind to the same port > at the same time. I thought Plan 9's approach eliminated that by keeping > a distinct instance of the stack for each imported /net. There can, in fact, be multiple IP stacks on a Plan 9 box, bound to multiple Ethernet cards, as you claim. (In fact, one can import another computer's ethernet cards and run snoopy, or build a network stack, using those instead.) I don't think that's relevant to the example at hand, though, as should be clear from the above. --nwf;