From: Richard Salz > Any view on this? > https://www.microsoft.com/en-us/research/publication/a-fork-in-the-road/ Yes. First, I dislike the presentation of the paper. From the pithy title to snarky section headings to the overly informal register of the writing, I think the authors did themselves a disservice writing in a style that is far too colloquial. The argument is presented with too much editorial comment, as if it's trying to be "edgy" or something. I find that off-putting and annoying at best. But stylistic issues aside, the substance of the paper is largely on point. The fact is that, for better or worse, fork() has not aged gracefully into the modern age of multi-threaded programs, big graphical applications, and the like, and the problems the authors point out are very, very real. An argument can be made along the lines of, "well, just don't write programs that way..." but the universe of interesting and useful userspace programs is now large enough that I'm afraid we're well inside the event horizon of increasing entropy. It's interesting that they make an oblique reference to Austin's dissertation, or at least one of the papers that came out of his work (Clements et al) but they don't go the full way in exploring the implications. On Fri, Apr 12, 2019 at 10:51 AM Noel Chiappa wrote: > Having read this, and seen the subsequent discussion, I think both sides > have > good points. > > What I perceive to be happening is something I've described previously, but > never named, which is that as a system scales up, it can be necessary to > take > one subsystem which did two things, and split it up so there's a custom > subsystem for each. > > I've seen this a lot in networking; I've been trying to remember some of > the > examples I've seen, and here's the best one I can come up with at the > moment: > having the routing track 'unique-ID network interface names' (i.e. > interface > 'addresses') - think 48-bit IEEE interface IDs' - directly. In a small > network, this works fine for routing traffic, and as a side-benefit, gives > you > mobility. Doesn't scale, though - you have to build an 'interface ID to > location name mapping system', and use 'location names' (i.e. 'addresses') > in > the routing. > > So classic Unix 'fork' does two things: i) creates a new process, and ii) > replicates > the environment/etc of an existing process. (In early Unix, the latter was > pretty > simple, but as the paper points out, it has now become a) complex and b) > expensive.) > > I think the answer has to include decomposing the functionality of old > fork() > into several separate sub-primitives (albeit not all necessarily directly > accessible to the user): a new-process primitive, which can be bundled > with a > number of different alternatives (e.g. i) exec(), ii) environment > replication, > iii) address-space replication, etc) - perhaps more than one at once. > This is approximately what was done in Akaros. The observation, in the MSR paper and elsewhere, that fork() is a poor abstraction because it tries to do too much and doesn't interact well with threads (which Akaros was all about) is essentially correct and created undue burdens on the system's implementors. The solution there was to split process creation into two steps: creation proper, and marking a process runnable for the first time. This gave a parent an opportunity to create a child process and then populate its file descriptors, environment variables, and so forth before setting it loose on the system: one got much of the elegance of the fork() model, but without the bad behavior. The critical observation is that a hard distinction between fork()/exec() on one side and spawn() on the other as the only two possible designs is unnecessary; an intermediate step in the spawn() model preserving the two-step nature of the fork()/exec() model yields many of the benefits of both, without many of the problems of one or the other. So that shell would want a form of fork() which bundled in i) and ii), but > large applications might want something else. And there might be several > variants of ii), e.g. one might replicate only environment variables, > another > might add I/O channels, etc. > > In a larger system, there's just no 'one size fits all' answer, I think. > This is, I think, the crux of the argument: Unix fork, as introduced on the PDP-7, wasn't designed for "large" systems. I'd be curious to know how much intention was behind the consequences of fork()'s behavior were known in advance. As Dennis Ritchie's document on early Unix history (well known to most of the denizens of this list) pointed out, the implementation was an expedient, and the construct was loosely based on prior art (Genie etc). Was the idea of the implicit dup()'ing of open file descriptors something that people thought about consciously at the time? Certainly, once it was there, people found out that they could exploit it in clever ways for e.g. IO redirection, pipes (which came later, of course), etc, but I wonder to what extent that was discovered as opposed to being an explicit design objective. Put another way, the thought process may have been along the lines of, "look at the neat properties that fall out of this thing we've already got..." as opposed to, "we designed this thing to have all these neat properties..." Much of the current literature and extant course material seems to take the latter tack, but it's not at all clear (to me, anyway) that that's an accurate reflection of the historical reality. I briefly mentioned Clements's dissertation above. In essence, the scalability commutativity rule says that interfaces that commute scale better than those that do not because they do not _require_ serialization, so they can be parallelized etc. Many of the algorithms in early Unix do not commute: the behavior of returning the lowest available number when allocating a file descriptor is an example (consider IO redirection here and the familiar sequence, "close(1), open("output", O_WRONLY)": this doesn't work if one does the close after the open, etc). But fork() and exec() might be the ultimate example. Obviously if I exec() before I fork() my semantics are very, very different, but more generally the commutativity must be considered in some context: operations on file descriptors don't commute with one another, but they do commute with, say, memory writes. On the other hand, fork() doesn't commute with much of anything at all. The early Unix algorithms are so incredibly simple (one can easily imagine the loop over a process's file descriptor table, for example), but one can't help but wonder if there was any sense of what the consequences of the details of those algorithms leaking out to user programs would be 50 years later. Surely not? - Dan C.