* [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-08 3:38 Lucio De Re
2018-10-08 4:29 ` Digby R.S. Tarvin
2018-10-08 8:09 ` Nils M Holm
0 siblings, 2 replies; 89+ messages in thread
From: Lucio De Re @ 2018-10-08 3:38 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
On 10/8/18, Digby R.S. Tarvin <digbyt42@gmail.com> wrote:
>
> So the question is... is plan9 still lean and mean enough to fit onto a
> machine with a 64K address space? Doing a port would certainly provide
> plenty of opportunity to tinker with the lights and switches on front
> panel, and if it the port was initially limited to being a CPU server,
> there would be no need to worry about displays and mass storage.... just
> the compiler back end and low level kernel support.
>
You really must be thinking of Inferno, native, running in a host with
1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
Even MPM won't cut it, I don't think.
Lucio.
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 3:38 [9fans] PDP11 (Was: Re: what heavy negativity!) Lucio De Re @ 2018-10-08 4:29 ` Digby R.S. Tarvin 2018-10-08 7:20 ` hiro 2018-10-08 8:12 ` Nils M Holm 2018-10-08 8:09 ` Nils M Holm 1 sibling, 2 replies; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-08 4:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2290 bytes --] A native Inferno port would certainly be a lot easier, but I think you might be a bit pessimistic about would can fit into a 64K address space machine. The 11/70 certainly managed to run a very respectable V7 Unix supporting 20-30 simultaneous active users in its day, and I wouldn't have thought plan 9 arriving about a decade later, would have been hugely bigger than V7 Unix. I recall a demo of Plan9 (I think it also included the source) being given by Rob Pike at UNSW which he carried on a 1.44Mb floppy disc. By its open source release in 2002 the distribution was 65MB The smallest Linux system I have used recently had 256K RAM and 512K flash. A rather stripped down busybox based system, but it did include a full TCP/IP stack and a web server. Thats comparable to a PDP11 except for the limitation on the largest individual process. Bear in mind that 16 bit executables are smaller, and whilst the 11/70 had a 64Kb address space, physical memory could be somewhat larger, and an individual process could have 128K of memory is using separate instruction and data space. I am used to thinking of Plan9 as very compact, but I havn't really looked to see if it has grown much since the 80s, and perhaps it is only next to the astronomical expansion of other systems that it still looks small. It would be an interesting exercise to find out. It would be an interesting thing to try, if only to get a better feel for how compact Plan9 actually is ... DigbyT On Mon, 8 Oct 2018 at 14:38, Lucio De Re <lucio.dere@gmail.com> wrote: > On 10/8/18, Digby R.S. Tarvin <digbyt42@gmail.com> wrote: > > > > So the question is... is plan9 still lean and mean enough to fit onto a > > machine with a 64K address space? Doing a port would certainly provide > > plenty of opportunity to tinker with the lights and switches on front > > panel, and if it the port was initially limited to being a CPU server, > > there would be no need to worry about displays and mass storage.... just > > the compiler back end and low level kernel support. > > > You really must be thinking of Inferno, native, running in a host with > 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM. > Even MPM won't cut it, I don't think. > > Lucio. > [-- Attachment #2: Type: text/html, Size: 2769 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 4:29 ` Digby R.S. Tarvin @ 2018-10-08 7:20 ` hiro 2018-10-08 12:03 ` Charles Forsyth 2018-10-08 8:12 ` Nils M Holm 1 sibling, 1 reply; 89+ messages in thread From: hiro @ 2018-10-08 7:20 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs saving every bit of memory has costs in coding, the pressure wasn't as strong any more. the earned flexibility can be used for more elegant design. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 7:20 ` hiro @ 2018-10-08 12:03 ` Charles Forsyth 2018-10-08 17:20 ` hiro 0 siblings, 1 reply; 89+ messages in thread From: Charles Forsyth @ 2018-10-08 12:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 246 bytes --] Ideally, anyway. On Mon, 8 Oct 2018 at 11:20, hiro <23hiro@gmail.com> wrote: > saving every bit of memory has costs in coding, the pressure wasn't as > strong any more. > the earned flexibility can be used for more elegant design. > > [-- Attachment #2: Type: text/html, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 12:03 ` Charles Forsyth @ 2018-10-08 17:20 ` hiro 2018-10-08 21:55 ` Digby R.S. Tarvin 0 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-08 17:20 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs i should have said could, not can :) ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 17:20 ` hiro @ 2018-10-08 21:55 ` Digby R.S. Tarvin 2018-10-08 23:03 ` Dan Cross 0 siblings, 1 reply; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-08 21:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2075 bytes --] Does anyone know what platform Plan9 was initially implemented on? My guess is that there is no reason in principle that it could not fit comfortably into the constraints of a PDP11/70, but if the initial implementation was done targeting a machine with significantly more resources, it would be easy to make design decisions that would be entirely incompatible. Certainly Richard Millar's comment suggests that might be the case. If it is heavily dependent on VM, then the necessary rewrite is likely to be substantial. I'm not sure how the kernel design has changed since the first release. The earliest version I have is the release I bought through Harcourt Brace back in 1995. But I won't be home till December so it will be a while before I can look at it, and probably won't have time to experiment before then in any case. For what it is worth, I don't think the embarrassment of riches presented to programmers by current hardware has tended to produce more elegant designs. If more resources resulted in elegance, Windows would be a thing of beauty. Perhaps Plan9 is an exception. It certainly excels in elegance and design simplicity, even if it does turn out to be more resource hungry than I imagined. I will admit that the evils of excessively constrained environments are generally worse in terms of coding elegance - especially when it leads to overlays and self modifying code. PDP11's don't support virtual memory, so there doesn't seem any elegant way to overcome that fundamental limitation on size of a singe executable. So I don't think it i would be worth a substantial rewrite to get it going. It is a shame that there don't seem to have been any more powerful machines with a comparably elegant architecture and attractive front panel :) It is sounding like Inferno is going to be the more practical option. I believe gcc can still generate PDP-11 code, so it shouldn't be too hard to try. DigbyT On Tue, 9 Oct 2018 at 04:53, hiro <23hiro@gmail.com> wrote: > i should have said could, not can :) > > [-- Attachment #2: Type: text/html, Size: 2505 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 21:55 ` Digby R.S. Tarvin @ 2018-10-08 23:03 ` Dan Cross 2018-10-09 0:14 ` Bakul Shah 2018-10-09 3:08 ` Digby R.S. Tarvin 0 siblings, 2 replies; 89+ messages in thread From: Dan Cross @ 2018-10-08 23:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 5901 bytes --] On Mon, Oct 8, 2018 at 6:25 PM Digby R.S. Tarvin <digbyt42@gmail.com> wrote: > Does anyone know what platform Plan9 was initially implemented on? > My understanding is that the earliest experiments involved a VAX, but development quickly shifted to MIPS and 68020-based machines (the "gnot" was, IIRC, a 68020-based computer). My guess is that there is no reason in principle that it could not fit > comfortably into the constraints of a PDP11/70, but if the initial > implementation was done targeting a machine with significantly more > resources, it would be easy to make design decisions that would be entirely > incompatible. > I find this unlikely. The PDP-11, while a respectable machine for its day, required too many tradeoffs to make it attractive as a development platform for a next-generation research operating system in the late 1980s: be it electrical power consumption vs computational oomph or dollar cost vs available memory, the -11 had fallen from the attractive position it held a decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it COULD run on a PDP-11 was possible in the early days, but I can't see any reason one would have WANTED to do so: particularly as part of the impetus behind plan9 was to exploit advances in contemporary hardware: lower-cost, higher-performance, RISC-based multiprocessors; ubiquitous networking; common high-resolution bitmapped graphical displays; even magneto-optical storage (one bet that didn't pan out); etc. Certainly Richard Millar's comment suggests that might be the case. If it > is heavily dependent on VM, then the necessary rewrite is likely to be > substantial. > As a demonstration project, getting a slimmed-down plan9 kernel to boot on a PDP-11/70-class machine would be a nifty hack, but it would be quite a tour de force and most likely the result would not be generally useful. I think that, as has been suggested, the conceptual simplicity of plan9 paradoxically means that resource utilization is higher than it might otherwise be on either a more elaborate OR more constrained system (such as one targeting e.g. the PDP-11). When you can afford not to care about a few bytes here or a couple of cycles there and you're not obsessed with scraping out the very last drop of performance, you can employ a simpler (some might say 'naive') algorithm or data structure. I'm not sure how the kernel design has changed since the first release. The > earliest version I have is the release I bought through Harcourt Brace back > in 1995. But I won't be home till December so it will be a while before I > can look at it, and probably won't have time to experiment before then in > any case. > The kernel evolved substantially over its life; something like doubling in size. I remember vaguely having a discussion with Sape where he said he felt it had grown bloated. That was probably close to 20 years ago now. For what it is worth, I don't think the embarrassment of riches presented > to programmers by current hardware has tended to produce more elegant > designs. If more resources resulted in elegance, Windows would be a thing > of beauty. Perhaps Plan9 is an exception. It certainly excels in elegance > and design simplicity, even if it does turn out to be more resource hungry > than I imagined. I will admit that the evils of excessively constrained > environments are generally worse in terms of coding elegance - especially > when it leads to overlays and self modifying code. > plan9 is breathtakingly elegant, but this is in no small part because as a research system it had the luxury of simply ignoring many thorny problems that would have marred that beauty but that the developers chose not to tackle. Some of these problems have non-trivial domain complexity and, while "modern" systems are far too complex by far, that doesn't mean that all solutions can be recast as elegantly simple pearls in the plan9 style. Whether we like those problems or not, they exist and real-world solutions have to at least attempt to deal with them (I'm looking at you, web x.0 for x >= 2...but curse you you aren't alone). PDP11's don't support virtual memory, so there doesn't seem any elegant way > to overcome that fundamental limitation on size of a singe executable. > No, they do: there is paging hardware on the PDP-11 that's used for address translation and memory protection (recall that PDP-11 kept the kernel at the top of the address space, the per-process "user" structure is at a fixed virtual address, and the system could trap a bus error and kill a misbehaving user-space process). What they may not support is the sort of trap handling that would let them recover from a page fault (though I haven't looked) and in any case, the address space is too small to make demand-paging with reclamation cost-effective. > So I don't think it i would be worth a substantial rewrite to get it > going. It is a shame that there don't seem to have been any more powerful > machines with a comparably elegant architecture and attractive front panel > :) > An attractive front panel for nearly any machine is just a soldering iron, LEDs and some logic chips away. As far as elegant architectures, some are very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0 machines can be had a reasonable prices, and POWER is kind of cool. I know I shouldn't, but I have a soft spot for ARM. It is sounding like Inferno is going to be the more practical option. I > believe gcc can still generate PDP-11 code, so it shouldn't be too hard to > try. > Sounds like a nifty hack. Fitting Dis into a 64k/64k split I/D space is the challenge. - Dan C. On Tue, 9 Oct 2018 at 04:53, hiro <23hiro@gmail.com> wrote: > >> i should have said could, not can :) >> >> [-- Attachment #2: Type: text/html, Size: 7819 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 23:03 ` Dan Cross @ 2018-10-09 0:14 ` Bakul Shah 2018-10-09 1:34 ` Christopher Nielsen ` (2 more replies) 2018-10-09 3:08 ` Digby R.S. Tarvin 1 sibling, 3 replies; 89+ messages in thread From: Bakul Shah @ 2018-10-09 0:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross <crossd@gmail.com> wrote: > > plan9 is breathtakingly elegant, but this is in no small part because as a > research system it had the luxury of simply ignoring many thorny problems > that would have marred that beauty but that the developers chose not to > tackle. Some of these problems have non-trivial domain complexity and, > while "modern" systems are far too complex by far, that doesn't mean that > all solutions can be recast as elegantly simple pearls in the plan9 style. One thing I have mused about is recasting plan9 as a microkernel and pushing out a lot of its kernel code into user mode code. It is already half way there -- it is basically a mux for 9p calls, low level device drivers, VM support & some process related code. Such a redesign can be made more secure and more resilient. The kind of problems you mention are easier to fix in user code. Different application domains may have different needs which are better handled as optional user mode components. Said another way, keep the good parts of the plan9 design and reachitect/reimplement the kernel + essential drivers/usermode daemons. This is unlikely to happen (without some serious funding) but still fun to think about! If done, this would be a more radical departure than Oberon-7 compared to Oberon but in the same spirit. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 0:14 ` Bakul Shah @ 2018-10-09 1:34 ` Christopher Nielsen 2018-10-09 3:28 ` Lucio De Re 2018-10-09 17:45 ` Lyndon Nerenberg 2 siblings, 0 replies; 89+ messages in thread From: Christopher Nielsen @ 2018-10-09 1:34 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1687 bytes --] On Mon, Oct 8, 2018, 17:15 Bakul Shah <bakul@bitblocks.com> wrote: > On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross <crossd@gmail.com> wrote: > > > > plan9 is breathtakingly elegant, but this is in no small part because as > a > > research system it had the luxury of simply ignoring many thorny problems > > that would have marred that beauty but that the developers chose not to > > tackle. Some of these problems have non-trivial domain complexity and, > > while "modern" systems are far too complex by far, that doesn't mean that > > all solutions can be recast as elegantly simple pearls in the plan9 > style. > > One thing I have mused about is recasting plan9 as a > microkernel and pushing out a lot of its kernel code into user > mode code. It is already half way there -- it is basically a > mux for 9p calls, low level device drivers, VM support & some > process related code. Such a redesign can be made more secure > and more resilient. The kind of problems you mention are > easier to fix in user code. Different application domains may > have different needs which are better handled as optional user > mode components. > > Said another way, keep the good parts of the plan9 design and > reachitect/reimplement the kernel + essential drivers/usermode > daemons. This is unlikely to happen (without some serious > funding) but still fun to think about! If done, this would be > a more radical departure than Oberon-7 compared to Oberon but > in the same spirit. > I've mused about that also. My problem has been finding the time. I think it would be a worthwhile project. Not entirely unrelated, I've been tinkering with seL4. > [-- Attachment #2: Type: text/html, Size: 2358 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 0:14 ` Bakul Shah 2018-10-09 1:34 ` Christopher Nielsen @ 2018-10-09 3:28 ` Lucio De Re 2018-10-09 8:23 ` hiro ` (2 more replies) 2018-10-09 17:45 ` Lyndon Nerenberg 2 siblings, 3 replies; 89+ messages in thread From: Lucio De Re @ 2018-10-09 3:28 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote: > > One thing I have mused about is recasting plan9 as a > microkernel and pushing out a lot of its kernel code into user > mode code. It is already half way there -- it is basically a > mux for 9p calls, low level device drivers, VM support & some > process related code. Such a redesign can be made more secure > and more resilient. The kind of problems you mention are > easier to fix in user code. Different application domains may > have different needs which are better handled as optional user > mode components. > There are religious reasons not to go there and, perhaps not very widely advertised, Minix-3 already does that, although I confess that all my best efforts have not yet created the space for my own experimentation with it. You won't believe what kind of madnesses I need to deal with to consume my few and short remaining years - I'm with Dan in cursing the modern technological trends, but one of these days I'm going to lock myself in someone's attic or basement (or a prison cell, if that's what it takes, a monastery, whatever...) with my Galaxy S4 and a dated Riff-box - is that really what this black object is called? - and build an OS from the accumulated wisdom of the last forty years. It will probably look more like MS-DOS, though! :-( > Said another way, keep the good parts of the plan9 design and > reachitect/reimplement the kernel + essential drivers/usermode > daemons. This is unlikely to happen (without some serious > funding) but still fun to think about! If done, this would be > a more radical departure than Oberon-7 compared to Oberon but > in the same spirit. > Surely, the targets for experimentation should be the ubiquitous smart-mobile and the insane arithmetic power of GPUs? All neatly networked over SDLC (or HDLC: AoH, anyone, for persistent storage?). Lucio. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 3:28 ` Lucio De Re @ 2018-10-09 8:23 ` hiro 2018-10-09 9:45 ` Ethan Gardener 2018-10-10 7:32 ` Giacomo Tesio 2 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-09 8:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs we already have a lot of user filesystems. feel free to add other useful ones. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 3:28 ` Lucio De Re 2018-10-09 8:23 ` hiro @ 2018-10-09 9:45 ` Ethan Gardener 2018-10-09 17:50 ` Bakul Shah 2018-10-10 7:32 ` Giacomo Tesio 2 siblings, 1 reply; 89+ messages in thread From: Ethan Gardener @ 2018-10-09 9:45 UTC (permalink / raw) To: 9fans On Tue, Oct 9, 2018, at 4:28 AM, Lucio De Re wrote: > On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote: > > One thing I have mused about is recasting plan9 as a > > microkernel and pushing out a lot of its kernel code into user > > mode code. > > > There are religious reasons not to go there I'm trying to forget all the religious beliefs I once held with regard to computers, but I've had these lines in my head for a long time, and probably won't get a better opportunity to post them: One day, Uriel met a man who explained very convincingly that the Plan 9 kernel is a microkernel. On another day, Uriel met a man who explained very convincingly that the Plan 9 kernel is a macrokernel. Uriel was enlightened. Based on a true story. ;) > You won't believe what kind of madnesses I need to deal with to > consume my few and short remaining years - I'm with Dan in cursing the > modern technological trends, but one of these days I'm going to lock > myself in someone's attic or basement (or a prison cell, if that's > what it takes, a monastery, whatever...) with my Galaxy S4 and a dated > Riff-box - is that really what this black object is called? - and > build an OS from the accumulated wisdom of the last forty years. It > will probably look more like MS-DOS, though! :-( I've started already, but I keep getting sidetracked by my need for entertainment, which often comes down to spending my energies on things which don't require such deep design work. I'm hoping it'll get easier as my health improves; I'm still too stressed too often. The trouble with this stress is I forget my goals, which are things I've learned from Plan 9 and other conclusions I've come to. -- Progress might have been all right once, but it has gone on too long -- Ogden Nash ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 9:45 ` Ethan Gardener @ 2018-10-09 17:50 ` Bakul Shah 2018-10-09 18:57 ` Ori Bernstein 0 siblings, 1 reply; 89+ messages in thread From: Bakul Shah @ 2018-10-09 17:50 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 542 bytes --] > On Oct 9, 2018, at 2:45 AM, Ethan Gardener <eekee57@fastmail.fm> wrote: > > One day, Uriel met a man who explained very > convincingly that the Plan 9 kernel is a microkernel. > On another day, Uriel met a man who explained very > convincingly that the Plan 9 kernel is a macrokernel. > Uriel was enlightened. Exactly! No point in being scared by labels! I am really only talking about distilling plan9 further. At least as a thought experiment. Isn’t it more fun to discuss this than all the “heavy negativity”? :-) [-- Attachment #2: Type: text/html, Size: 1273 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 17:50 ` Bakul Shah @ 2018-10-09 18:57 ` Ori Bernstein 0 siblings, 0 replies; 89+ messages in thread From: Ori Bernstein @ 2018-10-09 18:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Tue, 9 Oct 2018 10:50:08 -0700 Bakul Shah <bakul@bitblocks.com> wrote: > Exactly! No point in being scared by labels! I am really > only talking about distilling plan9 further. At least as a > thought experiment. > > Isn’t it more fun to discuss this than all the “heavy > negativity”? :-) It's much better with patches. -- Ori Bernstein <ori@eigenstate.org> ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 3:28 ` Lucio De Re 2018-10-09 8:23 ` hiro 2018-10-09 9:45 ` Ethan Gardener @ 2018-10-10 7:32 ` Giacomo Tesio 2 siblings, 0 replies; 89+ messages in thread From: Giacomo Tesio @ 2018-10-10 7:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Il giorno mar 9 ott 2018 alle ore 05:33 Lucio De Re <lucio.dere@gmail.com> ha scritto: > > On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote: > > > > One thing I have mused about is recasting plan9 as a > > microkernel and pushing out a lot of its kernel code into user > > mode code. It is already half way there -- it is basically a > > mux for 9p calls, low level device drivers, > > > There are religious reasons not to go there Indeed, as an heretic, one of the first things I did with Jehanne was to move the console filesystem out of kernel. Then I moved several syscalls into userspace. Or turned them to files or to operation on existing files. More syscall/kernel services will move to user space as I'll have time to hack it again. You know... heretics ruin everything! I'm not going to turn Jehanne to a microkernel, but I'm looking for the simplest possible set of kernel abstractions that can support a distributed operating system able to replace the mainstream Web+OS mess. You know... heretics are crazy, too! Giacomo ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 0:14 ` Bakul Shah 2018-10-09 1:34 ` Christopher Nielsen 2018-10-09 3:28 ` Lucio De Re @ 2018-10-09 17:45 ` Lyndon Nerenberg 2018-10-09 18:49 ` hiro 2018-10-09 19:09 ` Bakul Shah 2 siblings, 2 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 17:45 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg Bakul Shah writes: > One thing I have mused about is recasting plan9 as a > microkernel and pushing out a lot of its kernel code into user > mode code. It is already half way there -- it is basically a > mux for 9p calls, low level device drivers, VM support & some > process related code. Somewhat related to this ... after reading some papers on TCP-in-user-space implementations, I've been thinking about how an interface that supported fast/secure page flipping between the kernel and process address space would change how we do things. E.g. right now Plan 9 suffers from a *lot* of data copying between the kernel and processes, and between processes themselves. If we could eliminate most of that copying, things would get a lot faster. Dealing with the security issues isn't trivial, but the programmer time going into eeking out the last bit of I/O throughput of the current scheme could be redirected. If it works, this would reduce the kernel back to handling process/memory management, and talking to the hardware. Not a micro-kernel, but just as good from a practical standpoint. And no, this wouldn't get us to running on the 11/70. But by taking advantage of modern large virtual memory spaces by using page flipping, we could cut down on physical memory usage in the kernel. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 17:45 ` Lyndon Nerenberg @ 2018-10-09 18:49 ` hiro 2018-10-09 19:14 ` Lyndon Nerenberg ` (2 more replies) 2018-10-09 19:09 ` Bakul Shah 1 sibling, 3 replies; 89+ messages in thread From: hiro @ 2018-10-09 18:49 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > E.g. right now Plan 9 suffers from a *lot* of data copying between > the kernel and processes, and between processes themselves. Huh? What exactly do you mean? Can you describe the scenario and the measurements you made? > If we could eliminate most of that copying, things would get a lot faster. Which things would get faster? > Dealing with the security issues isn't trivial what security issues? ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 18:49 ` hiro @ 2018-10-09 19:14 ` Lyndon Nerenberg 2018-10-09 22:05 ` erik quanstrom 2018-10-10 10:42 ` Ethan Gardener 2018-10-09 19:23 ` Lyndon Nerenberg 2018-10-09 22:42 ` Dan Cross 2 siblings, 2 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 19:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > Huh? What exactly do you mean? Can you describe the scenario and the > measurements you made? The big one is USB. disk/radio->kernel->user-space-usbd->kernel->application. Four copies. I would like to start playing with software defined radio on Plan 9, but that amount of data copying is going to put a lot of pressure on the kernel to keep up. UNIX/Linux suffers the same copy bloat, and it's having trouble keeping up, too. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:14 ` Lyndon Nerenberg @ 2018-10-09 22:05 ` erik quanstrom 2018-10-11 17:54 ` Lyndon Nerenberg 2018-10-10 10:42 ` Ethan Gardener 1 sibling, 1 reply; 89+ messages in thread From: erik quanstrom @ 2018-10-09 22:05 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/html, Size: 219 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 22:05 ` erik quanstrom @ 2018-10-11 17:54 ` Lyndon Nerenberg 2018-10-11 18:04 ` Kurt H Maier ` (2 more replies) 0 siblings, 3 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 17:54 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg > I have been able to copy 1 GiB/s to userspace from an nvme device. I should > think a radio should be no problem. The problem is when you have multiple decoder blocks implemented as individual processes (i.e. the GNU radio model). Once you have everything debugged, you can put it into a single threaded process and eliminate the copy overhead. But it's completely impractical to prototype or debug real applications this way. And it's the prototyping case I'm interested in here. So I'm *curious* to know if page flipping a 'protocol buffer' like object between processes provides an optimization over copying through the kernel. Not so much for the speed aspect, but to free up CPU cycles that can be devoted to actual SDR work. Since when did curiosity become a capital crime? Oh, wait, that was January 20, 2017. My bad. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 17:54 ` Lyndon Nerenberg @ 2018-10-11 18:04 ` Kurt H Maier 2018-10-11 19:23 ` hiro 2018-10-11 19:26 ` Skip Tavakkolian 2 siblings, 0 replies; 89+ messages in thread From: Kurt H Maier @ 2018-10-11 18:04 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Thu, Oct 11, 2018 at 10:54:22AM -0700, Lyndon Nerenberg wrote: > > Since when did curiosity become a capital crime? Oh, wait, that > was January 20, 2017. My bad. Turns out it's not, so you can climb down off your cross. It's just that it helps to be a little clearer about your meaning, that's all. Otherwise you might do something embarassing, like posting SAS controller code into an NVMe discussion. khm ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 17:54 ` Lyndon Nerenberg 2018-10-11 18:04 ` Kurt H Maier @ 2018-10-11 19:23 ` hiro 2018-10-11 19:24 ` hiro 2018-10-11 19:26 ` Skip Tavakkolian 2 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-11 19:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > through the kernel. Not so much for the speed aspect, but to free > up CPU cycles that can be devoted to actual SDR work. those 2x25kHz channels would hardly need many cycles. rather it's just a matter of selecting the right CPU that can actually do the FFT with some software floating point implementation :) i don't see memory bandwidth or even random memory access latency affecting this scenario in the slightest. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:23 ` hiro @ 2018-10-11 19:24 ` hiro 2018-10-11 19:25 ` hiro 0 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-11 19:24 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs i meant without having to resort to some soft fp. On 10/11/18, hiro <23hiro@gmail.com> wrote: >> through the kernel. Not so much for the speed aspect, but to free >> up CPU cycles that can be devoted to actual SDR work. > > those 2x25kHz channels would hardly need many cycles. rather it's just > a matter of selecting the right CPU that can actually do the FFT with > some software floating point implementation :) > > i don't see memory bandwidth or even random memory access latency > affecting this scenario in the slightest. > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:24 ` hiro @ 2018-10-11 19:25 ` hiro 0 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-11 19:25 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs We also have CPU extensions that can help make fast FFT, because it's such a generic problem, and in the worst case you can use fpgas, asics, in any case dedicated hardware. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 17:54 ` Lyndon Nerenberg 2018-10-11 18:04 ` Kurt H Maier 2018-10-11 19:23 ` hiro @ 2018-10-11 19:26 ` Skip Tavakkolian 2018-10-11 19:39 ` Lyndon Nerenberg 2 siblings, 1 reply; 89+ messages in thread From: Skip Tavakkolian @ 2018-10-11 19:26 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1094 bytes --] I was able to use dump1090 (same author as redis) to get ADSB data reliably on RPi/Linux a while back. On Thu, Oct 11, 2018, 10:54 AM Lyndon Nerenberg <lyndon@orthanc.ca> wrote: > > I have been able to copy 1 GiB/s to userspace from an nvme device. I > should > > think a radio should be no problem. > > The problem is when you have multiple decoder blocks implemented > as individual processes (i.e. the GNU radio model). Once you have > everything debugged, you can put it into a single threaded process > and eliminate the copy overhead. But it's completely impractical > to prototype or debug real applications this way. And it's the > prototyping case I'm interested in here. > > So I'm *curious* to know if page flipping a 'protocol buffer' like > object between processes provides an optimization over copying > through the kernel. Not so much for the speed aspect, but to free > up CPU cycles that can be devoted to actual SDR work. > > Since when did curiosity become a capital crime? Oh, wait, that > was January 20, 2017. My bad. > > --lyndon > > [-- Attachment #2: Type: text/html, Size: 1405 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:26 ` Skip Tavakkolian @ 2018-10-11 19:39 ` Lyndon Nerenberg 2018-10-11 19:44 ` Skip Tavakkolian 0 siblings, 1 reply; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 19:39 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg > I was able to use dump1090 (same author as redis) to get ADSB data reliably > on RPi/Linux a while back. I have a pair of Flightbox ADS-B receivers I am using as references. While mostly reliable, they can and do stutter along with the rest of the alternatives on occasion. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:39 ` Lyndon Nerenberg @ 2018-10-11 19:44 ` Skip Tavakkolian 2018-10-11 19:47 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: Skip Tavakkolian @ 2018-10-11 19:44 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 454 bytes --] I assumed you were using an RTL2832U (rtlsdr library). On Thu, Oct 11, 2018, 12:40 PM Lyndon Nerenberg <lyndon@orthanc.ca> wrote: > > I was able to use dump1090 (same author as redis) to get ADSB data > reliably > > on RPi/Linux a while back. > > I have a pair of Flightbox ADS-B receivers I am using as references. > While mostly reliable, they can and do stutter along with the rest > of the alternatives on occasion. > > --lyndon > > [-- Attachment #2: Type: text/html, Size: 744 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:44 ` Skip Tavakkolian @ 2018-10-11 19:47 ` Lyndon Nerenberg 2018-10-11 19:57 ` hiro 0 siblings, 1 reply; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 19:47 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg Skip Tavakkolian writes: > I assumed you were using an RTL2832U (rtlsdr library). I'm pretty sure they all do, under the hood. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:47 ` Lyndon Nerenberg @ 2018-10-11 19:57 ` hiro 2018-10-11 20:23 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-11 19:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs >> I assumed you were using an RTL2832U (rtlsdr library). > > I'm pretty sure they all do, under the hood. > > don't you need sending ability, too for AIS? ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:57 ` hiro @ 2018-10-11 20:23 ` Lyndon Nerenberg 0 siblings, 0 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 20:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > don't you need sending ability, too for AIS? No, a receive-only setup is very useful on a small boat. Where I would like to go with this is to take the decoded AIS data as input for "ARPA" style collision plots. I'm interested in the big boats sailing through the straight. They can't turn fast, and rarely change course. If I can derive their intentions, I can plot a path between them that requires the least amount of tacking. The big boats, in turn, have no interest in us little critters. They actively filter out the "class B" (I think that's the term) noise that are AIS transmissions from the small craft. Even if we hit them, we can't sink them, so they don't care about us. Therefore there is no incentive for small boats to transmit AIS. Unless you're trying to locate your buddies for a tie-up somewhere. (That can be a very valid reason for transmitting!) --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:14 ` Lyndon Nerenberg 2018-10-09 22:05 ` erik quanstrom @ 2018-10-10 10:42 ` Ethan Gardener 1 sibling, 0 replies; 89+ messages in thread From: Ethan Gardener @ 2018-10-10 10:42 UTC (permalink / raw) To: 9fans On Tue, Oct 9, 2018, at 8:14 PM, Lyndon Nerenberg wrote: > hiro writes: > > > Huh? What exactly do you mean? Can you describe the scenario and the > > measurements you made? > > The big one is USB. disk/radio->kernel->user-space-usbd->kernel->application. > Four copies. > > I would like to start playing with software defined radio on Plan > 9, but that amount of data copying is going to put a lot of pressure > on the kernel to keep up. UNIX/Linux suffers the same copy bloat, > and it's having trouble keeping up, too. References, please. Programmers are notoriously bad at determining the cause of performance problems. Examining the source will help to see if "copy bloat" is the actual problem. > > --lyndon > -- Progress might have been all right once, but it has gone on too long -- Ogden Nash ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 18:49 ` hiro 2018-10-09 19:14 ` Lyndon Nerenberg @ 2018-10-09 19:23 ` Lyndon Nerenberg 2018-10-09 19:34 ` hiro 2018-10-09 22:06 ` erik quanstrom 2018-10-09 22:42 ` Dan Cross 2 siblings, 2 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 19:23 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > > Dealing with the security issues isn't trivial > what security issues? Passing protocol buffer like objects around user space, that might affect how the kernel talks to hardware. E.g. IPsec offload into hardware. You don't want user-space messing with that sort of context, but you want to tag it with the data buffer as it gets passed up and down through the user/kernel gate. Practical page flipping needs a kernel-read-only context attached to the non-kernel user data part of the page. A quick solution is to pair pages, one half of which the kernel owns, the other being the data payload. But that't just a start. And that's all I'm saying: this might be an approach to a better/faster I/O paradigm, but it needs interested people to explore it ... --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:23 ` Lyndon Nerenberg @ 2018-10-09 19:34 ` hiro 2018-10-09 19:36 ` hiro ` (2 more replies) 2018-10-09 22:06 ` erik quanstrom 1 sibling, 3 replies; 89+ messages in thread From: hiro @ 2018-10-09 19:34 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs from what i see in linux people have been more than just exploring it, they've gone absolutely nuts. it makes everything complex, not just the fast path. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:34 ` hiro @ 2018-10-09 19:36 ` hiro 2018-10-09 19:40 ` Lyndon Nerenberg 2018-10-10 0:18 ` Dan Cross 2 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-09 19:36 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs also, if all you care about is throughput, i don't see how those 4 copies you identified makes a difference. especially with something slow like USB. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:34 ` hiro 2018-10-09 19:36 ` hiro @ 2018-10-09 19:40 ` Lyndon Nerenberg 2018-10-10 0:18 ` Dan Cross 2 siblings, 0 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 19:40 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > from what i see in linux people have been more than just exploring it, > they've gone absolutely nuts. it makes everything complex, not just > the fast path. And those are the Linux folks doing thier thing. The reading I'm doing right now is related to the pessimizations page flipping throws at the CPU caches. It looks scary ... --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:34 ` hiro 2018-10-09 19:36 ` hiro 2018-10-09 19:40 ` Lyndon Nerenberg @ 2018-10-10 0:18 ` Dan Cross 2018-10-10 5:45 ` hiro 2 siblings, 1 reply; 89+ messages in thread From: Dan Cross @ 2018-10-10 0:18 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 540 bytes --] On Tue, Oct 9, 2018 at 7:24 PM hiro <23hiro@gmail.com> wrote: > from what i see in linux people have been more than just exploring it, > they've gone absolutely nuts. it makes everything complex, not just > the fast path. > To whom are you responding? Your email is devoid of context, so it is not clear. However your statement appears to be based on an unstated assumption that there is a plan9 school of thought, and a Linux school of thought, and no other school of thought. If so, that is incorrect. - Dan C. [-- Attachment #2: Type: text/html, Size: 865 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 0:18 ` Dan Cross @ 2018-10-10 5:45 ` hiro 0 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-10 5:45 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs I was responding to lyndon's comment on certain "experiments" that should have to be done here, 2 messages up. But what he described sounded exactly like the zero-copying stuff that linux is trying to shove into everything. I have not made any statement about non-linux systems, and I'm not even saying these experiments couldn't be done on plan9, it's just that the linux people are way busier going down that path. On 10/10/18, Dan Cross <crossd@gmail.com> wrote: > On Tue, Oct 9, 2018 at 7:24 PM hiro <23hiro@gmail.com> wrote: > >> from what i see in linux people have been more than just exploring it, >> they've gone absolutely nuts. it makes everything complex, not just >> the fast path. >> > > To whom are you responding? Your email is devoid of context, so it is not > clear. > > However your statement appears to be based on an unstated assumption that > there is a plan9 school of thought, and a Linux school of thought, and no > other school of thought. If so, that is incorrect. > > - Dan C. > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:23 ` Lyndon Nerenberg 2018-10-09 19:34 ` hiro @ 2018-10-09 22:06 ` erik quanstrom 2018-10-10 6:24 ` Bakul Shah 1 sibling, 1 reply; 89+ messages in thread From: erik quanstrom @ 2018-10-09 22:06 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/html, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 22:06 ` erik quanstrom @ 2018-10-10 6:24 ` Bakul Shah 2018-10-10 13:58 ` erik quanstrom 0 siblings, 1 reply; 89+ messages in thread From: Bakul Shah @ 2018-10-10 6:24 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Oct 9, 2018, at 3:06 PM, erik quanstrom <quanstro@quanstro.net> wrote: > > with meltdown/Spectre mitigations in place, I would like to see evidence that flip is faster than copy. If your system is well balanced, you should be able to stream data as fast as memory allows[1]. In such a system copying things N times will reduce throughput by similar factor. It may be that plan9 underperforms so much this doesn't matter normally. But the reason I want this is to reduce latency to the first access, especially for very large files. With read() I have to wait until the read completes. With mmap() processing can start much earlier and can be interleaved with background data fetch or prefetch. With read() a lot more resources are tied down. If I need random access and don't need to read all of the data, the application has to do pread(), pwrite() a lot thus complicating it. With mmap() I can just map in the whole file and excess reading (beyond what the app needs) will not be a large fraction. The default assumption here seems to be that doing this will be very complicated and be as bad as on Linux. But Linux is not a good model of what to do and examples of what not to do are not useful guides in system design. There are other OSes such as the old Apollo Aegis (AKA Apollo/Domain), KeyKOS & seL4 that avoid copying[2]. Though none of this matters right now as we don't even have a paper design so please put down your clubs and swords :-) [1] See: https://code.kx.com/q/cloud/aws/benchmarking/ A single q process can ingest data at 1.9GB/s from a single drive. 16 can achieve 2.7GB/s, with theoretical max being 2.8GB/s. [2] Liedke's original L4 evolved into a provably secure seL4 and in the process it became very much like KeyKOS. Capability systems do pass around pages as protected objects and avoid copying. Sort of like how in a program you'd pass a huge array by reference and not by value to a function. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 6:24 ` Bakul Shah @ 2018-10-10 13:58 ` erik quanstrom 0 siblings, 0 replies; 89+ messages in thread From: erik quanstrom @ 2018-10-10 13:58 UTC (permalink / raw) To: 9fans > > with meltdown/Spectre mitigations in place, I would like to see evidence that flip is faster than copy. > > If your system is well balanced, you should be able to > stream data as fast as memory allows[1]. In such a system > copying things N times will reduce throughput by similar > factor. It may be that plan9 underperforms so much this > doesn't matter normally. sure. but flipping page tables is also not free. there is a huge cost in processor stalls, etc. spectre and meltdown mitigations make this worse as each page flip has to be accompanied by a complete pipeline flush or other costly mitigation. (not that this was cheap to begin with) it's also not an object to move data as fast as possible. the object is to do work as fast as possible. > [1] See: https://code.kx.com/q/cloud/aws/benchmarking/ > A single q process can ingest data at 1.9GB/s from a > single drive. 16 can achieve 2.7GB/s, with theoretical > max being 2.8GB/s. with my same crappy un-optimized nvme driver, i was able to hit 2.5-2.6 GiB/s with two very crappy nvme drives. (are you're numbers really GB rather than GiB?) i am sure i could scale that lineraly. there's plenty of memory bandwidth left, but i haven't got any more nvme. :-) similarly coraid built an appliance that did copying (due to cache) and hit 1 million 4k iops. this was in 2011 or so. but, so what. all this proves is that with copying or without, we can ingest enough data for even the most hungry programs. unless you have data that shows otherwise. :-) - erik ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 18:49 ` hiro 2018-10-09 19:14 ` Lyndon Nerenberg 2018-10-09 19:23 ` Lyndon Nerenberg @ 2018-10-09 22:42 ` Dan Cross 2 siblings, 0 replies; 89+ messages in thread From: Dan Cross @ 2018-10-09 22:42 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 4338 bytes --] On Tue, Oct 9, 2018 at 5:28 PM hiro <23hiro@gmail.com> wrote: > > E.g. right now Plan 9 suffers from a *lot* of data copying between > > the kernel and processes, and between processes themselves. > > Huh? What exactly do you mean? The current plan9 architecture relies heavily on copying data within a process between userspace and the kernel for e.g. IO. This should be well known to anyone who's rummaged around in the kernel, as it's pretty evident. Because of the simple system call and VM interfaces, things like scatter/gather IO or memory-mapped direct hardware access aren't really options. `iowritev`, for example, coalesces its arguments into a single buffer that it then pwrite()'s to its destination. Can you describe the scenario and the > measurements you made? > This is a different issue. I don't know if copying is as significant an overhead as Lyndon suggested, but there are plenty of slow code paths in plan9. For example, when we ported the plan9 network stack to Akaros, we made a number of enhancements that combined sped things up by 50% or greater. Most of these were pretty simple: optimizing checksum calculations, alignment of IP and TCP headers on natural word boundaries meaning that we could read an IP address with a 32-bit load (I think that one netted a gigabit increase in throughput), using optimized memcpy instead of memmove in performance critical code paths, etc. We went from about 7Gbps on a 10Gbps interface to saturating the NIC. Those measurements were made between dedicated test machines on a dedicated network using netperf. Drew Gallatin, now at Netflix working on FreeBSD's network stack, did most of the optimization work. If that experience in that one section of the kernel is any indicator, plan9 undoubtedly has lots of room for optimization in other parts of the system. Lots of aspects of the system were optimized for much smaller machines than are common now and many of those optimizations no longer make much sense on modern machines; the allocator is slow, for example, though very good at not wasting RAM. Compare to a vmem-style allocator, that can allocate any requested size in constant-time, but with up-to a factor of two waste of memory. Lots of plan9 code is also buggy, or at least racy: consider the seemingly random valued timeouts to "give other threads 5 seconds to get out" in ipselffree() and iplinkfree() before "deallocating" an Iplink/Ipself. Something like RCU, even a naive RCU, would be more robust here, particularly under heavy load. Device drivers are atrophied and often buggy, or at least susceptible to hardware bugs that are fixed by the vendor-provided drivers. When I put in the plan9 networks to support Akaros development, we ran into a bug in the i218 ethernet controller that caused the NIC to wedge. We got Geoff Collyer to fix the i82563 driver and we sent a patch to 9legacy, but it's symptomatic of an aging code base with a shrinking developer population. > If we could eliminate most of that copying, things would get a lot faster. > > Which things would get faster? > Presumably bulk data transfer between devices and the user portion of an address space. If copying were eliminated (or just reduced) these would certainly get fast*er*. Whether they would be sufficiently faster as to make a perceptible performance different to a real workload is another matter. > Dealing with the security issues isn't trivial > > what security issues? > Presumably the bread-and-butter security issues that arise whenever the user portion of an address space is being concurrently accessed by hardware. As a trivial example, imagine scheduling a DMA transfer from some device into a buffer in the user portion of an address space and then exit()'ing the process. What do you do with the pages the device was writing into? They had better be pinned in some way until the IO operation completes before they're reallocated to something else that isn't expecting it to be clobbered. I wouldn't be surprised if the raft of currently popular speculative execution bugs could be exacerbated by the kernel playing around with data in the user address space in a naive way. It doesn't look like plan9 has any serious mitigations for those. - Dan C. [-- Attachment #2: Type: text/html, Size: 5200 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 17:45 ` Lyndon Nerenberg 2018-10-09 18:49 ` hiro @ 2018-10-09 19:09 ` Bakul Shah 2018-10-09 19:30 ` Lyndon Nerenberg 1 sibling, 1 reply; 89+ messages in thread From: Bakul Shah @ 2018-10-09 19:09 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg On Tue, 09 Oct 2018 10:45:37 -0700 Lyndon Nerenberg <lyndon@orthanc.ca> wrote: Lyndon Nerenberg writes: > Bakul Shah writes: > > > One thing I have mused about is recasting plan9 as a > > microkernel and pushing out a lot of its kernel code into user > > mode code. It is already half way there -- it is basically a > > mux for 9p calls, low level device drivers, VM support & some > > process related code. > > Somewhat related to this ... after reading some papers on > TCP-in-user-space implementations, I've been thinking about how an > interface that supported fast/secure page flipping between the > kernel and process address space would change how we do things. > > E.g. right now Plan 9 suffers from a *lot* of data copying between > the kernel and processes, and between processes themselves. If we > could eliminate most of that copying, things would get a lot faster. > Dealing with the security issues isn't trivial, but the programmer > time going into eeking out the last bit of I/O throughput of the > current scheme could be redirected. Funny you say this. I wrote I wanted memory mapping to avoid having to copy data multiple times but then deleted it, thinking it would detract from the main point. Actually I want this even without any major redesign! > If it works, this would reduce the kernel back to handling > process/memory management, and talking to the hardware. Not a > micro-kernel, but just as good from a practical standpoint. Some of this process/memory management can be delegated to user code as well. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:09 ` Bakul Shah @ 2018-10-09 19:30 ` Lyndon Nerenberg 0 siblings, 0 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 19:30 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg Bakul Shah writes: And funny you should mention this! > Some of this process/memory management can be delegated to > user code as well. At $DAYJOB we would really like to have application process control over the kernel scheduler, as this seems to be the only realistic way to avoid the (kernel) resource starvation issues we run into. Our back end servers don't go down often. But when they do, it's for reasons entirely out of our control. Because those resource allocation policies have been pushed into the kernel, and beyond our control. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 23:03 ` Dan Cross 2018-10-09 0:14 ` Bakul Shah @ 2018-10-09 3:08 ` Digby R.S. Tarvin 2018-10-09 3:16 ` [9fans] PDP11 David Arnold ` (2 more replies) 1 sibling, 3 replies; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-09 3:08 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 9490 bytes --] On Tue, 9 Oct 2018 at 10:07, Dan Cross <crossd@gmail.com> wrote: > My guess is that there is no reason in principle that it could not fit >> comfortably into the constraints of a PDP11/70, but if the initial >> implementation was done targeting a machine with significantly more >> resources, it would be easy to make design decisions that would be entirely >> incompatible. >> > > I find this unlikely. > > The PDP-11, while a respectable machine for its day, required too many > tradeoffs to make it attractive as a development platform for a > next-generation research operating system in the late 1980s: be it > electrical power consumption vs computational oomph or dollar cost vs > available memory, the -11 had fallen from the attractive position it held a > decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it > COULD run on a PDP-11 was possible in the early days, but I can't see any > reason one would have WANTED to do so: particularly as part of the impetus > behind plan9 was to exploit advances in contemporary hardware: lower-cost, > higher-performance, RISC-based multiprocessors; ubiquitous networking; > common high-resolution bitmapped graphical displays; even magneto-optical > storage (one bet that didn't pan out); etc. > If you mean that you find it unlikely that that development would have been done on a PDP11, then I agree, for the reasons you mentioned. Not sure that I can see why it wouldn't have been feasible, but I can see why it wouldn't have been desirable. I thought there might have been a chance of an early attempt to target the x86 because of its ubiquity and low cost - which could be useful for a networked operating system. And those were 16 bit address constrained in the early days. But its probably not an architecture you would choose to work with if you had a choice.. 68K is what I would have gone for.. > Certainly Richard Millar's comment suggests that might be the case. If it >> is heavily dependent on VM, then the necessary rewrite is likely to be >> substantial. >> > > As a demonstration project, getting a slimmed-down plan9 kernel to boot on > a PDP-11/70-class machine would be a nifty hack, but it would be quite a > tour de force and most likely the result would not be generally useful. I > think that, as has been suggested, the conceptual simplicity of plan9 > paradoxically means that resource utilization is higher than it might > otherwise be on either a more elaborate OR more constrained system (such as > one targeting e.g. the PDP-11). When you can afford not to care about a few > bytes here or a couple of cycles there and you're not obsessed with > scraping out the very last drop of performance, you can employ a simpler > (some might say 'naive') algorithm or data structure. > > I'm not sure how the kernel design has changed since the first release. >> The earliest version I have is the release I bought through Harcourt Brace >> back in 1995. But I won't be home till December so it will be a while >> before I can look at it, and probably won't have time to experiment before >> then in any case. >> > > The kernel evolved substantially over its life; something like doubling in > size. I remember vaguely having a discussion with Sape where he said he > felt it had grown bloated. That was probably close to 20 years ago now. > I guess kernel size wasn't a priority. I did a bit of searching back through the old papers, and whilst there is a lot of talk about lines of code and numbers of system calls, I didn't find any reference to kernel size or memory requirements. > For what it is worth, I don't think the embarrassment of riches presented >> to programmers by current hardware has tended to produce more elegant >> designs. If more resources resulted in elegance, Windows would be a thing >> of beauty. Perhaps Plan9 is an exception. It certainly excels in elegance >> and design simplicity, even if it does turn out to be more resource hungry >> than I imagined. I will admit that the evils of excessively constrained >> environments are generally worse in terms of coding elegance - especially >> when it leads to overlays and self modifying code. >> > > plan9 is breathtakingly elegant, but this is in no small part because as a > research system it had the luxury of simply ignoring many thorny problems > that would have marred that beauty but that the developers chose not to > tackle. Some of these problems have non-trivial domain complexity and, > while "modern" systems are far too complex by far, that doesn't mean that > all solutions can be recast as elegantly simple pearls in the plan9 style. > Whether we like those problems or not, they exist and real-world solutions > have to at least attempt to deal with them (I'm looking at you, web x.0 for > x >= 2...but curse you you aren't alone). > > PDP11's don't support virtual memory, so there doesn't seem any elegant >> way to overcome that fundamental limitation on size of a singe executable. >> > > No, they do: there is paging hardware on the PDP-11 that's used for > address translation and memory protection (recall that PDP-11 kept the > kernel at the top of the address space, the per-process "user" structure is > at a fixed virtual address, and the system could trap a bus error and kill > a misbehaving user-space process). What they may not support is the sort of > trap handling that would let them recover from a page fault (though I > haven't looked) and in any case, the address space is too small to make > demand-paging with reclamation cost-effective. > >From what I recall, PDP11 hardware memory management was based on segmentation rather than paging (64K divided into 16 variable sized segments), and Unix did swapping rather than paging (a process is either completely in memory or completely on disk). It does relocation and protection, but I think it was limited in its ability to restart trapped instructions. I suppose there was a sort of embryonic virtual memory in the way the process stack was able to expand dynamically. I believe that was handled by generating a trap when an address 'near' the top of the stack was accessed. The instruction could complete, then a trap would allow the operating system to add a bit more memory to the stack before returning to allow the user process to continue. Unix kept the 'per process data area' at the top of the 'process image' in physical memory, but it was not mapped into the process address space. It was, however. mapped into a fixed address in kernel space whenever the corresponding process was the 'current' process. But you are right - even is it could do virtual memory/paging, it wouldn't help much because of limited size of the virtual address space. > So I don't think it i would be worth a substantial rewrite to get it >> going. It is a shame that there don't seem to have been any more powerful >> machines with a comparably elegant architecture and attractive front panel >> :) >> > > An attractive front panel for nearly any machine is just a soldering iron, > LEDs and some logic chips away. As far as elegant architectures, some are > very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0 > machines can be had a reasonable prices, and POWER is kind of cool. I know > I shouldn't, but I have a soft spot for ARM. > I have thought about it, but there are a couple of problems (in addition to my lack artistic talent when it comes to building physically attractive enclosures).. One is the sheer number of LEDs required to display all of the address and data lines in a modern architecture. Mainly an issue if I want to use the old PDP11/70 front panel that I had saved for the purpose, I suppose. The other problem is getting access to the all of the machine state that was displayable on a mini computer console. Virtual addresses, User/Kernel mode, register contents etc are all hard to get at. I have toyed with using JTAG etc, but there always seems to be something that I can't get to. So it is hard to do more than resort to a software controlled front panel. I used to have a little box of LEDs and switches that I plugged into the parallel port on PCs, and had my BSDi kernel modified to update it as part of the clock interrupt. But now the parallel ports are becoming rare and you can't update LEDs connected via USB in a single instruction... :-/ Oh, and sure, you can find reasonable architectures. I spent a long time using 680x0 after learning on PDP11s, and found it equally comfortable for those occasions when I had to work in assembly language (I dread having to disassemble Intel binaries). Its just that packaging is so uninteresting now, it gives no indication of what is going on inside. Even lights on the hard drives when they were externally visible gave some signs of what was going on. Especially when there were different lights for read and write.. It is sounding like Inferno is going to be the more practical option. I >> believe gcc can still generate PDP-11 code, so it shouldn't be too hard to >> try. >> > > Sounds like a nifty hack. Fitting Dis into a 64k/64k split I/D space is > the challenge. > Yes indeed. I guess that is something I can try to find the time to test in the short term. It should provide a pretty good indication of viability. Regards, DigbyT [-- Attachment #2: Type: text/html, Size: 11801 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 2018-10-09 3:08 ` Digby R.S. Tarvin @ 2018-10-09 3:16 ` David Arnold 2018-10-09 4:52 ` Digby R.S. Tarvin 2018-10-09 11:58 ` [9fans] PDP11 (Was: Re: what heavy negativity!) Ethan Gardener 2018-10-09 14:02 ` erik quanstrom 2 siblings, 1 reply; 89+ messages in thread From: David Arnold @ 2018-10-09 3:16 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: David Arnold [-- Attachment #1.1: Type: text/plain, Size: 2054 bytes --] > On 9 Oct 2018, at 14:08, Digby R.S. Tarvin <digbyt42@gmail.com> wrote: <…> > So I don't think it i would be worth a substantial rewrite to get it going. It is a shame that there don't seem to have been any more powerful machines with a comparably elegant architecture and attractive front panel :) > > An attractive front panel for nearly any machine is just a soldering iron, LEDs and some logic chips away. As far as elegant architectures, some are very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0 machines can be had a reasonable prices, and POWER is kind of cool. I know I shouldn't, but I have a soft spot for ARM. > > I have thought about it, but there are a couple of problems (in addition to my lack artistic talent when it comes to building physically attractive enclosures).. One is the sheer number of LEDs required to display all of the address and data lines in a modern architecture. Mainly an issue if I want to use the old PDP11/70 front panel that I had saved for the purpose, I suppose. The other problem is getting access to the all of the machine state that was displayable on a mini computer console. Virtual addresses, User/Kernel mode, register contents etc are all hard to get at. I have toyed with using JTAG etc, but there always seems to be something that I can't get to. So it is hard to do more than resort to a software controlled front panel. I used to have a little box of LEDs and switches that I plugged into the parallel port on PCs, and had my BSDi kernel modified to update it as part of the clock interrupt. But now the parallel ports are becoming rare and you can't update LEDs connected via USB in a single instruction... :-/ Probably not quite what you’re after, but the PiDP8 and PiDP11 kits will get you an (arguably) attractive front panel without requiring artistic talent. http://obsolescence.wixsite.com/obsolescence/pidp-11 I’ve not looked into how the front-panel is driven (from SIMH, I guess?), but perhaps it could be suitably massaged? d [-- Attachment #1.2: Type: text/html, Size: 3271 bytes --] [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 2018-10-09 3:16 ` [9fans] PDP11 David Arnold @ 2018-10-09 4:52 ` Digby R.S. Tarvin 0 siblings, 0 replies; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-09 4:52 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2930 bytes --] Yes, that is exactly what prompted the thinking about Plan9 on a PDP11/70. I have already organized a PiDP11 kit to be shipped to me when I get home in December - so that I can experiment without running the risk of blowing up my old original 11/70 front panel. But a (simulated) 11/70 with a nice front panel isn't so interesting unless I have some interesting PDP11 software to run on it. A small Plan9/Inferno implementation could be integrated into a larger network and allow the old hardware to integrate seamlessly with other things. Such as exporting a device that lets other hosts write to the lights and read from the switches, for example.. Regards. DigbyT On Tue, 9 Oct 2018 at 14:23, David Arnold <davida@pobox.com> wrote: > On 9 Oct 2018, at 14:08, Digby R.S. Tarvin <digbyt42@gmail.com> wrote: > > > <…> > > So I don't think it i would be worth a substantial rewrite to get it >>> going. It is a shame that there don't seem to have been any more powerful >>> machines with a comparably elegant architecture and attractive front panel >>> :) >>> >> >> An attractive front panel for nearly any machine is just a soldering >> iron, LEDs and some logic chips away. As far as elegant architectures, some >> are very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0 >> machines can be had a reasonable prices, and POWER is kind of cool. I know >> I shouldn't, but I have a soft spot for ARM. >> > > I have thought about it, but there are a couple of problems (in addition > to my lack artistic talent when it comes to building physically attractive > enclosures).. One is the sheer number of LEDs required to display all of > the address and data lines in a modern architecture. Mainly an issue if I > want to use the old PDP11/70 front panel that I had saved for the purpose, > I suppose. The other problem is getting access to the all of the machine > state that was displayable on a mini computer console. Virtual addresses, > User/Kernel mode, register contents etc are all hard to get at. I have > toyed with using JTAG etc, but there always seems to be something that I > can't get to. So it is hard to do more than resort to a software controlled > front panel. I used to have a little box of LEDs and switches that I > plugged into the parallel port on PCs, and had my BSDi kernel modified to > update it as part of the clock interrupt. But now the parallel ports are > becoming rare and you can't update LEDs connected via USB in a single > instruction... :-/ > > > Probably not quite what you’re after, but the PiDP8 and PiDP11 kits will > get you an (arguably) attractive front panel without requiring artistic > talent. > > http://obsolescence.wixsite.com/obsolescence/pidp-11 > > I’ve not looked into how the front-panel is driven (from SIMH, I guess?), > but perhaps it could be suitably massaged? > > > > d > > [-- Attachment #2: Type: text/html, Size: 3964 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 3:08 ` Digby R.S. Tarvin 2018-10-09 3:16 ` [9fans] PDP11 David Arnold @ 2018-10-09 11:58 ` Ethan Gardener 2018-10-09 13:59 ` erik quanstrom 2018-10-09 22:22 ` Digby R.S. Tarvin 2018-10-09 14:02 ` erik quanstrom 2 siblings, 2 replies; 89+ messages in thread From: Ethan Gardener @ 2018-10-09 11:58 UTC (permalink / raw) To: 9fans On Tue, Oct 9, 2018, at 4:08 AM, Digby R.S. Tarvin wrote: > I thought there might have been a chance of an early attempt to target the x86 because of its ubiquity and low cost - which could be useful for a networked operating system. And those were 16 bit address constrained in the early days. But its probably not an architecture you would choose to work with if you had a choice.. 68K is what I would have gone for.. Fascinating thread, but I think you're off by a decade with the 16-bit address bus comment, unless you're not actually talking about Plan 9. The 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 respectively. The IBM PC, launched in 1982, had its ROM at the top of that 1MByte space, so it couldn't have been constrained in that way. By the end of the 80s, all my schoolmates had 68k-powered computers from Commodore and Atari, showing hardware with a 24-bit address space was very much affordable and ubiquitous at the time Plan 9 development started. Almost all of them had 512KB at the time. A few flashy gits had 1MB machines. :) I still wish I'd kept the better of the Atari STs which made their way down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier "STFM" models. I remember wanting to try to run Plan 9 on it. Let's estimate how tight it would be... I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86. I kept running out of image memory! The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :) 1 bit per pixel would obviously improve matters by a factor of 16 compared to my setup, and 640x400 (Atari ST high resolution) would be another 5 times smaller than my screen. Putting these numbers together with my experience, you'd have to be careful to use images sparingly on a machine with 800KB free RAM after the kernel is loaded. That's better than I thought, probably achievable on that Atari I had, but it couldn't be used as intensively as I used Plan 9 back then. How could it be used? I think it would be a good idea to push the draw device back to user space and make very sure to have it check for failing malloc! I certainly wouldn't want a terminal with a filesystem and graphics all on a single 1MByte 64000-powered computer, because a filesystem on a terminal runs in user space, and thus requires some free memory to run the programs to shut it down. Actually, Plan 9's separation of terminal from filesystem seems quite the obvious choice when I look at it like this. :) ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 11:58 ` [9fans] PDP11 (Was: Re: what heavy negativity!) Ethan Gardener @ 2018-10-09 13:59 ` erik quanstrom 2018-10-09 22:22 ` Digby R.S. Tarvin 1 sibling, 0 replies; 89+ messages in thread From: erik quanstrom @ 2018-10-09 13:59 UTC (permalink / raw) To: 9fans > I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86. I kept running out of image memory! The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :) this was changed long ago. image memory can now be much bigger. i never had a problem when a 4e terminal was my daily driver. - erik ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 11:58 ` [9fans] PDP11 (Was: Re: what heavy negativity!) Ethan Gardener 2018-10-09 13:59 ` erik quanstrom @ 2018-10-09 22:22 ` Digby R.S. Tarvin 2018-10-10 10:38 ` Ethan Gardener 1 sibling, 1 reply; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-09 22:22 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 4379 bytes --] On Tue, 9 Oct 2018 at 23:00, Ethan Gardener <eekee57@fastmail.fm> wrote: > > Fascinating thread, but I think you're off by a decade with the 16-bit > address bus comment, unless you're not actually talking about Plan 9. The > 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 > respectively. The IBM PC, launched in 1982, had its ROM at the top of that > 1MByte space, so it couldn't have been constrained in that way. By the end > of the 80s, all my schoolmates had 68k-powered computers from Commodore and > Atari, showing hardware with a 24-bit address space was very much > affordable and ubiquitous at the time Plan 9 development started. Almost > all of them had 512KB at the time. A few flashy gits had 1MB machines. :) > Not sure I would agree with that. The 20 bit addressing of the 8086 and 8088 did not change their 16 bit nature. They were still 16 bit program counter, with segmentation to provide access to a larger memory - similar in principle to the PDP11 with MMU. The first 32 bit x86 processor was the 386, which I think came out in 1985, very close to when work on Plan9 was rumored to have started. So it seemed not impossible that work might have started on an older 16 bit machine, but at Bell Labs probably a long shot. > I still wish I'd kept the better of the Atari STs which made their way > down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the > earlier "STFM" models. I remember wanting to try to run Plan 9 on it. > Let's estimate how tight it would be... > > I think it would be terrible, because I got frustrated enough trying to > run a 4e CPU server with graphics on a 2GB x86. I kept running out of > image memory! The trouble was the draw device in 4th edition stores images > in the same "image memory" the kernel loads programs into, and the 386 CPU > kernel 'only' allocates 64MB of that. :) > > 1 bit per pixel would obviously improve matters by a factor of 16 compared > to my setup, and 640x400 (Atari ST high resolution) would be another 5 > times smaller than my screen. Putting these numbers together with my > experience, you'd have to be careful to use images sparingly on a machine > with 800KB free RAM after the kernel is loaded. That's better than I > thought, probably achievable on that Atari I had, but it couldn't be used > as intensively as I used Plan 9 back then. > > How could it be used? I think it would be a good idea to push the draw > device back to user space and make very sure to have it check for failing > malloc! I certainly wouldn't want a terminal with a filesystem and > graphics all on a single 1MByte 64000-powered computer, because a > filesystem on a terminal runs in user space, and thus requires some free > memory to run the programs to shut it down. Actually, Plan 9's separation > of terminal from filesystem seems quite the obvious choice when I look at > it like this. :) > I went Commodore Amiga at about that time - because it at least supported some form of multi-tasking out out the box, and I spent many happy hours getting OS9 running on it.. An interesting architecture, capable of some impressive graphics, but subject to quite severe limitations which made general purpose graphics difficult. (Commodore later released SVR4 Unix for the A3000, but limited X11 to monochrome when using the inbuilt graphics). But being 32 bit didn't give it a huge advantage over the 16 bit x86 systems for tinkering with operating system, because the 68000 had no MMU. It was easier to get a Unix like system going with 16 bit segmentation than a 32 bit linear space and no hardware support for run time relocation. (OS9 used position independent code throughout to work without an MMU, but didn't try to implement fork() semantics). It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really did everything I wanted. The 68020 with an optional MMU was equivalent, but not so common in consumer machines. Hardware progress seems to have been rather uninteresting since then. Sure, hardware is *much* faster and *much* bigger, but fundamentally the same architecture. Intel had a brief flirtation with a novel architecture with the iAPX 432 in 81, but obviously found that was more profitable making the familiar architecture bigger and faster . [-- Attachment #2: Type: text/html, Size: 4918 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 22:22 ` Digby R.S. Tarvin @ 2018-10-10 10:38 ` Ethan Gardener 2018-10-10 23:15 ` Digby R.S. Tarvin 0 siblings, 1 reply; 89+ messages in thread From: Ethan Gardener @ 2018-10-10 10:38 UTC (permalink / raw) To: 9fans On Tue, Oct 9, 2018, at 11:22 PM, Digby R.S. Tarvin wrote: > > > On Tue, 9 Oct 2018 at 23:00, Ethan Gardener <eekee57@fastmail.fm> wrote: >> >> Fascinating thread, but I think you're off by a decade with the 16-bit address bus comment, unless you're not actually talking about Plan 9. The 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 respectively. The IBM PC, launched in 1982, had its ROM at the top of that 1MByte space, so it couldn't have been constrained in that way. By the end of the 80s, all my schoolmates had 68k-powered computers from Commodore and Atari, showing hardware with a 24-bit address space was very much affordable and ubiquitous at the time Plan 9 development started. Almost all of them had 512KB at the time. A few flashy gits had 1MB machines. :) > > Not sure I would agree with that. The 20 bit addressing of the 8086 and 8088 did not change their 16 bit nature. They were still 16 bit program counter, with segmentation to provide access to a larger memory - similar in principle to the PDP11 with MMU. That's not at all the same as being constrained to 64KB memory. Are we communicating at cross purposes here? If we're not, if I haven't misunderstood you, you might want to read up on creating .exe files for MS-DOS. > The first 32 bit x86 processor was the 386, which I think came out in 1985, very close to when work on Plan9 was rumored to have started. So it seemed not impossible that work might have started on an older 16 bit machine, but at Bell Labs probably a long shot. Mmh, rumors. I read they were starting to think about Plan 9 in 1985, but I haven't read anything about it being up and running until '89 or '90. There's not much to go on. >> I still wish I'd kept the better of the Atari STs which made their way down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier "STFM" models. I remember wanting to try to run Plan 9 on it. Let's estimate how tight it would be... >> >> I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86. I kept running out of image memory! The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :) >> >> 1 bit per pixel would obviously improve matters by a factor of 16 compared to my setup, and 640x400 (Atari ST high resolution) would be another 5 times smaller than my screen. Putting these numbers together with my experience, you'd have to be careful to use images sparingly on a machine with 800KB free RAM after the kernel is loaded. That's better than I thought, probably achievable on that Atari I had, but it couldn't be used as intensively as I used Plan 9 back then. >> >> How could it be used? I think it would be a good idea to push the draw device back to user space and make very sure to have it check for failing malloc! I certainly wouldn't want a terminal with a filesystem and graphics all on a single 1MByte 64000-powered computer, because a filesystem on a terminal runs in user space, and thus requires some free memory to run the programs to shut it down. Actually, Plan 9's separation of terminal from filesystem seems quite the obvious choice when I look at it like this. :) > > I went Commodore Amiga at about that time - because it at least supported some form of multi-tasking out out the box, and I spent many happy hours getting OS9 running on it.. An interesting architecture, capable of some impressive graphics, but subject to quite severe limitations which made general purpose graphics difficult. (Commodore later released SVR4 Unix for the A3000, but limited X11 to monochrome when using the inbuilt graphics). It does sound like fun. :) I'm not surprised by the monochrome graphics limitation after my calculations. Still, X11 or any other window system which lacks a backing store may do better in low-memory environments than Plan 9's present draw device. It's a shame, a backing store is a great simplification for programmers. > But being 32 bit didn't give it a huge advantage over the 16 bit x86 systems for tinkering with operating system, because the 68000 had no MMU. It was easier to get a Unix like system going with 16 bit segmentation than a 32 bit linear space and no hardware support for run time relocation. > (OS9 used position independent code throughout to work without an MMU, but didn't try to implement fork() semantics). I'm sometimes tempted to think that fork() is freakishly high-level crazy stuff. :) Still, like backing store, it's very nice to have. > It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really did everything I wanted. The 68020 with an optional MMU was equivalent, but not so common in consumer machines. > > Hardware progress seems to have been rather uninteresting since then. Sure, hardware is *much* faster and *much* bigger, but fundamentally the same architecture. Intel had a brief flirtation with a novel architecture with the iAPX 432 in 81, but obviously found that was more profitable making the familiar architecture bigger and faster . I rather agree. Multi-core and hyperthreading don't bring in much from an operating system designer's perspective, and I think all the interesting things about caches are means of working around their problems. I would very much like to get my hands on a ga144 to see what sort of operating system structure would work well on 144 processors with 64KW RAM each. :) There's 64KW ROM per processor too, a lot of stock code could go in that. Both the RAM and ROM operate at the full speed of the processor, no caches to worry about. A little rant about MMUs, sort-of saying "unix and C are not without complexifying nonsense": I'm sure the MMU itself is uninteresting or even harmful to many who prefer other languages and system designs. Just look at that other discussion about the penalties of copying versus the cache penalties of page flipping. If that doesn't devolve into "heavy negativity," it'll only be because those who know don't write much, or those who write much don't want to provide actual figures or references to argue about. What about all those languages which don't even give the programmer access to pointers in the first place. Many have run directly on hardware in the past, some can now. Do they need MMUs? Then there's Forth, which relies on pointers even more than C does. I haven't read *anything* about MMUs in relation to Forth, and yet Forth is in practice as much an operating system as a language. It runs directly on hardware. I'm not sure of some details yet, but it looks like many operating system features either "fall out of" the language design (to use a phrase from Ken Thompson & co.), or are trivial to implement. There were multitasking Forth systems in the 70s. No MMU. The full power of pointers *at the prompt*. Potential for stack under- and over-runs too. And yet these were working systems, and the language hasn't been consigned to the graveyard of computing history. My big project includes exploring how this is possible. :) A likely possibility is the power to redefine words (functions) without affecting previous definitions. Pointer store and fetch can trivially be redefined to check bounds. Check your code doesn't go out of bounds, then "empty", and load it without the bounds-checking store and fetch. -- Progress might have been all right once, but it has gone on too long -- Ogden Nash ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 10:38 ` Ethan Gardener @ 2018-10-10 23:15 ` Digby R.S. Tarvin 2018-10-11 18:10 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-10 23:15 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 10530 bytes --] On Wed, 10 Oct 2018 at 21:40, Ethan Gardener <eekee57@fastmail.fm> wrote: > > > > Not sure I would agree with that. The 20 bit addressing of the 8086 and > 8088 did not change their 16 bit nature. They were still 16 bit program > counter, with segmentation to provide access to a larger memory - similar > in principle to the PDP11 with MMU. > > That's not at all the same as being constrained to 64KB memory. Are we > communicating at cross purposes here? If we're not, if I haven't > misunderstood you, you might want to read up on creating .exe files for > MS-DOS. Agreed, but the PDP11/70 was not constrained to 64KB memory either. I do recall the MS-DOS small/large/medium etc models that used the segmentation in various ways to mitigate the limitations of being a 16 bit computer. Similar techniques were possible on the PDP11, for example Modula-2/VRS under RT-11 used the MMU to transparently support 4MB programs back in 1984 (it used trap instructions to implement subroutine calls). It wasn't possible under Unix, of course, because there were no system calls for manipulating the mmu. Understandable, as it would have complicated the security model in a multi-tasking system. Something neither MS-DOS or RT-11 had to deal with. Address space manipulation was more convenient with Intel segmentation because the instruction set included procedure call/return instructions that manipulated the segmentation registers, but the situation was not fundamentally different. They were both 16 bit machines with hacks to give access to a larger than 64K physical memory. The OS9 operating system allowed some control of application memory maps in a unix like environement by supporting dynamic (but explicit) link and unlink of subroutine and data modules - which would be added and removed from your 64K address space as required.So more analogous to memory based overlays. > > I went Commodore Amiga at about that time - because it at least > supported some form of multi-tasking out out the box, and I spent many > happy hours getting OS9 running on it.. An interesting architecture, > capable of some impressive graphics, but subject to quite severe > limitations which made general purpose graphics difficult. (Commodore later > released SVR4 Unix for the A3000, but limited X11 to monochrome when using > the inbuilt graphics). > > It does sound like fun. :) I'm not surprised by the monochrome graphics > limitation after my calculations. Still, X11 or any other window system > which lacks a backing store may do better in low-memory environments than > Plan 9's present draw device. It's a shame, a backing store is a great > simplification for programmers. > X11 does, of course, support the concept of a backing store. It just doesn't mandate it. It was an expensive thing to provide back when X11 was young, so pretty rare. I remember finding the need to be able to re-create windows on demand rather annoying when I first learned to program in Xlib, but once you get used to it I find it can lead to benefits when you have to retain a knowledge of how an image is created, not just the end result. > > But being 32 bit didn't give it a huge advantage over the 16 bit x86 > systems for tinkering with operating system, because the 68000 had no MMU. > It was easier to get a Unix like system going with 16 bit segmentation than > a 32 bit linear space and no hardware support for run time relocation. > > (OS9 used position independent code throughout to work without an MMU, > but didn't try to implement fork() semantics). > > I'm sometimes tempted to think that fork() is freakishly high-level crazy > stuff. :) Still, like backing store, it's very nice to have. > I agree. Very elegant when you compare it to the hoops you have to jump through to initialize the child process environment in systems with the more common combined 'forkexec' semantics, but a real sticking point for low end hardware. > > It wasn't till the 68030 based Amiga 3000 came out in 1990 that it > really did everything I wanted. The 68020 with an optional MMU was > equivalent, but not so common in consumer machines. > > > > Hardware progress seems to have been rather uninteresting since then. > Sure, hardware is *much* faster and *much* bigger, but fundamentally the > same architecture. Intel had a brief flirtation with a novel architecture > with the iAPX 432 in 81, but obviously found that was more profitable > making the familiar architecture bigger and faster . > > I rather agree. Multi-core and hyperthreading don't bring in much from an > operating system designer's perspective, and I think all the interesting > things about caches are means of working around their problems. I don't think anyone would bother with multiple cores or caches if that same performance could be achieved without them. They just buy a bit more performance at the cost of additional software complexity. I would very much like to get my hands on a ga144 to see what sort of > operating system structure would work well on 144 processors with 64KW RAM > each. :) There's 64KW ROM per processor too, a lot of stock code could go > in that. Both the RAM and ROM operate at the full speed of the processor, > no caches to worry about. > Interesting. I hadn't come across those before.... > A little rant about MMUs, sort-of saying "unix and C are not without > complexifying nonsense": I'm sure the MMU itself is uninteresting or even > harmful to many who prefer other languages and system designs. Just look > at that other discussion about the penalties of copying versus the cache > penalties of page flipping. If that doesn't devolve into "heavy > negativity," it'll only be because those who know don't write much, or > those who write much don't want to provide actual figures or references to > argue about. > I think if you are going to postulate about this, we need to refine our terms a bit. The term MMU encompasses too many quite different concepts. For example: 1. run time relocation 2. virtual address space expansion (use more memory than can be directly addressed) 3. virtual memory expansion (appear to use more memory than you physically have) 4. process interference protection 5. process privacy protection I'm sure there are more. For example, on the 68K processors, OS-9/68K had no support for the first three - virtual and physical addresses were always the same. For Unix style timesharing it required all compilers to generate position independent code. There was no swapping or virtual memory. It did use an MMU via a module called SPU - the 'System Protection Unit', which mapped all of your program code as read only, your data as read/write, and made everything else inaccessible. That sort of functionality is invaluable while developing, because you don't want faulty programs to change the kernel or other programs, and trapping attempts to do so makes it easier to identify faults. However with suitable privileges, any process can request that arbitrary memory addresses be made readable or writable, and if desired, the SPU could be omitted from the system, either removing an MMU performance penalty or allowing the application to run on cheaper, non-mmu equipped hardware. The executables were re-entrant and position independent, but one an instance started executing it could not be moved (calculated pointers stored in data etc). So this software solution would not have been enough to support efficient swapping or paging. It was ok in this case, because it was intended as a real-time system and you don't do swapping in that situation. What about all those languages which don't even give the programmer access > to pointers in the first place. Many have run directly on hardware in the > past, some can now. Do they need MMUs? > > Then there's Forth, which relies on pointers even more than C does. I > haven't read *anything* about MMUs in relation to Forth, and yet Forth is > in practice as much an operating system as a language. It runs directly on > hardware. I'm not sure of some details yet, but it looks like many > operating system features either "fall out of" the language design (to use > a phrase from Ken Thompson & co.), or are trivial to implement. > > There were multitasking Forth systems in the 70s. No MMU. The full power > of pointers *at the prompt*. Potential for stack under- and over-runs > too. And yet these were working systems, and the language hasn't been > consigned to the graveyard of computing history. My big project includes > exploring how this is possible. :) A likely possibility is the power to > redefine words (functions) without affecting previous definitions. Pointer > store and fetch can trivially be redefined to check bounds. Check your > code doesn't go out of bounds, then "empty", and load it without the > bounds-checking store and fetch. > MMU's are probably more important in multi-user than just multi-tasking systems. The Amiga's, as I mentioned, were multi-tasking without the need for an MMU. The result was a very fast system, resulting in some of the impressive graphics and games. But also the need for frequent reboots when developing. But you are right, if you try hard enough, you can replace hardware memory management with software - a sandboxed environment with program development tools with strong typing and no low level access. Look, for example, and the old Burroughs B6700 which had a security paradigm based on making it impossible for unprivileged users to generate machine code. Compilers had to blessed with trusted status, and only code generated by trusted compilers would be executed.. I don't recall many details, other than it had an interesting tagged architecture. An extreme example would be an emulator - sandboxing users without any actual hardware protection, albeit at significant performance cost. Forth, Basic and all the other common development environments common on person computers before the MMU availability were fine because there was generally only one user, the language provided a lot or protection/restriction (and was often interpreted), and if you managed to crash it, it was generally quick to restart. So I still tend to feel that MMUs were a valuable advance in computer architecture that I would hate to have to live without.. [-- Attachment #2: Type: text/html, Size: 12162 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 23:15 ` Digby R.S. Tarvin @ 2018-10-11 18:10 ` Lyndon Nerenberg 2018-10-11 20:55 ` Digby R.S. Tarvin 0 siblings, 1 reply; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 18:10 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg Digby R.S. Tarvin writes: > Agreed, but the PDP11/70 was not constrained to 64KB memory either. > I do recall the MS-DOS small/large/medium etc models that used the > segmentation in various ways to mitigate the limitations of being a 16 bit > computer. Similar techniques were possible on the PDP11, for example Coincidental to this conversation, I'm currently reading "The Apollo Guidance Computer: Architecture and Operation" by _Framk O'Brien_. (ISBN 978-1-4419-0876-6) Very interesting to see what you can do with a 15 bit architecture when sufficiently motivated. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 18:10 ` Lyndon Nerenberg @ 2018-10-11 20:55 ` Digby R.S. Tarvin 2018-10-11 21:03 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-11 20:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 917 bytes --] Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings held to discuss competing potential uses for a word of memory that had become free. That one would be a challenging Plan9 port.. On Fri, 12 Oct 2018 at 05:13, Lyndon Nerenberg <lyndon@orthanc.ca> wrote: > Digby R.S. Tarvin writes: > > > Agreed, but the PDP11/70 was not constrained to 64KB memory either. > > > I do recall the MS-DOS small/large/medium etc models that used the > > segmentation in various ways to mitigate the limitations of being a 16 > bit > > computer. Similar techniques were possible on the PDP11, for example > > Coincidental to this conversation, I'm currently reading "The Apollo > Guidance Computer: Architecture and Operation" by _Framk O'Brien_. > (ISBN 978-1-4419-0876-6) Very interesting to see what you can do with > a 15 bit architecture when sufficiently motivated. > > --lyndon > > [-- Attachment #2: Type: text/html, Size: 1232 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 20:55 ` Digby R.S. Tarvin @ 2018-10-11 21:03 ` Lyndon Nerenberg 0 siblings, 0 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 21:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg Digby R.S. Tarvin writes: > Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings > held to discuss competing potential uses for a word of memory that had > become free. > That one would be a challenging Plan9 port.. And yet Plan9 was not there to save the day. Such a pity. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 3:08 ` Digby R.S. Tarvin 2018-10-09 3:16 ` [9fans] PDP11 David Arnold 2018-10-09 11:58 ` [9fans] PDP11 (Was: Re: what heavy negativity!) Ethan Gardener @ 2018-10-09 14:02 ` erik quanstrom 2 siblings, 0 replies; 89+ messages in thread From: erik quanstrom @ 2018-10-09 14:02 UTC (permalink / raw) To: 9fans > From what I recall, PDP11 hardware memory management was based on > segmentation rather than paging (64K divided into 16 variable sized > segments), and Unix did swapping rather than paging (a process is either > completely in memory or completely on disk). It does relocation and completely in memory /and running/. or swapped out. - erik ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 4:29 ` Digby R.S. Tarvin 2018-10-08 7:20 ` hiro @ 2018-10-08 8:12 ` Nils M Holm 2018-10-08 9:12 ` Digby R.S. Tarvin 1 sibling, 1 reply; 89+ messages in thread From: Nils M Holm @ 2018-10-08 8:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote: > A native Inferno port would certainly be a lot easier, but I think you > might be a bit pessimistic about would can fit into a 64K address space > machine. The 11/70 certainly managed to run a very respectable V7 Unix > supporting 20-30 simultaneous active users in its day, [...] The 11/70 was a completely different beast than, say, an 11/03. The 70 had a backplane with 22 address lines, a MMU, and up to 4M bytes of memory. So while its processes were limited to 64K+64K bytes, I would not consider it to be a typical 16-bit machine. -- Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 8:12 ` Nils M Holm @ 2018-10-08 9:12 ` Digby R.S. Tarvin 0 siblings, 0 replies; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-08 9:12 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 3358 bytes --] I quite agree - the PDP 11/70 was quite a high end 16 bit machine, but it was the machine that I was talking about and the one I would most like to revisit (although I wouldn't turn down an 11/40 if somebody offered me a working one). I don't think I would contemplate putting Plan9 on a machine with no MMU or a 64K physical memory limit. My first reasonable multi-user, multi-tasking computer system (back in the early 80s) was home made 6809 machine with 6829 MMU and eventually 1MB of ram, running OS-9/6809. It initially ran with 64K for programs and and the rest of memory was a big ram disk - because what else could you do with such a ridiculous amount of memory. It did pretty well at providing a personal Unix like environment, although counldn't reproduce the fork() semantics and there was no memory protection, and the memory contraints meant always running the C compiler one pass at a time.. But we eventually ported 'Level 2' OS-9 which could use a mapping ram/MMU, and with that I had a quite robust multi-user system, with up to 64K available per process, and 64K available for the kernel. I was able to get most Unix programs running on it (except for a few with big tables that compiled to larger than 64K) and no longer had to worry about exiting the editor before doing a compile. Most of the core system utilities were written in assembly language - so the equivalent of 'ls' for example, required no more than a 256 byte memory allocation. And all executables were loaded read-only and re-entrant (shared text) which helped. The only real Achilles heal was the 6809 had no illegal instruction trapping, so executing data could occasionally result in an unrecoverable freeze.. I never liked the 68K version os OS-9 quite as much. Because of the larger address space it used the MMU for protection only, with no address translation - so the kernel was mapped into the same address space as the user programs but just not accessible in user mode. It just didn't seem as elegant. Anyway, thats why I don't see 64K per process as necessarily being inadequate for a lean operating system, although it would be easy enough to write extravagant code that would not run in 64K, or a design that relied on a large virtual address space - especially if you were used to relying on virtual memory. I just don't know if how small Plan9 can go, and unless someone has already explored those limits, I suppose rather than speculating i'll just have to plan on a little experimentation when I get a bit of spare time. Regards, Digby On Mon, 8 Oct 2018 at 19:13, Nils M Holm <nmh@t3x.org> wrote: > On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote: > > A native Inferno port would certainly be a lot easier, but I think you > > might be a bit pessimistic about would can fit into a 64K address space > > machine. The 11/70 certainly managed to run a very respectable V7 Unix > > supporting 20-30 simultaneous active users in its day, [...] > > The 11/70 was a completely different beast than, say, an 11/03. > The 70 had a backplane with 22 address lines, a MMU, and up to > 4M bytes of memory. So while its processes were limited to > 64K+64K bytes, I would not consider it to be a typical 16-bit > machine. > > -- > Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org > > [-- Attachment #2: Type: text/html, Size: 3866 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-08 3:38 [9fans] PDP11 (Was: Re: what heavy negativity!) Lucio De Re 2018-10-08 4:29 ` Digby R.S. Tarvin @ 2018-10-08 8:09 ` Nils M Holm 1 sibling, 0 replies; 89+ messages in thread From: Nils M Holm @ 2018-10-08 8:09 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 2018-10-08T05:38:07+0200, Lucio De Re wrote: > You really must be thinking of Inferno, native, running in a host with > 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM. > Even MPM won't cut it, I don't think. There were serveral UNIX 6th Edition-based "Mini Unix" variants for the PDP-11/03 and other 16-bit systems. Then there is UZI, the Unix Z80 Implementation, which can run multiple processes (with swapping) in 64K bytes of RAM. CP/M ran in much less than 64KB. -- Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-09 19:47 cinap_lenrek
2018-10-09 22:01 ` erik quanstrom
2018-10-09 23:43 ` Lyndon Nerenberg
0 siblings, 2 replies; 89+ messages in thread
From: cinap_lenrek @ 2018-10-09 19:47 UTC (permalink / raw)
To: 9fans
> The big one is USB. disk/radio->kernel->user-space-usbd->kernel->application.
> Four copies.
that sounds wrong.
usbd is not involved in the data transfer. it mainly is just responsible to
enumerating devices and instantiating drivers and registering the endpoints
in devusb. after that you access the endpoint files from devusb which goes
directly to the kernel. devusb also allows you to create a alias for a
endpoint file which then appears directly under /dev. usb audio uses this
mechanism. the usb driver just activates the device and provides the ctl/volume
files, while audio data is handled by the kernel's devusb.
on another remark regarding zero copy. the reason plan9 drivers are small comes
from NOT doing these "optimizations". identity mapping the low part of memory
in the kernel avoids alot of trouble and allows you to get DMA capable memory
with just wrapping a pointer in PADDR(va). no page lists needed. no MMU tricks
needed in the drivers. you can use any kernel memory va for DMA... even your
kernel stack! its never paged out. you can be sure it is not changed while the
device looks at it ect. do not underestimate the impact of this "simplification".
linux block layer is broken in that regard btw. it just hands user pages into
the drivers without making sure they do not change while the i/o is in flight,
which results in all kinds of false-negatives when you actually start verifying
your raid arrays as different snapshots in time got written out to the raid
members. they know about this and ignore it because benchmarks are more important.
--
cinap
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:47 cinap_lenrek @ 2018-10-09 22:01 ` erik quanstrom 2018-10-09 23:43 ` Lyndon Nerenberg 1 sibling, 0 replies; 89+ messages in thread From: erik quanstrom @ 2018-10-09 22:01 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/html, Size: 156 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:47 cinap_lenrek 2018-10-09 22:01 ` erik quanstrom @ 2018-10-09 23:43 ` Lyndon Nerenberg 2018-10-10 5:52 ` hiro 2018-10-10 5:57 ` hiro 1 sibling, 2 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-09 23:43 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg cinap_lenrek@felloff.net writes: > > The big one is USB. disk/radio->kernel->user-space-usbd->kernel->applicati > on. > > Four copies. > > that sounds wrong. > > usbd is not involved in the data transfer. You're right, I was wrong about 'usbd'. In the bits of testing I've done with this, 'usbd' is replaces with a user space file server that abstracts the hardware and presents a useful file system interface. (E.g. along the lines of the gps filesystem interface.) To address Hiro's comments, I have no benchmarks on Plan 9, because the SDR code I run does not exist there. But I do have experience with running SDR on Linux and FreeBSD with hardware like the HackRF One. That hardware can easily saturate a USB2 interface/driver on both of those operating systems. Given my experience with USB on Plan 9 to date, it's a safe bet that all the variants would die when presented with that amount of traffic. (I can knock down a Plan9 system with 56 Kb/s USB serial traffic.) I can see about twisting up some code that would read the raw I/Q data from the SDR via USB and see how it stands up. But the real question is what kind of delay, latency, and jitter will there be, getting that raw I/Q data from the USB interface up to the consuming application? Eliminating as much of the copy in/out WRT the kernel cannot but help, especially when you're doing SDR decoding near the radios using low-powered compute hardware (think Pies and the like). --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 23:43 ` Lyndon Nerenberg @ 2018-10-10 5:52 ` hiro 2018-10-10 8:13 ` Digby R.S. Tarvin 2018-10-11 17:43 ` Lyndon Nerenberg 2018-10-10 5:57 ` hiro 1 sibling, 2 replies; 89+ messages in thread From: hiro @ 2018-10-10 5:52 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Eliminating as much of the copy in/out WRT the kernel cannot but > help, especially when you're doing SDR decoding near the radios > using low-powered compute hardware (think Pies and the like). Does this include demodulation on the pi? cause even when i dumped the pi i was given for that purpose (with a <2Mbit I/Q stream) and replaced it with some similar ARM platform that at least had neon cpu instruction extensions for faster floating point operations, I was barely able to run a small FFT. My conclusion was that these low-powered ARM systems are just good enough for gathering low-bandwidth, non-critical USB traffic, like those raw I/Q samples from a dongle, but unfit for anything else. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 5:52 ` hiro @ 2018-10-10 8:13 ` Digby R.S. Tarvin 2018-10-10 9:14 ` hiro 2018-10-11 17:43 ` Lyndon Nerenberg 1 sibling, 1 reply; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-10 8:13 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] I don't know which other ARM board you tried, but I have always found terrible I/O performance of the Pi to be a bigger problem that the ARM speed. The USB2 interface is really slow, and there arn't really many other (documented) alternative options. The Ethernet goes through the same slow USB interface, and there is only so much that you can do bit bashing data with GPIO's. The sdCard interface seems to be the only non-usb filesystem I/O available. And that in turn limits the viability of relieving the RAM contraints with virtual memory. So the ARM processor itself is not usually the problem for me. In general I find the pi a nice little device for quite a few things - like low power, low bandwidth, low cost servers or displays with plenty of open source compatability.. Or hacking/prototyping where I don't want to have to worry too much about blowing things up. But it not good for high throughput I/O, memory intensive applications, or anything requiring a lot of processing power. The validity of your conclusion regarding low power ARM in general probably depends on what the other board you tried was.. DigbyT On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote: > > Eliminating as much of the copy in/out WRT the kernel cannot but > > help, especially when you're doing SDR decoding near the radios > > using low-powered compute hardware (think Pies and the like). > > Does this include demodulation on the pi? cause even when i dumped the > pi i was given for that purpose (with a <2Mbit I/Q stream) and > replaced it with some similar ARM platform that at least had neon cpu > instruction extensions for faster floating point operations, I was > barely able to run a small FFT. > > My conclusion was that these low-powered ARM systems are just good > enough for gathering low-bandwidth, non-critical USB traffic, like > those raw I/Q samples from a dongle, but unfit for anything else. > > [-- Attachment #2: Type: text/html, Size: 2326 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 8:13 ` Digby R.S. Tarvin @ 2018-10-10 9:14 ` hiro 2018-10-10 13:59 ` Steve Simon 2018-10-10 21:32 ` Digby R.S. Tarvin 0 siblings, 2 replies; 89+ messages in thread From: hiro @ 2018-10-10 9:14 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2583 bytes --] I agree, if you have a choice avoid rpi by all costs. Even if the software side of that other board was less pleasent at least it worked with my mouse and keyboard!! :) As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB. But my point is that even though this number is low, the rpi is too limited to do any meaningful processing anyway (ignoring the usb troubles and lack of ethernet). It's a mobile phone soc after all, where the modulation is done by dedicated chips, not on cpu! :) On Wednesday, October 10, 2018, Digby R.S. Tarvin <digbyt42@gmail.com> wrote: > I don't know which other ARM board you tried, but I have always found terrible I/O performance of the Pi to be a bigger problem that the ARM speed. The USB2 interface is really slow, and there arn't really many other (documented) alternative options. The Ethernet goes through the same slow USB interface, and there is only so much that you can do bit bashing data with GPIO's. The sdCard interface seems to be the only non-usb filesystem I/O available. And that in turn limits the viability of relieving the RAM contraints with virtual memory. So the ARM processor itself is not usually the problem for me. > In general I find the pi a nice little device for quite a few things - like low power, low bandwidth, low cost servers or displays with plenty of open source compatability.. Or hacking/prototyping where I don't want to have to worry too much about blowing things up. But it not good for high throughput I/O, memory intensive applications, or anything requiring a lot of processing power. > The validity of your conclusion regarding low power ARM in general probably depends on what the other board you tried was.. > DigbyT > On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote: >> >> > Eliminating as much of the copy in/out WRT the kernel cannot but >> > help, especially when you're doing SDR decoding near the radios >> > using low-powered compute hardware (think Pies and the like). >> >> Does this include demodulation on the pi? cause even when i dumped the >> pi i was given for that purpose (with a <2Mbit I/Q stream) and >> replaced it with some similar ARM platform that at least had neon cpu >> instruction extensions for faster floating point operations, I was >> barely able to run a small FFT. >> >> My conclusion was that these low-powered ARM systems are just good >> enough for gathering low-bandwidth, non-critical USB traffic, like >> those raw I/Q samples from a dongle, but unfit for anything else. >> > [-- Attachment #2: Type: text/html, Size: 2853 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 9:14 ` hiro @ 2018-10-10 13:59 ` Steve Simon 2018-10-10 21:32 ` Digby R.S. Tarvin 1 sibling, 0 replies; 89+ messages in thread From: Steve Simon @ 2018-10-10 13:59 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs people come down very hard on the pi. here are my times for building the pi kernel. i rebuilt it a few times to push data into any caches available. pi3+ with a high-ish spec sd card: 23 secs dual intel atom 1.8Ghz with an SSD: 9 secs the pi is slower, but not 10 times slower. However it does cost a 10th of the price and consumes a 10th of the electricity. i use the order of magnitude test as that is (in my experience) what you need to make a really noticeable difference (to stuff in general). i use one daily as a plan9 terminal, for which i feel its ideal. -Steve ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 9:14 ` hiro 2018-10-10 13:59 ` Steve Simon @ 2018-10-10 21:32 ` Digby R.S. Tarvin 1 sibling, 0 replies; 89+ messages in thread From: Digby R.S. Tarvin @ 2018-10-10 21:32 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 3664 bytes --] Well, I think 'avoid at all costs' is a bit strong. The Raspberry Pi is a good little platform for the right applications, so long as you are aware of its limitations. I use one as my 'always on' home server to give me access files when travelling (the networking is slow by LAN standards, but ok for WAN), and another for my energy monitoring system. It is good for experimenting with OS's, especially networking OS's like Plan9 where price is important if you want to try a large number of hosts. Its good for teaching/learning. Or for running/trying different operating systems without having do spend time and resources setting up VMs (downloading and flashing an sd card image is quick and takes up no space on my main systems). Just don't plan on deploying RPi's for mission critical applications that have demanding I/O or processing requirements. It was never intended to compete in that market. On Wed, 10 Oct 2018 at 20:54, hiro <23hiro@gmail.com> wrote: > I agree, if you have a choice avoid rpi by all costs. > Even if the software side of that other board was less pleasent at least > it worked with my mouse and keyboard!! :) > > As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB. > But my point is that even though this number is low, the rpi is too limited > to do any meaningful processing anyway (ignoring the usb troubles and lack > of ethernet). It's a mobile phone soc after all, where the modulation is > done by dedicated chips, not on cpu! :) > > On Wednesday, October 10, 2018, Digby R.S. Tarvin <digbyt42@gmail.com> > wrote: > > I don't know which other ARM board you tried, but I have always found > terrible I/O performance of the Pi to be a bigger problem that the ARM > speed. The USB2 interface is really slow, and there arn't really many > other (documented) alternative options. The Ethernet goes through the same > slow USB interface, and there is only so much that you can do bit bashing > data with GPIO's. The sdCard interface seems to be the only non-usb > filesystem I/O available. And that in turn limits the viability of > relieving the RAM contraints with virtual memory. So the ARM processor > itself is not usually the problem for me. > > In general I find the pi a nice little device for quite a few things - > like low power, low bandwidth, low cost servers or displays with plenty of > open source compatability.. Or hacking/prototyping where I don't want to > have to worry too much about blowing things up. But it not good for high > throughput I/O, memory intensive applications, or anything requiring a lot > of processing power. > > The validity of your conclusion regarding low power ARM in general > probably depends on what the other board you tried was.. > > DigbyT > > On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote: > >> > >> > Eliminating as much of the copy in/out WRT the kernel cannot but > >> > help, especially when you're doing SDR decoding near the radios > >> > using low-powered compute hardware (think Pies and the like). > >> > >> Does this include demodulation on the pi? cause even when i dumped the > >> pi i was given for that purpose (with a <2Mbit I/Q stream) and > >> replaced it with some similar ARM platform that at least had neon cpu > >> instruction extensions for faster floating point operations, I was > >> barely able to run a small FFT. > >> > >> My conclusion was that these low-powered ARM systems are just good > >> enough for gathering low-bandwidth, non-critical USB traffic, like > >> those raw I/Q samples from a dongle, but unfit for anything else. > >> > > [-- Attachment #2: Type: text/html, Size: 4174 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 5:52 ` hiro 2018-10-10 8:13 ` Digby R.S. Tarvin @ 2018-10-11 17:43 ` Lyndon Nerenberg 2018-10-11 19:11 ` hiro 1 sibling, 1 reply; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 17:43 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > Does this include demodulation on the pi? Yes. At least to a certain extent. The idea is to get from the high-birate I/Q data so something more amenable to transmission over an RS-422 (or -485) serial drop. One example is for an AIS transceiver on a boat. By putting the radio and decoder at the top of the mast, the backhaul can be a cat-3 twisted pair cable, rather than a much heavier coax run from the antenna at the top of the mast to the receiver below decks. Reducing the weight at the top of the mast reduces the moment arm acting on the boat, significantly enhancing the stability of a sailboat (which is how I got started down this road to begin with). --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 17:43 ` Lyndon Nerenberg @ 2018-10-11 19:11 ` hiro 2018-10-11 19:27 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-11 19:11 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > One example is for an AIS transceiver on a boat. By putting the > radio and decoder at the top of the mast, the backhaul can be a > cat-3 twisted pair cable, rather than a much heavier coax run from > the antenna at the top of the mast to the receiver below decks. Yeah, I've been sending 3Mbit I/Q samples over ethernet to a more beefy computer. For non-technical crowds I described the rpi as a passable USB->ethernet gateway for SDR tasks in that bandwidth. But given the alternatives available back then, even the armv5 in the kirkwood, which was cheaper even before the rpi became popular, did the same job more stably, which is why i would never actually recommend the pi. And there are even more alternatives now. Even the rpi itself is proof that better alternatives exist (as they did even back then when the first one out), because the newer rpi revision (i think) has finally gained neon cpu extensions, which surprisingly have been supported by gnuradio long before this, and a reason why my bachelor thesis back then was an easy success :) In general all limits that occured to me on the rpi were due to stability (usb power and compatibility issues), but more concretely for our discussion: lack of cpu power, mainly for the FFT. There were no throughput, delay or memory copy bottlenecks for me. This was using linux, because my mouse didn't work on the old rpi plan9 image and sadly there was a time-limit... Are you doing the AIS demodulation on plan9 on rpi? It would be a great showcase. Wish I had been given the opportunity to find an excuse to build something like that on plan9 instead :) ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:11 ` hiro @ 2018-10-11 19:27 ` Lyndon Nerenberg 2018-10-11 19:56 ` hiro 0 siblings, 1 reply; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-11 19:27 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg hiro writes: > But given the alternatives available back then, even the armv5 in the > kirkwood, which was cheaper even before the rpi became popular, did > the same job more stably, which is why i would never actually > recommend the pi. And there are even more alternatives now. I get that. But the actual hardware driving this conversation isn't particularly relevant,, and devolving to a hardware bikeshed isn't helpful. (Not picking on you specifically.) > Are you doing the AIS demodulation on plan9 on rpi? It would be a > great showcase. Wish I had been given the opportunity to find an > excuse to build something like that on plan9 instead :) Not yet. First I need to prove it can be done with the usual suspects (GNU radio, on the Pi -- the native fft libraries seem fast enought to make this viable). If the pessimized case works, then porting the code from the GNU radio python modules to C is a mechanical process for the most part. This week I am ENOTIME with getting the boat tarped up in preparation for the winter monsoon season :-P. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 19:27 ` Lyndon Nerenberg @ 2018-10-11 19:56 ` hiro 0 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-11 19:56 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > need to prove it can be done with the usual > suspects (GNU radio, on the Pi -- the native fft libraries seem fast > enought to make this viable). be assured i've demodulated 25khz signals in real-time and it's a walk in the park, as long as your revision has the neon stuff i mentioned, otherwise the fft becomes bottleneck. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 23:43 ` Lyndon Nerenberg 2018-10-10 5:52 ` hiro @ 2018-10-10 5:57 ` hiro 1 sibling, 0 replies; 89+ messages in thread From: hiro @ 2018-10-10 5:57 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > via USB and see how it stands up. But the real question is what > kind of delay, latency, and jitter will there be, getting that raw > I/Q data from the USB interface up to the consuming application? How is your proposal of zero-copy going to help latency? IIRC we have some real-time thingy, might be able to reduce jitter... But then I might also ask why you're not doing the most critical path on an fpga anyway? Start with identifying your worst bottleneck. > Eliminating as much of the copy in/out WRT the kernel cannot but > help wrong, this design change requires ressources, too, and might gain you higher complexity. measure first. ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) @ 2018-10-09 19:49 cinap_lenrek 2018-10-09 19:56 ` hiro 0 siblings, 1 reply; 89+ messages in thread From: cinap_lenrek @ 2018-10-09 19:49 UTC (permalink / raw) To: 9fans also, i wonder how much is the actual copy overhead you claim is the issue. maybe the impact for copying is more dominated by the memory allocator used for allocb(). have you measured? -- cinap ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-09 19:49 cinap_lenrek @ 2018-10-09 19:56 ` hiro 0 siblings, 0 replies; 89+ messages in thread From: hiro @ 2018-10-09 19:56 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs he has ignored my questions about measurement, so i'm sure he hasn't On 10/9/18, cinap_lenrek@felloff.net <cinap_lenrek@felloff.net> wrote: > also, i wonder how much is the actual copy overhead you claim is the issue. > maybe the impact for copying is more dominated by the memory allocator used > for allocb(). have you measured? > > -- > cinap > > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) @ 2018-10-10 0:15 cinap_lenrek 2018-10-10 0:22 ` Lyndon Nerenberg 0 siblings, 1 reply; 89+ messages in thread From: cinap_lenrek @ 2018-10-10 0:15 UTC (permalink / raw) To: 9fans > To address Hiro's comments, I have no benchmarks on Plan 9, because > the SDR code I run does not exist there. But I do have experience > with running SDR on Linux and FreeBSD with hardware like the HackRF > One. That hardware can easily saturate a USB2 interface/driver on > both of those operating systems. Given my experience with USB on > Plan 9 to date, it's a safe bet that all the variants would die > when presented with that amount of traffic. why? the *HOST CONTROLLER* schedules the data transfers. if the program doesnt do a read() theres nothing to schedule... (unless its isochronous endpoint, in which case the controller dma's for you in the background at the specified sampling rate). > (I can knock down a Plan9 system with 56 Kb/s USB serial traffic.) that sounds seriously scewed up. i have no issues here reading a usb stick on my x230 with xhci at 32MB/s, not using any fancy streaming optimization. no load at all. and this is just some garbage from the supermarket. > I can see about > twisting up some code that would read the raw I/Q data from the SDR > via USB and see how it stands up. But the real question is what > kind of delay, latency, and jitter will there be, getting that raw > I/Q data from the USB interface up to the consuming application? is this a isochronous endpoint? in that case you would not have to worry much as the controller does all the timing for you in hardware. > Eliminating as much of the copy in/out WRT the kernel cannot but > help, especially when you're doing SDR decoding near the radios > using low-powered compute hardware (think Pies and the like). ahhhh! we'r talking about some crappy raspi here... probably with all caches disabled... never mind. > --lyndon -- cinap ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 0:15 cinap_lenrek @ 2018-10-10 0:22 ` Lyndon Nerenberg 0 siblings, 0 replies; 89+ messages in thread From: Lyndon Nerenberg @ 2018-10-10 0:22 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg cinap_lenrek@felloff.net writes: > why? the *HOST CONTROLLER* schedules the data transfers. I *DON'T KNOW*. It's just observed behaviour. > ahhhh! we'r talking about some crappy raspi here... probably with all > caches disabled... never mind. Hah. An Rpi tips over with 1200 baud USB serial. I was talking about "real" (Intel :-P) hardware for the other tippy-over behaviour. --lyndon ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) @ 2018-10-10 16:14 cinap_lenrek 0 siblings, 0 replies; 89+ messages in thread From: cinap_lenrek @ 2018-10-10 16:14 UTC (permalink / raw) To: 9fans oh! you wrote a nvme driver TOO? where can i find it? maybe we can share some knowledge. especially regarding some quirks. i dont own hardware myself, so i wrote it using an emulator over a weekend and tested it on a work machine afterwork. http://code.9front.org/hg/plan9front/log/9df9ef969856/sys/src/9/pc/sdnvme.c -- cinap ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-10 17:34 cinap_lenrek
2018-10-10 21:54 ` Steven Stallion
0 siblings, 1 reply; 89+ messages in thread
From: cinap_lenrek @ 2018-10-10 17:34 UTC (permalink / raw)
To: 9fans
> But the reason I want this is to reduce latency to the first
> access, especially for very large files. With read() I have
> to wait until the read completes. With mmap() processing can
> start much earlier and can be interleaved with background
> data fetch or prefetch. With read() a lot more resources
> are tied down. If I need random access and don't need to
> read all of the data, the application has to do pread(),
> pwrite() a lot thus complicating it. With mmap() I can just
> map in the whole file and excess reading (beyond what the
> app needs) will not be a large fraction.
you think doing single 4K page sized reads in the pagefault
handler is better than doing precise >4K reads from your
application? possibly in a background thread so you can
overlap processing with data fetching?
the advantage of mmap is not prefetch. its about not to do
any I/O when data is already in the *SHARED* buffer cache!
which plan9 does not have (except the mntcache, but that is
optional and only works for the disk fileservers that maintain
ther file qid ver info consistently). its *IS* really a linux
thing where all block device i/o goes thru the buffer cache.
--
cinap
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 17:34 cinap_lenrek @ 2018-10-10 21:54 ` Steven Stallion 2018-10-10 22:29 ` Kurt H Maier ` (2 more replies) 0 siblings, 3 replies; 89+ messages in thread From: Steven Stallion @ 2018-10-10 21:54 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs As the guy who wrote the majority of the code that pushed those 1M 4K random IOPS erik mentioned, this thread annoys the shit out of me. You don't get an award for writing a driver. In fact, it's probably better not to be known at all considering the bloody murder one has to commit to marry hardware and software together. Let's be frank, the I/O handling in the kernel is anachronistic. To hit those rates, I had to add support for asynchronous and vectored I/O not to mention a sizable bit of work by a co-worker to properly handle NUMA on our appliances to hit those speeds. As I recall, we had to rewrite the scheduler and re-implement locking, which even Charles Forsyth had a hand in. Had we the time and resources to implement something like zero-copy we'd have done it in a heartbeat. In the end, it doesn't matter how "fast" a storage driver is in Plan 9 - as soon as you put a 9P-based filesystem on it, it's going to be limited to a single outstanding operation. This is the tyranny of 9P. We (Coraid) got around this by avoiding filesystems altogether. Go solve that problem first. On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: > > > But the reason I want this is to reduce latency to the first > > access, especially for very large files. With read() I have > > to wait until the read completes. With mmap() processing can > > start much earlier and can be interleaved with background > > data fetch or prefetch. With read() a lot more resources > > are tied down. If I need random access and don't need to > > read all of the data, the application has to do pread(), > > pwrite() a lot thus complicating it. With mmap() I can just > > map in the whole file and excess reading (beyond what the > > app needs) will not be a large fraction. > > you think doing single 4K page sized reads in the pagefault > handler is better than doing precise >4K reads from your > application? possibly in a background thread so you can > overlap processing with data fetching? > > the advantage of mmap is not prefetch. its about not to do > any I/O when data is already in the *SHARED* buffer cache! > which plan9 does not have (except the mntcache, but that is > optional and only works for the disk fileservers that maintain > ther file qid ver info consistently). its *IS* really a linux > thing where all block device i/o goes thru the buffer cache. > > -- > cinap > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 21:54 ` Steven Stallion @ 2018-10-10 22:29 ` Kurt H Maier 2018-10-10 22:55 ` Steven Stallion 2018-10-11 0:26 ` Skip Tavakkolian 2018-10-14 9:46 ` Ole-Hjalmar Kristensen 2 siblings, 1 reply; 89+ messages in thread From: Kurt H Maier @ 2018-10-10 22:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote: > As the guy might be worth keeping in mind the current most common use case for nvme is laptop storage and not building jet engines in coraid's basement so the nvme driver that cinap wrote works on my thinkpad today and is about infinity times faster than the one you guys locked up in the warehouse at the end of raiders of the lost ark, because my laptop can't seem to boot off nostalgia. so no, nobody gets an award for writing a driver. but cinap won the 9front Order of Valorous Service (with bronze oak leaf cluster, signifying working code) for *releasing* one. I was there when field marshal aiju presented the award; it was a very nice ceremony. anyway, someone once said communication is not a zero-sum game. the hyperspecific use case you describe is fine but there are other reasons to care about how well this stuff works, you know? khm ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 22:29 ` Kurt H Maier @ 2018-10-10 22:55 ` Steven Stallion 2018-10-11 11:19 ` Aram Hăvărneanu 0 siblings, 1 reply; 89+ messages in thread From: Steven Stallion @ 2018-10-10 22:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Posted August 15th, 2013: https://9p.io/sources/contrib/stallion/src/sdmpt2.c Corresponding announcement: https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ On Wed, Oct 10, 2018 at 5:31 PM Kurt H Maier <khm@sciops.net> wrote: > > On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote: > > As the guy > > might be worth keeping in mind the current most common use case for nvme > is laptop storage and not building jet engines in coraid's basement > > so the nvme driver that cinap wrote works on my thinkpad today and is > about infinity times faster than the one you guys locked up in the > warehouse at the end of raiders of the lost ark, because my laptop can't > seem to boot off nostalgia. > > so no, nobody gets an award for writing a driver. but cinap won the > 9front Order of Valorous Service (with bronze oak leaf cluster, > signifying working code) for *releasing* one. I was there when field > marshal aiju presented the award; it was a very nice ceremony. > > anyway, someone once said communication is not a zero-sum game. the > hyperspecific use case you describe is fine but there are other reasons > to care about how well this stuff works, you know? > > khm > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 22:55 ` Steven Stallion @ 2018-10-11 11:19 ` Aram Hăvărneanu 0 siblings, 0 replies; 89+ messages in thread From: Aram Hăvărneanu @ 2018-10-11 11:19 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > Posted August 15th, 2013: > https://9p.io/sources/contrib/stallion/src/sdmpt2.c Corresponding > announcement: > https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ This is not a NVMe driver. -- Aram Hăvărneanu ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 21:54 ` Steven Stallion 2018-10-10 22:29 ` Kurt H Maier @ 2018-10-11 0:26 ` Skip Tavakkolian 2018-10-11 1:03 ` Steven Stallion 2018-10-14 9:46 ` Ole-Hjalmar Kristensen 2 siblings, 1 reply; 89+ messages in thread From: Skip Tavakkolian @ 2018-10-11 0:26 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 2786 bytes --] For operations that matter in this context (read, write), there can be multiple outstanding tags. A while back rsc implemented fcp, partly to prove this point. On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion <sstallion@gmail.com> wrote: > As the guy who wrote the majority of the code that pushed those 1M 4K > random IOPS erik mentioned, this thread annoys the shit out of me. You > don't get an award for writing a driver. In fact, it's probably better > not to be known at all considering the bloody murder one has to commit > to marry hardware and software together. > > Let's be frank, the I/O handling in the kernel is anachronistic. To > hit those rates, I had to add support for asynchronous and vectored > I/O not to mention a sizable bit of work by a co-worker to properly > handle NUMA on our appliances to hit those speeds. As I recall, we had > to rewrite the scheduler and re-implement locking, which even Charles > Forsyth had a hand in. Had we the time and resources to implement > something like zero-copy we'd have done it in a heartbeat. > > In the end, it doesn't matter how "fast" a storage driver is in Plan 9 > - as soon as you put a 9P-based filesystem on it, it's going to be > limited to a single outstanding operation. This is the tyranny of 9P. > We (Coraid) got around this by avoiding filesystems altogether. > > Go solve that problem first. > On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: > > > > > But the reason I want this is to reduce latency to the first > > > access, especially for very large files. With read() I have > > > to wait until the read completes. With mmap() processing can > > > start much earlier and can be interleaved with background > > > data fetch or prefetch. With read() a lot more resources > > > are tied down. If I need random access and don't need to > > > read all of the data, the application has to do pread(), > > > pwrite() a lot thus complicating it. With mmap() I can just > > > map in the whole file and excess reading (beyond what the > > > app needs) will not be a large fraction. > > > > you think doing single 4K page sized reads in the pagefault > > handler is better than doing precise >4K reads from your > > application? possibly in a background thread so you can > > overlap processing with data fetching? > > > > the advantage of mmap is not prefetch. its about not to do > > any I/O when data is already in the *SHARED* buffer cache! > > which plan9 does not have (except the mntcache, but that is > > optional and only works for the disk fileservers that maintain > > ther file qid ver info consistently). its *IS* really a linux > > thing where all block device i/o goes thru the buffer cache. > > > > -- > > cinap > > > > [-- Attachment #2: Type: text/html, Size: 3340 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-11 0:26 ` Skip Tavakkolian @ 2018-10-11 1:03 ` Steven Stallion 0 siblings, 0 replies; 89+ messages in thread From: Steven Stallion @ 2018-10-11 1:03 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Interesting - was this ever generalized? It's been several years since I last looked, but I seem to recall that unless you went out of your way to write your own 9P implementation, you were limited to a single tag. On Wed, Oct 10, 2018 at 7:51 PM Skip Tavakkolian <skip.tavakkolian@gmail.com> wrote: > > For operations that matter in this context (read, write), there can be multiple outstanding tags. A while back rsc implemented fcp, partly to prove this point. > > On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion <sstallion@gmail.com> wrote: >> >> As the guy who wrote the majority of the code that pushed those 1M 4K >> random IOPS erik mentioned, this thread annoys the shit out of me. You >> don't get an award for writing a driver. In fact, it's probably better >> not to be known at all considering the bloody murder one has to commit >> to marry hardware and software together. >> >> Let's be frank, the I/O handling in the kernel is anachronistic. To >> hit those rates, I had to add support for asynchronous and vectored >> I/O not to mention a sizable bit of work by a co-worker to properly >> handle NUMA on our appliances to hit those speeds. As I recall, we had >> to rewrite the scheduler and re-implement locking, which even Charles >> Forsyth had a hand in. Had we the time and resources to implement >> something like zero-copy we'd have done it in a heartbeat. >> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9 >> - as soon as you put a 9P-based filesystem on it, it's going to be >> limited to a single outstanding operation. This is the tyranny of 9P. >> We (Coraid) got around this by avoiding filesystems altogether. >> >> Go solve that problem first. >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: >> > >> > > But the reason I want this is to reduce latency to the first >> > > access, especially for very large files. With read() I have >> > > to wait until the read completes. With mmap() processing can >> > > start much earlier and can be interleaved with background >> > > data fetch or prefetch. With read() a lot more resources >> > > are tied down. If I need random access and don't need to >> > > read all of the data, the application has to do pread(), >> > > pwrite() a lot thus complicating it. With mmap() I can just >> > > map in the whole file and excess reading (beyond what the >> > > app needs) will not be a large fraction. >> > >> > you think doing single 4K page sized reads in the pagefault >> > handler is better than doing precise >4K reads from your >> > application? possibly in a background thread so you can >> > overlap processing with data fetching? >> > >> > the advantage of mmap is not prefetch. its about not to do >> > any I/O when data is already in the *SHARED* buffer cache! >> > which plan9 does not have (except the mntcache, but that is >> > optional and only works for the disk fileservers that maintain >> > ther file qid ver info consistently). its *IS* really a linux >> > thing where all block device i/o goes thru the buffer cache. >> > >> > -- >> > cinap >> > >> ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-10 21:54 ` Steven Stallion 2018-10-10 22:29 ` Kurt H Maier 2018-10-11 0:26 ` Skip Tavakkolian @ 2018-10-14 9:46 ` Ole-Hjalmar Kristensen 2018-10-14 10:37 ` hiro 2 siblings, 1 reply; 89+ messages in thread From: Ole-Hjalmar Kristensen @ 2018-10-14 9:46 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 3416 bytes --] I'm not going to argue with someone who has got his hands dirty by actually doing this but I don't really get this about the tyranny of 9p. Isn't the point of the tag field to identify the request? What is stopping the client from issuing multiple requests and match the replies based on the tag? From the manual: Each T-message has a tag field, chosen and used by the client to identify the message. The reply to the message will have the same tag. Clients must arrange that no two outstanding messages on the same connection have the same tag. An exception is the tag NOTAG, defined as (ushort)~0 in <fcall.h>: the client can use it, when establishing a connection, to override tag matching in version messages. Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com>: > As the guy who wrote the majority of the code that pushed those 1M 4K > random IOPS erik mentioned, this thread annoys the shit out of me. You > don't get an award for writing a driver. In fact, it's probably better > not to be known at all considering the bloody murder one has to commit > to marry hardware and software together. > > Let's be frank, the I/O handling in the kernel is anachronistic. To > hit those rates, I had to add support for asynchronous and vectored > I/O not to mention a sizable bit of work by a co-worker to properly > handle NUMA on our appliances to hit those speeds. As I recall, we had > to rewrite the scheduler and re-implement locking, which even Charles > Forsyth had a hand in. Had we the time and resources to implement > something like zero-copy we'd have done it in a heartbeat. > > In the end, it doesn't matter how "fast" a storage driver is in Plan 9 > - as soon as you put a 9P-based filesystem on it, it's going to be > limited to a single outstanding operation. This is the tyranny of 9P. > We (Coraid) got around this by avoiding filesystems altogether. > > Go solve that problem first. > On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: > > > > > But the reason I want this is to reduce latency to the first > > > access, especially for very large files. With read() I have > > > to wait until the read completes. With mmap() processing can > > > start much earlier and can be interleaved with background > > > data fetch or prefetch. With read() a lot more resources > > > are tied down. If I need random access and don't need to > > > read all of the data, the application has to do pread(), > > > pwrite() a lot thus complicating it. With mmap() I can just > > > map in the whole file and excess reading (beyond what the > > > app needs) will not be a large fraction. > > > > you think doing single 4K page sized reads in the pagefault > > handler is better than doing precise >4K reads from your > > application? possibly in a background thread so you can > > overlap processing with data fetching? > > > > the advantage of mmap is not prefetch. its about not to do > > any I/O when data is already in the *SHARED* buffer cache! > > which plan9 does not have (except the mntcache, but that is > > optional and only works for the disk fileservers that maintain > > ther file qid ver info consistently). its *IS* really a linux > > thing where all block device i/o goes thru the buffer cache. > > > > -- > > cinap > > > > [-- Attachment #2: Type: text/html, Size: 4264 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-14 9:46 ` Ole-Hjalmar Kristensen @ 2018-10-14 10:37 ` hiro 2018-10-14 17:34 ` Ole-Hjalmar Kristensen 0 siblings, 1 reply; 89+ messages in thread From: hiro @ 2018-10-14 10:37 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs there's no tyranny involved. a client that is fine with the *responses* coming in reordered could remember the tag obviously and do whatever you imagine. the problem is potential reordering of the messages in the kernel before responding, even if the 9p transport has guaranteed ordering. On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> wrote: > I'm not going to argue with someone who has got his hands dirty by actually > doing this but I don't really get this about the tyranny of 9p. Isn't the > point of the tag field to identify the request? What is stopping the client > from issuing multiple requests and match the replies based on the tag? From > the manual: > > Each T-message has a tag field, chosen and used by the > client to identify the message. The reply to the message > will have the same tag. Clients must arrange that no two > outstanding messages on the same connection have the same > tag. An exception is the tag NOTAG, defined as (ushort)~0 > in <fcall.h>: the client can use it, when establishing a > connection, to override tag matching in version messages. > > > > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com>: > >> As the guy who wrote the majority of the code that pushed those 1M 4K >> random IOPS erik mentioned, this thread annoys the shit out of me. You >> don't get an award for writing a driver. In fact, it's probably better >> not to be known at all considering the bloody murder one has to commit >> to marry hardware and software together. >> >> Let's be frank, the I/O handling in the kernel is anachronistic. To >> hit those rates, I had to add support for asynchronous and vectored >> I/O not to mention a sizable bit of work by a co-worker to properly >> handle NUMA on our appliances to hit those speeds. As I recall, we had >> to rewrite the scheduler and re-implement locking, which even Charles >> Forsyth had a hand in. Had we the time and resources to implement >> something like zero-copy we'd have done it in a heartbeat. >> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9 >> - as soon as you put a 9P-based filesystem on it, it's going to be >> limited to a single outstanding operation. This is the tyranny of 9P. >> We (Coraid) got around this by avoiding filesystems altogether. >> >> Go solve that problem first. >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: >> > >> > > But the reason I want this is to reduce latency to the first >> > > access, especially for very large files. With read() I have >> > > to wait until the read completes. With mmap() processing can >> > > start much earlier and can be interleaved with background >> > > data fetch or prefetch. With read() a lot more resources >> > > are tied down. If I need random access and don't need to >> > > read all of the data, the application has to do pread(), >> > > pwrite() a lot thus complicating it. With mmap() I can just >> > > map in the whole file and excess reading (beyond what the >> > > app needs) will not be a large fraction. >> > >> > you think doing single 4K page sized reads in the pagefault >> > handler is better than doing precise >4K reads from your >> > application? possibly in a background thread so you can >> > overlap processing with data fetching? >> > >> > the advantage of mmap is not prefetch. its about not to do >> > any I/O when data is already in the *SHARED* buffer cache! >> > which plan9 does not have (except the mntcache, but that is >> > optional and only works for the disk fileservers that maintain >> > ther file qid ver info consistently). its *IS* really a linux >> > thing where all block device i/o goes thru the buffer cache. >> > >> > -- >> > cinap >> > >> >> > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-14 10:37 ` hiro @ 2018-10-14 17:34 ` Ole-Hjalmar Kristensen 2018-10-14 19:17 ` hiro 2018-10-15 9:29 ` Giacomo Tesio 0 siblings, 2 replies; 89+ messages in thread From: Ole-Hjalmar Kristensen @ 2018-10-14 17:34 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs [-- Attachment #1: Type: text/plain, Size: 4429 bytes --] OK, that makes sense. So it would not stop a client from for example first read an index block in a B-tree, wait for the result, and then issue read operations for all the data blocks in parallel. That's exactly the same as any asynchronous disk subsystem I am acquainted with. Reordering is the norm. On Sun, Oct 14, 2018 at 1:21 PM hiro <23hiro@gmail.com> wrote: > there's no tyranny involved. > > a client that is fine with the *responses* coming in reordered could > remember the tag obviously and do whatever you imagine. > > the problem is potential reordering of the messages in the kernel > before responding, even if the 9p transport has guaranteed ordering. > > On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> > wrote: > > I'm not going to argue with someone who has got his hands dirty by > actually > > doing this but I don't really get this about the tyranny of 9p. Isn't the > > point of the tag field to identify the request? What is stopping the > client > > from issuing multiple requests and match the replies based on the tag? > From > > the manual: > > > > Each T-message has a tag field, chosen and used by the > > client to identify the message. The reply to the message > > will have the same tag. Clients must arrange that no two > > outstanding messages on the same connection have the same > > tag. An exception is the tag NOTAG, defined as (ushort)~0 > > in <fcall.h>: the client can use it, when establishing a > > connection, to override tag matching in version messages. > > > > > > > > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com > >: > > > >> As the guy who wrote the majority of the code that pushed those 1M 4K > >> random IOPS erik mentioned, this thread annoys the shit out of me. You > >> don't get an award for writing a driver. In fact, it's probably better > >> not to be known at all considering the bloody murder one has to commit > >> to marry hardware and software together. > >> > >> Let's be frank, the I/O handling in the kernel is anachronistic. To > >> hit those rates, I had to add support for asynchronous and vectored > >> I/O not to mention a sizable bit of work by a co-worker to properly > >> handle NUMA on our appliances to hit those speeds. As I recall, we had > >> to rewrite the scheduler and re-implement locking, which even Charles > >> Forsyth had a hand in. Had we the time and resources to implement > >> something like zero-copy we'd have done it in a heartbeat. > >> > >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9 > >> - as soon as you put a 9P-based filesystem on it, it's going to be > >> limited to a single outstanding operation. This is the tyranny of 9P. > >> We (Coraid) got around this by avoiding filesystems altogether. > >> > >> Go solve that problem first. > >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: > >> > > >> > > But the reason I want this is to reduce latency to the first > >> > > access, especially for very large files. With read() I have > >> > > to wait until the read completes. With mmap() processing can > >> > > start much earlier and can be interleaved with background > >> > > data fetch or prefetch. With read() a lot more resources > >> > > are tied down. If I need random access and don't need to > >> > > read all of the data, the application has to do pread(), > >> > > pwrite() a lot thus complicating it. With mmap() I can just > >> > > map in the whole file and excess reading (beyond what the > >> > > app needs) will not be a large fraction. > >> > > >> > you think doing single 4K page sized reads in the pagefault > >> > handler is better than doing precise >4K reads from your > >> > application? possibly in a background thread so you can > >> > overlap processing with data fetching? > >> > > >> > the advantage of mmap is not prefetch. its about not to do > >> > any I/O when data is already in the *SHARED* buffer cache! > >> > which plan9 does not have (except the mntcache, but that is > >> > optional and only works for the disk fileservers that maintain > >> > ther file qid ver info consistently). its *IS* really a linux > >> > thing where all block device i/o goes thru the buffer cache. > >> > > >> > -- > >> > cinap > >> > > >> > >> > > > > [-- Attachment #2: Type: text/html, Size: 5587 bytes --] ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-14 17:34 ` Ole-Hjalmar Kristensen @ 2018-10-14 19:17 ` hiro 2018-10-15 9:29 ` Giacomo Tesio 1 sibling, 0 replies; 89+ messages in thread From: hiro @ 2018-10-14 19:17 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs also read what has been written before about fcp. and read the source of fcp. On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> wrote: > OK, that makes sense. So it would not stop a client from for example first > read an index block in a B-tree, wait for the result, and then issue read > operations for all the data blocks in parallel. That's exactly the same as > any asynchronous disk subsystem I am acquainted with. Reordering is the > norm. > > On Sun, Oct 14, 2018 at 1:21 PM hiro <23hiro@gmail.com> wrote: > >> there's no tyranny involved. >> >> a client that is fine with the *responses* coming in reordered could >> remember the tag obviously and do whatever you imagine. >> >> the problem is potential reordering of the messages in the kernel >> before responding, even if the 9p transport has guaranteed ordering. >> >> On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> >> wrote: >> > I'm not going to argue with someone who has got his hands dirty by >> actually >> > doing this but I don't really get this about the tyranny of 9p. Isn't >> > the >> > point of the tag field to identify the request? What is stopping the >> client >> > from issuing multiple requests and match the replies based on the tag? >> From >> > the manual: >> > >> > Each T-message has a tag field, chosen and used by the >> > client to identify the message. The reply to the message >> > will have the same tag. Clients must arrange that no two >> > outstanding messages on the same connection have the same >> > tag. An exception is the tag NOTAG, defined as (ushort)~0 >> > in <fcall.h>: the client can use it, when establishing a >> > connection, to override tag matching in version messages. >> > >> > >> > >> > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion >> > <sstallion@gmail.com >> >: >> > >> >> As the guy who wrote the majority of the code that pushed those 1M 4K >> >> random IOPS erik mentioned, this thread annoys the shit out of me. You >> >> don't get an award for writing a driver. In fact, it's probably better >> >> not to be known at all considering the bloody murder one has to commit >> >> to marry hardware and software together. >> >> >> >> Let's be frank, the I/O handling in the kernel is anachronistic. To >> >> hit those rates, I had to add support for asynchronous and vectored >> >> I/O not to mention a sizable bit of work by a co-worker to properly >> >> handle NUMA on our appliances to hit those speeds. As I recall, we had >> >> to rewrite the scheduler and re-implement locking, which even Charles >> >> Forsyth had a hand in. Had we the time and resources to implement >> >> something like zero-copy we'd have done it in a heartbeat. >> >> >> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9 >> >> - as soon as you put a 9P-based filesystem on it, it's going to be >> >> limited to a single outstanding operation. This is the tyranny of 9P. >> >> We (Coraid) got around this by avoiding filesystems altogether. >> >> >> >> Go solve that problem first. >> >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote: >> >> > >> >> > > But the reason I want this is to reduce latency to the first >> >> > > access, especially for very large files. With read() I have >> >> > > to wait until the read completes. With mmap() processing can >> >> > > start much earlier and can be interleaved with background >> >> > > data fetch or prefetch. With read() a lot more resources >> >> > > are tied down. If I need random access and don't need to >> >> > > read all of the data, the application has to do pread(), >> >> > > pwrite() a lot thus complicating it. With mmap() I can just >> >> > > map in the whole file and excess reading (beyond what the >> >> > > app needs) will not be a large fraction. >> >> > >> >> > you think doing single 4K page sized reads in the pagefault >> >> > handler is better than doing precise >4K reads from your >> >> > application? possibly in a background thread so you can >> >> > overlap processing with data fetching? >> >> > >> >> > the advantage of mmap is not prefetch. its about not to do >> >> > any I/O when data is already in the *SHARED* buffer cache! >> >> > which plan9 does not have (except the mntcache, but that is >> >> > optional and only works for the disk fileservers that maintain >> >> > ther file qid ver info consistently). its *IS* really a linux >> >> > thing where all block device i/o goes thru the buffer cache. >> >> > >> >> > -- >> >> > cinap >> >> > >> >> >> >> >> > >> >> > ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) 2018-10-14 17:34 ` Ole-Hjalmar Kristensen 2018-10-14 19:17 ` hiro @ 2018-10-15 9:29 ` Giacomo Tesio 1 sibling, 0 replies; 89+ messages in thread From: Giacomo Tesio @ 2018-10-15 9:29 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Il giorno dom 14 ott 2018 alle ore 19:39 Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> ha scritto: > > OK, that makes sense. So it would not stop a client from for example first read an index block in a B-tree, wait for the result, and then issue read operations for all the data blocks in parallel. If the client is the kernel that's true. If the client is directly speaking 9P that's true again. But if the client is a userspace program using pread/pwrite that wouldn't work unless it fork a new process per each read as the syscalls blocks. Which is what fcp does, actually: https://github.com/brho/plan9/blob/master/sys/src/cmd/fcp.c Giacomo ^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!) @ 2018-10-10 22:19 cinap_lenrek 0 siblings, 0 replies; 89+ messages in thread From: cinap_lenrek @ 2018-10-10 22:19 UTC (permalink / raw) To: 9fans hahahahahahahaha -- cinap ^ permalink raw reply [flat|nested] 89+ messages in thread
end of thread, other threads:[~2018-10-15 9:29 UTC | newest] Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-08 3:38 [9fans] PDP11 (Was: Re: what heavy negativity!) Lucio De Re 2018-10-08 4:29 ` Digby R.S. Tarvin 2018-10-08 7:20 ` hiro 2018-10-08 12:03 ` Charles Forsyth 2018-10-08 17:20 ` hiro 2018-10-08 21:55 ` Digby R.S. Tarvin 2018-10-08 23:03 ` Dan Cross 2018-10-09 0:14 ` Bakul Shah 2018-10-09 1:34 ` Christopher Nielsen 2018-10-09 3:28 ` Lucio De Re 2018-10-09 8:23 ` hiro 2018-10-09 9:45 ` Ethan Gardener 2018-10-09 17:50 ` Bakul Shah 2018-10-09 18:57 ` Ori Bernstein 2018-10-10 7:32 ` Giacomo Tesio 2018-10-09 17:45 ` Lyndon Nerenberg 2018-10-09 18:49 ` hiro 2018-10-09 19:14 ` Lyndon Nerenberg 2018-10-09 22:05 ` erik quanstrom 2018-10-11 17:54 ` Lyndon Nerenberg 2018-10-11 18:04 ` Kurt H Maier 2018-10-11 19:23 ` hiro 2018-10-11 19:24 ` hiro 2018-10-11 19:25 ` hiro 2018-10-11 19:26 ` Skip Tavakkolian 2018-10-11 19:39 ` Lyndon Nerenberg 2018-10-11 19:44 ` Skip Tavakkolian 2018-10-11 19:47 ` Lyndon Nerenberg 2018-10-11 19:57 ` hiro 2018-10-11 20:23 ` Lyndon Nerenberg 2018-10-10 10:42 ` Ethan Gardener 2018-10-09 19:23 ` Lyndon Nerenberg 2018-10-09 19:34 ` hiro 2018-10-09 19:36 ` hiro 2018-10-09 19:40 ` Lyndon Nerenberg 2018-10-10 0:18 ` Dan Cross 2018-10-10 5:45 ` hiro 2018-10-09 22:06 ` erik quanstrom 2018-10-10 6:24 ` Bakul Shah 2018-10-10 13:58 ` erik quanstrom 2018-10-09 22:42 ` Dan Cross 2018-10-09 19:09 ` Bakul Shah 2018-10-09 19:30 ` Lyndon Nerenberg 2018-10-09 3:08 ` Digby R.S. Tarvin 2018-10-09 3:16 ` [9fans] PDP11 David Arnold 2018-10-09 4:52 ` Digby R.S. Tarvin 2018-10-09 11:58 ` [9fans] PDP11 (Was: Re: what heavy negativity!) Ethan Gardener 2018-10-09 13:59 ` erik quanstrom 2018-10-09 22:22 ` Digby R.S. Tarvin 2018-10-10 10:38 ` Ethan Gardener 2018-10-10 23:15 ` Digby R.S. Tarvin 2018-10-11 18:10 ` Lyndon Nerenberg 2018-10-11 20:55 ` Digby R.S. Tarvin 2018-10-11 21:03 ` Lyndon Nerenberg 2018-10-09 14:02 ` erik quanstrom 2018-10-08 8:12 ` Nils M Holm 2018-10-08 9:12 ` Digby R.S. Tarvin 2018-10-08 8:09 ` Nils M Holm 2018-10-09 19:47 cinap_lenrek 2018-10-09 22:01 ` erik quanstrom 2018-10-09 23:43 ` Lyndon Nerenberg 2018-10-10 5:52 ` hiro 2018-10-10 8:13 ` Digby R.S. Tarvin 2018-10-10 9:14 ` hiro 2018-10-10 13:59 ` Steve Simon 2018-10-10 21:32 ` Digby R.S. Tarvin 2018-10-11 17:43 ` Lyndon Nerenberg 2018-10-11 19:11 ` hiro 2018-10-11 19:27 ` Lyndon Nerenberg 2018-10-11 19:56 ` hiro 2018-10-10 5:57 ` hiro 2018-10-09 19:49 cinap_lenrek 2018-10-09 19:56 ` hiro 2018-10-10 0:15 cinap_lenrek 2018-10-10 0:22 ` Lyndon Nerenberg 2018-10-10 16:14 cinap_lenrek 2018-10-10 17:34 cinap_lenrek 2018-10-10 21:54 ` Steven Stallion 2018-10-10 22:29 ` Kurt H Maier 2018-10-10 22:55 ` Steven Stallion 2018-10-11 11:19 ` Aram Hăvărneanu 2018-10-11 0:26 ` Skip Tavakkolian 2018-10-11 1:03 ` Steven Stallion 2018-10-14 9:46 ` Ole-Hjalmar Kristensen 2018-10-14 10:37 ` hiro 2018-10-14 17:34 ` Ole-Hjalmar Kristensen 2018-10-14 19:17 ` hiro 2018-10-15 9:29 ` Giacomo Tesio 2018-10-10 22:19 cinap_lenrek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).