9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-10 17:34 cinap_lenrek
  2018-10-10 21:54 ` Steven Stallion
  0 siblings, 1 reply; 104+ messages in thread
From: cinap_lenrek @ 2018-10-10 17:34 UTC (permalink / raw)
  To: 9fans

> But the reason I want this is to reduce latency to the first
> access, especially for very large files. With read() I have
> to wait until the read completes. With mmap() processing can
> start much earlier and can be interleaved with background
> data fetch or prefetch. With read() a lot more resources
> are tied down. If I need random access and don't need to
> read all of the data, the application has to do pread(),
> pwrite() a lot thus complicating it. With mmap() I can just
> map in the whole file and excess reading (beyond what the
> app needs) will not be a large fraction.

you think doing single 4K page sized reads in the pagefault
handler is better than doing precise >4K reads from your
application? possibly in a background thread so you can
overlap processing with data fetching?

the advantage of mmap is not prefetch. its about not to do
any I/O when data is already in the *SHARED* buffer cache!
which plan9 does not have (except the mntcache, but that is
optional and only works for the disk fileservers that maintain
ther file qid ver info consistently). its *IS* really a linux
thing where all block device i/o goes thru the buffer cache.

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 17:34 [9fans] PDP11 (Was: Re: what heavy negativity!) cinap_lenrek
@ 2018-10-10 21:54 ` Steven Stallion
  2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
                     ` (3 more replies)
  0 siblings, 4 replies; 104+ messages in thread
From: Steven Stallion @ 2018-10-10 21:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

As the guy who wrote the majority of the code that pushed those 1M 4K
random IOPS erik mentioned, this thread annoys the shit out of me. You
don't get an award for writing a driver. In fact, it's probably better
not to be known at all considering the bloody murder one has to commit
to marry hardware and software together.

Let's be frank, the I/O handling in the kernel is anachronistic. To
hit those rates, I had to add support for asynchronous and vectored
I/O not to mention a sizable bit of work by a co-worker to properly
handle NUMA on our appliances to hit those speeds. As I recall, we had
to rewrite the scheduler and re-implement locking, which even Charles
Forsyth had a hand in. Had we the time and resources to implement
something like zero-copy we'd have done it in a heartbeat.

In the end, it doesn't matter how "fast" a storage driver is in Plan 9
- as soon as you put a 9P-based filesystem on it, it's going to be
limited to a single outstanding operation. This is the tyranny of 9P.
We (Coraid) got around this by avoiding filesystems altogether.

Go solve that problem first.
On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>
> > But the reason I want this is to reduce latency to the first
> > access, especially for very large files. With read() I have
> > to wait until the read completes. With mmap() processing can
> > start much earlier and can be interleaved with background
> > data fetch or prefetch. With read() a lot more resources
> > are tied down. If I need random access and don't need to
> > read all of the data, the application has to do pread(),
> > pwrite() a lot thus complicating it. With mmap() I can just
> > map in the whole file and excess reading (beyond what the
> > app needs) will not be a large fraction.
>
> you think doing single 4K page sized reads in the pagefault
> handler is better than doing precise >4K reads from your
> application? possibly in a background thread so you can
> overlap processing with data fetching?
>
> the advantage of mmap is not prefetch. its about not to do
> any I/O when data is already in the *SHARED* buffer cache!
> which plan9 does not have (except the mntcache, but that is
> optional and only works for the disk fileservers that maintain
> ther file qid ver info consistently). its *IS* really a linux
> thing where all block device i/o goes thru the buffer cache.
>
> --
> cinap
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 21:54 ` Steven Stallion
@ 2018-10-10 22:26   ` Bakul Shah
  2018-10-10 22:52     ` Steven Stallion
  2018-10-11 20:43     ` Lyndon Nerenberg
  2018-10-10 22:29   ` [9fans] " Kurt H Maier
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 104+ messages in thread
From: Bakul Shah @ 2018-10-10 22:26 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Excellent response! Just what I was hoping for!

On Oct 10, 2018, at 2:54 PM, Steven Stallion <sstallion@gmail.com> wrote:
>
> As the guy who wrote the majority of the code that pushed those 1M 4K
> random IOPS erik mentioned, this thread annoys the shit out of me. You
> don't get an award for writing a driver. In fact, it's probably better
> not to be known at all considering the bloody murder one has to commit
> to marry hardware and software together.
>
> Let's be frank, the I/O handling in the kernel is anachronistic. To
> hit those rates, I had to add support for asynchronous and vectored
> I/O not to mention a sizable bit of work by a co-worker to properly
> handle NUMA on our appliances to hit those speeds. As I recall, we had
> to rewrite the scheduler and re-implement locking, which even Charles
> Forsyth had a hand in. Had we the time and resources to implement
> something like zero-copy we'd have done it in a heartbeat.
>
> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> - as soon as you put a 9P-based filesystem on it, it's going to be
> limited to a single outstanding operation. This is the tyranny of 9P.
> We (Coraid) got around this by avoiding filesystems altogether.
>
> Go solve that problem first.

You seem to be saying zero-copy wouldn't buy anything until these
other problems are solved, right?

Suppose you could replace 9p based FS with something of your choice.
Would it have made your jobs easier? Code less grotty? In other
words, is the complexity of the driver to achieve high throughput
due to the complexity of hardware or is it due to 9p's RPC model?
For streaming data you pretty much have to have some sort of
windowing protocol (data prefetch or write behind with mmap is a
similar thing).

Looks like people who have worked on the plan9 kernel have learned
a lot of lessons and have a lot of good advice to offer. I'd love
to learn from that. Except usually I rarely see anyone criticizing
plan9.


> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>>
>>> But the reason I want this is to reduce latency to the first
>>> access, especially for very large files. With read() I have
>>> to wait until the read completes. With mmap() processing can
>>> start much earlier and can be interleaved with background
>>> data fetch or prefetch. With read() a lot more resources
>>> are tied down. If I need random access and don't need to
>>> read all of the data, the application has to do pread(),
>>> pwrite() a lot thus complicating it. With mmap() I can just
>>> map in the whole file and excess reading (beyond what the
>>> app needs) will not be a large fraction.
>>
>> you think doing single 4K page sized reads in the pagefault
>> handler is better than doing precise >4K reads from your
>> application? possibly in a background thread so you can
>> overlap processing with data fetching?
>>
>> the advantage of mmap is not prefetch. its about not to do
>> any I/O when data is already in the *SHARED* buffer cache!
>> which plan9 does not have (except the mntcache, but that is
>> optional and only works for the disk fileservers that maintain
>> ther file qid ver info consistently). its *IS* really a linux
>> thing where all block device i/o goes thru the buffer cache.
>>
>> --
>> cinap
>>
>




^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 21:54 ` Steven Stallion
  2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
@ 2018-10-10 22:29   ` Kurt H Maier
  2018-10-10 22:55     ` Steven Stallion
  2018-10-11  0:26   ` Skip Tavakkolian
  2018-10-14  9:46   ` Ole-Hjalmar Kristensen
  3 siblings, 1 reply; 104+ messages in thread
From: Kurt H Maier @ 2018-10-10 22:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote:
> As the guy

might be worth keeping in mind the current most common use case for nvme
is laptop storage and not building jet engines in coraid's basement

so the nvme driver that cinap wrote works on my thinkpad today and is
about infinity times faster than the one you guys locked up in the
warehouse at the end of raiders of the lost ark, because my laptop can't
seem to boot off nostalgia.

so no, nobody gets an award for writing a driver.  but cinap won the
9front Order of Valorous Service (with bronze oak leaf cluster,
signifying working code) for *releasing* one.  I was there when field
marshal aiju presented the award; it was a very nice ceremony.

anyway, someone once said communication is not a zero-sum game.  the
hyperspecific use case you describe is fine but there are other reasons
to care about how well this stuff works, you know?

khm



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
@ 2018-10-10 22:52     ` Steven Stallion
  2018-10-11 20:43     ` Lyndon Nerenberg
  1 sibling, 0 replies; 104+ messages in thread
From: Steven Stallion @ 2018-10-10 22:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> On Oct 10, 2018, at 2:54 PM, Steven Stallion <sstallion@gmail.com> wrote:
>
> You seem to be saying zero-copy wouldn't buy anything until these
> other problems are solved, right?

Fundamentally zero-copy requires that the kernel and user process
share the same virtual address space mapped for the given operation.
This can't always be done and the kernel will be forced to perform a
copy anyway. To wit, one of the things I added to the exynos kernel
early on was a 1:1 mapping of the virtual kernel address space such
that something like zero-copy could be possible in the future (it was
also very convenient to limit MMU swaps on the Cortex-A15). That said,
the problem gets harder when you're working on something more general
that can handle the entire address space. In the end, you trade the
complexity/performance hit of MMU management versus making a copy.
Believe it or not, sometimes copies can be faster, especially on
larger NUMA systems.

> Suppose you could replace 9p based FS with something of your choice.
> Would it have made your jobs easier? Code less grotty? In other
> words, is the complexity of the driver to achieve high throughput
> due to the complexity of hardware or is it due to 9p's RPC model?
> For streaming data you pretty much have to have some sort of
> windowing protocol (data prefetch or write behind with mmap is a
> similar thing).

This is one of those problems that afflicts storage more than any
other subsystem, but like most things it's a tradeoff. Having a
filesystem that doesn't support 9P doesn't seem to make much sense on
Plan 9 given the ubiquity of the protocol. Dealing with the multiple
outstanding issue does make filesystem support much more complex and
would have a far-reaching effect on existing code (not to mention the
kernel).

It's completely possible to support prefetch and/or streaming I/O
using existing kernel interfaces. cinap's comment about read not
returning until the entire buffer is read is an implementation detail
of the underlying device. A read call is free to return fewer bytes
than requested; it's not uncommon for a driver to return partial data
to favor latency over throughput. In other words, there's no magic
behind mmap - it's a convenience interface. If you look at how other
kernels tend to implement I/O, there are generally fundamental calls
to the a read/write interface - there are no special provisions for
mmap beyond the syscall layer.

The beauty of 9P is you can wrap driver filesystems for added
functionality. Want a block caching interface? Great! Slap a kernel
device on top of a storage driver that handles caching and prefetch.
I'm sure you can see where this is going...

> Looks like people who have worked on the plan9 kernel have learned
> a lot of lessons and have a lot of good advice to offer. I'd love
> to learn from that. Except usually I rarely see anyone criticizing
> plan9.

Something, something, in polite company :-)



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 22:29   ` [9fans] " Kurt H Maier
@ 2018-10-10 22:55     ` Steven Stallion
  2018-10-11 11:19       ` Aram Hăvărneanu
  0 siblings, 1 reply; 104+ messages in thread
From: Steven Stallion @ 2018-10-10 22:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Posted August 15th, 2013: https://9p.io/sources/contrib/stallion/src/sdmpt2.c
Corresponding announcement:
https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ
On Wed, Oct 10, 2018 at 5:31 PM Kurt H Maier <khm@sciops.net> wrote:
>
> On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote:
> > As the guy
>
> might be worth keeping in mind the current most common use case for nvme
> is laptop storage and not building jet engines in coraid's basement
>
> so the nvme driver that cinap wrote works on my thinkpad today and is
> about infinity times faster than the one you guys locked up in the
> warehouse at the end of raiders of the lost ark, because my laptop can't
> seem to boot off nostalgia.
>
> so no, nobody gets an award for writing a driver.  but cinap won the
> 9front Order of Valorous Service (with bronze oak leaf cluster,
> signifying working code) for *releasing* one.  I was there when field
> marshal aiju presented the award; it was a very nice ceremony.
>
> anyway, someone once said communication is not a zero-sum game.  the
> hyperspecific use case you describe is fine but there are other reasons
> to care about how well this stuff works, you know?
>
> khm
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 21:54 ` Steven Stallion
  2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
  2018-10-10 22:29   ` [9fans] " Kurt H Maier
@ 2018-10-11  0:26   ` Skip Tavakkolian
  2018-10-11  1:03     ` Steven Stallion
  2018-10-14  9:46   ` Ole-Hjalmar Kristensen
  3 siblings, 1 reply; 104+ messages in thread
From: Skip Tavakkolian @ 2018-10-11  0:26 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

For operations that matter in this context (read, write), there can be
multiple outstanding tags. A while back rsc implemented fcp, partly to
prove this point.

On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion <sstallion@gmail.com> wrote:

> As the guy who wrote the majority of the code that pushed those 1M 4K
> random IOPS erik mentioned, this thread annoys the shit out of me. You
> don't get an award for writing a driver. In fact, it's probably better
> not to be known at all considering the bloody murder one has to commit
> to marry hardware and software together.
>
> Let's be frank, the I/O handling in the kernel is anachronistic. To
> hit those rates, I had to add support for asynchronous and vectored
> I/O not to mention a sizable bit of work by a co-worker to properly
> handle NUMA on our appliances to hit those speeds. As I recall, we had
> to rewrite the scheduler and re-implement locking, which even Charles
> Forsyth had a hand in. Had we the time and resources to implement
> something like zero-copy we'd have done it in a heartbeat.
>
> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> - as soon as you put a 9P-based filesystem on it, it's going to be
> limited to a single outstanding operation. This is the tyranny of 9P.
> We (Coraid) got around this by avoiding filesystems altogether.
>
> Go solve that problem first.
> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
> >
> > > But the reason I want this is to reduce latency to the first
> > > access, especially for very large files. With read() I have
> > > to wait until the read completes. With mmap() processing can
> > > start much earlier and can be interleaved with background
> > > data fetch or prefetch. With read() a lot more resources
> > > are tied down. If I need random access and don't need to
> > > read all of the data, the application has to do pread(),
> > > pwrite() a lot thus complicating it. With mmap() I can just
> > > map in the whole file and excess reading (beyond what the
> > > app needs) will not be a large fraction.
> >
> > you think doing single 4K page sized reads in the pagefault
> > handler is better than doing precise >4K reads from your
> > application? possibly in a background thread so you can
> > overlap processing with data fetching?
> >
> > the advantage of mmap is not prefetch. its about not to do
> > any I/O when data is already in the *SHARED* buffer cache!
> > which plan9 does not have (except the mntcache, but that is
> > optional and only works for the disk fileservers that maintain
> > ther file qid ver info consistently). its *IS* really a linux
> > thing where all block device i/o goes thru the buffer cache.
> >
> > --
> > cinap
> >
>
>

[-- Attachment #2: Type: text/html, Size: 3340 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11  0:26   ` Skip Tavakkolian
@ 2018-10-11  1:03     ` Steven Stallion
  0 siblings, 0 replies; 104+ messages in thread
From: Steven Stallion @ 2018-10-11  1:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Interesting - was this ever generalized? It's been several years since
I last looked, but I seem to recall that unless you went out of your
way to write your own 9P implementation, you were limited to a single
tag.
On Wed, Oct 10, 2018 at 7:51 PM Skip Tavakkolian
<skip.tavakkolian@gmail.com> wrote:
>
> For operations that matter in this context (read, write), there can be multiple outstanding tags. A while back rsc implemented fcp, partly to prove this point.
>
> On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion <sstallion@gmail.com> wrote:
>>
>> As the guy who wrote the majority of the code that pushed those 1M 4K
>> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> don't get an award for writing a driver. In fact, it's probably better
>> not to be known at all considering the bloody murder one has to commit
>> to marry hardware and software together.
>>
>> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> hit those rates, I had to add support for asynchronous and vectored
>> I/O not to mention a sizable bit of work by a co-worker to properly
>> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> to rewrite the scheduler and re-implement locking, which even Charles
>> Forsyth had a hand in. Had we the time and resources to implement
>> something like zero-copy we'd have done it in a heartbeat.
>>
>> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> - as soon as you put a 9P-based filesystem on it, it's going to be
>> limited to a single outstanding operation. This is the tyranny of 9P.
>> We (Coraid) got around this by avoiding filesystems altogether.
>>
>> Go solve that problem first.
>> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>> >
>> > > But the reason I want this is to reduce latency to the first
>> > > access, especially for very large files. With read() I have
>> > > to wait until the read completes. With mmap() processing can
>> > > start much earlier and can be interleaved with background
>> > > data fetch or prefetch. With read() a lot more resources
>> > > are tied down. If I need random access and don't need to
>> > > read all of the data, the application has to do pread(),
>> > > pwrite() a lot thus complicating it. With mmap() I can just
>> > > map in the whole file and excess reading (beyond what the
>> > > app needs) will not be a large fraction.
>> >
>> > you think doing single 4K page sized reads in the pagefault
>> > handler is better than doing precise >4K reads from your
>> > application? possibly in a background thread so you can
>> > overlap processing with data fetching?
>> >
>> > the advantage of mmap is not prefetch. its about not to do
>> > any I/O when data is already in the *SHARED* buffer cache!
>> > which plan9 does not have (except the mntcache, but that is
>> > optional and only works for the disk fileservers that maintain
>> > ther file qid ver info consistently). its *IS* really a linux
>> > thing where all block device i/o goes thru the buffer cache.
>> >
>> > --
>> > cinap
>> >
>>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 22:55     ` Steven Stallion
@ 2018-10-11 11:19       ` Aram Hăvărneanu
  0 siblings, 0 replies; 104+ messages in thread
From: Aram Hăvărneanu @ 2018-10-11 11:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Posted August 15th, 2013:
>   https://9p.io/sources/contrib/stallion/src/sdmpt2.c Corresponding
> announcement:
>   https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ

This is not a NVMe driver.

-- 
Aram Hăvărneanu



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
  2018-10-10 22:52     ` Steven Stallion
@ 2018-10-11 20:43     ` Lyndon Nerenberg
  2018-10-11 22:28       ` hiro
  2018-10-12  6:04       ` Ori Bernstein
  1 sibling, 2 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 20:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Another case to ponder ...   We're handling the incoming I/Q data
stream, but need to fan that out to many downstream consumers.  If
we already read the data into a page, then flip it to the first
consumer, is there a benefit to adding a reference counter to that
read-only page and leaving the page live until the counter expires?

Hiro clamours for benchmarks.  I agree.  Some basic searches I've
done don't show anyone trying this out with P9 (and publishing
their results).  Anybody have hints/references to prior work?

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 20:43     ` Lyndon Nerenberg
@ 2018-10-11 22:28       ` hiro
  2018-10-12  6:04       ` Ori Bernstein
  1 sibling, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-11 22:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

i'm not saying you should measure a lot even. just trying to make you
verify my point that this is not your bottleneck, just check if you
hit a cpu limit already with that single processing stage (my guess
was FFT).

the reason why i think my guess is right is bec. of experience with
the low bandwidth of SEQUENTIAL data you're claiming could create
problems.

in contrast i'm happy stallione at least brought up something more
demanding earlier, like finding true limits during small block-size
random access.

On 10/11/18, Lyndon Nerenberg <lyndon@orthanc.ca> wrote:
> Another case to ponder ...   We're handling the incoming I/Q data
> stream, but need to fan that out to many downstream consumers.  If
> we already read the data into a page, then flip it to the first
> consumer, is there a benefit to adding a reference counter to that
> read-only page and leaving the page live until the counter expires?
>
> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
> done don't show anyone trying this out with P9 (and publishing
> their results).  Anybody have hints/references to prior work?
>
> --lyndon
>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 20:43     ` Lyndon Nerenberg
  2018-10-11 22:28       ` hiro
@ 2018-10-12  6:04       ` Ori Bernstein
  2018-10-13 18:01         ` Charles Forsyth
  1 sibling, 1 reply; 104+ messages in thread
From: Ori Bernstein @ 2018-10-12  6:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg <lyndon@orthanc.ca> wrote:

> Another case to ponder ...   We're handling the incoming I/Q data
> stream, but need to fan that out to many downstream consumers.  If
> we already read the data into a page, then flip it to the first
> consumer, is there a benefit to adding a reference counter to that
> read-only page and leaving the page live until the counter expires?
>
> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
> done don't show anyone trying this out with P9 (and publishing
> their results).  Anybody have hints/references to prior work?
>
> --lyndon
>

I don't believe anyone has done the work yet. I'd be interested
to see what you come up with.


--
    Ori Bernstein



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-12  6:04       ` Ori Bernstein
@ 2018-10-13 18:01         ` Charles Forsyth
  2018-10-13 21:11           ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: Charles Forsyth @ 2018-10-13 18:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

[-- Attachment #1: Type: text/plain, Size: 1156 bytes --]

I did several versions of one part of zero copy, inspired by several things
in x-kernel, replacing Blocks by another structure throughout the network
stacks and kernel, then made messages visible to user level. Nemo did
another part, on his way to Clive

On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:

> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg <lyndon@orthanc.ca>
> wrote:
>
> > Another case to ponder ...   We're handling the incoming I/Q data
> > stream, but need to fan that out to many downstream consumers.  If
> > we already read the data into a page, then flip it to the first
> > consumer, is there a benefit to adding a reference counter to that
> > read-only page and leaving the page live until the counter expires?
> >
> > Hiro clamours for benchmarks.  I agree.  Some basic searches I've
> > done don't show anyone trying this out with P9 (and publishing
> > their results).  Anybody have hints/references to prior work?
> >
> > --lyndon
> >
>
> I don't believe anyone has done the work yet. I'd be interested
> to see what you come up with.
>
>
> --
>     Ori Bernstein
>
>

[-- Attachment #2: Type: text/html, Size: 1581 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-13 18:01         ` Charles Forsyth
@ 2018-10-13 21:11           ` hiro
  2018-10-14  5:25             ` FJ Ballesteros
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-13 21:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

and, did it improve anything noticeably?

On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
> I did several versions of one part of zero copy, inspired by several things
> in x-kernel, replacing Blocks by another structure throughout the network
> stacks and kernel, then made messages visible to user level. Nemo did
> another part, on his way to Clive
>
> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>
>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg <lyndon@orthanc.ca>
>> wrote:
>>
>> > Another case to ponder ...   We're handling the incoming I/Q data
>> > stream, but need to fan that out to many downstream consumers.  If
>> > we already read the data into a page, then flip it to the first
>> > consumer, is there a benefit to adding a reference counter to that
>> > read-only page and leaving the page live until the counter expires?
>> >
>> > Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>> > done don't show anyone trying this out with P9 (and publishing
>> > their results).  Anybody have hints/references to prior work?
>> >
>> > --lyndon
>> >
>>
>> I don't believe anyone has done the work yet. I'd be interested
>> to see what you come up with.
>>
>>
>> --
>>     Ori Bernstein
>>
>>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-13 21:11           ` hiro
@ 2018-10-14  5:25             ` FJ Ballesteros
  2018-10-14  7:34               ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: FJ Ballesteros @ 2018-10-14  5:25 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

yes. bugs, on my side at least. 
The copy isolates from others. 
But some experiments in nix and in a thing I wrote for leanxcale show that some things can be much faster. 
It’s fun either way. 

> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
> 
> and, did it improve anything noticeably?
> 
>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
>> I did several versions of one part of zero copy, inspired by several things
>> in x-kernel, replacing Blocks by another structure throughout the network
>> stacks and kernel, then made messages visible to user level. Nemo did
>> another part, on his way to Clive
>> 
>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>>> 
>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg <lyndon@orthanc.ca>
>>> wrote:
>>> 
>>>> Another case to ponder ...   We're handling the incoming I/Q data
>>>> stream, but need to fan that out to many downstream consumers.  If
>>>> we already read the data into a page, then flip it to the first
>>>> consumer, is there a benefit to adding a reference counter to that
>>>> read-only page and leaving the page live until the counter expires?
>>>> 
>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>>>> done don't show anyone trying this out with P9 (and publishing
>>>> their results).  Anybody have hints/references to prior work?
>>>> 
>>>> --lyndon
>>>> 
>>> 
>>> I don't believe anyone has done the work yet. I'd be interested
>>> to see what you come up with.
>>> 
>>> 
>>> --
>>>    Ori Bernstein
>>> 
>>> 
>> 
> 




^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-14  5:25             ` FJ Ballesteros
@ 2018-10-14  7:34               ` hiro
  2018-10-14  7:38                 ` Francisco J Ballesteros
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-14  7:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

well, finding bugs is always good :)
but since i got curious could you also tell which things exactly got
much faster, so that we know what might be possible?

On 10/14/18, FJ Ballesteros <nemo@lsub.org> wrote:
> yes. bugs, on my side at least.
> The copy isolates from others.
> But some experiments in nix and in a thing I wrote for leanxcale show that
> some things can be much faster.
> It’s fun either way.
>
>> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
>>
>> and, did it improve anything noticeably?
>>
>>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
>>> I did several versions of one part of zero copy, inspired by several
>>> things
>>> in x-kernel, replacing Blocks by another structure throughout the
>>> network
>>> stacks and kernel, then made messages visible to user level. Nemo did
>>> another part, on his way to Clive
>>>
>>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>>>>
>>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg
>>>> <lyndon@orthanc.ca>
>>>> wrote:
>>>>
>>>>> Another case to ponder ...   We're handling the incoming I/Q data
>>>>> stream, but need to fan that out to many downstream consumers.  If
>>>>> we already read the data into a page, then flip it to the first
>>>>> consumer, is there a benefit to adding a reference counter to that
>>>>> read-only page and leaving the page live until the counter expires?
>>>>>
>>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>>>>> done don't show anyone trying this out with P9 (and publishing
>>>>> their results).  Anybody have hints/references to prior work?
>>>>>
>>>>> --lyndon
>>>>>
>>>>
>>>> I don't believe anyone has done the work yet. I'd be interested
>>>> to see what you come up with.
>>>>
>>>>
>>>> --
>>>>    Ori Bernstein
>>>>
>>>>
>>>
>>
>
>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-14  7:34               ` hiro
@ 2018-10-14  7:38                 ` Francisco J Ballesteros
  2018-10-14  8:00                   ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: Francisco J Ballesteros @ 2018-10-14  7:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Pure "producer/cosumer" stuff, like sending things through a pipe as long as the source didn't need to touch the data ever more.
Regarding bugs, I meant "producing bugs" not "fixing bugs", btw.

> On 14 Oct 2018, at 09:34, hiro <23hiro@gmail.com> wrote:
> 
> well, finding bugs is always good :)
> but since i got curious could you also tell which things exactly got
> much faster, so that we know what might be possible?
> 
> On 10/14/18, FJ Ballesteros <nemo@lsub.org> wrote:
>> yes. bugs, on my side at least.
>> The copy isolates from others.
>> But some experiments in nix and in a thing I wrote for leanxcale show that
>> some things can be much faster.
>> It’s fun either way.
>> 
>>> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
>>> 
>>> and, did it improve anything noticeably?
>>> 
>>>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
>>>> I did several versions of one part of zero copy, inspired by several
>>>> things
>>>> in x-kernel, replacing Blocks by another structure throughout the
>>>> network
>>>> stacks and kernel, then made messages visible to user level. Nemo did
>>>> another part, on his way to Clive
>>>> 
>>>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>>>>> 
>>>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg
>>>>> <lyndon@orthanc.ca>
>>>>> wrote:
>>>>> 
>>>>>> Another case to ponder ...   We're handling the incoming I/Q data
>>>>>> stream, but need to fan that out to many downstream consumers.  If
>>>>>> we already read the data into a page, then flip it to the first
>>>>>> consumer, is there a benefit to adding a reference counter to that
>>>>>> read-only page and leaving the page live until the counter expires?
>>>>>> 
>>>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>>>>>> done don't show anyone trying this out with P9 (and publishing
>>>>>> their results).  Anybody have hints/references to prior work?
>>>>>> 
>>>>>> --lyndon
>>>>>> 
>>>>> 
>>>>> I don't believe anyone has done the work yet. I'd be interested
>>>>> to see what you come up with.
>>>>> 
>>>>> 
>>>>> --
>>>>>   Ori Bernstein
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
> 




^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-14  7:38                 ` Francisco J Ballesteros
@ 2018-10-14  8:00                   ` hiro
  2018-10-15 16:48                     ` Charles Forsyth
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-14  8:00 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

thanks, this will allow us to know where to look more closely.

On 10/14/18, Francisco J Ballesteros <nemo@lsub.org> wrote:
> Pure "producer/cosumer" stuff, like sending things through a pipe as long as
> the source didn't need to touch the data ever more.
> Regarding bugs, I meant "producing bugs" not "fixing bugs", btw.
>
>> On 14 Oct 2018, at 09:34, hiro <23hiro@gmail.com> wrote:
>>
>> well, finding bugs is always good :)
>> but since i got curious could you also tell which things exactly got
>> much faster, so that we know what might be possible?
>>
>> On 10/14/18, FJ Ballesteros <nemo@lsub.org> wrote:
>>> yes. bugs, on my side at least.
>>> The copy isolates from others.
>>> But some experiments in nix and in a thing I wrote for leanxcale show
>>> that
>>> some things can be much faster.
>>> It’s fun either way.
>>>
>>>> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
>>>>
>>>> and, did it improve anything noticeably?
>>>>
>>>>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
>>>>> I did several versions of one part of zero copy, inspired by several
>>>>> things
>>>>> in x-kernel, replacing Blocks by another structure throughout the
>>>>> network
>>>>> stacks and kernel, then made messages visible to user level. Nemo did
>>>>> another part, on his way to Clive
>>>>>
>>>>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>>>>>>
>>>>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg
>>>>>> <lyndon@orthanc.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> Another case to ponder ...   We're handling the incoming I/Q data
>>>>>>> stream, but need to fan that out to many downstream consumers.  If
>>>>>>> we already read the data into a page, then flip it to the first
>>>>>>> consumer, is there a benefit to adding a reference counter to that
>>>>>>> read-only page and leaving the page live until the counter expires?
>>>>>>>
>>>>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>>>>>>> done don't show anyone trying this out with P9 (and publishing
>>>>>>> their results).  Anybody have hints/references to prior work?
>>>>>>>
>>>>>>> --lyndon
>>>>>>>
>>>>>>
>>>>>> I don't believe anyone has done the work yet. I'd be interested
>>>>>> to see what you come up with.
>>>>>>
>>>>>>
>>>>>> --
>>>>>>   Ori Bernstein
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 21:54 ` Steven Stallion
                     ` (2 preceding siblings ...)
  2018-10-11  0:26   ` Skip Tavakkolian
@ 2018-10-14  9:46   ` Ole-Hjalmar Kristensen
  2018-10-14 10:37     ` hiro
  3 siblings, 1 reply; 104+ messages in thread
From: Ole-Hjalmar Kristensen @ 2018-10-14  9:46 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 3416 bytes --]

I'm not going to argue with someone who has got his hands dirty by actually
doing this but I don't really get this about the tyranny of 9p. Isn't the
point of the tag field to identify the request? What is stopping the client
from issuing multiple requests and match the replies based on the tag? From
the manual:

Each T-message has a tag field, chosen and used by the
          client to identify the message.  The reply to the message
          will have the same tag.  Clients must arrange that no two
          outstanding messages on the same connection have the same
          tag.  An exception is the tag NOTAG, defined as (ushort)~0
          in <fcall.h>: the client can use it, when establishing a
          connection, to override tag matching in version messages.



Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com>:

> As the guy who wrote the majority of the code that pushed those 1M 4K
> random IOPS erik mentioned, this thread annoys the shit out of me. You
> don't get an award for writing a driver. In fact, it's probably better
> not to be known at all considering the bloody murder one has to commit
> to marry hardware and software together.
>
> Let's be frank, the I/O handling in the kernel is anachronistic. To
> hit those rates, I had to add support for asynchronous and vectored
> I/O not to mention a sizable bit of work by a co-worker to properly
> handle NUMA on our appliances to hit those speeds. As I recall, we had
> to rewrite the scheduler and re-implement locking, which even Charles
> Forsyth had a hand in. Had we the time and resources to implement
> something like zero-copy we'd have done it in a heartbeat.
>
> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> - as soon as you put a 9P-based filesystem on it, it's going to be
> limited to a single outstanding operation. This is the tyranny of 9P.
> We (Coraid) got around this by avoiding filesystems altogether.
>
> Go solve that problem first.
> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
> >
> > > But the reason I want this is to reduce latency to the first
> > > access, especially for very large files. With read() I have
> > > to wait until the read completes. With mmap() processing can
> > > start much earlier and can be interleaved with background
> > > data fetch or prefetch. With read() a lot more resources
> > > are tied down. If I need random access and don't need to
> > > read all of the data, the application has to do pread(),
> > > pwrite() a lot thus complicating it. With mmap() I can just
> > > map in the whole file and excess reading (beyond what the
> > > app needs) will not be a large fraction.
> >
> > you think doing single 4K page sized reads in the pagefault
> > handler is better than doing precise >4K reads from your
> > application? possibly in a background thread so you can
> > overlap processing with data fetching?
> >
> > the advantage of mmap is not prefetch. its about not to do
> > any I/O when data is already in the *SHARED* buffer cache!
> > which plan9 does not have (except the mntcache, but that is
> > optional and only works for the disk fileservers that maintain
> > ther file qid ver info consistently). its *IS* really a linux
> > thing where all block device i/o goes thru the buffer cache.
> >
> > --
> > cinap
> >
>
>

[-- Attachment #2: Type: text/html, Size: 4264 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-14  9:46   ` Ole-Hjalmar Kristensen
@ 2018-10-14 10:37     ` hiro
  2018-10-14 17:34       ` Ole-Hjalmar Kristensen
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-14 10:37 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

there's no tyranny involved.

a client that is fine with the *responses* coming in reordered could
remember the tag obviously and do whatever you imagine.

the problem is potential reordering of the messages in the kernel
before responding, even if the 9p transport has guaranteed ordering.

On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> wrote:
> I'm not going to argue with someone who has got his hands dirty by actually
> doing this but I don't really get this about the tyranny of 9p. Isn't the
> point of the tag field to identify the request? What is stopping the client
> from issuing multiple requests and match the replies based on the tag? From
> the manual:
>
> Each T-message has a tag field, chosen and used by the
>           client to identify the message.  The reply to the message
>           will have the same tag.  Clients must arrange that no two
>           outstanding messages on the same connection have the same
>           tag.  An exception is the tag NOTAG, defined as (ushort)~0
>           in <fcall.h>: the client can use it, when establishing a
>           connection, to override tag matching in version messages.
>
>
>
> Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com>:
>
>> As the guy who wrote the majority of the code that pushed those 1M 4K
>> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> don't get an award for writing a driver. In fact, it's probably better
>> not to be known at all considering the bloody murder one has to commit
>> to marry hardware and software together.
>>
>> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> hit those rates, I had to add support for asynchronous and vectored
>> I/O not to mention a sizable bit of work by a co-worker to properly
>> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> to rewrite the scheduler and re-implement locking, which even Charles
>> Forsyth had a hand in. Had we the time and resources to implement
>> something like zero-copy we'd have done it in a heartbeat.
>>
>> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> - as soon as you put a 9P-based filesystem on it, it's going to be
>> limited to a single outstanding operation. This is the tyranny of 9P.
>> We (Coraid) got around this by avoiding filesystems altogether.
>>
>> Go solve that problem first.
>> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>> >
>> > > But the reason I want this is to reduce latency to the first
>> > > access, especially for very large files. With read() I have
>> > > to wait until the read completes. With mmap() processing can
>> > > start much earlier and can be interleaved with background
>> > > data fetch or prefetch. With read() a lot more resources
>> > > are tied down. If I need random access and don't need to
>> > > read all of the data, the application has to do pread(),
>> > > pwrite() a lot thus complicating it. With mmap() I can just
>> > > map in the whole file and excess reading (beyond what the
>> > > app needs) will not be a large fraction.
>> >
>> > you think doing single 4K page sized reads in the pagefault
>> > handler is better than doing precise >4K reads from your
>> > application? possibly in a background thread so you can
>> > overlap processing with data fetching?
>> >
>> > the advantage of mmap is not prefetch. its about not to do
>> > any I/O when data is already in the *SHARED* buffer cache!
>> > which plan9 does not have (except the mntcache, but that is
>> > optional and only works for the disk fileservers that maintain
>> > ther file qid ver info consistently). its *IS* really a linux
>> > thing where all block device i/o goes thru the buffer cache.
>> >
>> > --
>> > cinap
>> >
>>
>>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-14 10:37     ` hiro
@ 2018-10-14 17:34       ` Ole-Hjalmar Kristensen
  2018-10-14 19:17         ` hiro
  2018-10-15  9:29         ` Giacomo Tesio
  0 siblings, 2 replies; 104+ messages in thread
From: Ole-Hjalmar Kristensen @ 2018-10-14 17:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 4429 bytes --]

OK, that makes sense. So it would not stop a client from for example first
read an index block in a B-tree, wait for the result, and then issue read
operations for all the data blocks in parallel. That's exactly the same as
any asynchronous disk subsystem I am acquainted with. Reordering is the
norm.

On Sun, Oct 14, 2018 at 1:21 PM hiro <23hiro@gmail.com> wrote:

> there's no tyranny involved.
>
> a client that is fine with the *responses* coming in reordered could
> remember the tag obviously and do whatever you imagine.
>
> the problem is potential reordering of the messages in the kernel
> before responding, even if the 9p transport has guaranteed ordering.
>
> On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com>
> wrote:
> > I'm not going to argue with someone who has got his hands dirty by
> actually
> > doing this but I don't really get this about the tyranny of 9p. Isn't the
> > point of the tag field to identify the request? What is stopping the
> client
> > from issuing multiple requests and match the replies based on the tag?
> From
> > the manual:
> >
> > Each T-message has a tag field, chosen and used by the
> >           client to identify the message.  The reply to the message
> >           will have the same tag.  Clients must arrange that no two
> >           outstanding messages on the same connection have the same
> >           tag.  An exception is the tag NOTAG, defined as (ushort)~0
> >           in <fcall.h>: the client can use it, when establishing a
> >           connection, to override tag matching in version messages.
> >
> >
> >
> > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion <sstallion@gmail.com
> >:
> >
> >> As the guy who wrote the majority of the code that pushed those 1M 4K
> >> random IOPS erik mentioned, this thread annoys the shit out of me. You
> >> don't get an award for writing a driver. In fact, it's probably better
> >> not to be known at all considering the bloody murder one has to commit
> >> to marry hardware and software together.
> >>
> >> Let's be frank, the I/O handling in the kernel is anachronistic. To
> >> hit those rates, I had to add support for asynchronous and vectored
> >> I/O not to mention a sizable bit of work by a co-worker to properly
> >> handle NUMA on our appliances to hit those speeds. As I recall, we had
> >> to rewrite the scheduler and re-implement locking, which even Charles
> >> Forsyth had a hand in. Had we the time and resources to implement
> >> something like zero-copy we'd have done it in a heartbeat.
> >>
> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> >> - as soon as you put a 9P-based filesystem on it, it's going to be
> >> limited to a single outstanding operation. This is the tyranny of 9P.
> >> We (Coraid) got around this by avoiding filesystems altogether.
> >>
> >> Go solve that problem first.
> >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
> >> >
> >> > > But the reason I want this is to reduce latency to the first
> >> > > access, especially for very large files. With read() I have
> >> > > to wait until the read completes. With mmap() processing can
> >> > > start much earlier and can be interleaved with background
> >> > > data fetch or prefetch. With read() a lot more resources
> >> > > are tied down. If I need random access and don't need to
> >> > > read all of the data, the application has to do pread(),
> >> > > pwrite() a lot thus complicating it. With mmap() I can just
> >> > > map in the whole file and excess reading (beyond what the
> >> > > app needs) will not be a large fraction.
> >> >
> >> > you think doing single 4K page sized reads in the pagefault
> >> > handler is better than doing precise >4K reads from your
> >> > application? possibly in a background thread so you can
> >> > overlap processing with data fetching?
> >> >
> >> > the advantage of mmap is not prefetch. its about not to do
> >> > any I/O when data is already in the *SHARED* buffer cache!
> >> > which plan9 does not have (except the mntcache, but that is
> >> > optional and only works for the disk fileservers that maintain
> >> > ther file qid ver info consistently). its *IS* really a linux
> >> > thing where all block device i/o goes thru the buffer cache.
> >> >
> >> > --
> >> > cinap
> >> >
> >>
> >>
> >
>
>

[-- Attachment #2: Type: text/html, Size: 5587 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-14 17:34       ` Ole-Hjalmar Kristensen
@ 2018-10-14 19:17         ` hiro
  2018-10-15  9:29         ` Giacomo Tesio
  1 sibling, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-14 19:17 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

also read what has been written before about fcp. and read the source of fcp.

On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com> wrote:
> OK, that makes sense. So it would not stop a client from for example first
> read an index block in a B-tree, wait for the result, and then issue read
> operations for all the data blocks in parallel. That's exactly the same as
> any asynchronous disk subsystem I am acquainted with. Reordering is the
> norm.
>
> On Sun, Oct 14, 2018 at 1:21 PM hiro <23hiro@gmail.com> wrote:
>
>> there's no tyranny involved.
>>
>> a client that is fine with the *responses* coming in reordered could
>> remember the tag obviously and do whatever you imagine.
>>
>> the problem is potential reordering of the messages in the kernel
>> before responding, even if the 9p transport has guaranteed ordering.
>>
>> On 10/14/18, Ole-Hjalmar Kristensen <ole.hjalmar.kristensen@gmail.com>
>> wrote:
>> > I'm not going to argue with someone who has got his hands dirty by
>> actually
>> > doing this but I don't really get this about the tyranny of 9p. Isn't
>> > the
>> > point of the tag field to identify the request? What is stopping the
>> client
>> > from issuing multiple requests and match the replies based on the tag?
>> From
>> > the manual:
>> >
>> > Each T-message has a tag field, chosen and used by the
>> >           client to identify the message.  The reply to the message
>> >           will have the same tag.  Clients must arrange that no two
>> >           outstanding messages on the same connection have the same
>> >           tag.  An exception is the tag NOTAG, defined as (ushort)~0
>> >           in <fcall.h>: the client can use it, when establishing a
>> >           connection, to override tag matching in version messages.
>> >
>> >
>> >
>> > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion
>> > <sstallion@gmail.com
>> >:
>> >
>> >> As the guy who wrote the majority of the code that pushed those 1M 4K
>> >> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> >> don't get an award for writing a driver. In fact, it's probably better
>> >> not to be known at all considering the bloody murder one has to commit
>> >> to marry hardware and software together.
>> >>
>> >> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> >> hit those rates, I had to add support for asynchronous and vectored
>> >> I/O not to mention a sizable bit of work by a co-worker to properly
>> >> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> >> to rewrite the scheduler and re-implement locking, which even Charles
>> >> Forsyth had a hand in. Had we the time and resources to implement
>> >> something like zero-copy we'd have done it in a heartbeat.
>> >>
>> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> >> - as soon as you put a 9P-based filesystem on it, it's going to be
>> >> limited to a single outstanding operation. This is the tyranny of 9P.
>> >> We (Coraid) got around this by avoiding filesystems altogether.
>> >>
>> >> Go solve that problem first.
>> >> On Wed, Oct 10, 2018 at 12:36 PM <cinap_lenrek@felloff.net> wrote:
>> >> >
>> >> > > But the reason I want this is to reduce latency to the first
>> >> > > access, especially for very large files. With read() I have
>> >> > > to wait until the read completes. With mmap() processing can
>> >> > > start much earlier and can be interleaved with background
>> >> > > data fetch or prefetch. With read() a lot more resources
>> >> > > are tied down. If I need random access and don't need to
>> >> > > read all of the data, the application has to do pread(),
>> >> > > pwrite() a lot thus complicating it. With mmap() I can just
>> >> > > map in the whole file and excess reading (beyond what the
>> >> > > app needs) will not be a large fraction.
>> >> >
>> >> > you think doing single 4K page sized reads in the pagefault
>> >> > handler is better than doing precise >4K reads from your
>> >> > application? possibly in a background thread so you can
>> >> > overlap processing with data fetching?
>> >> >
>> >> > the advantage of mmap is not prefetch. its about not to do
>> >> > any I/O when data is already in the *SHARED* buffer cache!
>> >> > which plan9 does not have (except the mntcache, but that is
>> >> > optional and only works for the disk fileservers that maintain
>> >> > ther file qid ver info consistently). its *IS* really a linux
>> >> > thing where all block device i/o goes thru the buffer cache.
>> >> >
>> >> > --
>> >> > cinap
>> >> >
>> >>
>> >>
>> >
>>
>>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-14 17:34       ` Ole-Hjalmar Kristensen
  2018-10-14 19:17         ` hiro
@ 2018-10-15  9:29         ` Giacomo Tesio
  1 sibling, 0 replies; 104+ messages in thread
From: Giacomo Tesio @ 2018-10-15  9:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Il giorno dom 14 ott 2018 alle ore 19:39 Ole-Hjalmar Kristensen
<ole.hjalmar.kristensen@gmail.com> ha scritto:
>
> OK, that makes sense. So it would not stop a client from for example first read an index block in a B-tree, wait for the result, and then issue read operations for all the data blocks in parallel.

If the client is the kernel that's true.
If the client is directly speaking 9P that's true again.

But if the client is a userspace program using pread/pwrite that
wouldn't work unless it fork a new process per each read as the
syscalls blocks.
Which is what fcp does, actually:
https://github.com/brho/plan9/blob/master/sys/src/cmd/fcp.c


Giacomo



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-14  8:00                   ` hiro
@ 2018-10-15 16:48                     ` Charles Forsyth
  2018-10-15 17:01                       ` hiro
                                         ` (3 more replies)
  0 siblings, 4 replies; 104+ messages in thread
From: Charles Forsyth @ 2018-10-15 16:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 4108 bytes --]

It's useful internally in protocol implementation, specifically to avoid
copying in transport protocols (for later retransmission), and the
modifications aren't vast.
A few changes were trickier, often because of small bugs in the original
code. icmp does some odd things i think.

Btw, "zero copy" isn't the right term and I preferred another term that
I've now forgotten. Minimal copying, perhaps.
For one thing, messages can eventually end up being copied to contiguous
blocks for devices without decent scatter-gather DMA.

Messages are a tuple (mutable header stack, immutable slices of immutable
data).
Originally the data was organised as a tree, but nemo suggested using just
an array, so I changed it.
It's important that it's (logically) immutable. Headers are pushed onto and
popped from the header stack, and the current stack top is mutable.

There were new readmsg and writemsg system calls to carry message
structures between kernel and user level.
The message was immutable on writemsg. Between processes in the same
program, message transfers could be done by exchanging pointers into a
shared region.

I'll see if I wrote up some of it. I think there were manual pages for the
Messages replacing Blocks.

My mcs lock implementation was probably more useful, and I use that in my
copy of the kernel known as 9k

Also, NUMA effects are more important in practice on big multicores. Some
of the off-chip delays are brutal.

On Sun, 14 Oct 2018 at 09:50, hiro <23hiro@gmail.com> wrote:

> thanks, this will allow us to know where to look more closely.
>
> On 10/14/18, Francisco J Ballesteros <nemo@lsub.org> wrote:
> > Pure "producer/cosumer" stuff, like sending things through a pipe as
> long as
> > the source didn't need to touch the data ever more.
> > Regarding bugs, I meant "producing bugs" not "fixing bugs", btw.
> >
> >> On 14 Oct 2018, at 09:34, hiro <23hiro@gmail.com> wrote:
> >>
> >> well, finding bugs is always good :)
> >> but since i got curious could you also tell which things exactly got
> >> much faster, so that we know what might be possible?
> >>
> >> On 10/14/18, FJ Ballesteros <nemo@lsub.org> wrote:
> >>> yes. bugs, on my side at least.
> >>> The copy isolates from others.
> >>> But some experiments in nix and in a thing I wrote for leanxcale show
> >>> that
> >>> some things can be much faster.
> >>> It’s fun either way.
> >>>
> >>>> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
> >>>>
> >>>> and, did it improve anything noticeably?
> >>>>
> >>>>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
> >>>>> I did several versions of one part of zero copy, inspired by several
> >>>>> things
> >>>>> in x-kernel, replacing Blocks by another structure throughout the
> >>>>> network
> >>>>> stacks and kernel, then made messages visible to user level. Nemo did
> >>>>> another part, on his way to Clive
> >>>>>
> >>>>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org>
> wrote:
> >>>>>>
> >>>>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg
> >>>>>> <lyndon@orthanc.ca>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Another case to ponder ...   We're handling the incoming I/Q data
> >>>>>>> stream, but need to fan that out to many downstream consumers.  If
> >>>>>>> we already read the data into a page, then flip it to the first
> >>>>>>> consumer, is there a benefit to adding a reference counter to that
> >>>>>>> read-only page and leaving the page live until the counter expires?
> >>>>>>>
> >>>>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
> >>>>>>> done don't show anyone trying this out with P9 (and publishing
> >>>>>>> their results).  Anybody have hints/references to prior work?
> >>>>>>>
> >>>>>>> --lyndon
> >>>>>>>
> >>>>>>
> >>>>>> I don't believe anyone has done the work yet. I'd be interested
> >>>>>> to see what you come up with.
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>   Ori Bernstein
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >
> >
> >
>
>

[-- Attachment #2: Type: text/html, Size: 6062 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-15 16:48                     ` Charles Forsyth
@ 2018-10-15 17:01                       ` hiro
  2018-10-15 17:29                       ` hiro
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-15 17:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Btw, "zero copy" isn't the right term and I preferred another term that I've now forgotten. Minimal copying, perhaps.

I like that, "zero-copy" makes me imply other linux-specifics, and
those are hard to not get emotional about.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-15 16:48                     ` Charles Forsyth
  2018-10-15 17:01                       ` hiro
@ 2018-10-15 17:29                       ` hiro
  2018-10-15 23:06                         ` Charles Forsyth
  2018-10-16  0:09                       ` erik quanstrom
  2018-10-17 18:14                       ` Charles Forsyth
  3 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-15 17:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Also, NUMA effects are more important in practice on big multicores. Some
> of the off-chip delays are brutal.

yeah, we've been talking about this on #cat-v. even inside one CPU
package amd puts multiple dies nowadays, and the cross-die cpu cache
access delays are approaching the same dimensions as memory-access!

also on each die, they have what they call ccx (cpu complex),
groupings of 4 cores, which are connected much faster internally than
towards the other ccx



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-15 17:29                       ` hiro
@ 2018-10-15 23:06                         ` Charles Forsyth
  0 siblings, 0 replies; 104+ messages in thread
From: Charles Forsyth @ 2018-10-15 23:06 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

They are machines designed to run programs most people do not write!

On Mon, 15 Oct 2018 at 19:20, hiro <23hiro@gmail.com> wrote:

> > Also, NUMA effects are more important in practice on big multicores. Some
> > of the off-chip delays are brutal.
>
> yeah, we've been talking about this on #cat-v. even inside one CPU
> package amd puts multiple dies nowadays, and the cross-die cpu cache
> access delays are approaching the same dimensions as memory-access!
>
> also on each die, they have what they call ccx (cpu complex),
> groupings of 4 cores, which are connected much faster internally than
> towards the other ccx
>
>

[-- Attachment #2: Type: text/html, Size: 909 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-15 16:48                     ` Charles Forsyth
  2018-10-15 17:01                       ` hiro
  2018-10-15 17:29                       ` hiro
@ 2018-10-16  0:09                       ` erik quanstrom
  2018-10-17 18:14                       ` Charles Forsyth
  3 siblings, 0 replies; 104+ messages in thread
From: erik quanstrom @ 2018-10-16  0:09 UTC (permalink / raw)
  To: 9fans

> It's useful internally in protocol implementation, specifically to avoid
> copying in transport protocols (for later retransmission), and the
> modifications aren't vast.
> A few changes were trickier, often because of small bugs in the original
> code. icmp does some odd things i think.

that makes sense.  likewise, if it were essentially free to add file systems in the i/o path,
from user space, one could build micro file systems that took care of small details without
incuring much cost.  ramfs is enough of a file system if you have other programs to do
other things like dump.

> I'll see if I wrote up some of it. I think there were manual pages for the
> Messages replacing Blocks.

that would be great.  thanks.


> My mcs lock implementation was probably more useful, and I use that in my
> copy of the kernel known as 9k

indeed.  i've seen great performance with mcs in my kernel.

- erik



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] zero copy & 9p (was Re: PDP11 (Was: Re: what heavy negativity!)
  2018-10-15 16:48                     ` Charles Forsyth
                                         ` (2 preceding siblings ...)
  2018-10-16  0:09                       ` erik quanstrom
@ 2018-10-17 18:14                       ` Charles Forsyth
  3 siblings, 0 replies; 104+ messages in thread
From: Charles Forsyth @ 2018-10-17 18:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 263 bytes --]

> I'll see if I wrote up some of it. I think there were manual pages for the
>> Messages replacing Blocks.
>
>
Here are the three manual pages  https://goo.gl/Qykprf
It's not obvious from them, but internally a Fragment can represent a slice
of a Segment*

[-- Attachment #2: Type: text/html, Size: 764 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 20:55                         ` Digby R.S. Tarvin
@ 2018-10-11 21:03                           ` Lyndon Nerenberg
  0 siblings, 0 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 21:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Digby R.S. Tarvin writes:

> Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings
> held to discuss competing potential uses for a word of memory that had
> become free.

> That one would be a challenging Plan9 port..

And yet Plan9 was not there to save the day.  Such a pity.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 18:10                       ` Lyndon Nerenberg
@ 2018-10-11 20:55                         ` Digby R.S. Tarvin
  2018-10-11 21:03                           ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-11 20:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings
held to discuss competing potential uses for a word of memory that had
become free.

That one would be a challenging Plan9 port..

On Fri, 12 Oct 2018 at 05:13, Lyndon Nerenberg <lyndon@orthanc.ca> wrote:

> Digby R.S. Tarvin writes:
>
> > Agreed, but the PDP11/70 was not constrained to 64KB memory either.
>
> > I do recall the MS-DOS small/large/medium etc models that used the
> > segmentation in various ways to mitigate the limitations of being a 16
> bit
> > computer. Similar techniques were possible on the PDP11, for example
>
> Coincidental to this conversation, I'm currently reading "The Apollo
> Guidance Computer: Architecture and Operation" by _Framk O'Brien_.
> (ISBN 978-1-4419-0876-6)  Very interesting to see what you can do with
> a 15 bit architecture when sufficiently motivated.
>
> --lyndon
>
>

[-- Attachment #2: Type: text/html, Size: 1232 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:57                                 ` hiro
@ 2018-10-11 20:23                                   ` Lyndon Nerenberg
  0 siblings, 0 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 20:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:

> don't you need sending ability, too for AIS?

No, a receive-only setup is very useful on a small boat.  Where I
would like to go with this is to take the decoded AIS data as input
for "ARPA" style collision plots.  I'm interested in the big boats
sailing through the straight.  They can't turn fast, and rarely
change course.  If I can derive their intentions, I can plot a path
between them that requires the least amount of tacking.

The big boats, in turn, have no interest in us little critters.
They actively filter out the "class B" (I think that's the term)
noise that are AIS transmissions from the small craft.  Even if we
hit them, we can't sink them, so they don't care about us.  Therefore
there is no incentive for small boats to transmit AIS.  Unless you're
trying to locate your buddies for a tie-up somewhere.  (That can
be a very valid reason for transmitting!)

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:47                               ` Lyndon Nerenberg
@ 2018-10-11 19:57                                 ` hiro
  2018-10-11 20:23                                   ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-11 19:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

>> I assumed you were using an RTL2832U (rtlsdr library).
>
> I'm pretty sure they all do, under the hood.
>
>

don't you need sending ability, too for AIS?



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:27         ` Lyndon Nerenberg
@ 2018-10-11 19:56           ` hiro
  0 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-11 19:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> need to prove it can be done with the usual
> suspects (GNU radio, on the Pi -- the native fft libraries seem fast
> enought to make this viable).

be assured i've demodulated 25khz signals in real-time and it's a walk
in the park, as long as your revision has the neon stuff i mentioned,
otherwise the fft becomes bottleneck.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:44                             ` Skip Tavakkolian
@ 2018-10-11 19:47                               ` Lyndon Nerenberg
  2018-10-11 19:57                                 ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 19:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Skip Tavakkolian writes:

> I assumed you were using an RTL2832U (rtlsdr library).

I'm pretty sure they all do, under the hood.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:39                           ` Lyndon Nerenberg
@ 2018-10-11 19:44                             ` Skip Tavakkolian
  2018-10-11 19:47                               ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: Skip Tavakkolian @ 2018-10-11 19:44 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

I assumed you were using an RTL2832U (rtlsdr library).

On Thu, Oct 11, 2018, 12:40 PM Lyndon Nerenberg <lyndon@orthanc.ca> wrote:

> > I was able to use dump1090 (same author as redis) to get ADSB data
> reliably
> > on RPi/Linux a while back.
>
> I have a pair of Flightbox ADS-B receivers I am using as references.
> While mostly reliable, they can and do stutter along with the rest
> of the alternatives on occasion.
>
> --lyndon
>
>

[-- Attachment #2: Type: text/html, Size: 744 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:26                         ` Skip Tavakkolian
@ 2018-10-11 19:39                           ` Lyndon Nerenberg
  2018-10-11 19:44                             ` Skip Tavakkolian
  0 siblings, 1 reply; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 19:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

> I was able to use dump1090 (same author as redis) to get ADSB data reliably
> on RPi/Linux a while back.

I have a pair of Flightbox ADS-B receivers I am using as references.
While mostly reliable, they can and do stutter along with the rest
of the alternatives on occasion.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:11       ` hiro
@ 2018-10-11 19:27         ` Lyndon Nerenberg
  2018-10-11 19:56           ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 19:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:

> But given the alternatives available back then, even the armv5 in the
> kirkwood, which was cheaper even before the rpi became popular, did
> the same job more stably, which is why i would never actually
> recommend the pi. And there are even more alternatives now.

I get that. But the actual hardware driving this conversation isn't
particularly relevant,, and devolving to a hardware bikeshed isn't
helpful.  (Not picking on you specifically.)

> Are you doing the AIS demodulation on plan9 on rpi? It would be a
> great showcase. Wish I had been given the opportunity to find an
> excuse to build something like that on plan9 instead :)

Not yet.  First I need to prove it can be done with the usual
suspects (GNU radio, on the Pi -- the native fft libraries seem fast
enought to make this viable).  If the pessimized case works, then
porting the code from the GNU radio python modules to C is a
mechanical process for the most part.  This week I am ENOTIME with
getting the boat tarped up in preparation for the winter monsoon
season :-P.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 17:54                       ` Lyndon Nerenberg
  2018-10-11 18:04                         ` Kurt H Maier
  2018-10-11 19:23                         ` hiro
@ 2018-10-11 19:26                         ` Skip Tavakkolian
  2018-10-11 19:39                           ` Lyndon Nerenberg
  2 siblings, 1 reply; 104+ messages in thread
From: Skip Tavakkolian @ 2018-10-11 19:26 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

I was able to use dump1090 (same author as redis) to get ADSB data reliably
on RPi/Linux a while back.

On Thu, Oct 11, 2018, 10:54 AM Lyndon Nerenberg <lyndon@orthanc.ca> wrote:

> > I have been able to copy 1 GiB/s to userspace from an nvme device. I
> should
> > think a radio should be no problem.
>
> The problem is when you have multiple decoder blocks implemented
> as individual processes (i.e. the GNU radio model).  Once you have
> everything debugged, you can put it into a single threaded process
> and eliminate the copy overhead.  But it's completely impractical
> to prototype or debug real applications this way.  And it's the
> prototyping case I'm interested in here.
>
> So I'm *curious* to know if page flipping a 'protocol buffer' like
> object between processes provides an optimization over copying
> through the kernel.  Not so much for the speed aspect, but to free
> up CPU cycles that can be devoted to actual SDR work.
>
> Since when did curiosity become a capital crime?   Oh, wait, that
> was January 20, 2017.  My bad.
>
> --lyndon
>
>

[-- Attachment #2: Type: text/html, Size: 1405 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:24                           ` hiro
@ 2018-10-11 19:25                             ` hiro
  0 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-11 19:25 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

We also have CPU extensions that can help make fast FFT, because it's
such a generic problem, and in the worst case you can use fpgas,
asics, in any case dedicated hardware.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 19:23                         ` hiro
@ 2018-10-11 19:24                           ` hiro
  2018-10-11 19:25                             ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-11 19:24 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

i meant without having to resort to some soft fp.

On 10/11/18, hiro <23hiro@gmail.com> wrote:
>> through the kernel.  Not so much for the speed aspect, but to free
>> up CPU cycles that can be devoted to actual SDR work.
>
> those 2x25kHz channels would hardly need many cycles. rather it's just
> a matter of selecting the right CPU that can actually do the FFT with
> some software floating point implementation :)
>
> i don't see memory bandwidth or even random memory access latency
> affecting this scenario in the slightest.
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 17:54                       ` Lyndon Nerenberg
  2018-10-11 18:04                         ` Kurt H Maier
@ 2018-10-11 19:23                         ` hiro
  2018-10-11 19:24                           ` hiro
  2018-10-11 19:26                         ` Skip Tavakkolian
  2 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-11 19:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> through the kernel.  Not so much for the speed aspect, but to free
> up CPU cycles that can be devoted to actual SDR work.

those 2x25kHz channels would hardly need many cycles. rather it's just
a matter of selecting the right CPU that can actually do the FFT with
some software floating point implementation :)

i don't see memory bandwidth or even random memory access latency
affecting this scenario in the slightest.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 17:43     ` Lyndon Nerenberg
@ 2018-10-11 19:11       ` hiro
  2018-10-11 19:27         ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-11 19:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> One example is for an AIS transceiver on a boat.  By putting the
> radio and decoder at the top of the mast, the backhaul can be a
> cat-3 twisted pair cable, rather than a much heavier coax run from
> the antenna at the top of the mast to the receiver below decks.

Yeah, I've been sending 3Mbit I/Q samples over ethernet to a more
beefy computer. For non-technical crowds I described the rpi as a
passable USB->ethernet gateway for SDR tasks in that bandwidth.

But given the alternatives available back then, even the armv5 in the
kirkwood, which was cheaper even before the rpi became popular, did
the same job more stably, which is why i would never actually
recommend the pi. And there are even more alternatives now.

Even the rpi itself is proof that better alternatives exist (as they
did even back then when the first one out), because the newer rpi
revision (i think) has finally gained neon cpu extensions, which
surprisingly have been supported by gnuradio long before this, and a
reason why my bachelor thesis back then was an easy success :)

In general all limits that occured to me on the rpi were due to
stability (usb power and compatibility issues), but more concretely
for our discussion: lack of cpu power, mainly for the FFT. There were
no throughput, delay or memory copy bottlenecks for me.

This was using linux, because my mouse didn't work on the old rpi
plan9 image and sadly there was a time-limit...

Are you doing the AIS demodulation on plan9 on rpi? It would be a
great showcase. Wish I had been given the opportunity to find an
excuse to build something like that on plan9 instead :)



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 23:15                     ` Digby R.S. Tarvin
@ 2018-10-11 18:10                       ` Lyndon Nerenberg
  2018-10-11 20:55                         ` Digby R.S. Tarvin
  0 siblings, 1 reply; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 18:10 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Digby R.S. Tarvin writes:

> Agreed, but the PDP11/70 was not constrained to 64KB memory either.

> I do recall the MS-DOS small/large/medium etc models that used the
> segmentation in various ways to mitigate the limitations of being a 16 bit
> computer. Similar techniques were possible on the PDP11, for example

Coincidental to this conversation, I'm currently reading "The Apollo
Guidance Computer: Architecture and Operation" by _Framk O'Brien_.
(ISBN 978-1-4419-0876-6)  Very interesting to see what you can do with
a 15 bit architecture when sufficiently motivated.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-11 17:54                       ` Lyndon Nerenberg
@ 2018-10-11 18:04                         ` Kurt H Maier
  2018-10-11 19:23                         ` hiro
  2018-10-11 19:26                         ` Skip Tavakkolian
  2 siblings, 0 replies; 104+ messages in thread
From: Kurt H Maier @ 2018-10-11 18:04 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Oct 11, 2018 at 10:54:22AM -0700, Lyndon Nerenberg wrote:
>
> Since when did curiosity become a capital crime?   Oh, wait, that
> was January 20, 2017.  My bad.

Turns out it's not, so you can climb down off your cross.  It's just
that it helps to be a little clearer about your meaning, that's all.
Otherwise you might do something embarassing, like posting SAS
controller code into an NVMe discussion.

khm



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 22:05                     ` erik quanstrom
@ 2018-10-11 17:54                       ` Lyndon Nerenberg
  2018-10-11 18:04                         ` Kurt H Maier
                                           ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 17:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

> I have been able to copy 1 GiB/s to userspace from an nvme device. I should
> think a radio should be no problem.

The problem is when you have multiple decoder blocks implemented
as individual processes (i.e. the GNU radio model).  Once you have
everything debugged, you can put it into a single threaded process
and eliminate the copy overhead.  But it's completely impractical
to prototype or debug real applications this way.  And it's the
prototyping case I'm interested in here.

So I'm *curious* to know if page flipping a 'protocol buffer' like
object between processes provides an optimization over copying
through the kernel.  Not so much for the speed aspect, but to free
up CPU cycles that can be devoted to actual SDR work.

Since when did curiosity become a capital crime?   Oh, wait, that
was January 20, 2017.  My bad.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  5:52   ` hiro
  2018-10-10  8:13     ` Digby R.S. Tarvin
@ 2018-10-11 17:43     ` Lyndon Nerenberg
  2018-10-11 19:11       ` hiro
  1 sibling, 1 reply; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-11 17:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:

> Does this include demodulation on the pi?

Yes.  At least to a certain extent.  The idea is to get from the
high-birate I/Q data so something more amenable to transmission
over an RS-422 (or -485) serial drop.

One example is for an AIS transceiver on a boat.  By putting the
radio and decoder at the top of the mast, the backhaul can be a
cat-3 twisted pair cable, rather than a much heavier coax run from
the antenna at the top of the mast to the receiver below decks.

Reducing the weight at the top of the mast reduces the moment arm
acting on the boat, significantly enhancing the stability of a
sailboat (which is how I got started down this road to begin with).

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10 10:38                   ` Ethan Gardener
@ 2018-10-10 23:15                     ` Digby R.S. Tarvin
  2018-10-11 18:10                       ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-10 23:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 10530 bytes --]

On Wed, 10 Oct 2018 at 21:40, Ethan Gardener <eekee57@fastmail.fm> wrote:

> >
> > Not sure I would agree with that. The 20 bit addressing of the 8086 and
> 8088 did not change their 16 bit nature. They were still 16 bit program
> counter, with segmentation to provide access to a larger memory - similar
> in principle to the PDP11 with MMU.
>
> That's not at all the same as being constrained to 64KB memory.  Are we
> communicating at cross purposes here?  If we're not, if I haven't
> misunderstood you, you might want to read up on creating .exe files for
> MS-DOS.


Agreed, but the PDP11/70 was not constrained to 64KB memory either.

I do recall the MS-DOS small/large/medium etc models that used the
segmentation in various ways to mitigate the limitations of being a 16 bit
computer. Similar techniques were possible on the PDP11, for example
Modula-2/VRS under RT-11 used the MMU to transparently support 4MB programs
back in 1984 (it used trap instructions to implement subroutine calls).

It wasn't possible under Unix, of course, because there were no system
calls for manipulating the mmu. Understandable, as it would have
complicated the security model in a multi-tasking system. Something neither
MS-DOS or RT-11 had to deal with.

Address space manipulation was more convenient with Intel segmentation
because the instruction set included procedure call/return instructions
that manipulated the segmentation registers, but the situation was not
fundamentally different.  They were both 16 bit machines with hacks to give
access to a larger than 64K physical memory.

The OS9 operating system allowed some control of application memory maps in
a unix like environement by supporting dynamic (but explicit) link and
unlink of subroutine and data modules - which would be added and removed
from your 64K address space as required.So more analogous to memory based
overlays.


> > I went Commodore Amiga at about that time - because it at least
> supported some form of multi-tasking out out the box, and I spent many
> happy hours getting OS9 running on it.. An interesting architecture,
> capable of some impressive graphics, but subject to quite severe
> limitations which made general purpose graphics difficult. (Commodore later
> released SVR4 Unix for the A3000, but limited X11 to monochrome when using
> the inbuilt graphics).
>
> It does sound like fun. :)  I'm not surprised by the monochrome graphics
> limitation after my calculations.  Still, X11 or any other window system
> which lacks a backing store may do better in low-memory environments than
> Plan 9's present draw device.  It's a shame, a backing store is a great
> simplification for programmers.
>

X11 does, of course, support the concept of a backing store. It just
doesn't mandate it. It was an expensive thing to provide back when X11 was
young, so pretty rare. I remember finding the need to be able to re-create
windows on demand rather annoying when I first learned to program in Xlib,
but once you get used to it I find it can lead to benefits when you have to
retain a knowledge of how an image is created, not just the end result.


> > But being 32 bit didn't give it a huge advantage over the 16 bit x86
> systems for tinkering with operating system, because the 68000 had no MMU.
> It was easier to get a Unix like system going with 16 bit segmentation than
> a 32 bit linear space and no hardware support for run time relocation.
> > (OS9 used position independent code throughout to work without an MMU,
> but didn't try to implement fork() semantics).
>
> I'm sometimes tempted to think that fork() is freakishly high-level crazy
> stuff. :)  Still, like backing store, it's very nice to have.
>

I agree. Very elegant when you compare it to the hoops you have to jump
through to initialize the child process environment in systems with the
more common combined 'forkexec' semantics, but a real sticking point for
low end hardware.


> > It wasn't till the 68030 based Amiga 3000 came out in 1990 that it
> really did everything I wanted. The 68020 with an optional MMU was
> equivalent, but not so common in consumer machines.
> >
> > Hardware progress seems to have been rather uninteresting since then.
> Sure, hardware is *much* faster and *much* bigger, but fundamentally the
> same architecture. Intel had a brief flirtation with a novel architecture
> with the iAPX 432 in 81, but obviously found that was more profitable
> making the familiar architecture bigger and faster .
>
> I rather agree.  Multi-core and hyperthreading don't bring in much from an
> operating system designer's perspective, and I think all the interesting
> things about caches are means of working around their problems.


I don't think anyone would bother with multiple cores or caches if that
same performance could be achieved without them.  They just buy a bit more
performance at the cost of additional software complexity.

I would very much like to get my hands on a ga144 to see what sort of
> operating system structure would work well on 144 processors with 64KW RAM
> each. :)  There's 64KW ROM per processor too, a lot of stock code could go
> in that.  Both the RAM and ROM operate at the full speed of the processor,
> no caches to worry about.
>

Interesting. I hadn't come across those before....


> A little rant about MMUs, sort-of saying "unix and C are not without
> complexifying nonsense":  I'm sure the MMU itself is uninteresting or even
> harmful to many who prefer other languages and system designs.  Just look
> at that other discussion about the penalties of copying versus the cache
> penalties of page flipping.  If that doesn't devolve into "heavy
> negativity," it'll only be because those who know don't write much, or
> those who write much don't want to provide actual figures or references to
> argue about.
>

I think if you are going to postulate about this, we need to refine our
terms a bit. The term MMU encompasses too many quite different concepts.
For example:
1. run time relocation
2. virtual address space expansion (use more memory than can be directly
addressed)
3. virtual memory expansion (appear to use more memory than you physically
have)
4. process interference protection
5. process privacy protection

I'm sure there are more. For example, on the 68K processors, OS-9/68K had
no support for the first three - virtual and physical addresses were always
the same. For Unix style timesharing it required all compilers to generate
position independent code. There was no swapping or virtual memory. It did
use an MMU via a module called SPU - the 'System Protection Unit', which
mapped all of your program code as read only, your data as read/write, and
made everything else inaccessible. That sort of functionality is invaluable
while developing, because you don't want faulty programs to change the
kernel or other programs, and trapping attempts to do so makes it easier to
identify faults.
However with suitable privileges, any process can request that arbitrary
memory addresses be made readable or writable, and if desired, the SPU
could be omitted from the system, either removing an MMU  performance
penalty or allowing the application to run on cheaper, non-mmu equipped
hardware.

The executables were re-entrant and position independent, but one an
instance started executing it could not be moved (calculated pointers
stored in data etc). So this software solution would not have been enough
to support efficient swapping or paging. It was ok in this case, because it
was intended as a real-time system and you don't do swapping in that
situation.

What about all those languages which don't even give the programmer access
> to pointers in the first place.  Many have run directly on hardware in the
> past, some can now.  Do they need MMUs?
>


> Then there's Forth, which relies on pointers even more than C does.  I
> haven't read *anything* about MMUs in relation to Forth, and yet Forth is
> in practice as much an operating system as a language.  It runs directly on
> hardware.  I'm not sure of some details yet, but it looks like many
> operating system features either "fall out of" the language design (to use
> a phrase from Ken Thompson & co.), or are trivial to implement.
>
> There were multitasking Forth systems in the 70s.  No MMU.  The full power
> of pointers *at the prompt*.  Potential for stack under- and over-runs
> too.  And yet these were working systems, and the language hasn't been
> consigned to the graveyard of computing history.  My big project includes
> exploring how this is possible. :)  A likely possibility is the power to
> redefine words (functions) without affecting previous definitions.  Pointer
> store and fetch can trivially be redefined to check bounds.  Check your
> code doesn't go out of bounds, then "empty", and load it without the
> bounds-checking store and fetch.
>

MMU's are probably more important in multi-user than just multi-tasking
systems. The Amiga's, as I mentioned, were multi-tasking without the need
for an MMU. The result was a very fast system, resulting in some of the
impressive graphics and games. But also the need for frequent reboots when
developing.

But you are right, if you try hard enough, you can replace hardware memory
management with software - a sandboxed environment with program development
tools with strong typing and no low level access. Look, for example, and
the old Burroughs B6700 which had a security paradigm based on making it
impossible for unprivileged users to generate machine code. Compilers had
to blessed with trusted status, and only code generated by trusted
compilers would be executed.. I don't recall many details, other than it
had an interesting tagged architecture.

An extreme example would be an emulator - sandboxing users without any
actual hardware protection, albeit at significant performance cost.

Forth, Basic and all the other common development environments common on
person computers before the MMU availability were fine because there was
generally only one user, the language provided a lot or
protection/restriction (and was often interpreted), and if you managed to
crash it, it was generally quick to restart.

So I still tend to feel that MMUs were a valuable advance in computer
architecture that I would hate to have to live without..

[-- Attachment #2: Type: text/html, Size: 12162 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-10 22:19 cinap_lenrek
  0 siblings, 0 replies; 104+ messages in thread
From: cinap_lenrek @ 2018-10-10 22:19 UTC (permalink / raw)
  To: 9fans

hahahahahahahaha

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  9:14       ` hiro
  2018-10-10 13:59         ` Steve Simon
@ 2018-10-10 21:32         ` Digby R.S. Tarvin
  1 sibling, 0 replies; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-10 21:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 3664 bytes --]

Well, I think 'avoid at all costs'  is a bit strong.

The Raspberry Pi is a good little platform for the right applications, so
long as you are aware of its limitations. I use one as my 'always on' home
server to give me access files when travelling (the networking is slow by
LAN standards, but ok for WAN), and another for my energy monitoring
system. It is good for experimenting with OS's, especially networking OS's
like Plan9 where price is important if you want to try a large number of
hosts. Its good for teaching/learning. Or for running/trying different
operating systems without having do spend time and resources setting up VMs
(downloading and flashing an sd card image is quick and takes up no space
on my main systems).

Just don't plan on deploying RPi's for mission critical applications that
have demanding I/O or processing requirements. It was never intended to
compete in that market.

On Wed, 10 Oct 2018 at 20:54, hiro <23hiro@gmail.com> wrote:

> I agree, if you have a choice avoid rpi by all costs.
> Even if the software side of that other board was less pleasent at least
> it worked with my mouse and keyboard!! :)
>
> As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB.
> But my point is that even though this number is low, the rpi is too limited
> to do any meaningful processing anyway (ignoring the usb troubles and lack
> of ethernet). It's a mobile phone soc after all, where the modulation is
> done by dedicated chips, not on cpu! :)
>
> On Wednesday, October 10, 2018, Digby R.S. Tarvin <digbyt42@gmail.com>
> wrote:
> > I don't know which other ARM board you tried, but I have always found
> terrible I/O performance of the Pi to be a bigger problem that the ARM
> speed.  The USB2 interface is really slow, and there arn't really many
> other (documented) alternative options. The Ethernet goes through the same
> slow USB interface, and there is only so much that you can do bit bashing
> data with GPIO's.  The sdCard interface seems to be the only non-usb
> filesystem I/O available. And that in turn limits the viability of
> relieving the RAM contraints with virtual memory. So the ARM processor
> itself is not usually the problem for me.
> > In general I find the pi a nice little device for quite a few things -
> like low power, low bandwidth, low cost servers or displays with plenty of
> open source compatability.. Or hacking/prototyping where I don't want to
> have to worry too much about blowing things up. But it not good for high
> throughput I/O,  memory intensive applications, or anything requiring a lot
> of processing power.
> > The validity of your conclusion regarding low power ARM in general
> probably depends on what the other board you tried was..
> > DigbyT
> > On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote:
> >>
> >> > Eliminating as much of the copy in/out WRT the kernel cannot but
> >> > help, especially when you're doing SDR decoding near the radios
> >> > using low-powered compute hardware (think Pies and the like).
> >>
> >> Does this include demodulation on the pi? cause even when i dumped the
> >> pi i was given for that purpose (with a <2Mbit I/Q stream) and
> >> replaced it with some similar ARM platform that at least had neon cpu
> >> instruction extensions for faster floating point operations, I was
> >> barely able to run a small FFT.
> >>
> >> My conclusion was that these low-powered ARM systems are just good
> >> enough for gathering low-bandwidth, non-critical USB traffic, like
> >> those raw I/Q samples from a dongle, but unfit for anything else.
> >>
> >

[-- Attachment #2: Type: text/html, Size: 4174 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-10 16:14 cinap_lenrek
  0 siblings, 0 replies; 104+ messages in thread
From: cinap_lenrek @ 2018-10-10 16:14 UTC (permalink / raw)
  To: 9fans

oh! you wrote a nvme driver TOO? where can i find it?

maybe we can share some knowledge. especially regarding
some quirks. i dont own hardware myself, so i wrote it
using an emulator over a weekend and tested it on a
work machine afterwork.

http://code.9front.org/hg/plan9front/log/9df9ef969856/sys/src/9/pc/sdnvme.c

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  9:14       ` hiro
@ 2018-10-10 13:59         ` Steve Simon
  2018-10-10 21:32         ` Digby R.S. Tarvin
  1 sibling, 0 replies; 104+ messages in thread
From: Steve Simon @ 2018-10-10 13:59 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs


people come down very hard on the pi.

here are my times for building the pi kernel. i rebuilt it a few times to push data into any caches available.

pi3+ with a high-ish spec sd card: 23 secs
dual intel atom 1.8Ghz with an SSD: 9 secs

the pi is slower, but not 10 times slower.
However it does cost a 10th of the price and consumes a 10th of the electricity.

i use the order of magnitude test as that is (in my experience) what you need to make a really noticeable difference (to stuff in general).

i use one daily as a plan9 terminal, for which i feel its ideal.

-Steve






^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  6:24                       ` Bakul Shah
@ 2018-10-10 13:58                         ` erik quanstrom
  0 siblings, 0 replies; 104+ messages in thread
From: erik quanstrom @ 2018-10-10 13:58 UTC (permalink / raw)
  To: 9fans

> > with meltdown/Spectre mitigations in place, I would like to see evidence that flip is faster than copy.
>
> If your system is well balanced, you should be able to
> stream data as fast as memory allows[1]. In such a system
> copying things N times will reduce throughput by similar
> factor. It may be that plan9 underperforms so much this
> doesn't matter normally.

sure.  but flipping page tables is also not free.  there is a huge cost in processor stalls, etc.
spectre and meltdown mitigations make this worse as each page flip has to be accompanied
by a complete pipeline flush or other costly mitigation.  (not that this was cheap to begin with)

it's also not an object to move data as fast as possible.  the object is to do work as fast as possible.

> [1] See: https://code.kx.com/q/cloud/aws/benchmarking/
> A single q process can ingest data at 1.9GB/s from a
> single drive. 16 can achieve 2.7GB/s, with theoretical
> max being 2.8GB/s.

with my same crappy un-optimized nvme driver, i was able to hit 2.5-2.6 GiB/s
with two very crappy nvme drives.  (are you're numbers really GB rather than GiB?)
i am sure i could scale that lineraly.  there's plenty of memory bandwidth left, but
i haven't got any more nvme.  :-)

similarly coraid built an appliance that did copying (due to cache) and hit 1 million
4k iops.  this was in 2011 or so.

but, so what.  all this proves is that with copying or without, we can ingest enough
data for even the most hungry programs.

unless you have data that shows otherwise.  :-)

- erik



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:14                   ` Lyndon Nerenberg
  2018-10-09 22:05                     ` erik quanstrom
@ 2018-10-10 10:42                     ` Ethan Gardener
  1 sibling, 0 replies; 104+ messages in thread
From: Ethan Gardener @ 2018-10-10 10:42 UTC (permalink / raw)
  To: 9fans

On Tue, Oct 9, 2018, at 8:14 PM, Lyndon Nerenberg wrote:
> hiro writes:
>
> > Huh? What exactly do you mean? Can you describe the scenario and the
> > measurements you made?
>
> The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
> Four copies.
>
> I would like to start playing with software defined radio on Plan
> 9, but that amount of data copying is going to put a lot of pressure
> on the kernel to keep up.  UNIX/Linux suffers the same copy bloat,
> and it's having trouble keeping up, too.

References, please.  Programmers are notoriously bad at determining the cause of performance problems.  Examining the source will help to see if "copy bloat" is the actual problem.

>
> --lyndon
>


--
Progress might have been all right once, but it has gone on too long -- Ogden Nash



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 22:22                 ` Digby R.S. Tarvin
@ 2018-10-10 10:38                   ` Ethan Gardener
  2018-10-10 23:15                     ` Digby R.S. Tarvin
  0 siblings, 1 reply; 104+ messages in thread
From: Ethan Gardener @ 2018-10-10 10:38 UTC (permalink / raw)
  To: 9fans

On Tue, Oct 9, 2018, at 11:22 PM, Digby R.S. Tarvin wrote:
> 
> 
> On Tue, 9 Oct 2018 at 23:00, Ethan Gardener <eekee57@fastmail.fm> wrote:
>> 
>> Fascinating thread, but I think you're off by a decade with the 16-bit address bus comment, unless you're not actually talking about Plan 9.  The 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 respectively.  The IBM PC, launched in 1982, had its ROM at the top of that 1MByte space, so it couldn't have been constrained in that way.  By the end of the 80s, all my schoolmates had 68k-powered computers from Commodore and Atari, showing hardware with a 24-bit address space was very much affordable and ubiquitous at the time Plan 9 development started.  Almost all of them had 512KB at the time.  A few flashy gits had 1MB machines. :)
> 
> Not sure I would agree with that. The 20 bit addressing of the 8086 and 8088 did not change their 16 bit nature. They were still 16 bit program counter, with segmentation to provide access to a larger memory - similar in principle to the PDP11 with MMU. 

That's not at all the same as being constrained to 64KB memory.  Are we communicating at cross purposes here?  If we're not, if I haven't misunderstood you, you might want to read up on creating .exe files for MS-DOS.  

> The first 32 bit x86 processor was the 386, which I think came out in 1985, very close to when work on Plan9 was rumored to have  started. So it seemed not impossible that work might have started on an older 16 bit machine, but  at Bell Labs probably a long shot.

Mmh, rumors. I read they were starting to think about Plan 9 in 1985, but I haven't read anything about it being up and running until '89 or '90.  There's not much to go on.

>> I still wish I'd kept the better of the Atari STs which made their way down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier "STFM" models.  I remember wanting to try to run Plan 9 on it.  Let's estimate how tight it would be...
>>  
>>  I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86.  I kept running out of image memory!  The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :)  
>>  
>>  1 bit per pixel would obviously improve matters by a factor of 16 compared to my setup, and 640x400 (Atari ST high resolution) would be another 5 times smaller than my screen.  Putting these numbers together with my experience, you'd have to be careful to use images sparingly on a machine with 800KB free RAM after the kernel is loaded.  That's better than I thought, probably achievable on that Atari I had, but it couldn't be used as intensively as I used Plan 9 back then.  
>>  
>>  How could it be used?  I think it would be a good idea to push the draw device back to user space and make very sure to have it check for failing malloc!  I certainly wouldn't want a terminal with a filesystem and graphics all on a single 1MByte 64000-powered computer, because a filesystem on a terminal runs in user space, and thus requires some free memory to run the programs to shut it down.  Actually, Plan 9's separation of terminal from filesystem seems quite the obvious choice when I look at it like this. :)  
> 
> I went Commodore Amiga at about that time - because it at least supported some form of multi-tasking out out the box, and I spent many happy hours getting OS9 running on it.. An interesting architecture, capable of some impressive graphics, but subject to quite severe limitations which made general purpose graphics difficult. (Commodore later released SVR4 Unix for the A3000, but limited X11 to monochrome when using the inbuilt graphics).

It does sound like fun. :)  I'm not surprised by the monochrome graphics limitation after my calculations.  Still, X11 or any other window system which lacks a backing store may do better in low-memory environments than Plan 9's present draw device.  It's a shame, a backing store is a great simplification for programmers.

> But being 32 bit didn't give it a huge advantage over the 16 bit x86 systems for tinkering with operating system, because the 68000 had no MMU.  It was easier to get a Unix like system going with 16 bit segmentation than a 32 bit linear space and no hardware support for run time relocation.
> (OS9 used position independent code throughout to work without an MMU, but didn't try to implement fork() semantics).

I'm sometimes tempted to think that fork() is freakishly high-level crazy stuff. :)  Still, like backing store, it's very nice to have.

> It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really did everything I wanted. The 68020 with an optional MMU was equivalent, but not so common in consumer machines.
> 
> Hardware progress seems to have been rather uninteresting since then. Sure, hardware is *much* faster and *much* bigger, but fundamentally the same architecture. Intel had a brief flirtation with a novel architecture with the iAPX 432 in 81, but obviously found that was more profitable making the familiar architecture bigger and faster .

I rather agree.  Multi-core and hyperthreading don't bring in much from an operating system designer's perspective, and I think all the interesting things about caches are means of working around their problems.

I would very much like to get my hands on a ga144 to see what sort of operating system structure would work well on 144 processors with 64KW RAM each. :)  There's 64KW ROM per processor too, a lot of stock code could go in that.  Both the RAM and ROM operate at the full speed of the processor, no caches to worry about.

A little rant about MMUs, sort-of saying "unix and C are not without complexifying nonsense":  I'm sure the MMU itself is uninteresting or even harmful to many who prefer other languages and system designs.  Just look at that other discussion about the penalties of copying versus the cache penalties of page flipping.  If that doesn't devolve into "heavy negativity," it'll only be because those who know don't write much, or those who write much don't want to provide actual figures or references to argue about.

What about all those languages which don't even give the programmer access to pointers in the first place.  Many have run directly on hardware in the past, some can now.  Do they need MMUs?  

Then there's Forth, which relies on pointers even more than C does.  I haven't read *anything* about MMUs in relation to Forth, and yet Forth is in practice as much an operating system as a language.  It runs directly on hardware.  I'm not sure of some details yet, but it looks like many operating system features either "fall out of" the language design (to use a phrase from Ken Thompson & co.), or are trivial to implement.  

There were multitasking Forth systems in the 70s.  No MMU.  The full power of pointers *at the prompt*.  Potential for stack under- and over-runs too.  And yet these were working systems, and the language hasn't been consigned to the graveyard of computing history.  My big project includes exploring how this is possible. :)  A likely possibility is the power to redefine words (functions) without affecting previous definitions.  Pointer store and fetch can trivially be redefined to check bounds.  Check your code doesn't go out of bounds, then "empty", and load it without the bounds-checking store and fetch.  

--
Progress might have been all right once, but it has gone on too long -- Ogden Nash



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  8:13     ` Digby R.S. Tarvin
@ 2018-10-10  9:14       ` hiro
  2018-10-10 13:59         ` Steve Simon
  2018-10-10 21:32         ` Digby R.S. Tarvin
  0 siblings, 2 replies; 104+ messages in thread
From: hiro @ 2018-10-10  9:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

I agree, if you have a choice avoid rpi by all costs.
Even if the software side of that other board was less pleasent at least it
worked with my mouse and keyboard!! :)

As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB.
But my point is that even though this number is low, the rpi is too limited
to do any meaningful processing anyway (ignoring the usb troubles and lack
of ethernet). It's a mobile phone soc after all, where the modulation is
done by dedicated chips, not on cpu! :)

On Wednesday, October 10, 2018, Digby R.S. Tarvin <digbyt42@gmail.com>
wrote:
> I don't know which other ARM board you tried, but I have always found
terrible I/O performance of the Pi to be a bigger problem that the ARM
speed.  The USB2 interface is really slow, and there arn't really many
other (documented) alternative options. The Ethernet goes through the same
slow USB interface, and there is only so much that you can do bit bashing
data with GPIO's.  The sdCard interface seems to be the only non-usb
filesystem I/O available. And that in turn limits the viability of
relieving the RAM contraints with virtual memory. So the ARM processor
itself is not usually the problem for me.
> In general I find the pi a nice little device for quite a few things -
like low power, low bandwidth, low cost servers or displays with plenty of
open source compatability.. Or hacking/prototyping where I don't want to
have to worry too much about blowing things up. But it not good for high
throughput I/O,  memory intensive applications, or anything requiring a lot
of processing power.
> The validity of your conclusion regarding low power ARM in general
probably depends on what the other board you tried was..
> DigbyT
> On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote:
>>
>> > Eliminating as much of the copy in/out WRT the kernel cannot but
>> > help, especially when you're doing SDR decoding near the radios
>> > using low-powered compute hardware (think Pies and the like).
>>
>> Does this include demodulation on the pi? cause even when i dumped the
>> pi i was given for that purpose (with a <2Mbit I/Q stream) and
>> replaced it with some similar ARM platform that at least had neon cpu
>> instruction extensions for faster floating point operations, I was
>> barely able to run a small FFT.
>>
>> My conclusion was that these low-powered ARM systems are just good
>> enough for gathering low-bandwidth, non-critical USB traffic, like
>> those raw I/Q samples from a dongle, but unfit for anything else.
>>
>

[-- Attachment #2: Type: text/html, Size: 2853 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  5:52   ` hiro
@ 2018-10-10  8:13     ` Digby R.S. Tarvin
  2018-10-10  9:14       ` hiro
  2018-10-11 17:43     ` Lyndon Nerenberg
  1 sibling, 1 reply; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-10  8:13 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1965 bytes --]

I don't know which other ARM board you tried, but I have always found
terrible I/O performance of the Pi to be a bigger problem that the ARM
speed.  The USB2 interface is really slow, and there arn't really many
other (documented) alternative options. The Ethernet goes through the same
slow USB interface, and there is only so much that you can do bit bashing
data with GPIO's.  The sdCard interface seems to be the only non-usb
filesystem I/O available. And that in turn limits the viability of
relieving the RAM contraints with virtual memory. So the ARM processor
itself is not usually the problem for me.

In general I find the pi a nice little device for quite a few things - like
low power, low bandwidth, low cost servers or displays with plenty of open
source compatability.. Or hacking/prototyping where I don't want to have to
worry too much about blowing things up. But it not good for high throughput
I/O,  memory intensive applications, or anything requiring a lot of
processing power.

The validity of your conclusion regarding low power ARM in general probably
depends on what the other board you tried was..

DigbyT

On Wed, 10 Oct 2018 at 17:51, hiro <23hiro@gmail.com> wrote:

> > Eliminating as much of the copy in/out WRT the kernel cannot but
> > help, especially when you're doing SDR decoding near the radios
> > using low-powered compute hardware (think Pies and the like).
>
> Does this include demodulation on the pi? cause even when i dumped the
> pi i was given for that purpose (with a <2Mbit I/Q stream) and
> replaced it with some similar ARM platform that at least had neon cpu
> instruction extensions for faster floating point operations, I was
> barely able to run a small FFT.
>
> My conclusion was that these low-powered ARM systems are just good
> enough for gathering low-bandwidth, non-critical USB traffic, like
> those raw I/Q samples from a dongle, but unfit for anything else.
>
>

[-- Attachment #2: Type: text/html, Size: 2326 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  3:28               ` Lucio De Re
  2018-10-09  8:23                 ` hiro
  2018-10-09  9:45                 ` Ethan Gardener
@ 2018-10-10  7:32                 ` Giacomo Tesio
  2 siblings, 0 replies; 104+ messages in thread
From: Giacomo Tesio @ 2018-10-10  7:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Il giorno mar 9 ott 2018 alle ore 05:33 Lucio De Re
<lucio.dere@gmail.com> ha scritto:
>
> On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote:
> >
> > One thing I have mused about is recasting plan9 as a
> > microkernel and pushing out a lot of its kernel code into user
> > mode code.  It is already half way there -- it is basically a
> > mux for 9p calls, low level device drivers,
> >
> There are religious reasons not to go there

Indeed, as an heretic, one of the first things I did with Jehanne was
to move the console filesystem out of kernel.
Then I moved several syscalls into userspace. Or turned them to files
or to operation on existing files.
More syscall/kernel services will move to user space as I'll have time
to hack it again.

You know... heretics ruin everything!

I'm not going to turn Jehanne to a microkernel, but I'm looking for
the simplest possible set of kernel abstractions that can support a
distributed operating system able to replace the mainstream Web+OS
mess.
You know... heretics are crazy, too!


Giacomo



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 22:06                     ` erik quanstrom
@ 2018-10-10  6:24                       ` Bakul Shah
  2018-10-10 13:58                         ` erik quanstrom
  0 siblings, 1 reply; 104+ messages in thread
From: Bakul Shah @ 2018-10-10  6:24 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Oct 9, 2018, at 3:06 PM, erik quanstrom <quanstro@quanstro.net> wrote:
> 
> with meltdown/Spectre mitigations in place, I would like to see evidence that flip is faster than copy.

If your system is well balanced, you should be able to
stream data as fast as memory allows[1]. In such a system
copying things N times will reduce throughput by similar
factor. It may be that plan9 underperforms so much this
doesn't matter normally.

But the reason I want this is to reduce latency to the first
access, especially for very large files. With read() I have
to wait until the read completes. With mmap() processing can
start much earlier and can be interleaved with background
data fetch or prefetch. With read() a lot more resources
are tied down. If I need random access and don't need to
read all of the data, the application has to do pread(),
pwrite() a lot thus complicating it. With mmap() I can just
map in the whole file and excess reading (beyond what the
app needs) will not be a large fraction.

The default assumption here seems to be that doing this
will be very complicated and be as bad as on Linux. But
Linux is not a good model of what to do and examples of what
not to do are not useful guides in system design. There are
other OSes such as the old Apollo Aegis (AKA Apollo/Domain),
KeyKOS & seL4 that avoid copying[2].

Though none of this matters right now as we don't even have
a paper design so please put down your clubs and swords :-)

[1] See: https://code.kx.com/q/cloud/aws/benchmarking/
A single q process can ingest data at 1.9GB/s from a
single drive. 16 can achieve 2.7GB/s, with theoretical
max being 2.8GB/s.

[2] Liedke's original L4 evolved into a provably secure
seL4 and in the process it became very much like KeyKOS.
Capability systems do pass around pages as protected
objects and avoid copying. Sort of like how in a program
you'd pass a huge array by reference and not by value
to a function.





^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 23:43 ` Lyndon Nerenberg
  2018-10-10  5:52   ` hiro
@ 2018-10-10  5:57   ` hiro
  1 sibling, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-10  5:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> via USB and see how it stands up.  But the real question is what
> kind of delay, latency, and jitter will there be, getting that raw
> I/Q data from the USB interface up to the consuming application?

How is your proposal of zero-copy going to help latency? IIRC we have
some real-time thingy, might be able to reduce jitter...
But then I might also ask why you're not doing the most critical path
on an fpga anyway?
Start with identifying your worst bottleneck.

> Eliminating as much of the copy in/out WRT the kernel cannot but
> help

wrong, this design change requires ressources, too, and might gain you
higher complexity. measure first.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 23:43 ` Lyndon Nerenberg
@ 2018-10-10  5:52   ` hiro
  2018-10-10  8:13     ` Digby R.S. Tarvin
  2018-10-11 17:43     ` Lyndon Nerenberg
  2018-10-10  5:57   ` hiro
  1 sibling, 2 replies; 104+ messages in thread
From: hiro @ 2018-10-10  5:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Eliminating as much of the copy in/out WRT the kernel cannot but
> help, especially when you're doing SDR decoding near the radios
> using low-powered compute hardware (think Pies and the like).

Does this include demodulation on the pi? cause even when i dumped the
pi i was given for that purpose (with a <2Mbit I/Q stream) and
replaced it with some similar ARM platform that at least had neon cpu
instruction extensions for faster floating point operations, I was
barely able to run a small FFT.

My conclusion was that these low-powered ARM systems are just good
enough for gathering low-bandwidth, non-critical USB traffic, like
those raw I/Q samples from a dongle, but unfit for anything else.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  0:18                       ` Dan Cross
@ 2018-10-10  5:45                         ` hiro
  0 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-10  5:45 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I was responding to lyndon's comment on certain "experiments" that
should have to be done here, 2 messages up.
But what he described sounded exactly like the zero-copying stuff that
linux is trying to shove into everything.
I have not made any statement about non-linux systems, and I'm not
even saying these experiments couldn't be done on plan9, it's just
that the linux people are way busier going down that path.

On 10/10/18, Dan Cross <crossd@gmail.com> wrote:
> On Tue, Oct 9, 2018 at 7:24 PM hiro <23hiro@gmail.com> wrote:
>
>> from what i see in linux people have been more than just exploring it,
>> they've gone absolutely nuts. it makes everything complex, not just
>> the fast path.
>>
>
> To whom are you responding? Your email is devoid of context, so it is not
> clear.
>
> However your statement appears to be based on an unstated assumption that
> there is a plan9 school of thought, and a Linux school of thought, and no
> other school of thought. If so, that is incorrect.
>
>         - Dan C.
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-10  0:15 cinap_lenrek
@ 2018-10-10  0:22 ` Lyndon Nerenberg
  0 siblings, 0 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-10  0:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

cinap_lenrek@felloff.net writes:

> why? the *HOST CONTROLLER* schedules the data transfers.

I *DON'T KNOW*.  It's just observed behaviour.

> ahhhh! we'r talking about some crappy raspi here... probably with all
> caches disabled... never mind.

Hah.  An Rpi tips over with 1200 baud USB serial.  I was talking
about "real" (Intel :-P) hardware for the other tippy-over behaviour.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:34                     ` hiro
  2018-10-09 19:36                       ` hiro
  2018-10-09 19:40                       ` Lyndon Nerenberg
@ 2018-10-10  0:18                       ` Dan Cross
  2018-10-10  5:45                         ` hiro
  2 siblings, 1 reply; 104+ messages in thread
From: Dan Cross @ 2018-10-10  0:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 540 bytes --]

On Tue, Oct 9, 2018 at 7:24 PM hiro <23hiro@gmail.com> wrote:

> from what i see in linux people have been more than just exploring it,
> they've gone absolutely nuts. it makes everything complex, not just
> the fast path.
>

To whom are you responding? Your email is devoid of context, so it is not
clear.

However your statement appears to be based on an unstated assumption that
there is a plan9 school of thought, and a Linux school of thought, and no
other school of thought. If so, that is incorrect.

        - Dan C.

[-- Attachment #2: Type: text/html, Size: 865 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-10  0:15 cinap_lenrek
  2018-10-10  0:22 ` Lyndon Nerenberg
  0 siblings, 1 reply; 104+ messages in thread
From: cinap_lenrek @ 2018-10-10  0:15 UTC (permalink / raw)
  To: 9fans

> To address Hiro's comments, I have no benchmarks on Plan 9, because
> the SDR code I run does not exist there.  But I do have experience
> with running SDR on Linux and FreeBSD with hardware like the HackRF
> One.  That hardware can easily saturate a USB2 interface/driver on
> both of those operating systems.  Given my experience with USB on
> Plan 9 to date, it's a safe bet that all the variants would die
> when presented with that amount of traffic.

why? the *HOST CONTROLLER* schedules the data transfers. if the
program doesnt do a read() theres nothing to schedule... (unless
its isochronous endpoint, in which case the controller dma's for
you in the background at the specified sampling rate).

> (I can knock down a Plan9 system with 56 Kb/s USB serial traffic.)

that sounds seriously scewed up. i have no issues here reading a usb
stick on my x230 with xhci at 32MB/s, not using any fancy streaming
optimization. no load at all. and this is just some garbage from the
supermarket.

> I can see about
> twisting up some code that would read the raw I/Q data from the SDR
> via USB and see how it stands up.  But the real question is what
> kind of delay, latency, and jitter will there be, getting that raw
> I/Q data from the USB interface up to the consuming application?

is this a isochronous endpoint? in that case you would not have to
worry much as the controller does all the timing for you in hardware.

> Eliminating as much of the copy in/out WRT the kernel cannot but
> help, especially when you're doing SDR decoding near the radios
> using low-powered compute hardware (think Pies and the like).

ahhhh! we'r talking about some crappy raspi here... probably with all
caches disabled... never mind.

> --lyndon

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:47 cinap_lenrek
  2018-10-09 22:01 ` erik quanstrom
@ 2018-10-09 23:43 ` Lyndon Nerenberg
  2018-10-10  5:52   ` hiro
  2018-10-10  5:57   ` hiro
  1 sibling, 2 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 23:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

cinap_lenrek@felloff.net writes:
> > The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->applicati
> on.
> > Four copies.
>
> that sounds wrong.
>
> usbd is not involved in the data transfer.

You're right, I was wrong about 'usbd'.  In the bits of testing
I've done with this, 'usbd' is replaces with a user space file
server that abstracts the hardware and presents a useful file system
interface.  (E.g. along the lines of the gps filesystem interface.)

To address Hiro's comments, I have no benchmarks on Plan 9, because
the SDR code I run does not exist there.  But I do have experience
with running SDR on Linux and FreeBSD with hardware like the HackRF
One.  That hardware can easily saturate a USB2 interface/driver on
both of those operating systems.  Given my experience with USB on
Plan 9 to date, it's a safe bet that all the variants would die
when presented with that amount of traffic. (I can knock down a
Plan9 system with 56 Kb/s USB serial traffic.)  I can see about
twisting up some code that would read the raw I/Q data from the SDR
via USB and see how it stands up.  But the real question is what
kind of delay, latency, and jitter will there be, getting that raw
I/Q data from the USB interface up to the consuming application?

Eliminating as much of the copy in/out WRT the kernel cannot but
help, especially when you're doing SDR decoding near the radios
using low-powered compute hardware (think Pies and the like).

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 18:49                 ` hiro
  2018-10-09 19:14                   ` Lyndon Nerenberg
  2018-10-09 19:23                   ` Lyndon Nerenberg
@ 2018-10-09 22:42                   ` Dan Cross
  2 siblings, 0 replies; 104+ messages in thread
From: Dan Cross @ 2018-10-09 22:42 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 4338 bytes --]

On Tue, Oct 9, 2018 at 5:28 PM hiro <23hiro@gmail.com> wrote:

> > E.g. right now Plan 9 suffers from a *lot* of data copying between
> > the kernel and processes, and between processes themselves.
>
> Huh? What exactly do you mean?


The current plan9 architecture relies heavily on copying data within a
process between userspace and the kernel for e.g. IO. This should be well
known to anyone who's rummaged around in the kernel, as it's pretty
evident. Because of the simple system call and VM interfaces, things like
scatter/gather IO or memory-mapped direct hardware access aren't really
options. `iowritev`, for example, coalesces its arguments into a single
buffer that it then pwrite()'s to its destination.

Can you describe the scenario and the
> measurements you made?
>

This is a different issue. I don't know if copying is as significant an
overhead as Lyndon suggested, but there are plenty of slow code paths in
plan9. For example, when we ported the plan9 network stack to Akaros, we
made a number of enhancements that combined sped things up by 50% or
greater. Most of these were pretty simple: optimizing checksum
calculations, alignment of IP and TCP headers on natural word boundaries
meaning that we could read an IP address with a 32-bit load (I think that
one netted a gigabit increase in throughput), using optimized memcpy
instead of memmove in performance critical code paths, etc. We went from
about 7Gbps on a 10Gbps interface to saturating the NIC. Those measurements
were made between dedicated test machines on a dedicated network using
netperf. Drew Gallatin, now at Netflix working on FreeBSD's network stack,
did most of the optimization work.

If that experience in that one section of the kernel is any indicator,
plan9 undoubtedly has lots of room for optimization in other parts of the
system. Lots of aspects of the system were optimized for much smaller
machines than are common now and many of those optimizations no longer make
much sense on modern machines; the allocator is slow, for example, though
very good at not wasting RAM. Compare to a vmem-style allocator, that can
allocate any requested size in constant-time, but with up-to a factor of
two waste of memory.

Lots of plan9 code is also buggy, or at least racy: consider the seemingly
random valued timeouts to "give other threads 5 seconds to get out" in
ipselffree() and iplinkfree() before "deallocating" an Iplink/Ipself.
Something like RCU, even a naive RCU, would be more robust here,
particularly under heavy load. Device drivers are atrophied and often
buggy, or at least susceptible to hardware bugs that are fixed by the
vendor-provided drivers. When I put in the plan9 networks to support Akaros
development, we ran into a bug in the i218 ethernet controller that caused
the NIC to wedge. We got Geoff Collyer to fix the i82563 driver and we sent
a patch to 9legacy, but it's symptomatic of an aging code base with a
shrinking developer population.

> If we could eliminate most of that copying, things would get a lot faster.
>
> Which things would get faster?
>

Presumably bulk data transfer between devices and the user portion of an
address space. If copying were eliminated (or just reduced) these would
certainly get fast*er*. Whether they would be sufficiently faster as to
make a perceptible performance different to a real workload is another
matter.

> Dealing with the security issues isn't trivial
>
> what security issues?
>

Presumably the bread-and-butter security issues that arise whenever the
user portion of an address space is being concurrently accessed by
hardware. As a trivial example, imagine scheduling a DMA transfer from some
device into a buffer in the user portion of an address space and then
exit()'ing the process. What do you do with the pages the device was
writing into? They had better be pinned in some way until the IO operation
completes before they're reallocated to something else that isn't expecting
it to be clobbered.

I wouldn't be surprised if the raft of currently popular speculative
execution bugs could be exacerbated by the kernel playing around with data
in the user address space in a naive way. It doesn't look like plan9 has
any serious mitigations for those.

        - Dan C.

[-- Attachment #2: Type: text/html, Size: 5200 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 11:58               ` Ethan Gardener
  2018-10-09 13:59                 ` erik quanstrom
@ 2018-10-09 22:22                 ` Digby R.S. Tarvin
  2018-10-10 10:38                   ` Ethan Gardener
  1 sibling, 1 reply; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-09 22:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 4379 bytes --]

On Tue, 9 Oct 2018 at 23:00, Ethan Gardener <eekee57@fastmail.fm> wrote:

>
> Fascinating thread, but I think you're off by a decade with the 16-bit
> address bus comment, unless you're not actually talking about Plan 9.  The
> 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979
> respectively.  The IBM PC, launched in 1982, had its ROM at the top of that
> 1MByte space, so it couldn't have been constrained in that way.  By the end
> of the 80s, all my schoolmates had 68k-powered computers from Commodore and
> Atari, showing hardware with a 24-bit address space was very much
> affordable and ubiquitous at the time Plan 9 development started.  Almost
> all of them had 512KB at the time.  A few flashy gits had 1MB machines. :)
>

Not sure I would agree with that. The 20 bit addressing of the 8086 and
8088 did not change their 16 bit nature. They were still 16 bit program
counter, with segmentation to provide access to a larger memory - similar
in principle to the PDP11 with MMU.

The first 32 bit x86 processor was the 386, which I think came out in 1985,
very close to when work on Plan9 was rumored to have  started. So it seemed
not impossible that work might have started on an older 16 bit machine,
but  at Bell Labs probably a long shot.


> I still wish I'd kept the better of the Atari STs which made their way
> down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the
> earlier "STFM" models.  I remember wanting to try to run Plan 9 on it.
> Let's estimate how tight it would be...
>
> I think it would be terrible, because I got frustrated enough trying to
> run a 4e CPU server with graphics on a 2GB x86.  I kept running out of
> image memory!  The trouble was the draw device in 4th edition stores images
> in the same "image memory" the kernel loads programs into, and the 386 CPU
> kernel 'only' allocates 64MB of that. :)
>
> 1 bit per pixel would obviously improve matters by a factor of 16 compared
> to my setup, and 640x400 (Atari ST high resolution) would be another 5
> times smaller than my screen.  Putting these numbers together with my
> experience, you'd have to be careful to use images sparingly on a machine
> with 800KB free RAM after the kernel is loaded.  That's better than I
> thought, probably achievable on that Atari I had, but it couldn't be used
> as intensively as I used Plan 9 back then.
>
> How could it be used?  I think it would be a good idea to push the draw
> device back to user space and make very sure to have it check for failing
> malloc!  I certainly wouldn't want a terminal with a filesystem and
> graphics all on a single 1MByte 64000-powered computer, because a
> filesystem on a terminal runs in user space, and thus requires some free
> memory to run the programs to shut it down.  Actually, Plan 9's separation
> of terminal from filesystem seems quite the obvious choice when I look at
> it like this. :)
>

I went Commodore Amiga at about that time - because it at least supported
some form of multi-tasking out out the box, and I spent many happy hours
getting OS9 running on it.. An interesting architecture, capable of some
impressive graphics, but subject to quite severe limitations which made
general purpose graphics difficult. (Commodore later released SVR4 Unix for
the A3000, but limited X11 to monochrome when using the inbuilt graphics).

But being 32 bit didn't give it a huge advantage over the 16 bit x86
systems for tinkering with operating system, because the 68000 had no MMU.
It was easier to get a Unix like system going with 16 bit segmentation than
a 32 bit linear space and no hardware support for run time relocation.
(OS9 used position independent code throughout to work without an MMU, but
didn't try to implement fork() semantics).

It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really
did everything I wanted. The 68020 with an optional MMU was equivalent, but
not so common in consumer machines.

Hardware progress seems to have been rather uninteresting since then. Sure,
hardware is *much* faster and *much* bigger, but fundamentally the same
architecture. Intel had a brief flirtation with a novel architecture with
the iAPX 432 in 81, but obviously found that was more profitable making the
familiar architecture bigger and faster .

[-- Attachment #2: Type: text/html, Size: 4918 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:23                   ` Lyndon Nerenberg
  2018-10-09 19:34                     ` hiro
@ 2018-10-09 22:06                     ` erik quanstrom
  2018-10-10  6:24                       ` Bakul Shah
  1 sibling, 1 reply; 104+ messages in thread
From: erik quanstrom @ 2018-10-09 22:06 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/html, Size: 205 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:14                   ` Lyndon Nerenberg
@ 2018-10-09 22:05                     ` erik quanstrom
  2018-10-11 17:54                       ` Lyndon Nerenberg
  2018-10-10 10:42                     ` Ethan Gardener
  1 sibling, 1 reply; 104+ messages in thread
From: erik quanstrom @ 2018-10-09 22:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/html, Size: 219 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:47 cinap_lenrek
@ 2018-10-09 22:01 ` erik quanstrom
  2018-10-09 23:43 ` Lyndon Nerenberg
  1 sibling, 0 replies; 104+ messages in thread
From: erik quanstrom @ 2018-10-09 22:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/html, Size: 156 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:49 cinap_lenrek
@ 2018-10-09 19:56 ` hiro
  0 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-09 19:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

he has ignored my questions about measurement, so i'm sure he hasn't

On 10/9/18, cinap_lenrek@felloff.net <cinap_lenrek@felloff.net> wrote:
> also, i wonder how much is the actual copy overhead you claim is the issue.
> maybe the impact for copying is more dominated by the memory allocator used
> for allocb(). have you measured?
>
> --
> cinap
>
>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-09 19:49 cinap_lenrek
  2018-10-09 19:56 ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: cinap_lenrek @ 2018-10-09 19:49 UTC (permalink / raw)
  To: 9fans

also, i wonder how much is the actual copy overhead you claim is the issue.
maybe the impact for copying is more dominated by the memory allocator used
for allocb(). have you measured?

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
@ 2018-10-09 19:47 cinap_lenrek
  2018-10-09 22:01 ` erik quanstrom
  2018-10-09 23:43 ` Lyndon Nerenberg
  0 siblings, 2 replies; 104+ messages in thread
From: cinap_lenrek @ 2018-10-09 19:47 UTC (permalink / raw)
  To: 9fans

> The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
> Four copies.

that sounds wrong.

usbd is not involved in the data transfer. it mainly is just responsible to
enumerating devices and instantiating drivers and registering the endpoints
in devusb. after that you access the endpoint files from devusb which goes
directly to the kernel. devusb also allows you to create a alias for a
endpoint file which then appears directly under /dev. usb audio uses this
mechanism. the usb driver just activates the device and provides the ctl/volume
files, while audio data is handled by the kernel's devusb.

on another remark regarding zero copy. the reason plan9 drivers are small comes
from NOT doing these "optimizations". identity mapping the low part of memory
in the kernel avoids alot of trouble and allows you to get DMA capable memory
with just wrapping a pointer in PADDR(va). no page lists needed. no MMU tricks
needed in the drivers. you can use any kernel memory va for DMA... even your
kernel stack! its never paged out. you can be sure it is not changed while the
device looks at it ect. do not underestimate the impact of this "simplification".

linux block layer is broken in that regard btw. it just hands user pages into
the drivers without making sure they do not change while the i/o is in flight,
which results in all kinds of false-negatives when you actually start verifying
your raid arrays as different snapshots in time got written out to the raid
members. they know about this and ignore it because benchmarks are more important.

--
cinap



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:34                     ` hiro
  2018-10-09 19:36                       ` hiro
@ 2018-10-09 19:40                       ` Lyndon Nerenberg
  2018-10-10  0:18                       ` Dan Cross
  2 siblings, 0 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 19:40 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:
> from what i see in linux people have been more than just exploring it,
> they've gone absolutely nuts. it makes everything complex, not just
> the fast path.

And those are the Linux folks doing thier thing.  The reading I'm
doing right now is related to the pessimizations page flipping throws
at the CPU caches.  It looks scary ...


--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:34                     ` hiro
@ 2018-10-09 19:36                       ` hiro
  2018-10-09 19:40                       ` Lyndon Nerenberg
  2018-10-10  0:18                       ` Dan Cross
  2 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-09 19:36 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

also, if all you care about is throughput, i don't see how those 4
copies you identified makes a difference. especially with something
slow like USB.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:23                   ` Lyndon Nerenberg
@ 2018-10-09 19:34                     ` hiro
  2018-10-09 19:36                       ` hiro
                                         ` (2 more replies)
  2018-10-09 22:06                     ` erik quanstrom
  1 sibling, 3 replies; 104+ messages in thread
From: hiro @ 2018-10-09 19:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

from what i see in linux people have been more than just exploring it,
they've gone absolutely nuts. it makes everything complex, not just
the fast path.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 19:09                 ` Bakul Shah
@ 2018-10-09 19:30                   ` Lyndon Nerenberg
  0 siblings, 0 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 19:30 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Bakul Shah writes:

And funny you should mention this!

> Some of this process/memory management can be delegated to
> user code as well.

At $DAYJOB we would really like to have application process control
over the kernel scheduler, as this seems to be the only realistic
way to avoid the (kernel) resource starvation issues we run into.

Our back end servers don't go down often.  But when they do, it's for
reasons entirely out of our control.  Because those resource allocation
policies have been pushed into the kernel, and beyond our control.


--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 18:49                 ` hiro
  2018-10-09 19:14                   ` Lyndon Nerenberg
@ 2018-10-09 19:23                   ` Lyndon Nerenberg
  2018-10-09 19:34                     ` hiro
  2018-10-09 22:06                     ` erik quanstrom
  2018-10-09 22:42                   ` Dan Cross
  2 siblings, 2 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 19:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:

> > Dealing with the security issues isn't trivial

> what security issues?

Passing protocol buffer like objects around user space, that might
affect how the kernel talks to hardware.  E.g. IPsec offload into
hardware.  You don't want user-space messing with that sort of
context, but you want to tag it with the data buffer as it gets
passed up and down through the user/kernel gate.  Practical page
flipping needs a kernel-read-only context attached to the non-kernel
user data part of the page.  A quick solution is to pair pages, one
half of which the kernel owns, the other being the data payload.  But
that't just a start.  And that's all I'm saying: this might be an
approach to a better/faster I/O paradigm, but it needs interested
people to explore it ...


--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 18:49                 ` hiro
@ 2018-10-09 19:14                   ` Lyndon Nerenberg
  2018-10-09 22:05                     ` erik quanstrom
  2018-10-10 10:42                     ` Ethan Gardener
  2018-10-09 19:23                   ` Lyndon Nerenberg
  2018-10-09 22:42                   ` Dan Cross
  2 siblings, 2 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 19:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

hiro writes:

> Huh? What exactly do you mean? Can you describe the scenario and the
> measurements you made?

The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
Four copies.

I would like to start playing with software defined radio on Plan
9, but that amount of data copying is going to put a lot of pressure
on the kernel to keep up.  UNIX/Linux suffers the same copy bloat,
and it's having trouble keeping up, too.

--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 17:45               ` Lyndon Nerenberg
  2018-10-09 18:49                 ` hiro
@ 2018-10-09 19:09                 ` Bakul Shah
  2018-10-09 19:30                   ` Lyndon Nerenberg
  1 sibling, 1 reply; 104+ messages in thread
From: Bakul Shah @ 2018-10-09 19:09 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

On Tue, 09 Oct 2018 10:45:37 -0700 Lyndon Nerenberg <lyndon@orthanc.ca> wrote:
Lyndon Nerenberg writes:
> Bakul Shah writes:
>
> > One thing I have mused about is recasting plan9 as a
> > microkernel and pushing out a lot of its kernel code into user
> > mode code.  It is already half way there -- it is basically a
> > mux for 9p calls, low level device drivers, VM support & some
> > process related code.
>
> Somewhat related to this ... after reading some papers on
> TCP-in-user-space implementations, I've been thinking about how an
> interface that supported fast/secure page flipping between the
> kernel and process address space would change how we do things.
>
> E.g. right now Plan 9 suffers from a *lot* of data copying between
> the kernel and processes, and between processes themselves.  If we
> could eliminate most of that copying, things would get a lot faster.
> Dealing with the security issues isn't trivial, but the programmer
> time going into eeking out the last bit of I/O throughput of the
> current scheme could be redirected.

Funny you say this. I wrote I wanted memory mapping to avoid
having to copy data multiple times but then deleted it,
thinking it would detract from the main point.

Actually I want this even without any major redesign!

> If it works, this would reduce the kernel back to handling
> process/memory management, and talking to the hardware.  Not a
> micro-kernel, but just as good from a practical standpoint.

Some of this process/memory management can be delegated to
user code as well.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 17:50                   ` Bakul Shah
@ 2018-10-09 18:57                     ` Ori Bernstein
  0 siblings, 0 replies; 104+ messages in thread
From: Ori Bernstein @ 2018-10-09 18:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, 9 Oct 2018 10:50:08 -0700
Bakul Shah <bakul@bitblocks.com> wrote:

> Exactly! No point in being scared by labels! I am really
> only talking about distilling plan9 further. At least as a
> thought experiment.
> 
> Isn’t it more fun to discuss this than all the “heavy
> negativity”? :-)

It's much better with patches.

-- 
Ori Bernstein <ori@eigenstate.org>



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 17:45               ` Lyndon Nerenberg
@ 2018-10-09 18:49                 ` hiro
  2018-10-09 19:14                   ` Lyndon Nerenberg
                                     ` (2 more replies)
  2018-10-09 19:09                 ` Bakul Shah
  1 sibling, 3 replies; 104+ messages in thread
From: hiro @ 2018-10-09 18:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> E.g. right now Plan 9 suffers from a *lot* of data copying between
> the kernel and processes, and between processes themselves.

Huh? What exactly do you mean? Can you describe the scenario and the
measurements you made?

> If we could eliminate most of that copying, things would get a lot faster.

Which things would get faster?

> Dealing with the security issues isn't trivial

what security issues?



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  9:45                 ` Ethan Gardener
@ 2018-10-09 17:50                   ` Bakul Shah
  2018-10-09 18:57                     ` Ori Bernstein
  0 siblings, 1 reply; 104+ messages in thread
From: Bakul Shah @ 2018-10-09 17:50 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 542 bytes --]

> On Oct 9, 2018, at 2:45 AM, Ethan Gardener <eekee57@fastmail.fm> wrote:
> 
> One day, Uriel met a man who explained very 
> convincingly that the Plan 9 kernel is a microkernel.
> On another day, Uriel met a man who explained very 
> convincingly that the Plan 9 kernel is a macrokernel.
> Uriel was enlightened.

Exactly! No point in being scared by labels! I am really
only talking about distilling plan9 further. At least as a
thought experiment.

Isn’t it more fun to discuss this than all the “heavy
negativity”? :-)

[-- Attachment #2: Type: text/html, Size: 1273 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  0:14             ` Bakul Shah
  2018-10-09  1:34               ` Christopher Nielsen
  2018-10-09  3:28               ` Lucio De Re
@ 2018-10-09 17:45               ` Lyndon Nerenberg
  2018-10-09 18:49                 ` hiro
  2018-10-09 19:09                 ` Bakul Shah
  2 siblings, 2 replies; 104+ messages in thread
From: Lyndon Nerenberg @ 2018-10-09 17:45 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Lyndon Nerenberg

Bakul Shah writes:

> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.

Somewhat related to this ... after reading some papers on
TCP-in-user-space implementations, I've been thinking about how an
interface that supported fast/secure page flipping between the
kernel and process address space would change how we do things.

E.g. right now Plan 9 suffers from a *lot* of data copying between
the kernel and processes, and between processes themselves.  If we
could eliminate most of that copying, things would get a lot faster.
Dealing with the security issues isn't trivial, but the programmer
time going into eeking out the last bit of I/O throughput of the
current scheme could be redirected.

If it works, this would reduce the kernel back to handling
process/memory management, and talking to the hardware.  Not a
micro-kernel, but just as good from a practical standpoint.

And no, this wouldn't get us to running on the 11/70.  But by taking
advantage of modern large virtual memory spaces by using page
flipping, we could cut down on physical memory usage in the kernel.


--lyndon



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  3:08             ` Digby R.S. Tarvin
  2018-10-09 11:58               ` Ethan Gardener
@ 2018-10-09 14:02               ` erik quanstrom
  1 sibling, 0 replies; 104+ messages in thread
From: erik quanstrom @ 2018-10-09 14:02 UTC (permalink / raw)
  To: 9fans

>  From what I recall, PDP11 hardware memory management was based on
> segmentation rather than paging (64K divided into 16 variable sized
> segments), and Unix did swapping rather than paging (a process is either
> completely in memory or completely on disk). It does relocation and

completely in memory /and running/. or swapped out.

- erik



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09 11:58               ` Ethan Gardener
@ 2018-10-09 13:59                 ` erik quanstrom
  2018-10-09 22:22                 ` Digby R.S. Tarvin
  1 sibling, 0 replies; 104+ messages in thread
From: erik quanstrom @ 2018-10-09 13:59 UTC (permalink / raw)
  To: 9fans

> I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86.  I kept running out of image memory!  The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :)

this was changed long ago.  image memory can now be much bigger.  i never had a problem when a 4e terminal
was my daily driver.

- erik



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  3:08             ` Digby R.S. Tarvin
@ 2018-10-09 11:58               ` Ethan Gardener
  2018-10-09 13:59                 ` erik quanstrom
  2018-10-09 22:22                 ` Digby R.S. Tarvin
  2018-10-09 14:02               ` erik quanstrom
  1 sibling, 2 replies; 104+ messages in thread
From: Ethan Gardener @ 2018-10-09 11:58 UTC (permalink / raw)
  To: 9fans

On Tue, Oct 9, 2018, at 4:08 AM, Digby R.S. Tarvin wrote:
> I thought there might have been a chance of an early attempt to target the x86 because of its ubiquity and low cost - which could be useful for a networked operating system. And those were 16 bit address constrained in the early days. But its probably not an architecture you would choose to work with if you had a choice.. 68K is what I would have gone for..

Fascinating thread, but I think you're off by a decade with the 16-bit address bus comment, unless you're not actually talking about Plan 9.  The 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 respectively.  The IBM PC, launched in 1982, had its ROM at the top of that 1MByte space, so it couldn't have been constrained in that way.  By the end of the 80s, all my schoolmates had 68k-powered computers from Commodore and Atari, showing hardware with a 24-bit address space was very much affordable and ubiquitous at the time Plan 9 development started.  Almost all of them had 512KB at the time.  A few flashy gits had 1MB machines. :)

I still wish I'd kept the better of the Atari STs which made their way down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier "STFM" models.  I remember wanting to try to run Plan 9 on it.  Let's estimate how tight it would be...

I think it would be terrible, because I got frustrated enough trying to run a 4e CPU server with graphics on a 2GB x86.  I kept running out of image memory!  The trouble was the draw device in 4th edition stores images in the same "image memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 64MB of that. :)

1 bit per pixel would obviously improve matters by a factor of 16 compared to my setup, and 640x400 (Atari ST high resolution) would be another 5 times smaller than my screen.  Putting these numbers together with my experience, you'd have to be careful to use images sparingly on a machine with 800KB free RAM after the kernel is loaded.  That's better than I thought, probably achievable on that Atari I had, but it couldn't be used as intensively as I used Plan 9 back then.

How could it be used?  I think it would be a good idea to push the draw device back to user space and make very sure to have it check for failing malloc!  I certainly wouldn't want a terminal with a filesystem and graphics all on a single 1MByte 64000-powered computer, because a filesystem on a terminal runs in user space, and thus requires some free memory to run the programs to shut it down.  Actually, Plan 9's separation of terminal from filesystem seems quite the obvious choice when I look at it like this. :)



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  3:28               ` Lucio De Re
  2018-10-09  8:23                 ` hiro
@ 2018-10-09  9:45                 ` Ethan Gardener
  2018-10-09 17:50                   ` Bakul Shah
  2018-10-10  7:32                 ` Giacomo Tesio
  2 siblings, 1 reply; 104+ messages in thread
From: Ethan Gardener @ 2018-10-09  9:45 UTC (permalink / raw)
  To: 9fans

On Tue, Oct 9, 2018, at 4:28 AM, Lucio De Re wrote:
> On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote:
> > One thing I have mused about is recasting plan9 as a
> > microkernel and pushing out a lot of its kernel code into user
> > mode code.
> >
> There are religious reasons not to go there

I'm trying to forget all the religious beliefs I once held with regard to computers, but I've had these lines in my head for a long time, and probably won't get a better opportunity to post them:

One day, Uriel met a man who explained very
convincingly that the Plan 9 kernel is a microkernel.
On another day, Uriel met a man who explained very
convincingly that the Plan 9 kernel is a macrokernel.
Uriel was enlightened.

Based on a true story. ;)


> You won't believe what kind of madnesses I need to deal with to
> consume my few and short remaining years - I'm with Dan in cursing the
> modern technological trends, but one of these days I'm going to lock
> myself in someone's attic or basement (or a prison cell, if that's
> what it takes, a monastery, whatever...) with my Galaxy S4 and a dated
> Riff-box - is that really what this black object is called? - and
> build an OS from the accumulated wisdom of the last forty years. It
> will probably look more like MS-DOS, though! :-(

I've started already, but I keep getting sidetracked by my need for entertainment, which often comes down to spending my energies on things which don't require such deep design work.  I'm hoping it'll get easier as my health improves; I'm still too stressed too often.  The trouble with this stress is I forget my goals, which are things I've learned from Plan 9 and other conclusions I've come to.

--
Progress might have been all right once, but it has gone on too long -- Ogden Nash



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  3:28               ` Lucio De Re
@ 2018-10-09  8:23                 ` hiro
  2018-10-09  9:45                 ` Ethan Gardener
  2018-10-10  7:32                 ` Giacomo Tesio
  2 siblings, 0 replies; 104+ messages in thread
From: hiro @ 2018-10-09  8:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

we already have a lot of user filesystems. feel free to add other useful ones.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  0:14             ` Bakul Shah
  2018-10-09  1:34               ` Christopher Nielsen
@ 2018-10-09  3:28               ` Lucio De Re
  2018-10-09  8:23                 ` hiro
                                   ` (2 more replies)
  2018-10-09 17:45               ` Lyndon Nerenberg
  2 siblings, 3 replies; 104+ messages in thread
From: Lucio De Re @ 2018-10-09  3:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On 10/9/18, Bakul Shah <bakul@bitblocks.com> wrote:
>
> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.  Such a redesign can be made more secure
> and more resilient.  The kind of problems you mention are
> easier to fix in user code. Different application domains may
> have different needs which are better handled as optional user
> mode components.
>
There are religious reasons not to go there and, perhaps not very
widely advertised, Minix-3 already does that, although I confess that
all my best efforts have not yet created the space for my own
experimentation with it.

You won't believe what kind of madnesses I need to deal with to
consume my few and short remaining years - I'm with Dan in cursing the
modern technological trends, but one of these days I'm going to lock
myself in someone's attic or basement (or a prison cell, if that's
what it takes, a monastery, whatever...) with my Galaxy S4 and a dated
Riff-box - is that really what this black object is called? - and
build an OS from the accumulated wisdom of the last forty years. It
will probably look more like MS-DOS, though! :-(

> Said another way, keep the good parts of the plan9 design and
> reachitect/reimplement the kernel + essential drivers/usermode
> daemons.  This is unlikely to happen (without some serious
> funding) but still fun to think about!  If done, this would be
> a more radical departure than Oberon-7 compared to Oberon but
> in the same spirit.
>
Surely, the targets for experimentation should be the ubiquitous
smart-mobile and the insane arithmetic power of GPUs? All neatly
networked over SDLC (or HDLC: AoH, anyone, for persistent storage?).

Lucio.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08 23:03           ` Dan Cross
  2018-10-09  0:14             ` Bakul Shah
@ 2018-10-09  3:08             ` Digby R.S. Tarvin
  2018-10-09 11:58               ` Ethan Gardener
  2018-10-09 14:02               ` erik quanstrom
  1 sibling, 2 replies; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-09  3:08 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 9490 bytes --]

On Tue, 9 Oct 2018 at 10:07, Dan Cross <crossd@gmail.com> wrote:

> My guess is that there is no reason in principle that it could not fit
>> comfortably into the constraints of a PDP11/70, but if the initial
>> implementation was done targeting a machine with significantly more
>> resources, it would be easy to make design decisions that would be entirely
>> incompatible.
>>
>
> I find this unlikely.
>
> The PDP-11, while a respectable machine for its day, required too many
> tradeoffs to make it attractive as a development platform for a
> next-generation research operating system in the late 1980s: be it
> electrical power consumption vs computational oomph or dollar cost vs
> available memory, the -11 had fallen from the attractive position it held a
> decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it
> COULD run on a PDP-11 was possible in the early days, but I can't see any
> reason one would have WANTED to do so: particularly as part of the impetus
> behind plan9 was to exploit advances in contemporary hardware: lower-cost,
> higher-performance, RISC-based multiprocessors; ubiquitous networking;
> common high-resolution bitmapped graphical displays; even magneto-optical
> storage (one bet that didn't pan out); etc.
>

If you  mean that you find it unlikely that that development would have
been done on a PDP11, then I agree, for the reasons you mentioned.

Not sure that I can see why it wouldn't  have been feasible, but I can see
why it wouldn't have been desirable.

I thought there might have been a chance of an early attempt to target the
x86 because of its ubiquity and low cost - which could be useful for a
networked operating system. And those were 16 bit address constrained in
the early days. But its probably not an architecture you would choose to
work with if you had a choice.. 68K is what I would have gone for..


> Certainly Richard Millar's comment suggests that might be the case. If it
>> is heavily dependent on VM, then the necessary rewrite is likely to be
>> substantial.
>>
>
> As a demonstration project, getting a slimmed-down plan9 kernel to boot on
> a PDP-11/70-class machine would be a nifty hack, but it would be quite a
> tour de force and most likely the result would not be generally useful. I
> think that, as has been suggested, the conceptual simplicity of plan9
> paradoxically means that resource utilization is higher than it might
> otherwise be on either a more elaborate OR more constrained system (such as
> one targeting e.g. the PDP-11). When you can afford not to care about a few
> bytes here or a couple of cycles there and you're not obsessed with
> scraping out the very last drop of performance, you can employ a simpler
> (some might say 'naive') algorithm or data structure.
>
> I'm not sure how the kernel design has changed since the first release.
>> The earliest version I have is the release I bought through Harcourt Brace
>> back in 1995. But I won't be home till December so it will be a while
>> before I can look at it, and probably won't have time to experiment before
>> then in any case.
>>
>
> The kernel evolved substantially over its life; something like doubling in
> size. I remember vaguely having a discussion with Sape where he said he
> felt it had grown bloated. That was probably close to 20 years ago now.
>

I guess kernel size wasn't a priority. I did a bit of searching back
through the old papers, and whilst there is a lot of talk about lines of
code and numbers of system calls, I didn't find any reference to kernel
size or memory requirements.


> For what it is worth, I don't think the embarrassment of riches presented
>> to programmers by current hardware has tended to produce more elegant
>> designs. If more resources resulted in elegance, Windows would be a thing
>> of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
>> and design simplicity, even if it does turn out to be more resource hungry
>> than I imagined. I will admit that the evils of excessively constrained
>> environments are generally worse in terms of coding elegance - especially
>> when it leads to overlays and self modifying code.
>>
>
> plan9 is breathtakingly elegant, but this is in no small part because as a
> research system it had the luxury of simply ignoring many thorny problems
> that would have marred that beauty but that the developers chose not to
> tackle. Some of these problems have non-trivial domain complexity and,
> while "modern" systems are far too complex by far, that doesn't mean that
> all solutions can be recast as elegantly simple pearls in the plan9 style.
> Whether we like those problems or not, they exist and real-world solutions
> have to at least attempt to deal with them (I'm looking at you, web x.0 for
> x >= 2...but curse you you aren't alone).
>
> PDP11's don't support virtual memory, so there doesn't seem any elegant
>> way to overcome that fundamental limitation on size of a singe executable.
>>
>
> No, they do: there is paging hardware on the PDP-11 that's used for
> address translation and memory protection (recall that PDP-11 kept the
> kernel at the top of the address space, the per-process "user" structure is
> at a fixed virtual address, and the system could trap a bus error and kill
> a misbehaving user-space process). What they may not support is the sort of
> trap handling that would let them recover from a page fault (though I
> haven't looked) and in any case, the address space is too small to make
> demand-paging with reclamation cost-effective.
>

>From what I recall, PDP11 hardware memory management was based on
segmentation rather than paging (64K divided into 16 variable sized
segments), and Unix did swapping rather than paging (a process is either
completely in memory or completely on disk). It does relocation and
protection, but I think it was limited in its ability to restart trapped
instructions. I suppose there was a sort of embryonic virtual memory in the
way the process stack was able to expand dynamically. I believe that was
handled by generating a trap when an address 'near' the top of the stack
was accessed. The instruction could complete, then a trap would allow the
operating system to add a bit more memory to the stack before returning to
allow the user process to continue.

Unix kept the 'per process data area' at the top of the 'process image' in
physical memory, but it was not mapped into the process address space. It
was, however. mapped into a fixed address in kernel space whenever the
corresponding process was the 'current' process.

But you are right - even is it could do virtual memory/paging, it wouldn't
help much because of limited size of the virtual address space.


> So I don't think it i would be worth a substantial rewrite to get it
>> going. It is a shame that there don't seem to have been any more powerful
>> machines with a comparably elegant architecture and attractive front panel
>> :)
>>
>
> An attractive front panel for nearly any machine is just a soldering iron,
> LEDs and some logic chips away. As far as elegant architectures, some are
> very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0
> machines can be had a reasonable prices, and POWER is kind of cool. I know
> I shouldn't, but I have a soft spot for ARM.
>

I have thought about it, but there are a couple of problems (in addition to
my lack artistic talent when it comes to building physically attractive
enclosures)..  One is the sheer number of LEDs required to display all of
the address and data lines in a modern architecture.  Mainly an issue if I
want to use the old PDP11/70 front panel that I had saved for the purpose,
I suppose. The other problem is getting access to the all of the machine
state that was displayable on a mini computer console. Virtual addresses,
User/Kernel mode, register contents etc are all hard to get at. I have
toyed with using JTAG etc, but there always seems to be something that I
can't get to. So it is hard to do more than resort to a software controlled
front panel. I used to have a little box of LEDs and switches that I
plugged into the parallel port on PCs, and had my BSDi kernel modified to
update it as part of the clock interrupt. But now the parallel ports are
becoming rare and you can't update LEDs connected via USB in a single
instruction... :-/

Oh, and sure, you can find reasonable architectures. I spent a long time
using 680x0 after learning on PDP11s, and found it equally comfortable for
those occasions when I had to work in assembly language (I dread having to
disassemble Intel binaries). Its just that packaging is so uninteresting
now, it gives no indication of what is going on inside. Even lights on the
hard drives when they were externally visible gave some signs of what was
going on. Especially when there were different lights for read and write..

It is sounding like Inferno is going to be the more practical option. I
>> believe gcc can still generate PDP-11 code, so it shouldn't be too hard to
>> try.
>>
>
> Sounds like a nifty hack. Fitting Dis into a 64k/64k split I/D space is
> the challenge.
>

 Yes indeed. I guess that is something I can try to find the time to test
in the short term. It should provide a pretty good indication of viability.

Regards,
DigbyT

[-- Attachment #2: Type: text/html, Size: 11801 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-09  0:14             ` Bakul Shah
@ 2018-10-09  1:34               ` Christopher Nielsen
  2018-10-09  3:28               ` Lucio De Re
  2018-10-09 17:45               ` Lyndon Nerenberg
  2 siblings, 0 replies; 104+ messages in thread
From: Christopher Nielsen @ 2018-10-09  1:34 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1687 bytes --]

On Mon, Oct 8, 2018, 17:15 Bakul Shah <bakul@bitblocks.com> wrote:

> On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross <crossd@gmail.com> wrote:
> >
> > plan9 is breathtakingly elegant, but this is in no small part because as
> a
> > research system it had the luxury of simply ignoring many thorny problems
> > that would have marred that beauty but that the developers chose not to
> > tackle. Some of these problems have non-trivial domain complexity and,
> > while "modern" systems are far too complex by far, that doesn't mean that
> > all solutions can be recast as elegantly simple pearls in the plan9
> style.
>
> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.  Such a redesign can be made more secure
> and more resilient.  The kind of problems you mention are
> easier to fix in user code. Different application domains may
> have different needs which are better handled as optional user
> mode components.
>
> Said another way, keep the good parts of the plan9 design and
> reachitect/reimplement the kernel + essential drivers/usermode
> daemons.  This is unlikely to happen (without some serious
> funding) but still fun to think about!  If done, this would be
> a more radical departure than Oberon-7 compared to Oberon but
> in the same spirit.
>

I've mused about that also. My problem has been finding the time. I think
it would be a worthwhile project.

Not entirely unrelated, I've been tinkering with seL4.

>

[-- Attachment #2: Type: text/html, Size: 2358 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08 23:03           ` Dan Cross
@ 2018-10-09  0:14             ` Bakul Shah
  2018-10-09  1:34               ` Christopher Nielsen
                                 ` (2 more replies)
  2018-10-09  3:08             ` Digby R.S. Tarvin
  1 sibling, 3 replies; 104+ messages in thread
From: Bakul Shah @ 2018-10-09  0:14 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross <crossd@gmail.com> wrote:
>
> plan9 is breathtakingly elegant, but this is in no small part because as a
> research system it had the luxury of simply ignoring many thorny problems
> that would have marred that beauty but that the developers chose not to
> tackle. Some of these problems have non-trivial domain complexity and,
> while "modern" systems are far too complex by far, that doesn't mean that
> all solutions can be recast as elegantly simple pearls in the plan9 style.

One thing I have mused about is recasting plan9 as a
microkernel and pushing out a lot of its kernel code into user
mode code.  It is already half way there -- it is basically a
mux for 9p calls, low level device drivers, VM support & some
process related code.  Such a redesign can be made more secure
and more resilient.  The kind of problems you mention are
easier to fix in user code. Different application domains may
have different needs which are better handled as optional user
mode components.

Said another way, keep the good parts of the plan9 design and
reachitect/reimplement the kernel + essential drivers/usermode
daemons.  This is unlikely to happen (without some serious
funding) but still fun to think about!  If done, this would be
a more radical departure than Oberon-7 compared to Oberon but
in the same spirit.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08 21:55         ` Digby R.S. Tarvin
@ 2018-10-08 23:03           ` Dan Cross
  2018-10-09  0:14             ` Bakul Shah
  2018-10-09  3:08             ` Digby R.S. Tarvin
  0 siblings, 2 replies; 104+ messages in thread
From: Dan Cross @ 2018-10-08 23:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 5901 bytes --]

On Mon, Oct 8, 2018 at 6:25 PM Digby R.S. Tarvin <digbyt42@gmail.com> wrote:

> Does anyone know what platform Plan9 was initially implemented on?
>

My understanding is that the earliest experiments involved a VAX, but
development quickly shifted to MIPS and 68020-based machines (the "gnot"
was, IIRC, a 68020-based computer).

My guess is that there is no reason in principle that it could not fit
> comfortably into the constraints of a PDP11/70, but if the initial
> implementation was done targeting a machine with significantly more
> resources, it would be easy to make design decisions that would be entirely
> incompatible.
>

I find this unlikely.

The PDP-11, while a respectable machine for its day, required too many
tradeoffs to make it attractive as a development platform for a
next-generation research operating system in the late 1980s: be it
electrical power consumption vs computational oomph or dollar cost vs
available memory, the -11 had fallen from the attractive position it held a
decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it
COULD run on a PDP-11 was possible in the early days, but I can't see any
reason one would have WANTED to do so: particularly as part of the impetus
behind plan9 was to exploit advances in contemporary hardware: lower-cost,
higher-performance, RISC-based multiprocessors; ubiquitous networking;
common high-resolution bitmapped graphical displays; even magneto-optical
storage (one bet that didn't pan out); etc.

Certainly Richard Millar's comment suggests that might be the case. If it
> is heavily dependent on VM, then the necessary rewrite is likely to be
> substantial.
>

As a demonstration project, getting a slimmed-down plan9 kernel to boot on
a PDP-11/70-class machine would be a nifty hack, but it would be quite a
tour de force and most likely the result would not be generally useful. I
think that, as has been suggested, the conceptual simplicity of plan9
paradoxically means that resource utilization is higher than it might
otherwise be on either a more elaborate OR more constrained system (such as
one targeting e.g. the PDP-11). When you can afford not to care about a few
bytes here or a couple of cycles there and you're not obsessed with
scraping out the very last drop of performance, you can employ a simpler
(some might say 'naive') algorithm or data structure.

I'm not sure how the kernel design has changed since the first release. The
> earliest version I have is the release I bought through Harcourt Brace back
> in 1995. But I won't be home till December so it will be a while before I
> can look at it, and probably won't have time to experiment before then in
> any case.
>

The kernel evolved substantially over its life; something like doubling in
size. I remember vaguely having a discussion with Sape where he said he
felt it had grown bloated. That was probably close to 20 years ago now.

For what it is worth, I don't think the embarrassment of riches presented
> to programmers by current hardware has tended to produce more elegant
> designs. If more resources resulted in elegance, Windows would be a thing
> of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
> and design simplicity, even if it does turn out to be more resource hungry
> than I imagined. I will admit that the evils of excessively constrained
> environments are generally worse in terms of coding elegance - especially
> when it leads to overlays and self modifying code.
>

plan9 is breathtakingly elegant, but this is in no small part because as a
research system it had the luxury of simply ignoring many thorny problems
that would have marred that beauty but that the developers chose not to
tackle. Some of these problems have non-trivial domain complexity and,
while "modern" systems are far too complex by far, that doesn't mean that
all solutions can be recast as elegantly simple pearls in the plan9 style.
Whether we like those problems or not, they exist and real-world solutions
have to at least attempt to deal with them (I'm looking at you, web x.0 for
x >= 2...but curse you you aren't alone).

PDP11's don't support virtual memory, so there doesn't seem any elegant way
> to overcome that fundamental limitation on size of a singe executable.
>

No, they do: there is paging hardware on the PDP-11 that's used for address
translation and memory protection (recall that PDP-11 kept the kernel at
the top of the address space, the per-process "user" structure is at a
fixed virtual address, and the system could trap a bus error and kill a
misbehaving user-space process). What they may not support is the sort of
trap handling that would let them recover from a page fault (though I
haven't looked) and in any case, the address space is too small to make
demand-paging with reclamation cost-effective.


> So I don't think it i would be worth a substantial rewrite to get it
> going. It is a shame that there don't seem to have been any more powerful
> machines with a comparably elegant architecture and attractive front panel
> :)
>

An attractive front panel for nearly any machine is just a soldering iron,
LEDs and some logic chips away. As far as elegant architectures, some are
very nice: MIPS is kind of retro but elegant, RISC-V is nice, 680x0
machines can be had a reasonable prices, and POWER is kind of cool. I know
I shouldn't, but I have a soft spot for ARM.

It is sounding like Inferno is going to be the more practical option. I
> believe gcc can still generate PDP-11 code, so it shouldn't be too hard to
> try.
>

Sounds like a nifty hack. Fitting Dis into a 64k/64k split I/D space is the
challenge.

        - Dan C.

On Tue, 9 Oct 2018 at 04:53, hiro <23hiro@gmail.com> wrote:
>
>> i should have said could, not can :)
>>
>>

[-- Attachment #2: Type: text/html, Size: 7819 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08 17:20       ` hiro
@ 2018-10-08 21:55         ` Digby R.S. Tarvin
  2018-10-08 23:03           ` Dan Cross
  0 siblings, 1 reply; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-08 21:55 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2075 bytes --]

Does anyone know what platform Plan9 was initially implemented on? My guess
is that there is no reason in principle that it could not fit comfortably
into the constraints of a PDP11/70, but if the initial implementation was
done targeting a machine with significantly more resources, it would be
easy to make design decisions that would be entirely incompatible.
Certainly Richard Millar's comment suggests that might be the case. If it
is heavily dependent on VM, then the necessary rewrite is likely to be
substantial.

I'm not sure how the kernel design has changed since the first release. The
earliest version I have is the release I bought through Harcourt Brace back
in 1995. But I won't be home till December so it will be a while before I
can look at it, and probably won't have time to experiment before then in
any case.

For what it is worth, I don't think the embarrassment of riches presented
to programmers by current hardware has tended to produce more elegant
designs. If more resources resulted in elegance, Windows would be a thing
of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
and design simplicity, even if it does turn out to be more resource hungry
than I imagined. I will admit that the evils of excessively constrained
environments are generally worse in terms of coding elegance - especially
when it leads to overlays and self modifying code.

PDP11's don't support virtual memory, so there doesn't seem any elegant way
to overcome that fundamental limitation on size of a singe executable.  So
I don't think it i would be worth a substantial rewrite to get it going. It
is a shame that there don't seem to have been any more powerful machines
with a comparably elegant architecture and attractive front panel :)

It is sounding like Inferno is going to be the more practical option. I
believe gcc can still generate PDP-11 code, so it shouldn't be too hard to
try.

DigbyT

On Tue, 9 Oct 2018 at 04:53, hiro <23hiro@gmail.com> wrote:

> i should have said could, not can :)
>
>

[-- Attachment #2: Type: text/html, Size: 2505 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08 12:03     ` Charles Forsyth
@ 2018-10-08 17:20       ` hiro
  2018-10-08 21:55         ` Digby R.S. Tarvin
  0 siblings, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-08 17:20 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

i should have said could, not can :)



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08  7:20   ` hiro
@ 2018-10-08 12:03     ` Charles Forsyth
  2018-10-08 17:20       ` hiro
  0 siblings, 1 reply; 104+ messages in thread
From: Charles Forsyth @ 2018-10-08 12:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 246 bytes --]

Ideally, anyway.

On Mon, 8 Oct 2018 at 11:20, hiro <23hiro@gmail.com> wrote:

> saving every bit of memory has costs in coding, the pressure wasn't as
> strong any more.
> the earned flexibility can be used for more elegant design.
>
>

[-- Attachment #2: Type: text/html, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08  8:12   ` Nils M Holm
@ 2018-10-08  9:12     ` Digby R.S. Tarvin
  0 siblings, 0 replies; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-08  9:12 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 3358 bytes --]

I quite agree - the PDP 11/70 was quite a high end 16 bit machine, but it
was the machine that I was talking about and the one I would most like to
revisit (although I wouldn't turn down an 11/40 if somebody offered me a
working one). I don't think I would contemplate putting Plan9 on a machine
with no MMU or a 64K physical memory limit.

My first reasonable multi-user, multi-tasking computer system (back in the
early 80s)  was home made 6809 machine with 6829 MMU and eventually 1MB of
ram, running OS-9/6809. It initially ran with 64K for programs and and the
rest of memory was a big ram disk - because what else could you do with
such a ridiculous amount of memory. It did pretty well at providing a
personal Unix like environment, although counldn't reproduce the fork()
semantics and there was no memory protection, and the memory contraints
meant always running the C compiler one pass at a time.. But we eventually
ported 'Level 2' OS-9 which could use a mapping ram/MMU, and with that I
had a quite robust multi-user system, with up to 64K available per process,
and 64K available for the kernel. I was able to get most Unix programs
running on it (except for a few with big tables that compiled to larger
than 64K) and no longer had to worry about exiting the editor before doing
a compile. Most of the core system utilities were written in assembly
language - so the equivalent of 'ls' for example, required no more than a
256 byte memory allocation. And all executables were loaded read-only and
re-entrant (shared text) which helped. The only real Achilles heal was the
6809 had no illegal instruction trapping, so executing data could
occasionally  result in an unrecoverable freeze..

I never liked the 68K version os OS-9 quite as much. Because of the larger
address space it used the MMU for protection only, with no address
translation - so the kernel was mapped into the same address space as the
user programs but just not accessible in user mode. It just didn't seem as
elegant.

Anyway, thats why I don't see 64K per process as necessarily being
inadequate for a lean operating system, although it would be easy enough to
write extravagant code that would not run in 64K, or a design that relied
on a large virtual address space - especially if you were used to relying
on virtual memory. I just don't know if how small Plan9 can go, and unless
someone has already explored those limits, I suppose rather than
speculating i'll just have to plan on a little experimentation when I get a
bit of spare time.

Regards,
Digby



On Mon, 8 Oct 2018 at 19:13, Nils M Holm <nmh@t3x.org> wrote:

> On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote:
> > A native Inferno port would certainly be a lot easier, but I think you
> > might be a bit pessimistic about would can fit into a 64K address space
> > machine. The 11/70 certainly managed to run a very respectable V7 Unix
> > supporting 20-30 simultaneous active users in its day, [...]
>
> The 11/70 was a completely different beast than, say, an 11/03.
> The 70 had a backplane with 22 address lines, a MMU, and up to
> 4M bytes of memory. So while its processes were limited to
> 64K+64K bytes, I would not consider it to be a typical 16-bit
> machine.
>
> --
> Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org
>
>

[-- Attachment #2: Type: text/html, Size: 3866 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re:  what heavy negativity!)
  2018-10-08  4:29 ` Digby R.S. Tarvin
  2018-10-08  7:20   ` hiro
@ 2018-10-08  8:12   ` Nils M Holm
  2018-10-08  9:12     ` Digby R.S. Tarvin
  1 sibling, 1 reply; 104+ messages in thread
From: Nils M Holm @ 2018-10-08  8:12 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote:
> A native Inferno port would certainly be a lot easier, but I think you
> might be a bit pessimistic about would can fit into a 64K address space
> machine. The 11/70 certainly managed to run a very respectable V7 Unix
> supporting 20-30 simultaneous active users in its day, [...]

The 11/70 was a completely different beast than, say, an 11/03.
The 70 had a backplane with 22 address lines, a MMU, and up to
4M bytes of memory. So while its processes were limited to
64K+64K bytes, I would not consider it to be a typical 16-bit
machine.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re:  what heavy negativity!)
  2018-10-08  3:38 Lucio De Re
  2018-10-08  4:29 ` Digby R.S. Tarvin
@ 2018-10-08  8:09 ` Nils M Holm
  1 sibling, 0 replies; 104+ messages in thread
From: Nils M Holm @ 2018-10-08  8:09 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On 2018-10-08T05:38:07+0200, Lucio De Re wrote:
> You really must be thinking of Inferno, native, running in a host with
> 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
> Even MPM won't cut it, I don't think.

There were serveral UNIX 6th Edition-based "Mini Unix" variants
for the PDP-11/03 and other 16-bit systems. Then there is UZI,
the Unix Z80 Implementation, which can run multiple processes
(with swapping) in 64K bytes of RAM. CP/M ran in much less than
64KB.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re: what heavy negativity!)
  2018-10-08  4:29 ` Digby R.S. Tarvin
@ 2018-10-08  7:20   ` hiro
  2018-10-08 12:03     ` Charles Forsyth
  2018-10-08  8:12   ` Nils M Holm
  1 sibling, 1 reply; 104+ messages in thread
From: hiro @ 2018-10-08  7:20 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

saving every bit of memory has costs in coding, the pressure wasn't as
strong any more.
the earned flexibility can be used for more elegant design.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [9fans] PDP11 (Was: Re:  what heavy negativity!)
  2018-10-08  3:38 Lucio De Re
@ 2018-10-08  4:29 ` Digby R.S. Tarvin
  2018-10-08  7:20   ` hiro
  2018-10-08  8:12   ` Nils M Holm
  2018-10-08  8:09 ` Nils M Holm
  1 sibling, 2 replies; 104+ messages in thread
From: Digby R.S. Tarvin @ 2018-10-08  4:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]

A native Inferno port would certainly be a lot easier, but I think you
might be a bit pessimistic about would can fit into a 64K address space
machine. The 11/70 certainly managed to run a very respectable V7 Unix
supporting 20-30 simultaneous active users in its day, and I wouldn't have
thought plan 9  arriving about a decade later, would have been hugely
bigger than V7 Unix.
I recall a demo of Plan9 (I think it also included the source) being given
by Rob Pike at UNSW which he carried on a 1.44Mb floppy disc. By its open
source release in 2002 the distribution was 65MB

The smallest Linux system I have used recently had 256K RAM and 512K flash.
A rather stripped down busybox based system, but it did include a full
TCP/IP stack and a web server. Thats comparable to a PDP11 except for the
limitation on the largest individual process.

Bear in mind that 16 bit executables are smaller, and whilst the 11/70 had
a 64Kb address space, physical memory could be somewhat larger, and an
individual process could have 128K of memory is using separate instruction
and data space.

I am used to thinking of Plan9 as very compact, but I havn't really looked
to see if it has grown much since the 80s, and perhaps it is only next to
the astronomical expansion of other systems that it still looks small. It
would be an interesting exercise to find out.

It would be an interesting thing to try, if only to get a better feel for
how compact Plan9 actually is ...

DigbyT

On Mon, 8 Oct 2018 at 14:38, Lucio De Re <lucio.dere@gmail.com> wrote:

> On 10/8/18, Digby R.S. Tarvin <digbyt42@gmail.com> wrote:
> >
> > So the question is... is plan9 still lean and mean enough to fit onto a
> > machine with a 64K address space? Doing a port would certainly provide
> > plenty of opportunity to tinker with the lights and switches on front
> > panel, and if it the port was initially limited to being a CPU server,
> > there would be no need to worry about displays and mass storage.... just
> > the compiler back end and low level kernel support.
> >
> You really must be thinking of Inferno, native, running in a host with
> 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
> Even MPM won't cut it, I don't think.
>
> Lucio.
>

[-- Attachment #2: Type: text/html, Size: 2769 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [9fans] PDP11 (Was: Re:  what heavy negativity!)
@ 2018-10-08  3:38 Lucio De Re
  2018-10-08  4:29 ` Digby R.S. Tarvin
  2018-10-08  8:09 ` Nils M Holm
  0 siblings, 2 replies; 104+ messages in thread
From: Lucio De Re @ 2018-10-08  3:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On 10/8/18, Digby R.S. Tarvin <digbyt42@gmail.com> wrote:
>
> So the question is... is plan9 still lean and mean enough to fit onto a
> machine with a 64K address space? Doing a port would certainly provide
> plenty of opportunity to tinker with the lights and switches on front
> panel, and if it the port was initially limited to being a CPU server,
> there would be no need to worry about displays and mass storage.... just
> the compiler back end and low level kernel support.
>
You really must be thinking of Inferno, native, running in a host with
1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
Even MPM won't cut it, I don't think.

Lucio.



^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2018-10-17 18:14 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-10 17:34 [9fans] PDP11 (Was: Re: what heavy negativity!) cinap_lenrek
2018-10-10 21:54 ` Steven Stallion
2018-10-10 22:26   ` [9fans] zero copy & 9p (was " Bakul Shah
2018-10-10 22:52     ` Steven Stallion
2018-10-11 20:43     ` Lyndon Nerenberg
2018-10-11 22:28       ` hiro
2018-10-12  6:04       ` Ori Bernstein
2018-10-13 18:01         ` Charles Forsyth
2018-10-13 21:11           ` hiro
2018-10-14  5:25             ` FJ Ballesteros
2018-10-14  7:34               ` hiro
2018-10-14  7:38                 ` Francisco J Ballesteros
2018-10-14  8:00                   ` hiro
2018-10-15 16:48                     ` Charles Forsyth
2018-10-15 17:01                       ` hiro
2018-10-15 17:29                       ` hiro
2018-10-15 23:06                         ` Charles Forsyth
2018-10-16  0:09                       ` erik quanstrom
2018-10-17 18:14                       ` Charles Forsyth
2018-10-10 22:29   ` [9fans] " Kurt H Maier
2018-10-10 22:55     ` Steven Stallion
2018-10-11 11:19       ` Aram Hăvărneanu
2018-10-11  0:26   ` Skip Tavakkolian
2018-10-11  1:03     ` Steven Stallion
2018-10-14  9:46   ` Ole-Hjalmar Kristensen
2018-10-14 10:37     ` hiro
2018-10-14 17:34       ` Ole-Hjalmar Kristensen
2018-10-14 19:17         ` hiro
2018-10-15  9:29         ` Giacomo Tesio
  -- strict thread matches above, loose matches on Subject: below --
2018-10-10 22:19 cinap_lenrek
2018-10-10 16:14 cinap_lenrek
2018-10-10  0:15 cinap_lenrek
2018-10-10  0:22 ` Lyndon Nerenberg
2018-10-09 19:49 cinap_lenrek
2018-10-09 19:56 ` hiro
2018-10-09 19:47 cinap_lenrek
2018-10-09 22:01 ` erik quanstrom
2018-10-09 23:43 ` Lyndon Nerenberg
2018-10-10  5:52   ` hiro
2018-10-10  8:13     ` Digby R.S. Tarvin
2018-10-10  9:14       ` hiro
2018-10-10 13:59         ` Steve Simon
2018-10-10 21:32         ` Digby R.S. Tarvin
2018-10-11 17:43     ` Lyndon Nerenberg
2018-10-11 19:11       ` hiro
2018-10-11 19:27         ` Lyndon Nerenberg
2018-10-11 19:56           ` hiro
2018-10-10  5:57   ` hiro
2018-10-08  3:38 Lucio De Re
2018-10-08  4:29 ` Digby R.S. Tarvin
2018-10-08  7:20   ` hiro
2018-10-08 12:03     ` Charles Forsyth
2018-10-08 17:20       ` hiro
2018-10-08 21:55         ` Digby R.S. Tarvin
2018-10-08 23:03           ` Dan Cross
2018-10-09  0:14             ` Bakul Shah
2018-10-09  1:34               ` Christopher Nielsen
2018-10-09  3:28               ` Lucio De Re
2018-10-09  8:23                 ` hiro
2018-10-09  9:45                 ` Ethan Gardener
2018-10-09 17:50                   ` Bakul Shah
2018-10-09 18:57                     ` Ori Bernstein
2018-10-10  7:32                 ` Giacomo Tesio
2018-10-09 17:45               ` Lyndon Nerenberg
2018-10-09 18:49                 ` hiro
2018-10-09 19:14                   ` Lyndon Nerenberg
2018-10-09 22:05                     ` erik quanstrom
2018-10-11 17:54                       ` Lyndon Nerenberg
2018-10-11 18:04                         ` Kurt H Maier
2018-10-11 19:23                         ` hiro
2018-10-11 19:24                           ` hiro
2018-10-11 19:25                             ` hiro
2018-10-11 19:26                         ` Skip Tavakkolian
2018-10-11 19:39                           ` Lyndon Nerenberg
2018-10-11 19:44                             ` Skip Tavakkolian
2018-10-11 19:47                               ` Lyndon Nerenberg
2018-10-11 19:57                                 ` hiro
2018-10-11 20:23                                   ` Lyndon Nerenberg
2018-10-10 10:42                     ` Ethan Gardener
2018-10-09 19:23                   ` Lyndon Nerenberg
2018-10-09 19:34                     ` hiro
2018-10-09 19:36                       ` hiro
2018-10-09 19:40                       ` Lyndon Nerenberg
2018-10-10  0:18                       ` Dan Cross
2018-10-10  5:45                         ` hiro
2018-10-09 22:06                     ` erik quanstrom
2018-10-10  6:24                       ` Bakul Shah
2018-10-10 13:58                         ` erik quanstrom
2018-10-09 22:42                   ` Dan Cross
2018-10-09 19:09                 ` Bakul Shah
2018-10-09 19:30                   ` Lyndon Nerenberg
2018-10-09  3:08             ` Digby R.S. Tarvin
2018-10-09 11:58               ` Ethan Gardener
2018-10-09 13:59                 ` erik quanstrom
2018-10-09 22:22                 ` Digby R.S. Tarvin
2018-10-10 10:38                   ` Ethan Gardener
2018-10-10 23:15                     ` Digby R.S. Tarvin
2018-10-11 18:10                       ` Lyndon Nerenberg
2018-10-11 20:55                         ` Digby R.S. Tarvin
2018-10-11 21:03                           ` Lyndon Nerenberg
2018-10-09 14:02               ` erik quanstrom
2018-10-08  8:12   ` Nils M Holm
2018-10-08  9:12     ` Digby R.S. Tarvin
2018-10-08  8:09 ` Nils M Holm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).