9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] venti /plan9port mmapped 
@ 2026-01-02 19:54 wb.kloke
  2026-01-02 20:39 ` ori
  0 siblings, 1 reply; 39+ messages in thread
From: wb.kloke @ 2026-01-02 19:54 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 935 bytes --]

Further trying to sanitize venti, I decided to change my mventi (no isects) for mmap IO.

At a first stage I mmap the whole venti i arena partition (current size 80GB, filled to 44GB). On a 64bit processor this is not a big thing, and replace the file operations (rwpart in part.c  and readarena in arena.c) by memmove. 

As it works on amd64 FreeBSD, I can now eliminate the block cache  and its complications (IO over block boundaries) from the source making the remaining source much clearer.

The price is, that I have to rely on demand paging of the OS. So it is probably not easily portable to plan9 or 9front.

Can any body give ime a hint, how to map a partition to a segment in 9front?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M6862409e4512b86c28587d83
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1570 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-02 19:54 [9fans] venti /plan9port mmapped wb.kloke
@ 2026-01-02 20:39 ` ori
  2026-01-02 20:58   ` Bakul Shah via 9fans
  2026-01-02 21:01   ` ori
  0 siblings, 2 replies; 39+ messages in thread
From: ori @ 2026-01-02 20:39 UTC (permalink / raw)
  To: 9fans

You may want to read this: https://db.cs.cmu.edu/mmap-cidr2022/

Quoth wb.kloke@gmail.com:
> Further trying to sanitize venti, I decided to change my mventi (no isects) for mmap IO.
> 
> At a first stage I mmap the whole venti i arena partition (current size 80GB, filled to 44GB). On a 64bit processor this is not a big thing, and replace the file operations (rwpart in part.c  and readarena in arena.c) by memmove.
> 
> As it works on amd64 FreeBSD, I can now eliminate the block cache  and its complications (IO over block boundaries) from the source making the remaining source much clearer.
> 
> The price is, that I have to rely on demand paging of the OS. So it is probably not easily portable to plan9 or 9front.
> 
> Can any body give ime a hint, how to map a partition to a segment in 9front?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M1c4328c5437366d50eb39ffd
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-02 20:39 ` ori
@ 2026-01-02 20:58   ` Bakul Shah via 9fans
  2026-01-06 22:59     ` Ron Minnich
  2026-01-02 21:01   ` ori
  1 sibling, 1 reply; 39+ messages in thread
From: Bakul Shah via 9fans @ 2026-01-02 20:58 UTC (permalink / raw)
  To: 9fans

Might be an opportunity for optimizing mmap + virtual memory architecture for database specific applications.... (should be worth at least a few papers for the academic crowd).

> On Jan 2, 2026, at 12:39 PM, ori@eigenstate.org wrote:
> 
> You may want to read this: https://db.cs.cmu.edu/mmap-cidr2022/
> 
> Quoth wb.kloke@gmail.com:
>> Further trying to sanitize venti, I decided to change my mventi (no isects) for mmap IO.
>> 
>> At a first stage I mmap the whole venti i arena partition (current size 80GB, filled to 44GB). On a 64bit processor this is not a big thing, and replace the file operations (rwpart in part.c  and readarena in arena.c) by memmove.
>> 
>> As it works on amd64 FreeBSD, I can now eliminate the block cache  and its complications (IO over block boundaries) from the source making the remaining source much clearer.
>> 
>> The price is, that I have to rely on demand paging of the OS. So it is probably not easily portable to plan9 or 9front.
>> 
>> Can any body give ime a hint, how to map a partition to a segment in 9front?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M227c416862f61d52b01bd98f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-02 20:39 ` ori
  2026-01-02 20:58   ` Bakul Shah via 9fans
@ 2026-01-02 21:01   ` ori
  2026-01-08 15:59     ` wb.kloke
  1 sibling, 1 reply; 39+ messages in thread
From: ori @ 2026-01-02 21:01 UTC (permalink / raw)
  To: 9fans

Quoth ori@eigenstate.org:
> > Can any body give ime a hint, how to map a partition to a segment in 9front?

Using read and write; there is no memory mapping, and I don't think we want it.
It doesn't work very well when reading from a networked file system, it makes
error handling unnecessarily difficult, and if you're using it for writing, it's
very fiddly when it comes to committing data.

There may be an argument for read-only mapping of small files, demand paging
style, but things fall apart when the working set size gets larger than memory.


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-Mc99c287a545807b040babaf5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-02 20:58   ` Bakul Shah via 9fans
@ 2026-01-06 22:59     ` Ron Minnich
  2026-01-07  4:27       ` Noam Preil
                         ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Ron Minnich @ 2026-01-06 22:59 UTC (permalink / raw)
  To: 9fans

back in the NIX days, when we had 32GiB of memory mapped with 32 1-G
PTEs, I wrote a trivial venti that ONLY used dram. That was easy.
Because you can keep up a single machine with 32G up for an
arbitrarily long time, I did not bother with a disk. This work was
based on the fact that lsub had used very little of their 32G coraid
server (something like 3G? I forget) over ten years: venti dedups by
its nature and that's your friend.

So maybe a pure-ram venti, with a TiB or so of memory, could work?

"disks are the work of the devil" -- jmk


ron

On Fri, Jan 2, 2026 at 1:02 PM Bakul Shah via 9fans <9fans@9fans.net> wrote:
>
> Might be an opportunity for optimizing mmap + virtual memory architecture for database specific applications.... (should be worth at least a few papers for the academic crowd).
>
> > On Jan 2, 2026, at 12:39 PM, ori@eigenstate.org wrote:
> >
> > You may want to read this: https://db.cs.cmu.edu/mmap-cidr2022/
> >
> > Quoth wb.kloke@gmail.com:
> >> Further trying to sanitize venti, I decided to change my mventi (no isects) for mmap IO.
> >>
> >> At a first stage I mmap the whole venti i arena partition (current size 80GB, filled to 44GB). On a 64bit processor this is not a big thing, and replace the file operations (rwpart in part.c  and readarena in arena.c) by memmove.
> >>
> >> As it works on amd64 FreeBSD, I can now eliminate the block cache  and its complications (IO over block boundaries) from the source making the remaining source much clearer.
> >>
> >> The price is, that I have to rely on demand paging of the OS. So it is probably not easily portable to plan9 or 9front.
> >>
> >> Can any body give ime a hint, how to map a partition to a segment in 9front?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M44817a56f8c46bcdac466b1b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-06 22:59     ` Ron Minnich
@ 2026-01-07  4:27       ` Noam Preil
  2026-01-07  6:15       ` Shawn Rutledge
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Noam Preil @ 2026-01-07  4:27 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

If you want to buy 1TiB of RAM to play with, you can go right ahead :p

For a pooled together project - a shared venti - that might still be practical, at least with slower memory, but - 1TiB of HDD space is like $20. I'm not going to bother checking the spot price of 1tib of RAM because... it has a spot price now :/

And it's not terriblly much code to do a better venti using disks.

Could definitely build the index entirely into memory on startup and have less disk code, too, but I'm not convinced it's worth it.


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M9f9a4833f888ca1ec34f3c0f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1662 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-06 22:59     ` Ron Minnich
  2026-01-07  4:27       ` Noam Preil
@ 2026-01-07  6:15       ` Shawn Rutledge
  2026-01-07 15:46         ` Persistent memory (was Re: [9fans] venti /plan9port mmapped) arnold
  2026-01-07  8:52       ` [9fans] venti /plan9port mmapped wb.kloke
  2026-01-07 14:57       ` Thaddeus Woskowiak
  3 siblings, 1 reply; 39+ messages in thread
From: Shawn Rutledge @ 2026-01-07  6:15 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1979 bytes --]

> On Jan 6, 2026, at 23:59, Ron Minnich <rminnich@p9f.org> wrote:
> 
> back in the NIX days, when we had 32GiB of memory mapped with 32 1-G
> PTEs, I wrote a trivial venti that ONLY used dram. That was easy.
> Because you can keep up a single machine with 32G up for an
> arbitrarily long time, I did not bother with a disk. This work was
> based on the fact that lsub had used very little of their 32G coraid
> server (something like 3G? I forget) over ten years: venti dedups by
> its nature and that's your friend.
> 
> So maybe a pure-ram venti, with a TiB or so of memory, could work?
> 
> "disks are the work of the devil" — jmk

I’ve been wondering why it’s still so rare to map persistent storage to memory addresses, in hardware.  It seemed like Intel Optane was going to go there, for a while, then they just gave up on the idea.  And core memory was already persistent, back in the day.

I think universal memory should happen eventually; and to prepare for that, software design should go towards organizing data the same in memory as on storage: better packing rather than lots of randomness in the heap, and memory-aligned structures. Local file I/O might become mostly unnecessary, but could continue as an abstraction to organize things in memory, at the cost of having to keep writing I/O code.  So if that’s where we are going, mmap is a good thing to have.  But yeah, maybe it’s more hassle as an abstraction for network-attached storage.

Wasn’t this sort of thing being done in the PDA era?  I never developed for the Newton, but I think a “soup” is such a persistent structure.  And maybe whatever smalltalk does with their sandboxes.  Is it just a persistent heap or is it organized better?


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-Mbf113fadcd84d606d155d602
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 10619 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-06 22:59     ` Ron Minnich
  2026-01-07  4:27       ` Noam Preil
  2026-01-07  6:15       ` Shawn Rutledge
@ 2026-01-07  8:52       ` wb.kloke
  2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
  2026-01-07 14:57       ` Thaddeus Woskowiak
  3 siblings, 1 reply; 39+ messages in thread
From: wb.kloke @ 2026-01-07  8:52 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

On Tuesday, 6 January 2026, at 11:59 PM, Ron Minnich wrote:
> So maybe a pure-ram venti, with a TiB or so of memory, could work?

"disks are the work of the devil" -- jmk
Of course, it would work, but imho it would be pointless. Why would anybody use content-addressed storage on data, which are not on persistent memory?

Most of the code of a venti is concerned with the task making a backup easy.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M5b0ec55532fb3ef9b2466ae8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1215 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-06 22:59     ` Ron Minnich
                         ` (2 preceding siblings ...)
  2026-01-07  8:52       ` [9fans] venti /plan9port mmapped wb.kloke
@ 2026-01-07 14:57       ` Thaddeus Woskowiak
  2026-01-07 16:07         ` Wes Kussmaul
  2026-01-07 16:13         ` Noam Preil
  3 siblings, 2 replies; 39+ messages in thread
From: Thaddeus Woskowiak @ 2026-01-07 14:57 UTC (permalink / raw)
  To: 9fans

On Tue, Jan 6, 2026 at 7:03 PM Ron Minnich <rminnich@p9f.org> wrote:
>
> So maybe a pure-ram venti, with a TiB or so of memory, could work?
>

If you can afford it thanks to the AI surge. 1TB of DDR5 is around $25,000 USD.

I am quite thankful that my 8GB CPU server is "overkill" for my current needs.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M1bd713b8e014eb23e00b4928
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Persistent memory (was Re: [9fans] venti /plan9port mmapped)
  2026-01-07  6:15       ` Shawn Rutledge
@ 2026-01-07 15:46         ` arnold
  2026-01-07 16:11           ` Noam Preil
  0 siblings, 1 reply; 39+ messages in thread
From: arnold @ 2026-01-07 15:46 UTC (permalink / raw)
  To: 9fans

Shawn Rutledge <lists@ecloud.org> wrote:

> I’ve been wondering why it’s still so rare to map persistent storage
> to memory addresses, in hardware.  It seemed like Intel Optane was
> going to go there, for a while, then they just gave up on the idea.

Because Intel doesn't understand any kind of product except CPUs. :-(

> I think universal memory should happen eventually; and to prepare for
> that, software design should go towards organizing data the same in
> memory as on storage: better packing rather than lots of randomness in
> the heap, and memory-aligned structures. Local file I/O might become
> mostly unnecessary, but could continue as an abstraction to organize
> things in memory, at the cost of having to keep writing I/O code.  So if
> that’s where we are going, mmap is a good thing to have.  But yeah,
> maybe it’s more hassle as an abstraction for network-attached storage.

A web search shows that there are several options for peristent memory
allocators, many of which I didn't know about. However, gawk has
been using one for a few years. See https://dl.acm.org/doi/10.1145/3643886
and https://web.eecs.umich.edu/~tpkelly/pma/. It's built on top of
mmap() and only for 64-bit *nix systems.

For the short instructions on using it, see
https://www.gnu.org/software/gawk/manual/html_node/Persistent-Memory.html.
For more details, see
https://www.gnu.org/software/gawk/manual/pm-gawk/pm-gawk.html.

Arnold

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T84cf4042bdd1a74b-M153c99a3b6b8ac59c01977f5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-07 14:57       ` Thaddeus Woskowiak
@ 2026-01-07 16:07         ` Wes Kussmaul
  2026-01-07 16:22           ` Noam Preil
  2026-01-07 16:13         ` Noam Preil
  1 sibling, 1 reply; 39+ messages in thread
From: Wes Kussmaul @ 2026-01-07 16:07 UTC (permalink / raw)
  To: 9fans


On 1/7/26 9:57 AM, Thaddeus Woskowiak wrote:
> On Tue, Jan 6, 2026 at 7:03 PM Ron Minnich <rminnich@p9f.org> wrote:
>> So maybe a pure-ram venti, with a TiB or so of memory, could work?
>>
> If you can afford it thanks to the AI surge. 1TB of DDR5 is around $25,000 USD.
>
> I am quite thankful that my 8GB CPU server is "overkill" for my current needs.


And I'm thankful that demand spikes like this end up generating 
overproduction, meaning this time next year DDR5 will be cheaper than ever.


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M3bcd391a57c4cc6bd57edc97
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Persistent memory (was Re: [9fans] venti /plan9port mmapped)
  2026-01-07 15:46         ` Persistent memory (was Re: [9fans] venti /plan9port mmapped) arnold
@ 2026-01-07 16:11           ` Noam Preil
  2026-01-07 17:26             ` Wes Kussmaul
  0 siblings, 1 reply; 39+ messages in thread
From: Noam Preil @ 2026-01-07 16:11 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 331 bytes --]

> Because Intel doesn't understand any kind of product except CPUs. :-(

Intel understands CPUs? ;)

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T84cf4042bdd1a74b-M6098fa8b8affb66eaa74773e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1025 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-07 14:57       ` Thaddeus Woskowiak
  2026-01-07 16:07         ` Wes Kussmaul
@ 2026-01-07 16:13         ` Noam Preil
  1 sibling, 0 replies; 39+ messages in thread
From: Noam Preil @ 2026-01-07 16:13 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 321 bytes --]

> 1TB of DDR5 is around $25,000 USD.

...i knew i didn't want to look it up, holy _crap_.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-Md6144477702f7007cc456012
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1015 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:07         ` Wes Kussmaul
@ 2026-01-07 16:22           ` Noam Preil
  2026-01-07 17:31             ` Wes Kussmaul
  0 siblings, 1 reply; 39+ messages in thread
From: Noam Preil @ 2026-01-07 16:22 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 851 bytes --]

> And I'm thankful that demand spikes like this end up generating overproduction, meaning this time next year DDR5 will be cheaper than ever.

The memory makers are on the record saying they're not going to increase production because, in essence, they know the bubble is going to pop and want to make sure that their prices stay high afterwards.

In other words, they know that demand spikes lead to overproduction and decided they'd rather avoid meeting the current demand than have oversupply later.

Of coruse, I'd say the same thing if i wanted people buying _now_ and assuming things won't be cheaper later...

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-Mfee471c8b0cadbeaac39a1bc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1701 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07  8:52       ` [9fans] venti /plan9port mmapped wb.kloke
@ 2026-01-07 16:30         ` Bakul Shah via 9fans
  2026-01-07 16:40           ` Noam Preil
                             ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Bakul Shah via 9fans @ 2026-01-07 16:30 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 2335 bytes --]

I have this idea that will horrify most of you!

1. Create an mmap device driver. You ask it to a new file handle which you use to communicate about memory mapping.
2. If you want to mmap some file, you open it and write its file descriptor along with other parameters (file offset, base addr, size, mode, flags) to your mmap file handle.
3. The mmap driver sets up necessary page table entries but doesn't actually fetch any data before returning from the write.
4. It can asynchronously kick off io requests on your behalf and fixup page table entries as needed.
5. Page faults in the mmapped area are serviced by making appropriate read/write calls.
6. Flags can be used to indicate read-ahead or write-behind for typical serial access.
7. Similarly msync, munmap etc. can be implemented.

In a sneaky way this avoids the need for adding any mmap specific syscalls! But the underlying work would be mostly similar in either case.

The main benefits of mmap are reduced initial latency , "pay as you go" cost structure and ease of use. It is certainly more expensive than reading/writing the same amount of data directly from a program.

No idea how horrible a hack is needed to implement such a thing or even if it is possible at all but I had to share this ;-)

> On Jan 7, 2026, at 12:52 AM, wb.kloke@gmail.com wrote:
> 
> On Tuesday, 6 January 2026, at 11:59 PM, Ron Minnich wrote:
>> So maybe a pure-ram venti, with a TiB or so of memory, could work? "disks are the work of the devil" -- jmk
> Of course, it would work, but imho it would be pointless. Why would anybody use content-addressed storage on data, which are not on persistent memory?
> 
> Most of the code of a venti is concerned with the task making a backup easy.
> 9fans <https://9fans.topicbox.com/latest> / 9fans / see discussions <https://9fans.topicbox.com/groups/9fans> + participants <https://9fans.topicbox.com/groups/9fans/members> + delivery options <https://9fans.topicbox.com/groups/9fans/subscription>Permalink <https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M5b0ec55532fb3ef9b2466ae8>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M79241f0580350c49710d5dde
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 3045 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
@ 2026-01-07 16:40           ` Noam Preil
  2026-01-07 16:41           ` ori
  2026-01-07 16:52           ` ori
  2 siblings, 0 replies; 39+ messages in thread
From: Noam Preil @ 2026-01-07 16:40 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 316 bytes --]

Honestly i think this would be a great experiment regardless of its practical usage :)

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M586930ee0b23d0b0562f1eba
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 944 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
  2026-01-07 16:40           ` Noam Preil
@ 2026-01-07 16:41           ` ori
  2026-01-07 20:35             ` Bakul Shah via 9fans
  2026-01-07 16:52           ` ori
  2 siblings, 1 reply; 39+ messages in thread
From: ori @ 2026-01-07 16:41 UTC (permalink / raw)
  To: 9fans

Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> I have this idea that will horrify most of you!
> 
> 1. Create an mmap device driver. You ask it to a new file handle which you use to communicate about memory mapping.
> 2. If you want to mmap some file, you open it and write its file descriptor along with other parameters (file offset, base addr, size, mode, flags) to your mmap file handle.
> 3. The mmap driver sets up necessary page table entries but doesn't actually fetch any data before returning from the write.
> 4. It can asynchronously kick off io requests on your behalf and fixup page table entries as needed.
> 5. Page faults in the mmapped area are serviced by making appropriate read/write calls.
> 6. Flags can be used to indicate read-ahead or write-behind for typical serial access.
> 7. Similarly msync, munmap etc. can be implemented.
> 
> In a sneaky way this avoids the need for adding any mmap specific syscalls! But the underlying work would be mostly similar in either case.
> 
> The main benefits of mmap are reduced initial latency , "pay as you go" cost structure and ease of use. It is certainly more expensive than reading/writing the same amount of data directly from a program.
> 
> No idea how horrible a hack is needed to implement such a thing or even if it is possible at all but I had to share this ;-)

To what end? The problems with mmap have little to do with adding a syscall;
they're about how you do things like communicating I/O errors. Especially
when flushing the cache.

Imagine the following setup -- I've imported 9p.io:

        9fs 9pio

and then I map a file from it:

        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);

Now, I want to write something into the file:

        *mapped = 1234;

The cached version of the page is dirty, so the OS will
eventually need to flush it back with a 9p Twrite; Let's
assume that before this happens, the network goes down.

How do you communicate the error with userspace?


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M1ebb42ae226a10acb54e9127
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
  2026-01-07 16:40           ` Noam Preil
  2026-01-07 16:41           ` ori
@ 2026-01-07 16:52           ` ori
  2026-01-07 17:37             ` wb.kloke
  2 siblings, 1 reply; 39+ messages in thread
From: ori @ 2026-01-07 16:52 UTC (permalink / raw)
  To: 9fans

Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> 
> No idea how horrible a hack is needed to implement such a thing or even if it is possible at all but I had to share this ;-)
> 

pretty horrible; so, we hard-code up to 10 segments in the kernel,
which allows things to be really simple; when N is 10, O(n) is fine.

We've got a ton of loops doing things like:

        for(i = 0; i < NSEG; i++)

and it works great; the simplicity of our VM system also means that
our fork is something like 10 times faster than fork on Unix, and

If you want to start having a lot of small maps, you start to need
complex data structures to track them and look them up quickly on
page fault; things start to suck.

In addition to mmap being superficially easy, but hard to use correctly,
and very hard to use correctly in the case of trying to write with it,
I think it's best to leave it behind.


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M44cbcb5dccb0f6b946f78dfa
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Persistent memory (was Re: [9fans] venti /plan9port mmapped)
  2026-01-07 16:11           ` Noam Preil
@ 2026-01-07 17:26             ` Wes Kussmaul
  0 siblings, 0 replies; 39+ messages in thread
From: Wes Kussmaul @ 2026-01-07 17:26 UTC (permalink / raw)
  To: 9fans


On 1/7/26 11:11 AM, Noam Preil wrote:
> > Because Intel doesn't understand any kind of product except CPUs. :-(
>
> Intel understands CPUs? ;)

Intel understands what happens when you choose to divest your newer 
product line (Xscale) that's built on the technology (ARM) that 
challenges your cash cow product (x86) because "ARM is not how we do 
things around here."

What they now understand about what happens when you do things like 
that: smart people leave.



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T84cf4042bdd1a74b-Mfb6fd74ee170955a193b0ec4
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:22           ` Noam Preil
@ 2026-01-07 17:31             ` Wes Kussmaul
  0 siblings, 0 replies; 39+ messages in thread
From: Wes Kussmaul @ 2026-01-07 17:31 UTC (permalink / raw)
  To: 9fans


On 1/7/26 11:22 AM, Noam Preil wrote:
> > And I'm thankful that demand spikes like this end up generating 
> overproduction, meaning this time next year DDR5 will be cheaper than 
> ever.
>
> The memory makers are on the record saying they're not going to 
> increase production because, in essence, they know the bubble is going 
> to pop and want to make sure that their prices stay high afterwards.
>
> In other words, they know that demand spikes lead to overproduction 
> and decided they'd rather avoid meeting the current demand than have 
> oversupply later.
>
> Of coruse, I'd say the same thing if i wanted people buying _now_ and 
> assuming things won't be cheaper later...
>
Regardless of management intent, they'll get sued by securities 
ambulance chasers for taking the long view instead of optimizing 
quarterly earnings.



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-Mc30224b80c8601103548d87a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:52           ` ori
@ 2026-01-07 17:37             ` wb.kloke
  2026-01-07 17:46               ` Noam Preil
  0 siblings, 1 reply; 39+ messages in thread
From: wb.kloke @ 2026-01-07 17:37 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

On Wednesday, 7 January 2026, at 5:52 PM, ori wrote:
> If you want to start having a lot of small maps, you start to need
complex data structures to track them and look them up quickly on
page fault; things start to suck.
Just let me remark, that in the venti use case, only 1 segment per arena partition would be needed.
Who uses more than 1 arena partition, anyway?
Remember, the isect partitions are gone already for mventi.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M34578a6b597ed109684fb875
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1233 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 17:37             ` wb.kloke
@ 2026-01-07 17:46               ` Noam Preil
  2026-01-07 17:56                 ` wb.kloke
  0 siblings, 1 reply; 39+ messages in thread
From: Noam Preil @ 2026-01-07 17:46 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 319 bytes --]

I do 😅

Multiple disks.

Many cheap ones instead of a single big one, in my server

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M9ded26e484f52e1cb9643d3b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1075 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 17:46               ` Noam Preil
@ 2026-01-07 17:56                 ` wb.kloke
  2026-01-07 18:07                   ` Noam Preil
  0 siblings, 1 reply; 39+ messages in thread
From: wb.kloke @ 2026-01-07 17:56 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On Wednesday, 7 January 2026, at 6:46 PM, Noam Preil wrote:
> I do 😅 
>  
> Multiple disks. 
>  
> Many cheap ones instead of a single big one, in my server

Ok. A simple solution is: Use 1 mventi process per partition, and a super process to delegate the eral work, perhaps using bloomfilters to avoid spurious requests.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M33fbbcb7e39e81e7b696db1b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1375 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 17:56                 ` wb.kloke
@ 2026-01-07 18:07                   ` Noam Preil
  2026-01-07 18:58                     ` wb.kloke
  0 siblings, 1 reply; 39+ messages in thread
From: Noam Preil @ 2026-01-07 18:07 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 323 bytes --]

Which if the whole point is to remove the complexity of the disk layer, seems a bit silly imo

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M85ea45143bf2cc5ddd8575f3
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 951 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 18:07                   ` Noam Preil
@ 2026-01-07 18:58                     ` wb.kloke
  0 siblings, 0 replies; 39+ messages in thread
From: wb.kloke @ 2026-01-07 18:58 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

On Wednesday, 7 January 2026, at 7:08 PM, Noam Preil wrote:
> Which if the whole point is to remove the complexity of the disk layer, seems a bit silly imo
It depends on the number of partitions. 

The original venti uses 1 thread per arenapartition, so the complexity is already catered for. Bloom filters are there, too.

A sane installation could perhaps  use 1 or 2 readonly arenapartitions and one writable partition, so we come out with the need for at most 3 segments. ymcv
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mcffad8c0698cbd93526ad357
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1402 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 16:41           ` ori
@ 2026-01-07 20:35             ` Bakul Shah via 9fans
  2026-01-07 21:31               ` ron minnich
  2026-01-07 21:40               ` ori
  0 siblings, 2 replies; 39+ messages in thread
From: Bakul Shah via 9fans @ 2026-01-07 20:35 UTC (permalink / raw)
  To: 9fans



> On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
> 
> Quoth Bakul Shah via 9fans <9fans@9fans.net>:
>> I have this idea that will horrify most of you!
>> 
>> 1. Create an mmap device driver. You ask it to a new file handle which you use to communicate about memory mapping.
>> 2. If you want to mmap some file, you open it and write its file descriptor along with other parameters (file offset, base addr, size, mode, flags) to your mmap file handle.
>> 3. The mmap driver sets up necessary page table entries but doesn't actually fetch any data before returning from the write.
>> 4. It can asynchronously kick off io requests on your behalf and fixup page table entries as needed.
>> 5. Page faults in the mmapped area are serviced by making appropriate read/write calls.
>> 6. Flags can be used to indicate read-ahead or write-behind for typical serial access.
>> 7. Similarly msync, munmap etc. can be implemented.
>> 
>> In a sneaky way this avoids the need for adding any mmap specific syscalls! But the underlying work would be mostly similar in either case.
>> 
>> The main benefits of mmap are reduced initial latency , "pay as you go" cost structure and ease of use. It is certainly more expensive than reading/writing the same amount of data directly from a program.
>> 
>> No idea how horrible a hack is needed to implement such a thing or even if it is possible at all but I had to share this ;-)
> 
> To what end? The problems with mmap have little to do with adding a syscall;
> they're about how you do things like communicating I/O errors. Especially
> when flushing the cache.
> 
> Imagine the following setup -- I've imported 9p.io:
> 
>        9fs 9pio
> 
> and then I map a file from it:
> 
>        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
> 
> Now, I want to write something into the file:
> 
>        *mapped = 1234;
> 
> The cached version of the page is dirty, so the OS will
> eventually need to flush it back with a 9p Twrite; Let's
> assume that before this happens, the network goes down.
> 
> How do you communicate the error with userspace?

This was just a brainwave but...

You have a (control) connection with the mmap device to
set up mmap so might as well use it to convey errors!
This device would be strictly local to where a program
runs.

I'd even consider allowing a separate process to mmap,
by making an address space a first class object. That'd
move more stuff out of the kernel and allow for more
interesting/esoteric uses.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M6fb0ef830c8e525cb591fb48
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 20:35             ` Bakul Shah via 9fans
@ 2026-01-07 21:31               ` ron minnich
  2026-01-08  7:56                 ` arnold
                                   ` (2 more replies)
  2026-01-07 21:40               ` ori
  1 sibling, 3 replies; 39+ messages in thread
From: ron minnich @ 2026-01-07 21:31 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 3684 bytes --]

what we had planned for harvey was a good deal simpler: designate a part of
the address space as a "bounce fault to user" space area.

When a page fault in that area occurred, info about the fault was sent to
an fd (if  it was opened) or a note handler.

user could could handle the fault or punt, as it saw fit. The fixup was
that user mode had to get the data to satisfy the fault, then tell the
kernel what to do.

This is much like the 35-years-ago work we did on AIX, called
external pagers at the time; or the more recent umap work,
https://computing.llnl.gov/projects/umap, used fairly widely in HPC.

If you go this route, it's a bit less complex than what you are proposing.

On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net> wrote:

>
>
> > On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
> >
> > Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> >> I have this idea that will horrify most of you!
> >>
> >> 1. Create an mmap device driver. You ask it to a new file handle which
> you use to communicate about memory mapping.
> >> 2. If you want to mmap some file, you open it and write its file
> descriptor along with other parameters (file offset, base addr, size, mode,
> flags) to your mmap file handle.
> >> 3. The mmap driver sets up necessary page table entries but doesn't
> actually fetch any data before returning from the write.
> >> 4. It can asynchronously kick off io requests on your behalf and fixup
> page table entries as needed.
> >> 5. Page faults in the mmapped area are serviced by making appropriate
> read/write calls.
> >> 6. Flags can be used to indicate read-ahead or write-behind for typical
> serial access.
> >> 7. Similarly msync, munmap etc. can be implemented.
> >>
> >> In a sneaky way this avoids the need for adding any mmap specific
> syscalls! But the underlying work would be mostly similar in either case.
> >>
> >> The main benefits of mmap are reduced initial latency , "pay as you go"
> cost structure and ease of use. It is certainly more expensive than
> reading/writing the same amount of data directly from a program.
> >>
> >> No idea how horrible a hack is needed to implement such a thing or even
> if it is possible at all but I had to share this ;-)
> >
> > To what end? The problems with mmap have little to do with adding a
> syscall;
> > they're about how you do things like communicating I/O errors. Especially
> > when flushing the cache.
> >
> > Imagine the following setup -- I've imported 9p.io:
> >
> >        9fs 9pio
> >
> > and then I map a file from it:
> >
> >        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
> >
> > Now, I want to write something into the file:
> >
> >        *mapped = 1234;
> >
> > The cached version of the page is dirty, so the OS will
> > eventually need to flush it back with a 9p Twrite; Let's
> > assume that before this happens, the network goes down.
> >
> > How do you communicate the error with userspace?
> 
> This was just a brainwave but...
> 
> You have a (control) connection with the mmap device to
> set up mmap so might as well use it to convey errors!
> This device would be strictly local to where a program
> runs.
> 
> I'd even consider allowing a separate process to mmap,
> by making an address space a first class object. That'd
> move more stuff out of the kernel and allow for more
> interesting/esoteric uses.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mae5eb9a90d72008533969f26
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 5772 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 20:35             ` Bakul Shah via 9fans
  2026-01-07 21:31               ` ron minnich
@ 2026-01-07 21:40               ` ori
  1 sibling, 0 replies; 39+ messages in thread
From: ori @ 2026-01-07 21:40 UTC (permalink / raw)
  To: 9fans

Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> 
> You have a (control) connection with the mmap device to
> set up mmap so might as well use it to convey errors!
> This device would be strictly local to where a program
> runs.

I'm not sure how this is supposed to work; I assume
you'd need to fork a proc to handle errors, since
you would need to block the faulting process while
figuring out how to re-map the page so that the
memory access could be handled?

If the flush was done in the background due to cache
pressure, the code that had done the failing I/O
operation may have very well moved on, so you you'd
need a complciated mechanism to know how long ago
your writes had failed coded into your software.

Memory isn't a good abstraction for failable i/o.
It looks simple, but makes error handling impossibly
hard.

I think the only time it's acceptably behaved is when
it's read-only, you're willing to treat i/o errors as
irrecoverable, and you don't care about random pauses
in your program's perforamce profile. I don't think
that situation is all so common.


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M3dc790fee41d21884de7bacd
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 21:31               ` ron minnich
@ 2026-01-08  7:56                 ` arnold
  2026-01-08 10:31                 ` wb.kloke
  2026-01-09  3:57                 ` Paul Lalonde
  2 siblings, 0 replies; 39+ messages in thread
From: arnold @ 2026-01-08  7:56 UTC (permalink / raw)
  To: 9fans

ron minnich <rminnich@gmail.com> wrote:

> This is much like the 35-years-ago work we did on AIX, called
> external pagers at the time;

External pagers were a thing in Mach in the mid-80s, IIRC.

To me it seemed like overengineering. How many plain old users
want to write their own pager?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M8381942ed38c05822d84803f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 21:31               ` ron minnich
  2026-01-08  7:56                 ` arnold
@ 2026-01-08 10:31                 ` wb.kloke
  2026-01-09  0:02                   ` ron minnich
  2026-01-09  3:57                 ` Paul Lalonde
  2 siblings, 1 reply; 39+ messages in thread
From: wb.kloke @ 2026-01-08 10:31 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1238 bytes --]

On Wednesday, 7 January 2026, at 10:31 PM, ron minnich wrote:
> what we had planned for harvey was a good deal simpler: designate a part of the address space as a "bounce fault to user" space area. 
> 
> When a page fault in that area occurred, info about the fault was sent to an fd (if  it was opened) or a note handler.
> 
> user could could handle the fault or punt, as it saw fit. The fixup was that user mode had to get the data to satisfy the fault, then tell the kernel what to do.
> 
> This is much like the 35-years-ago work we did on AIX, called external pagers at the time; or the more recent umap work, https://computing.llnl.gov/projects/umap, used fairly widely in HPC. 
> 
> If you go this route, it's a bit less complex than what you are proposing.

Thank you, this seems the nearest possible answer to my original question. The bad about umap is, of course, that it depends on a linuxism.

BTW. Harvey is gone. Is the work done to crosscompile for plan9 on gcc accessible?
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M11a11204a789ee5dac250c3a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2084 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [9fans] venti /plan9port mmapped
  2026-01-02 21:01   ` ori
@ 2026-01-08 15:59     ` wb.kloke
  0 siblings, 0 replies; 39+ messages in thread
From: wb.kloke @ 2026-01-08 15:59 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]

I have committed the changes to make the mmapped venti server fully functional to 
https://github.com/vestein463/plan9port/tree/master/src/cmd/venti/mappedsrv

This is not fully tested yet, esp. for more than 1 arenapartition.

I include data on the read performance vs. the previous traditional-io version.
The mmapped version is on amd64/32GB freebsd, the traditional on WD mycloud Ex2 (arm7/1G).
The arena partition resides on the mycloud, and is served via nfs readonly to the amd.
I repeated the test to see what effect the os caching of the 2 machines may have.
1st is the mmapped version, 2nd the traditional, 3rd mmapped again and 4th traditional.

The data is the image of an old ufs filesystem created by vbackup.
time venti=wbk1 vcat ffs:f10fb5797e5ea50d6bc567772f9794706548f568 > /dev/null
1540096 blocks total
582848 blocks in use, 2122946 file reads

real    6m53,762s
user    1m13,788s
sys     0m16,910s
[wb@wbk1 ~/plan9port/src/cmd/venti/mappedsrv]$ time venti=wdmc vcat ffs:f10fb5797e5ea50d6bc567772f9794706548f568 > /dev/null
1540096 blocks total
582848 blocks in use, 2122946 file reads

real    17m28,870s
user    1m26,011s
sys     0m42,485s
[wb@wbk1 ~/plan9port/src/cmd/venti/mappedsrv]$ time venti=wbk1 vcat ffs:f10fb5797e5ea50d6bc567772f9794706548f568 > /dev/null
1540096 blocks total
582848 blocks in use, 2122946 file reads

real    3m55,614s
user    1m9,911s
sys     0m14,804s
[wb@wbk1 ~/plan9port/src/cmd/venti/mappedsrv]$ time venti=wdmc vcat ffs:f10fb5797e5ea50d6bc567772f9794706548f568 > /dev/null
1540096 blocks total
582848 blocks in use, 2122946 file reads

real    16m41,033s
user    1m15,768s
sys     0m32,756s


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tf991997f4e7bb37e-M0b9a9d657a4ab010b4c9d5d2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2882 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-08 10:31                 ` wb.kloke
@ 2026-01-09  0:02                   ` ron minnich
  0 siblings, 0 replies; 39+ messages in thread
From: ron minnich @ 2026-01-09  0:02 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 7376 bytes --]

Harvey is not gone at all. it's all there on the repo.
git@github.com:Harvey-OS/harvey
git branch
* GPL-C11

The build tool is called build, written in go. It is deliberately very
dumb, Andy Tannenbaum advised me to go with "standard C, not GCC like Linus
did, even though I told him not to!" [in his growly voice -- I miss that
guy] and another friend told me "C compilers are fast, just brute force it"

So instead of calling compilers for each C file, I call the compiler for
every file in a directory:
Building Libc
/usr/bin/clang -std=c11 -c -I /home/rminnich/harvey/harvey/amd64/include -I
/home/rminnich/harvey/harvey/sys/include -I . -fasm -ffreestanding
-fno-builtin -fno-omit-frame-pointer -fno-stack-protector -g -gdwarf-2
-ggdb -mcmodel=small -O0 -static -Wall -Werror -fcommon -mstack-alignment=4
9sys/abort.c 9sys/access.c 9sys/announce.c 9sys/convD2M.c 9sys/convM2D.c
9sys/convM2S.c 9sys/convS2M.c 9sys/cputime.c 9sys/ctime.c 9sys/dial.c
9sys/dirfstat.c 9sys/dirfwstat.c 9sys/dirmodefmt.c 9sys/dirread.c
9sys/dirstat.c 9sys/dirwstat.c 9sys/fcallfmt.c 9sys/fork.c
9sys/getnetconninfo.c 9sys/getenv.c 9sys/getpid.c 9sys/getppid.c
9sys/getwd.c 9sys/iounit.c 9sys/nulldir.c 9sys/postnote.c 9sys/privalloc.c
9sys/pushssl.c 9sys/pushtls.c 9sys/putenv.c 9sys/qlock.c 9sys/read9pmsg.c
9sys/read.c 9sys/readv.c 9sys/rerrstr.c 9sys/sbrk.c 9sys/setnetmtpt.c
9sys/sysfatal.c 9sys/syslog.c 9sys/sysname.c 9sys/time.c 9sys/times.c
9sys/tm2sec.c 9sys/truerand.c 9sys/wait.c 9sys/waitpid.c 9sys/werrstr.c
9sys/write.c 9sys/writev.c 9syscall/alarm.s 9syscall/await.s
9syscall/bind.s 9syscall/brk_.s 9syscall/chdir.s 9syscall/close.s
9syscall/create.s 9syscall/dup.s 9syscall/errstr.s 9syscall/exec.s
9syscall/_exits.s 9syscall/fauth.s 9syscall/fd2path.s 9syscall/fstat.s
9syscall/fversion.s 9syscall/fwstat.s 9syscall/mount.s 9syscall/noted.s
9syscall/notify.s 9syscall/nsec.s 9syscall/open.s 9syscall/pipe.s
9syscall/pread.s 9syscall/pwrite.s 9syscall/r0.s 9syscall/remove.s
9syscall/rendezvous.s 9syscall/rfork.s 9syscall/seek.s 9syscall/segattach.s
9syscall/segbrk.s 9syscall/segdetach.s 9syscall/segflush.s
9syscall/segfree.s 9syscall/semacquire.s 9syscall/semrelease.s
9syscall/sleep.s 9syscall/stat.s 9syscall/tsemacquire.s 9syscall/unmount.s
9syscall/wstat.s fmt/dofmt.c fmt/dorfmt.c fmt/errfmt.c fmt/fltfmt.c
fmt/fmt.c fmt/fmtfd.c fmt/fmtlock.c fmt/fmtprint.c fmt/fmtquote.c
fmt/fmtrune.c fmt/fmtstr.c fmt/fmtvprint.c fmt/fprint.c fmt/print.c
fmt/runefmtstr.c fmt/runeseprint.c fmt/runesmprint.c fmt/runesnprint.c
fmt/runesprint.c fmt/runevseprint.c fmt/runevsmprint.c fmt/runevsnprint.c
fmt/seprint.c fmt/smprint.c fmt/snprint.c fmt/sprint.c fmt/vfprint.c
fmt/vseprint.c fmt/vsmprint.c fmt/vsnprint.c port/_assert.c port/abs.c
port/asin.c port/atan.c port/atan2.c port/atexit.c port/atnotify.c
port/atof.c port/atol.c port/atoll.c port/cistrcmp.c port/cistrncmp.c
port/cistrstr.c port/charstod.c port/cleanname.c port/configstring.c
port/ctype.c port/encodefmt.c port/errno2str.c port/execl.c port/exp.c
port/fabs.c port/floor.c port/fmod.c port/frand.c port/frexp.c
port/getfields.c port/getuser.c port/hangup.c port/hashmap.c port/hypot.c
port/lnrand.c port/lock.c port/log.c port/lrand.c port/malloc.c
port/memccpy.c port/memchr.c port/memcmp.c port/memmove.c port/memset.c
port/mktemp.c port/muldiv.c port/nan.c port/needsrcquote.c port/netcrypt.c
port/netmkaddr.c port/nrand.c port/ntruerand.c port/perror.c port/pool.c
port/pow.c port/pow10.c port/qsort.c port/quote.c port/rand.c port/readn.c
port/reallocarray.c port/rijndael.c port/rune.c port/runebase.c
port/runebsearch.c port/runestrcat.c port/runestrchr.c port/runestrcmp.c
port/runestrcpy.c port/runestrecpy.c port/runestrdup.c port/runestrncat.c
port/runestrncmp.c port/runestrncpy.c port/runestrrchr.c port/runestrlen.c
port/runestrstr.c port/runetype.c port/sha2.c port/sin.c port/sinh.c
port/slice.c port/strcat.c port/strchr.c port/strcmp.c port/strcpy.c
port/strecpy.c port/strcspn.c port/strdup.c port/strlcat.c port/strlcpy.c
port/strlen.c port/strncat.c port/strncmp.c port/strncpy.c port/strpbrk.c
port/strrchr.c port/strspn.c port/strstr.c port/strtod.c port/strtok.c
port/strtol.c port/strtoll.c port/strtoul.c port/strtoull.c port/tan.c
port/tanh.c port/tokenize.c port/toupper.c port/utfecpy.c port/utflen.c
port/utfnlen.c port/utfrune.c port/utfrrune.c port/utfutf.c port/u16.c
port/u32.c port/u64.c amd64/notejmp.c amd64/cycles.c amd64/argv0.c
port/getcallstack.c amd64/rdpmc.c amd64/setjmp.s amd64/sqrt.s amd64/tas.s
amd64/atom.S amd64/main9.S

because clang is smart enough to only parse a .h file once.

There are a few other go tools. We got rid of all the random awk, rc, etc.
scripts, because we always forgot how they worked.

I just tried this:
go install harvey-os.org/cmd/decompress@latest
go install harvey-os.org/cmd/mksys@latest
go install harvey-os.org/cmd/build@latest
git clone git@github.com:Harvey-OS/harvey
cd harvey
ARCH=amd64 CC=gcc build

And, ironically, the one thing that fails is one of the go build steps,
since in the 10 years since we set this up, the Go build commands have
changed in incompatible ways.
The changes are a HUGE improvement, but it just means old projects like
this have an issue with go.

The C stuff all builds fine.

note also:
rminnich@pop-os:~/harvey/harvey$ build
You need to set the CC environment variable (e.g. gcc, clang, clang-3.6,
...)
rminnich@pop-os:~/harvey/harvey$ CC=clang build
You need to set the ARCH environment variable from: [amd64 riscv aarch64]
rminnich@pop-os:~/harvey/harvey$

You have lots of choices as to toolchain.

On Thu, Jan 8, 2026 at 4:45 AM <wb.kloke@gmail.com> wrote:

> On Wednesday, 7 January 2026, at 10:31 PM, ron minnich wrote:
>
> what we had planned for harvey was a good deal simpler: designate a part
> of the address space as a "bounce fault to user" space area.
>
> When a page fault in that area occurred, info about the fault was sent to
> an fd (if  it was opened) or a note handler.
>
> user could could handle the fault or punt, as it saw fit. The fixup was
> that user mode had to get the data to satisfy the fault, then tell the
> kernel what to do.
>
> This is much like the 35-years-ago work we did on AIX, called
> external pagers at the time; or the more recent umap work,
> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
>
> If you go this route, it's a bit less complex than what you are proposing.
>
>
> Thank you, this seems the nearest possible answer to my original question.
> The bad about umap is, of course, that it depends on a linuxism.
>
> BTW. Harvey is gone. Is the work done to crosscompile for plan9 on gcc
> accessible?
> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M11a11204a789ee5dac250c3a>
>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M2966fd4ffcd98eb245b77956
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 8682 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-07 21:31               ` ron minnich
  2026-01-08  7:56                 ` arnold
  2026-01-08 10:31                 ` wb.kloke
@ 2026-01-09  3:57                 ` Paul Lalonde
  2026-01-09  5:10                   ` ron minnich
  2 siblings, 1 reply; 39+ messages in thread
From: Paul Lalonde @ 2026-01-09  3:57 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 4386 bytes --]

Did the same on GPUs/Xeon Phi, including in the texture units.  Very useful
mechanism for abstracting compute with random access characteristics.

Paul

On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <rminnich@gmail.com> wrote:

> what we had planned for harvey was a good deal simpler: designate a part
> of the address space as a "bounce fault to user" space area.
>
> When a page fault in that area occurred, info about the fault was sent to
> an fd (if  it was opened) or a note handler.
>
> user could could handle the fault or punt, as it saw fit. The fixup was
> that user mode had to get the data to satisfy the fault, then tell the
> kernel what to do.
>
> This is much like the 35-years-ago work we did on AIX, called
> external pagers at the time; or the more recent umap work,
> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
>
> If you go this route, it's a bit less complex than what you are proposing.
>
> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net>
> wrote:
>
>>
>>
>> > On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
>> >
>> > Quoth Bakul Shah via 9fans <9fans@9fans.net>:
>> >> I have this idea that will horrify most of you!
>> >>
>> >> 1. Create an mmap device driver. You ask it to a new file handle which
>> you use to communicate about memory mapping.
>> >> 2. If you want to mmap some file, you open it and write its file
>> descriptor along with other parameters (file offset, base addr, size, mode,
>> flags) to your mmap file handle.
>> >> 3. The mmap driver sets up necessary page table entries but doesn't
>> actually fetch any data before returning from the write.
>> >> 4. It can asynchronously kick off io requests on your behalf and fixup
>> page table entries as needed.
>> >> 5. Page faults in the mmapped area are serviced by making appropriate
>> read/write calls.
>> >> 6. Flags can be used to indicate read-ahead or write-behind for
>> typical serial access.
>> >> 7. Similarly msync, munmap etc. can be implemented.
>> >>
>> >> In a sneaky way this avoids the need for adding any mmap specific
>> syscalls! But the underlying work would be mostly similar in either case.
>> >>
>> >> The main benefits of mmap are reduced initial latency , "pay as you
>> go" cost structure and ease of use. It is certainly more expensive than
>> reading/writing the same amount of data directly from a program.
>> >>
>> >> No idea how horrible a hack is needed to implement such a thing or
>> even if it is possible at all but I had to share this ;-)
>> >
>> > To what end? The problems with mmap have little to do with adding a
>> syscall;
>> > they're about how you do things like communicating I/O errors.
>> Especially
>> > when flushing the cache.
>> >
>> > Imagine the following setup -- I've imported 9p.io:
>> >
>> >        9fs 9pio
>> >
>> > and then I map a file from it:
>> >
>> >        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
>> >
>> > Now, I want to write something into the file:
>> >
>> >        *mapped = 1234;
>> >
>> > The cached version of the page is dirty, so the OS will
>> > eventually need to flush it back with a 9p Twrite; Let's
>> > assume that before this happens, the network goes down.
>> >
>> > How do you communicate the error with userspace?
>> 
>> This was just a brainwave but...
>> 
>> You have a (control) connection with the mmap device to
>> set up mmap so might as well use it to convey errors!
>> This device would be strictly local to where a program
>> runs.
>> 
>> I'd even consider allowing a separate process to mmap,
>> by making an address space a first class object. That'd
>> move more stuff out of the kernel and allow for more
>> interesting/esoteric uses.
> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mae5eb9a90d72008533969f26>
>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 6452 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09  3:57                 ` Paul Lalonde
@ 2026-01-09  5:10                   ` ron minnich
  2026-01-09  5:18                     ` arnold
  0 siblings, 1 reply; 39+ messages in thread
From: ron minnich @ 2026-01-09  5:10 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 4770 bytes --]

I would not tar the idea of external pagers with the Mach tarbrush. Mach
was pretty much inefficient at everything, including external pagers.
External pagers can work well, when implemented well.

On Thu, Jan 8, 2026 at 8:41 PM Paul Lalonde <paul.a.lalonde@gmail.com>
wrote:

> Did the same on GPUs/Xeon Phi, including in the texture units.  Very
> useful mechanism for abstracting compute with random access characteristics.
>
> Paul
>
> On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <rminnich@gmail.com> wrote:
>
>> what we had planned for harvey was a good deal simpler: designate a part
>> of the address space as a "bounce fault to user" space area.
>>
>> When a page fault in that area occurred, info about the fault was sent to
>> an fd (if  it was opened) or a note handler.
>>
>> user could could handle the fault or punt, as it saw fit. The fixup was
>> that user mode had to get the data to satisfy the fault, then tell the
>> kernel what to do.
>>
>> This is much like the 35-years-ago work we did on AIX, called
>> external pagers at the time; or the more recent umap work,
>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
>>
>> If you go this route, it's a bit less complex than what you are proposing.
>>
>> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net>
>> wrote:
>>
>>>
>>>
>>> > On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
>>> >
>>> > Quoth Bakul Shah via 9fans <9fans@9fans.net>:
>>> >> I have this idea that will horrify most of you!
>>> >>
>>> >> 1. Create an mmap device driver. You ask it to a new file handle
>>> which you use to communicate about memory mapping.
>>> >> 2. If you want to mmap some file, you open it and write its file
>>> descriptor along with other parameters (file offset, base addr, size, mode,
>>> flags) to your mmap file handle.
>>> >> 3. The mmap driver sets up necessary page table entries but doesn't
>>> actually fetch any data before returning from the write.
>>> >> 4. It can asynchronously kick off io requests on your behalf and
>>> fixup page table entries as needed.
>>> >> 5. Page faults in the mmapped area are serviced by making appropriate
>>> read/write calls.
>>> >> 6. Flags can be used to indicate read-ahead or write-behind for
>>> typical serial access.
>>> >> 7. Similarly msync, munmap etc. can be implemented.
>>> >>
>>> >> In a sneaky way this avoids the need for adding any mmap specific
>>> syscalls! But the underlying work would be mostly similar in either case.
>>> >>
>>> >> The main benefits of mmap are reduced initial latency , "pay as you
>>> go" cost structure and ease of use. It is certainly more expensive than
>>> reading/writing the same amount of data directly from a program.
>>> >>
>>> >> No idea how horrible a hack is needed to implement such a thing or
>>> even if it is possible at all but I had to share this ;-)
>>> >
>>> > To what end? The problems with mmap have little to do with adding a
>>> syscall;
>>> > they're about how you do things like communicating I/O errors.
>>> Especially
>>> > when flushing the cache.
>>> >
>>> > Imagine the following setup -- I've imported 9p.io:
>>> >
>>> >        9fs 9pio
>>> >
>>> > and then I map a file from it:
>>> >
>>> >        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
>>> >
>>> > Now, I want to write something into the file:
>>> >
>>> >        *mapped = 1234;
>>> >
>>> > The cached version of the page is dirty, so the OS will
>>> > eventually need to flush it back with a 9p Twrite; Let's
>>> > assume that before this happens, the network goes down.
>>> >
>>> > How do you communicate the error with userspace?
>>> 
>>> This was just a brainwave but...
>>> 
>>> You have a (control) connection with the mmap device to
>>> set up mmap so might as well use it to convey errors!
>>> This device would be strictly local to where a program
>>> runs.
>>> 
>>> I'd even consider allowing a separate process to mmap,
>>> by making an address space a first class object. That'd
>>> move more stuff out of the kernel and allow for more
>>> interesting/esoteric uses.
>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f>
>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M56e476fb601ad6cca1a4d4fc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 7141 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09  5:10                   ` ron minnich
@ 2026-01-09  5:18                     ` arnold
  2026-01-09  6:06                       ` David Leimbach via 9fans
  0 siblings, 1 reply; 39+ messages in thread
From: arnold @ 2026-01-09  5:18 UTC (permalink / raw)
  To: 9fans

I vaguely remember someone being quoted as saying

        Microkernels don't have to be small. They just have to
        not do much.

:-)

ron minnich <rminnich@gmail.com> wrote:

> I would not tar the idea of external pagers with the Mach tarbrush. Mach
> was pretty much inefficient at everything, including external pagers.
> External pagers can work well, when implemented well.
>
> On Thu, Jan 8, 2026 at 8:41 PM Paul Lalonde <paul.a.lalonde@gmail.com>
> wrote:
>
> > Did the same on GPUs/Xeon Phi, including in the texture units.  Very
> > useful mechanism for abstracting compute with random access characteristics.
> >
> > Paul
> >
> > On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <rminnich@gmail.com> wrote:
> >
> >> what we had planned for harvey was a good deal simpler: designate a part
> >> of the address space as a "bounce fault to user" space area.
> >>
> >> When a page fault in that area occurred, info about the fault was sent to
> >> an fd (if  it was opened) or a note handler.
> >>
> >> user could could handle the fault or punt, as it saw fit. The fixup was
> >> that user mode had to get the data to satisfy the fault, then tell the
> >> kernel what to do.
> >>
> >> This is much like the 35-years-ago work we did on AIX, called
> >> external pagers at the time; or the more recent umap work,
> >> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
> >>
> >> If you go this route, it's a bit less complex than what you are proposing.
> >>
> >> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net>
> >> wrote:
> >>
> >>>
> >>>
> >>> > On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
> >>> >
> >>> > Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> >>> >> I have this idea that will horrify most of you!
> >>> >>
> >>> >> 1. Create an mmap device driver. You ask it to a new file handle
> >>> which you use to communicate about memory mapping.
> >>> >> 2. If you want to mmap some file, you open it and write its file
> >>> descriptor along with other parameters (file offset, base addr, size, mode,
> >>> flags) to your mmap file handle.
> >>> >> 3. The mmap driver sets up necessary page table entries but doesn't
> >>> actually fetch any data before returning from the write.
> >>> >> 4. It can asynchronously kick off io requests on your behalf and
> >>> fixup page table entries as needed.
> >>> >> 5. Page faults in the mmapped area are serviced by making appropriate
> >>> read/write calls.
> >>> >> 6. Flags can be used to indicate read-ahead or write-behind for
> >>> typical serial access.
> >>> >> 7. Similarly msync, munmap etc. can be implemented.
> >>> >>
> >>> >> In a sneaky way this avoids the need for adding any mmap specific
> >>> syscalls! But the underlying work would be mostly similar in either case.
> >>> >>
> >>> >> The main benefits of mmap are reduced initial latency , "pay as you
> >>> go" cost structure and ease of use. It is certainly more expensive than
> >>> reading/writing the same amount of data directly from a program.
> >>> >>
> >>> >> No idea how horrible a hack is needed to implement such a thing or
> >>> even if it is possible at all but I had to share this ;-)
> >>> >
> >>> > To what end? The problems with mmap have little to do with adding a
> >>> syscall;
> >>> > they're about how you do things like communicating I/O errors.
> >>> Especially
> >>> > when flushing the cache.
> >>> >
> >>> > Imagine the following setup -- I've imported 9p.io:
> >>> >
> >>> >        9fs 9pio
> >>> >
> >>> > and then I map a file from it:
> >>> >
> >>> >        mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
> >>> >
> >>> > Now, I want to write something into the file:
> >>> >
> >>> >        *mapped = 1234;
> >>> >
> >>> > The cached version of the page is dirty, so the OS will
> >>> > eventually need to flush it back with a 9p Twrite; Let's
> >>> > assume that before this happens, the network goes down.
> >>> >
> >>> > How do you communicate the error with userspace?
> >>> 
> >>> This was just a brainwave but...
> >>> 
> >>> You have a (control) connection with the mmap device to
> >>> set up mmap so might as well use it to convey errors!
> >>> This device would be strictly local to where a program
> >>> runs.
> >>> 
> >>> I'd even consider allowing a separate process to mmap,
> >>> by making an address space a first class object. That'd
> >>> move more stuff out of the kernel and allow for more
> >>> interesting/esoteric uses.
> >> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> > <https://9fans.topicbox.com/groups/9fans> + participants
> > <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> > <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> > <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f>
> >

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mb4abae3026a42ab768f9f6db
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09  5:18                     ` arnold
@ 2026-01-09  6:06                       ` David Leimbach via 9fans
  2026-01-09 17:13                         ` ron minnich
  2026-01-09 17:39                         ` tlaronde
  0 siblings, 2 replies; 39+ messages in thread
From: David Leimbach via 9fans @ 2026-01-09  6:06 UTC (permalink / raw)
  To: 9fans; +Cc: 9fans

I’d been impressed by L4.  It’s certainly been deployed pretty broadly.

And it has recursive pagers but … not sure how that’s used in practice.

And there are a bunch of variants.
Sent from my iPhone

> On Jan 8, 2026, at 9:23 PM, arnold@skeeve.com wrote:
> 
> I vaguely remember someone being quoted as saying
> 
>        Microkernels don't have to be small. They just have to
>        not do much.
> 
> :-)
> 
> ron minnich <rminnich@gmail.com> wrote:
> 
>> I would not tar the idea of external pagers with the Mach tarbrush. Mach
>> was pretty much inefficient at everything, including external pagers.
>> External pagers can work well, when implemented well.
>> 
>>> On Thu, Jan 8, 2026 at 8:41 PM Paul Lalonde <paul.a.lalonde@gmail.com>
>>> wrote:
>>> 
>>> Did the same on GPUs/Xeon Phi, including in the texture units.  Very
>>> useful mechanism for abstracting compute with random access characteristics.
>>> 
>>> Paul
>>> 
>>>> On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <rminnich@gmail.com> wrote:
>>>> what we had planned for harvey was a good deal simpler: designate a part
>>>> of the address space as a "bounce fault to user" space area.
>>>> 
>>>> When a page fault in that area occurred, info about the fault was sent to
>>>> an fd (if  it was opened) or a note handler.
>>>> 
>>>> user could could handle the fault or punt, as it saw fit. The fixup was
>>>> that user mode had to get the data to satisfy the fault, then tell the
>>>> kernel what to do.
>>>> 
>>>> This is much like the 35-years-ago work we did on AIX, called
>>>> external pagers at the time; or the more recent umap work,
>>>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
>>>> 
>>>> If you go this route, it's a bit less complex than what you are proposing.
>>>> 
>>>> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net>
>>>> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
>>>>>> 
>>>>>> Quoth Bakul Shah via 9fans <9fans@9fans.net>:
>>>>>>> I have this idea that will horrify most of you!
>>>>>>> 
>>>>>>> 1. Create an mmap device driver. You ask it to a new file handle
>>>>> which you use to communicate about memory mapping.
>>>>>>> 2. If you want to mmap some file, you open it and write its file
>>>>> descriptor along with other parameters (file offset, base addr, size, mode,
>>>>> flags) to your mmap file handle.
>>>>>>> 3. The mmap driver sets up necessary page table entries but doesn't
>>>>> actually fetch any data before returning from the write.
>>>>>>> 4. It can asynchronously kick off io requests on your behalf and
>>>>> fixup page table entries as needed.
>>>>>>> 5. Page faults in the mmapped area are serviced by making appropriate
>>>>> read/write calls.
>>>>>>> 6. Flags can be used to indicate read-ahead or write-behind for
>>>>> typical serial access.
>>>>>>> 7. Similarly msync, munmap etc. can be implemented.
>>>>>>> 
>>>>>>> In a sneaky way this avoids the need for adding any mmap specific
>>>>> syscalls! But the underlying work would be mostly similar in either case.
>>>>>>> 
>>>>>>> The main benefits of mmap are reduced initial latency , "pay as you
>>>>> go" cost structure and ease of use. It is certainly more expensive than
>>>>> reading/writing the same amount of data directly from a program.
>>>>>>> 
>>>>>>> No idea how horrible a hack is needed to implement such a thing or
>>>>> even if it is possible at all but I had to share this ;-)
>>>>>> 
>>>>>> To what end? The problems with mmap have little to do with adding a
>>>>> syscall;
>>>>>> they're about how you do things like communicating I/O errors.
>>>>> Especially
>>>>>> when flushing the cache.
>>>>>> 
>>>>>> Imagine the following setup -- I've imported 9p.io:
>>>>>> 
>>>>>>       9fs 9pio
>>>>>> 
>>>>>> and then I map a file from it:
>>>>>> 
>>>>>>       mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
>>>>>> 
>>>>>> Now, I want to write something into the file:
>>>>>> 
>>>>>>       *mapped = 1234;
>>>>>> 
>>>>>> The cached version of the page is dirty, so the OS will
>>>>>> eventually need to flush it back with a 9p Twrite; Let's
>>>>>> assume that before this happens, the network goes down.
>>>>>> 
>>>>>> How do you communicate the error with userspace?
>>>>> 
>>>>> This was just a brainwave but...
>>>>> 
>>>>> You have a (control) connection with the mmap device to
>>>>> set up mmap so might as well use it to convey errors!
>>>>> This device would be strictly local to where a program
>>>>> runs.
>>>>> 
>>>>> I'd even consider allowing a separate process to mmap,
>>>>> by making an address space a first class object. That'd
>>>>> move more stuff out of the kernel and allow for more
>>>>> interesting/esoteric uses.
>>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
>>> <https://9fans.topicbox.com/groups/9fans> + participants
>>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
>>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
>>> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f>
>>> 

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M4dfecc367000953ec3cd500e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09  6:06                       ` David Leimbach via 9fans
@ 2026-01-09 17:13                         ` ron minnich
  2026-01-09 17:39                         ` tlaronde
  1 sibling, 0 replies; 39+ messages in thread
From: ron minnich @ 2026-01-09 17:13 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 6015 bytes --]

Right, so, getting back to the original discussion, Bakul, I think the
right path forward is to implement a device that supports an external
pager, rather than mmap.

But code wins, so, quick, somebody, implement something :-)

On Thu, Jan 8, 2026 at 11:03 PM David Leimbach via 9fans <9fans@9fans.net>
wrote:

> I’d been impressed by L4.  It’s certainly been deployed pretty broadly.
>
> And it has recursive pagers but … not sure how that’s used in practice.
>
> And there are a bunch of variants.
> Sent from my iPhone
>
> > On Jan 8, 2026, at 9:23 PM, arnold@skeeve.com wrote:
> >
> > I vaguely remember someone being quoted as saying
> >
> >        Microkernels don't have to be small. They just have to
> >        not do much.
> >
> > :-)
> >
> > ron minnich <rminnich@gmail.com> wrote:
> >
> >> I would not tar the idea of external pagers with the Mach tarbrush. Mach
> >> was pretty much inefficient at everything, including external pagers.
> >> External pagers can work well, when implemented well.
> >>
> >>> On Thu, Jan 8, 2026 at 8:41 PM Paul Lalonde <paul.a.lalonde@gmail.com>
> >>> wrote:
> >>>
> >>> Did the same on GPUs/Xeon Phi, including in the texture units.  Very
> >>> useful mechanism for abstracting compute with random access
> characteristics.
> >>>
> >>> Paul
> >>>
> >>>> On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <rminnich@gmail.com>
> wrote:
> >>>> what we had planned for harvey was a good deal simpler: designate a
> part
> >>>> of the address space as a "bounce fault to user" space area.
> >>>>
> >>>> When a page fault in that area occurred, info about the fault was
> sent to
> >>>> an fd (if  it was opened) or a note handler.
> >>>>
> >>>> user could could handle the fault or punt, as it saw fit. The fixup
> was
> >>>> that user mode had to get the data to satisfy the fault, then tell the
> >>>> kernel what to do.
> >>>>
> >>>> This is much like the 35-years-ago work we did on AIX, called
> >>>> external pagers at the time; or the more recent umap work,
> >>>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
> >>>>
> >>>> If you go this route, it's a bit less complex than what you are
> proposing.
> >>>>
> >>>> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <9fans@9fans.net>
> >>>> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>>> On Jan 7, 2026, at 8:41 AM, ori@eigenstate.org wrote:
> >>>>>>
> >>>>>> Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> >>>>>>> I have this idea that will horrify most of you!
> >>>>>>>
> >>>>>>> 1. Create an mmap device driver. You ask it to a new file handle
> >>>>> which you use to communicate about memory mapping.
> >>>>>>> 2. If you want to mmap some file, you open it and write its file
> >>>>> descriptor along with other parameters (file offset, base addr,
> size, mode,
> >>>>> flags) to your mmap file handle.
> >>>>>>> 3. The mmap driver sets up necessary page table entries but doesn't
> >>>>> actually fetch any data before returning from the write.
> >>>>>>> 4. It can asynchronously kick off io requests on your behalf and
> >>>>> fixup page table entries as needed.
> >>>>>>> 5. Page faults in the mmapped area are serviced by making
> appropriate
> >>>>> read/write calls.
> >>>>>>> 6. Flags can be used to indicate read-ahead or write-behind for
> >>>>> typical serial access.
> >>>>>>> 7. Similarly msync, munmap etc. can be implemented.
> >>>>>>>
> >>>>>>> In a sneaky way this avoids the need for adding any mmap specific
> >>>>> syscalls! But the underlying work would be mostly similar in either
> case.
> >>>>>>>
> >>>>>>> The main benefits of mmap are reduced initial latency , "pay as you
> >>>>> go" cost structure and ease of use. It is certainly more expensive
> than
> >>>>> reading/writing the same amount of data directly from a program.
> >>>>>>>
> >>>>>>> No idea how horrible a hack is needed to implement such a thing or
> >>>>> even if it is possible at all but I had to share this ;-)
> >>>>>>
> >>>>>> To what end? The problems with mmap have little to do with adding a
> >>>>> syscall;
> >>>>>> they're about how you do things like communicating I/O errors.
> >>>>> Especially
> >>>>>> when flushing the cache.
> >>>>>>
> >>>>>> Imagine the following setup -- I've imported 9p.io:
> >>>>>>
> >>>>>>       9fs 9pio
> >>>>>>
> >>>>>> and then I map a file from it:
> >>>>>>
> >>>>>>       mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
> >>>>>>
> >>>>>> Now, I want to write something into the file:
> >>>>>>
> >>>>>>       *mapped = 1234;
> >>>>>>
> >>>>>> The cached version of the page is dirty, so the OS will
> >>>>>> eventually need to flush it back with a 9p Twrite; Let's
> >>>>>> assume that before this happens, the network goes down.
> >>>>>>
> >>>>>> How do you communicate the error with userspace?
> >>>>>
> >>>>> This was just a brainwave but...
> >>>>>
> >>>>> You have a (control) connection with the mmap device to
> >>>>> set up mmap so might as well use it to convey errors!
> >>>>> This device would be strictly local to where a program
> >>>>> runs.
> >>>>>
> >>>>> I'd even consider allowing a separate process to mmap,
> >>>>> by making an address space a first class object. That'd
> >>>>> move more stuff out of the kernel and allow for more
> >>>>> interesting/esoteric uses.
> >>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> >>> <https://9fans.topicbox.com/groups/9fans> + participants
> >>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> >>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> >>> <
> https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mfe8a1e4feddaad7bebb650ea
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 10724 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09  6:06                       ` David Leimbach via 9fans
  2026-01-09 17:13                         ` ron minnich
@ 2026-01-09 17:39                         ` tlaronde
  2026-01-09 19:48                           ` David Leimbach via 9fans
  1 sibling, 1 reply; 39+ messages in thread
From: tlaronde @ 2026-01-09 17:39 UTC (permalink / raw)
  To: 9fans

On Thu, Jan 08, 2026 at 10:06:26PM -0800, David Leimbach via 9fans wrote:
> I?d been impressed by L4.  It?s certainly been deployed pretty broadly.
> 

Was not L4 rewritten in assembly for performance purposes?

T. Laronde

> And it has recursive pagers but ? not sure how that?s used in practice.
> 
> And there are a bunch of variants.
> Sent from my iPhone
> 
> > On Jan 8, 2026, at 9:23?PM, arnold@skeeve.com wrote:
> > 
> > ?I vaguely remember someone being quoted as saying
> > 
> >        Microkernels don't have to be small. They just have to
> >        not do much.
> > 
> > :-)
> > 
> > ron minnich <rminnich@gmail.com> wrote:
> > 
> >> I would not tar the idea of external pagers with the Mach tarbrush. Mach
> >> was pretty much inefficient at everything, including external pagers.
> >> External pagers can work well, when implemented well.
> >> 
> >>> On Thu, Jan 8, 2026 at 8:41?PM Paul Lalonde <paul.a.lalonde@gmail.com>
> >>> wrote:
> >>> 
> >>> Did the same on GPUs/Xeon Phi, including in the texture units.  Very
> >>> useful mechanism for abstracting compute with random access characteristics.
> >>> 
> >>> Paul
> >>> 
> >>>> On Wed, Jan 7, 2026, 1:35?p.m. ron minnich <rminnich@gmail.com> wrote:
> >>>> what we had planned for harvey was a good deal simpler: designate a part
> >>>> of the address space as a "bounce fault to user" space area.
> >>>> 
> >>>> When a page fault in that area occurred, info about the fault was sent to
> >>>> an fd (if  it was opened) or a note handler.
> >>>> 
> >>>> user could could handle the fault or punt, as it saw fit. The fixup was
> >>>> that user mode had to get the data to satisfy the fault, then tell the
> >>>> kernel what to do.
> >>>> 
> >>>> This is much like the 35-years-ago work we did on AIX, called
> >>>> external pagers at the time; or the more recent umap work,
> >>>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
> >>>> 
> >>>> If you go this route, it's a bit less complex than what you are proposing.
> >>>> 
> >>>> On Wed, Jan 7, 2026 at 1:09?PM Bakul Shah via 9fans <9fans@9fans.net>
> >>>> wrote:
> >>>> 
> >>>>> 
> >>>>> 
> >>>>>> On Jan 7, 2026, at 8:41?AM, ori@eigenstate.org wrote:
> >>>>>> 
> >>>>>> Quoth Bakul Shah via 9fans <9fans@9fans.net>:
> >>>>>>> I have this idea that will horrify most of you!
> >>>>>>> 
> >>>>>>> 1. Create an mmap device driver. You ask it to a new file handle
> >>>>> which you use to communicate about memory mapping.
> >>>>>>> 2. If you want to mmap some file, you open it and write its file
> >>>>> descriptor along with other parameters (file offset, base addr, size, mode,
> >>>>> flags) to your mmap file handle.
> >>>>>>> 3. The mmap driver sets up necessary page table entries but doesn't
> >>>>> actually fetch any data before returning from the write.
> >>>>>>> 4. It can asynchronously kick off io requests on your behalf and
> >>>>> fixup page table entries as needed.
> >>>>>>> 5. Page faults in the mmapped area are serviced by making appropriate
> >>>>> read/write calls.
> >>>>>>> 6. Flags can be used to indicate read-ahead or write-behind for
> >>>>> typical serial access.
> >>>>>>> 7. Similarly msync, munmap etc. can be implemented.
> >>>>>>> 
> >>>>>>> In a sneaky way this avoids the need for adding any mmap specific
> >>>>> syscalls! But the underlying work would be mostly similar in either case.
> >>>>>>> 
> >>>>>>> The main benefits of mmap are reduced initial latency , "pay as you
> >>>>> go" cost structure and ease of use. It is certainly more expensive than
> >>>>> reading/writing the same amount of data directly from a program.
> >>>>>>> 
> >>>>>>> No idea how horrible a hack is needed to implement such a thing or
> >>>>> even if it is possible at all but I had to share this ;-)
> >>>>>> 
> >>>>>> To what end? The problems with mmap have little to do with adding a
> >>>>> syscall;
> >>>>>> they're about how you do things like communicating I/O errors.
> >>>>> Especially
> >>>>>> when flushing the cache.
> >>>>>> 
> >>>>>> Imagine the following setup -- I've imported 9p.io:
> >>>>>> 
> >>>>>>       9fs 9pio
> >>>>>> 
> >>>>>> and then I map a file from it:
> >>>>>> 
> >>>>>>       mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
> >>>>>> 
> >>>>>> Now, I want to write something into the file:
> >>>>>> 
> >>>>>>       *mapped = 1234;
> >>>>>> 
> >>>>>> The cached version of the page is dirty, so the OS will
> >>>>>> eventually need to flush it back with a 9p Twrite; Let's
> >>>>>> assume that before this happens, the network goes down.
> >>>>>> 
> >>>>>> How do you communicate the error with userspace?
> >>>>> 
> >>>>> This was just a brainwave but...
> >>>>> 
> >>>>> You have a (control) connection with the mmap device to
> >>>>> set up mmap so might as well use it to convey errors!
> >>>>> This device would be strictly local to where a program
> >>>>> runs.
> >>>>> 
> >>>>> I'd even consider allowing a separate process to mmap,
> >>>>> by making an address space a first class object. That'd
> >>>>> move more stuff out of the kernel and allow for more
> >>>>> interesting/esoteric uses.
> >>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> >>> <https://9fans.topicbox.com/groups/9fans> + participants
> >>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> >>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> >>> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f>
> >>> 

-- 
        Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M8bc645f9949b2eb1466815cb
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
  2026-01-09 17:39                         ` tlaronde
@ 2026-01-09 19:48                           ` David Leimbach via 9fans
  0 siblings, 0 replies; 39+ messages in thread
From: David Leimbach via 9fans @ 2026-01-09 19:48 UTC (permalink / raw)
  To: 9fans; +Cc: 9fans


Sent from my iPhone

> On Jan 9, 2026, at 9:57 AM, tlaronde@kergis.com wrote:
> 
> On Thu, Jan 08, 2026 at 10:06:26PM -0800, David Leimbach via 9fans wrote:
>> I?d been impressed by L4.  It?s certainly been deployed pretty broadly.
>> 
> 
> Was not L4 rewritten in assembly for performance purposes?
> 
> T. Laronde

The opposite. L3 was assembly. L4 is basically a spec now.  Many implementations. C and C++ (and Haskell I believe)

My point was you can do recursive paging.  It has been shown.


>> And it has recursive pagers but ? not sure how that?s used in practice.
>> 
>> And there are a bunch of variants.
>> Sent from my iPhone
>> 
>>>> On Jan 8, 2026, at 9:23?PM, arnold@skeeve.com wrote:
>>> 
>>> ?I vaguely remember someone being quoted as saying
>>> 
>>>       Microkernels don't have to be small. They just have to
>>>       not do much.
>>> 
>>> :-)
>>> 
>>> ron minnich <rminnich@gmail.com> wrote:
>>> 
>>>> I would not tar the idea of external pagers with the Mach tarbrush. Mach
>>>> was pretty much inefficient at everything, including external pagers.
>>>> External pagers can work well, when implemented well.
>>>> 
>>>>> On Thu, Jan 8, 2026 at 8:41?PM Paul Lalonde <paul.a.lalonde@gmail.com>
>>>>> wrote:
>>>>> 
>>>>> Did the same on GPUs/Xeon Phi, including in the texture units.  Very
>>>>> useful mechanism for abstracting compute with random access characteristics.
>>>>> 
>>>>> Paul
>>>>> 
>>>>>> On Wed, Jan 7, 2026, 1:35?p.m. ron minnich <rminnich@gmail.com> wrote:
>>>>>> what we had planned for harvey was a good deal simpler: designate a part
>>>>>> of the address space as a "bounce fault to user" space area.
>>>>>> 
>>>>>> When a page fault in that area occurred, info about the fault was sent to
>>>>>> an fd (if  it was opened) or a note handler.
>>>>>> 
>>>>>> user could could handle the fault or punt, as it saw fit. The fixup was
>>>>>> that user mode had to get the data to satisfy the fault, then tell the
>>>>>> kernel what to do.
>>>>>> 
>>>>>> This is much like the 35-years-ago work we did on AIX, called
>>>>>> external pagers at the time; or the more recent umap work,
>>>>>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC.
>>>>>> 
>>>>>> If you go this route, it's a bit less complex than what you are proposing.
>>>>>> 
>>>>>> On Wed, Jan 7, 2026 at 1:09?PM Bakul Shah via 9fans <9fans@9fans.net>
>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jan 7, 2026, at 8:41?AM, ori@eigenstate.org wrote:
>>>>>>>> 
>>>>>>>> Quoth Bakul Shah via 9fans <9fans@9fans.net>:
>>>>>>>>> I have this idea that will horrify most of you!
>>>>>>>>> 
>>>>>>>>> 1. Create an mmap device driver. You ask it to a new file handle
>>>>>>> which you use to communicate about memory mapping.
>>>>>>>>> 2. If you want to mmap some file, you open it and write its file
>>>>>>> descriptor along with other parameters (file offset, base addr, size, mode,
>>>>>>> flags) to your mmap file handle.
>>>>>>>>> 3. The mmap driver sets up necessary page table entries but doesn't
>>>>>>> actually fetch any data before returning from the write.
>>>>>>>>> 4. It can asynchronously kick off io requests on your behalf and
>>>>>>> fixup page table entries as needed.
>>>>>>>>> 5. Page faults in the mmapped area are serviced by making appropriate
>>>>>>> read/write calls.
>>>>>>>>> 6. Flags can be used to indicate read-ahead or write-behind for
>>>>>>> typical serial access.
>>>>>>>>> 7. Similarly msync, munmap etc. can be implemented.
>>>>>>>>> 
>>>>>>>>> In a sneaky way this avoids the need for adding any mmap specific
>>>>>>> syscalls! But the underlying work would be mostly similar in either case.
>>>>>>>>> 
>>>>>>>>> The main benefits of mmap are reduced initial latency , "pay as you
>>>>>>> go" cost structure and ease of use. It is certainly more expensive than
>>>>>>> reading/writing the same amount of data directly from a program.
>>>>>>>>> 
>>>>>>>>> No idea how horrible a hack is needed to implement such a thing or
>>>>>>> even if it is possible at all but I had to share this ;-)
>>>>>>>> 
>>>>>>>> To what end? The problems with mmap have little to do with adding a
>>>>>>> syscall;
>>>>>>>> they're about how you do things like communicating I/O errors.
>>>>>>> Especially
>>>>>>>> when flushing the cache.
>>>>>>>> 
>>>>>>>> Imagine the following setup -- I've imported 9p.io:
>>>>>>>> 
>>>>>>>>      9fs 9pio
>>>>>>>> 
>>>>>>>> and then I map a file from it:
>>>>>>>> 
>>>>>>>>      mapped = mmap("/n/9pio/plan9/lib/words", OWRITE);
>>>>>>>> 
>>>>>>>> Now, I want to write something into the file:
>>>>>>>> 
>>>>>>>>      *mapped = 1234;
>>>>>>>> 
>>>>>>>> The cached version of the page is dirty, so the OS will
>>>>>>>> eventually need to flush it back with a 9p Twrite; Let's
>>>>>>>> assume that before this happens, the network goes down.
>>>>>>>> 
>>>>>>>> How do you communicate the error with userspace?
>>>>>>> 
>>>>>>> This was just a brainwave but...
>>>>>>> 
>>>>>>> You have a (control) connection with the mmap device to
>>>>>>> set up mmap so might as well use it to convey errors!
>>>>>>> This device would be strictly local to where a program
>>>>>>> runs.
>>>>>>> 
>>>>>>> I'd even consider allowing a separate process to mmap,
>>>>>>> by making an address space a first class object. That'd
>>>>>>> move more stuff out of the kernel and allow for more
>>>>>>> interesting/esoteric uses.
>>>>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
>>>>> <https://9fans.topicbox.com/groups/9fans> + participants
>>>>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
>>>>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
>>>>> <https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f>
>>>>> 
> 
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>              http://www.kergis.com/
>             http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mae35de7f0d27935dac0dd51b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2026-01-09 20:07 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-02 19:54 [9fans] venti /plan9port mmapped wb.kloke
2026-01-02 20:39 ` ori
2026-01-02 20:58   ` Bakul Shah via 9fans
2026-01-06 22:59     ` Ron Minnich
2026-01-07  4:27       ` Noam Preil
2026-01-07  6:15       ` Shawn Rutledge
2026-01-07 15:46         ` Persistent memory (was Re: [9fans] venti /plan9port mmapped) arnold
2026-01-07 16:11           ` Noam Preil
2026-01-07 17:26             ` Wes Kussmaul
2026-01-07  8:52       ` [9fans] venti /plan9port mmapped wb.kloke
2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
2026-01-07 16:40           ` Noam Preil
2026-01-07 16:41           ` ori
2026-01-07 20:35             ` Bakul Shah via 9fans
2026-01-07 21:31               ` ron minnich
2026-01-08  7:56                 ` arnold
2026-01-08 10:31                 ` wb.kloke
2026-01-09  0:02                   ` ron minnich
2026-01-09  3:57                 ` Paul Lalonde
2026-01-09  5:10                   ` ron minnich
2026-01-09  5:18                     ` arnold
2026-01-09  6:06                       ` David Leimbach via 9fans
2026-01-09 17:13                         ` ron minnich
2026-01-09 17:39                         ` tlaronde
2026-01-09 19:48                           ` David Leimbach via 9fans
2026-01-07 21:40               ` ori
2026-01-07 16:52           ` ori
2026-01-07 17:37             ` wb.kloke
2026-01-07 17:46               ` Noam Preil
2026-01-07 17:56                 ` wb.kloke
2026-01-07 18:07                   ` Noam Preil
2026-01-07 18:58                     ` wb.kloke
2026-01-07 14:57       ` Thaddeus Woskowiak
2026-01-07 16:07         ` Wes Kussmaul
2026-01-07 16:22           ` Noam Preil
2026-01-07 17:31             ` Wes Kussmaul
2026-01-07 16:13         ` Noam Preil
2026-01-02 21:01   ` ori
2026-01-08 15:59     ` wb.kloke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).