9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Dan Cross <crossd@gmail.com>
To: 9fans <9fans@9fans.net>
Subject: Re: mmaping on plan9? (was Re: [9fans] venti /plan9port mmapped
Date: Thu, 12 Feb 2026 20:36:10 -0500	[thread overview]
Message-ID: <CAEoi9W525QzT5_K1m6fntPmXVknAuYq1=1CU19G2r5EiDU0Vew@mail.gmail.com> (raw)
In-Reply-To: <17708854590.f343ba.4455@composer.9fans.topicbox.com>

On Thu, Feb 12, 2026 at 6:44 AM Alyssa M via 9fans <9fans@9fans.net> wrote:
> > On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote:
> > > On Wed, Feb 11, 2026 at 11:08 PM Alyssa M via 9fans <9fans@9fans.net> wrote:
> > >
> > > > On Wednesday, February 11, 2026, at 10:01 AM, hiro wrote:
> > > > what concrete problem are you trying to solve?
> > >
> > > Making software simpler to write, I think.
> >
> > I don't understand that. If the interface doesn't change, how is it simpler?
>
> Think of a program that reads a file completely into memory, pokes at it a
> bit sparsely then writes the whole file out again. This is simple if the file is small.
> If the file gets big, you might start looking around for ways to not do all that I/O,
> and pretty soon you have a buffer cache implementation. So the program is now
> more complex. Not only is there a buffer cache implementation, but you have to
> use it everywhere, rather than just operating on memory.

I don't see how this really works; in particular, the semantics of
read/write are simply different from those of `mmap`.  In the former
case, to read a file into memory, I have to know how big it is (I can
just stat it) and then I have to allocate memory to hold its contents,
and then I expect `read` to copy the contents of the file into the
memory region I just allocated.  Note that there is a "happens before"
relationship between allocating memory and then reading the contents
of the file into that memory.  With mapping a file into virtual
memory, I'm simultaneously allocating address space _and_ arranging
things so that accesses to that region of address space correspond go
parts of the mapped file.

You seem to be proposing a model that somehow pushes enough smarts
into `read` to combine the two, as in the `mmap` case; but how does
that work from a day-to-day programming perspective?  Suppose I go and
allocate a bunch of memory, and then immediately stream a bunch of
data from /dev/random and write it into that memory; the contents of
each page are thus random, and now there's no good way for the VM
system to do anything clever like acknowledge success but not _really_
allocate until I demand fault it in by accessing it (I already did by
scribbling all over it), nor can it do something like say, "oh, these
bits are all zeros; I'll just map this to a single global zero page
and trap stores and CoW", since the contents are random, not uniform.

Now, with these preconditions set, I go to `read` a big file into that
memory: what should the system do?

_An_ argument is that it should just discard the prior contents, since
they are logically overwritten by the contents of the file, anyway.
But that's not general: you aren't guaranteed that the your buffer
you're reading into is properly aligned to do a bunch of page mapping
shenanigans.  Read doesn't care: it just copies bytes, but pages of
memory are both sized and aligned: `mmap` returns a pointer aligned to
a page boundary, and requests for fixed mappings enforce this and will
fail if given a non-aligned offset.

But also, suppose that instead of one big read, I do something like:
`loop ... { seek(fd, 1, 1); read(fd, p + 1, 4093); p += 4093; }` to
copy into this region of memory I've mangled. Now you've got to deal
with access patterns that mix pre-existing data with data newly copied
from the file.  "Well, copy part of the file contents into a newly
allocated page..." might be an answer there, but that's not
substantially different than what `read` does today, so what's the
differentiator?

> This is when mmap starts to look appealing.

The key to making a good argument here is first acknowledging that the
whole model for working with data is just fundamentally different with
`mmap` as it is with `read`. You really can't treat them as the same.

Let me be blunt: the `mmap` interface, as specified in 4.2BSD and
implemented in a bunch of Unix and Unix-like systems, is atrocious.
Its roots come from a system that was radically different in design
than Unix, and its baroque design, with a bunch of operations
multiplexed onto a single call with 6 (!!) arguments, two of which are
bitmaps that interact in abstruse ways and one of which can radically
alter the semantics of the call, really shows. I believe that it _is_
possible to do better. But shoehorning the model of memory-mapped IO
into an overloaded `read` is not it.

        - Dan C.


> On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote:
>
> The other [use] is to map the contents of a file into an address space, so that you can treat them like memory, without first reading them from an actual file. This is useful for large but sparse read-only data files: I don't need to read the entire thing into physical memory; for that matter if may not even fit into physical memory. But if I mmap it, and just copy the bits I need, then those can be faulted into the address space on demand.
>
>
> So what I'm suggesting is that instead of the programmer making an mmap call, they should make a single read call to read the entire file into the address space - as they did before. The new read implementation would do this, but as a memory mapped snapshot. This looks no different to the programmer from how reads have always worked, it just happens very quickly, because no I/O actually happens.
> The snapshot data is brought in by demand paging as it is touched, and pages may get dirtied.
>
> When the programmer would otherwise call msync, they instead write out the entire file back where it came from - as they did before. The write implementation will recognise when it's overwriting the file where the snapshot came from and will only write the dirty pages - which is effectively what msync does.
>
> So from the programmer's point of view this is exactly what they've always done. The implementation uses c-o-w snapshots and demand paging which have the performance of mmap, but provide the conventional semantics of read and write.
>
> Programs can handle larger files faster without having to change.
> It's just an optimisation in the read/write implementation.
>
> So that's the idea. Is it practical? I don't know... It's certainly harder to do.
>
> One difference with mmap is that dirty pages don't get back to the file by themselves. You have to do the writes. But I think there may be ways to address this.
>
> On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote:
>
> The problem is, those aren't the right analogues for the file metaphor. `mmap` is closer to `open` than to `read`
>
> In the sense that mmap creates an association between pages and the file and munmap undoes that, yes. With the idea above the page association is with snapshots and is a bit more ephemeral, and I don't know yet how much it matters if it persists after it's no longer needed. Pages are disassociated from snapshots naturally by being dirtied, by being associated with something else or perhaps by memory being deallocated. It may be somewhat like file deletion. Sometimes when it's 'gone' it's not really gone until the last user lets go. I don't think it's a problem for the process, but it may be for the file system in some situations.
>
>
> 9fans / 9fans / see discussions + participants + delivery options Permalink

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M67e7be4c741cd85745124418
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

  parent reply	other threads:[~2026-02-13  4:16 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-02 19:54 [9fans] venti /plan9port mmapped wb.kloke
2026-01-02 20:39 ` ori
2026-01-02 20:58   ` Bakul Shah via 9fans
2026-01-06 22:59     ` Ron Minnich
2026-01-07  4:27       ` Noam Preil
2026-01-07  6:15       ` Shawn Rutledge
2026-01-07 15:46         ` Persistent memory (was Re: [9fans] venti /plan9port mmapped) arnold
2026-01-07 16:11           ` Noam Preil
2026-01-07 17:26             ` Wes Kussmaul
2026-01-07  8:52       ` [9fans] venti /plan9port mmapped wb.kloke
2026-01-07 16:30         ` mmaping on plan9? (was " Bakul Shah via 9fans
2026-01-07 16:40           ` Noam Preil
2026-01-07 16:41           ` ori
2026-01-07 20:35             ` Bakul Shah via 9fans
2026-01-07 21:31               ` ron minnich
2026-01-08  7:56                 ` arnold
2026-01-08 10:31                 ` wb.kloke
2026-01-09  0:02                   ` ron minnich
2026-01-09  3:57                 ` Paul Lalonde
2026-01-09  5:10                   ` ron minnich
2026-01-09  5:18                     ` arnold
2026-01-09  6:06                       ` David Leimbach via 9fans
2026-01-09 17:13                         ` ron minnich
2026-01-09 17:39                         ` tlaronde
2026-01-09 19:48                           ` David Leimbach via 9fans
2026-02-05 21:30                             ` Alyssa M via 9fans
2026-02-08 14:18                               ` Ethan Azariah
2026-02-08 15:10                                 ` Alyssa M via 9fans
2026-02-08 20:43                                   ` Ethan Azariah
2026-02-09  1:35                                     ` ron minnich
2026-02-09 15:23                                       ` ron minnich
2026-02-09 17:13                                         ` Bakul Shah via 9fans
2026-02-09 21:38                                           ` ron minnich
2026-02-10 10:13                                         ` Alyssa M via 9fans
2026-02-11  1:43                                           ` Ron Minnich
2026-02-11  2:19                                           ` Bakul Shah via 9fans
2026-02-11  3:21                                           ` Ori Bernstein
2026-02-11 10:01                                             ` hiro
2026-02-12  1:36                                               ` Dan Cross
2026-02-12  5:39                                                 ` Alyssa M via 9fans
2026-02-12  9:08                                                   ` hiro via 9fans
2026-02-12 13:34                                                   ` Alyssa M via 9fans
2026-02-13 13:48                                                     ` hiro
2026-02-13 17:21                                                     ` ron minnich
2026-02-15 16:12                                                       ` Danny Wilkins via 9fans
2026-02-17  3:13                                                         ` Alyssa M via 9fans
2026-02-17 13:02                                                           ` Dan Cross
2026-02-17 16:00                                                             ` ron minnich
2026-02-17 16:39                                                               ` hiro
2026-02-17 16:56                                                             ` Bakul Shah via 9fans
2026-02-17 17:54                                                               ` hiro
2026-02-17 22:21                                                               ` Alyssa M via 9fans
2026-02-16  2:24                                                       ` Alyssa M via 9fans
2026-02-16  3:17                                                         ` Ori Bernstein
2026-02-16 10:55                                                           ` Frank D. Engel, Jr.
2026-02-16 13:49                                                             ` Ori Bernstein
2026-02-16 19:40                                                           ` Bakul Shah via 9fans
2026-02-16 19:43                                                             ` Bakul Shah via 9fans
2026-02-16  9:50                                                         ` tlaronde
2026-02-16 12:24                                                         ` hiro via 9fans
2026-02-16 12:33                                                         ` hiro via 9fans
2026-02-11 14:22                                             ` Dan Cross
2026-02-11 18:44                                               ` Ori Bernstein
2026-02-12  1:22                                                 ` Dan Cross
2026-02-12  4:26                                                   ` Ori Bernstein
2026-02-12  4:34                                                     ` Dan Cross
2026-02-12  3:12                                             ` Alyssa M via 9fans
2026-02-12  4:52                                               ` Dan Cross
2026-02-12  8:37                                                 ` Alyssa M via 9fans
2026-02-12 12:37                                                   ` hiro via 9fans
2026-02-13  1:36                                                   ` Dan Cross [this message]
2026-02-14  3:35                                                     ` Alyssa M via 9fans
2026-02-14 14:26                                                       ` Dan Cross
2026-02-15  4:34                                                   ` Bakul Shah via 9fans
2026-02-15 10:19                                                     ` hiro
2026-02-10 16:49                                         ` wb.kloke
2026-02-08 14:08                             ` Ethan Azariah
2026-01-07 21:40               ` ori
2026-01-07 16:52           ` ori
2026-01-07 17:37             ` wb.kloke
2026-01-07 17:46               ` Noam Preil
2026-01-07 17:56                 ` wb.kloke
2026-01-07 18:07                   ` Noam Preil
2026-01-07 18:58                     ` wb.kloke
2026-01-07 14:57       ` Thaddeus Woskowiak
2026-01-07 16:07         ` Wes Kussmaul
2026-01-07 16:22           ` Noam Preil
2026-01-07 17:31             ` Wes Kussmaul
2026-01-07 16:13         ` Noam Preil
2026-01-02 21:01   ` ori
2026-01-08 15:59     ` wb.kloke
2026-02-11 23:19       ` red

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEoi9W525QzT5_K1m6fntPmXVknAuYq1=1CU19G2r5EiDU0Vew@mail.gmail.com' \
    --to=crossd@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).