* [TUHS] Re: SCCS, TeamWare, BitKeeper, and Git
@ 2024-12-15 20:48 Douglas McIlroy
2024-12-15 20:57 ` Larry McVoy
2024-12-15 23:05 ` [TUHS] Re: mmap, was " John Levine
0 siblings, 2 replies; 16+ messages in thread
From: Douglas McIlroy @ 2024-12-15 20:48 UTC (permalink / raw)
To: TUHS main list
> well after Unix had fledged, its developers at CSRC found it necessary
> and/or desirable to borrow back a Multics concept: they named it mmap().
As far as I know no Research version of Unix ever had mmap.
Multics had a segmented universal memory. A process incorporated
segments into its address space The universal memory was normally
addressed via a hierachical segment-name directory. With enhancement
to provide for multisegment "files", the directory could serve as a file
system and file I/O became data transfer between segments.
Unix originally imitated the Multics file system, but not the universal
memory. mmap(2) weakly imitates universal memory by allowing a process
to nominally incorporate a portion of a file into the process address space
at page-level granularity. However, an update is guaranteed to be visible
to the file and other processes only upon specific request.
Does anyone know whether there are implementations of mmap that
do transparent file sharing? It seems to me that should be possible by
making the buffer cache share pages with mmapping processes.
Doug
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: SCCS, TeamWare, BitKeeper, and Git
2024-12-15 20:48 [TUHS] Re: SCCS, TeamWare, BitKeeper, and Git Douglas McIlroy
@ 2024-12-15 20:57 ` Larry McVoy
2024-12-15 23:05 ` [TUHS] Re: mmap, was " John Levine
1 sibling, 0 replies; 16+ messages in thread
From: Larry McVoy @ 2024-12-15 20:57 UTC (permalink / raw)
To: Douglas McIlroy; +Cc: TUHS main list
On Sun, Dec 15, 2024 at 03:48:40PM -0500, Douglas McIlroy wrote:
> Unix originally imitated the Multics file system, but not the universal
> memory. mmap(2) weakly imitates universal memory by allowing a process
> to nominally incorporate a portion of a file into the process address space
> at page-level granularity. However, an update is guaranteed to be visible
> to the file and other processes only upon specific request.
That's not true with Sun's mmap(). It's coherent across process boundaries
and it's in sync with read()/write(). This is because Sun got rid of the
buffer cache and _only_ did file IO to/from the page cache. You mapped
the actual page into your address space, if one process wrote it, the
other process will see it.
> Does anyone know whether there are implementations of mmap that
> do transparent file sharing? It seems to me that should be possible by
> making the buffer cache share pages with mmapping processes.
SunOS 4.0 did what I believe you are asking.
http://mcvoy.com/lm/papers/SunOS.vm_arch.pdf
http://mcvoy.com/lm/papers/SunOS.vm_impl.pdf
--
---
Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-15 20:48 [TUHS] Re: SCCS, TeamWare, BitKeeper, and Git Douglas McIlroy
2024-12-15 20:57 ` Larry McVoy
@ 2024-12-15 23:05 ` John Levine
1 sibling, 0 replies; 16+ messages in thread
From: John Levine @ 2024-12-15 23:05 UTC (permalink / raw)
To: tuhs; +Cc: douglas.mcilroy
It appears that Douglas McIlroy <douglas.mcilroy@dartmouth.edu> said:
>Unix originally imitated the Multics file system, but not the universal
>memory. mmap(2) weakly imitates universal memory by allowing a process
>to nominally incorporate a portion of a file into the process address space
>at page-level granularity. However, an update is guaranteed to be visible
>to the file and other processes only upon specific request.
>
>Does anyone know whether there are implementations of mmap that
>do transparent file sharing? It seems to me that should be possible by
>making the buffer cache share pages with mmapping processes.
These days they all do. The POSIX rationale says:
A memory object can be concurrently mapped into the address space of one or more
processes. The mmap( ) and munmap( ) functions allow a process to manipulate their address
space by mapping portions of memory objects into it and removing them from it. When
multiple processes map the same memory object, they can share access to the underlying
data.
"Memory object" includes disk files.
There are MAP_SHARED and MAP_PRIVATE flags that say whether changes are written
through or copy-on-write to local private pages.
R's,
John
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 22:43 ` John Levine
@ 2024-12-16 23:19 ` Chet Ramey via TUHS
0 siblings, 0 replies; 16+ messages in thread
From: Chet Ramey via TUHS @ 2024-12-16 23:19 UTC (permalink / raw)
To: John Levine, tuhs
[-- Attachment #1.1: Type: text/plain, Size: 622 bytes --]
On 12/16/24 5:43 PM, John Levine wrote:
>> Here are the current (POSIX.1-2024) versions of the appropriate definitions
>> and the mmap() description, for those who want to read along.
>
> Well, that's confusing. That web site seems to have the same text as the
> IEEE's PDF but organized differently.
Sure, it's the Open Group. The advantage is it's public and on the web.
(Plus I always have a browser tab open on it.)
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 19:45 ` Chet Ramey via TUHS
@ 2024-12-16 22:43 ` John Levine
2024-12-16 23:19 ` Chet Ramey via TUHS
0 siblings, 1 reply; 16+ messages in thread
From: John Levine @ 2024-12-16 22:43 UTC (permalink / raw)
To: tuhs
It appears that Chet Ramey via TUHS <chet.ramey@case.edu> said:
>-=-=-=-=-=-
>-=-=-=-=-=-
>On 12/16/24 2:08 PM, John Levine wrote:
>> It appears that Douglas McIlroy <douglas.mcilroy@dartmouth.edu> said:
>>> [Weasel words of my own: I have not read the POSIX definition of mmap.]
>>
>> We're in luck, I have it right here. On pages 529-530:
>
>Here are the current (POSIX.1-2024) versions of the appropriate definitions
>and the mmap() description, for those who want to read along.
Well, that's confusing. That web site seems to have the same text as the
IEEE's PDF but organized differently.
>> 2.8.3.2 Memory Mapped Files
>
>Is that from the Rationale?
No, like I said it's on pages 529-530 of the PDF version of IEEE 1003.1-2024
which I downloaded from the IEEE.
Your web site has a copy here
https://pubs.opengroup.org/onlinepubs/9799919799/functions/V2_chap02.html#tag_16_08
R's,
John
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 15:40 Douglas McIlroy
` (2 preceding siblings ...)
2024-12-16 19:08 ` John Levine
@ 2024-12-16 22:19 ` Bakul Shah via TUHS
3 siblings, 0 replies; 16+ messages in thread
From: Bakul Shah via TUHS @ 2024-12-16 22:19 UTC (permalink / raw)
To: Douglas McIlroy; +Cc: TUHS main list
On Dec 16, 2024, at 7:40 AM, Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote:
>
>
>> ... When multiple processes map the same memory object, they can
>> share access to the underlying data.
>
> Notice the weasel word "can". It is not guaranteed that they will do so
> automatically without delay. Apparently each process may have a physically
> distinct copy of the data, not shared access to a single location.
May be this has to do with using POSIX on multiprocessor systems that
are *not* cache coherent? May be less of an issue for now[1] though but
weren't there such systems in the past?
In any case shared access by itself is not enough for proper operation.
Bakul
[1] Given very high clock rates and increasing # of cores, I wonder
how long before they start using CSMA/CD for coherency protocols!
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 20:58 ` Clem Cole
@ 2024-12-16 21:50 ` Steffen Nurpmeso
0 siblings, 0 replies; 16+ messages in thread
From: Steffen Nurpmeso @ 2024-12-16 21:50 UTC (permalink / raw)
To: Clem Cole; +Cc: Douglas McIlroy, TUHS main list
Clem Cole wrote in
<CAC20D2OzCzf=zpGaPzNW-_tF6AEsw-EWc8kGnL_usYLuvdCh0A@mail.gmail.com>:
|On Mon, Dec 16, 2024 at 2:40 PM Steffen Nurpmeso <steffen@sdaoden.eu> \
|wrote:
|> They are all lowercase nowadays.
|>
|They were always case-sensitive to use. I used upper case to make it stand
|out in my message.
Ah, ok. In the last years i am reading lots of IETF outcome, and
for them this makes a world of a difference.
|>|have try dig up the troff sources to an early draft of the original and
|> do
|>|a "grep -i should * | wc -l" through it. I bet the number is very small
|>|and "can" is zero.
|>
|> They offer lots of hints like "can be implemented efficiently" or
|> "conforming implementations cannot count on", which turns "can"
|> into a "dying butterfly". (In the rationale's.)
|>
|It always say. *"conforming implementations shall not count on."*
|The standard has to be as precise as it can be. If there are
|implementation grey areas, then the standard needs to state that too.
|
|Again - we were careful when we wrote original documents (and had lots of
|arguments as you can imagine).
|
| As for what you read, I'm not worried. The fact the over time, standards
|stop being what they were originally intended and take a life of their
|own. POSIX is less of an issue than say, C++ or even FORTRAN for that
|matter. But over time, a new group of people want to "make their mark."
..i would even put on top that if someone does *not* want to do
that, they react irritated. "Too many marks", at least in the
IETF world.
|Some of us grew tired of arguing. We got what we originally set out to do,
|and I left the POSIX work soon after the *.2 began. I worked through one
|draft of it, but was not there for approval, although I helped with *.4 at
|one point.
|
|Someone implementing something new, be it a language or a system, should be
|familiar with the standards. They is a lot be learned both good and bad.
|Reinventing things, particularly bad ideas, is hardly for the greater good
|(although the implementor might find it fun). e.g. C++, IMO, is a great
Wise words of an, please excuse that, older man. It is surely
nothing but true, but it will not quell the overflowing hormons of
the young warriors, nor false prowd, nor .. etc etc etc. I do not
know. One must not prevent young people from making their own
experiences for sure, at best they can be guided a bit to not run
into fatality dead-ends, maybe.
|example of what >>not<< to do. The core POSIX.1 interface, think is
|excellent and gets to the point. After that, YMMV and the value you get
|from it also is a bit variable. But thinking the world should be stagnant
|and believing that C or UNIX (or modern Unix, a.k.a. Linux) is "the end" is
|hardly a good idea either.
The holy men of India even do nothing at all! Except walking to
the fountain of the holy river. And already in sight, the water
drag dies into the ocean. You know, i have no idea regarding
technology, it surely will iterate further. Surely the computer
of for example space odysee was created after long talks with wise
men, in the spirit of Lidlicker etc. The big picture.
But regarding "the end", i for myself do not like the Rust
programning language, regardless what you say.
|So, reading and learning about them does have some merit. The question is
|how much and how far to go.
It is surely more than only merit. To the opposite, one should
not drown in deep respect aka be slained due to the high
intellectual penetration of a document like the POSIX standard,
which is in the works for fourty years and has seen hundreds of
highly educated people. (And at much later times, the one or the
other idiot.)
I, by the way, would expect the traditional answer being "learn
the abc/koran/calligraphy/.. studiously while you are young, then
gain the mastership and pleasure through living experience when
you are older". As well as, and that is very important in my
humble opinion, "if it doesn't come naturally, leave it".
Ie, books are one thing, living it is another.
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
|
|And in Fall, feel "The Dropbear Bard"s ball(s).
|
|The banded bear
|without a care,
|Banged on himself for e'er and e'er
|
|Farewell, dear collar bear
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
[not found] ` <Z2CAuymbU9HkF38q@kib.kiev.ua>
@ 2024-12-16 21:08 ` John R Levine
0 siblings, 0 replies; 16+ messages in thread
From: John R Levine @ 2024-12-16 21:08 UTC (permalink / raw)
To: Konstantin Belousov; +Cc: tuhs, douglas.mcilroy
On Mon, 16 Dec 2024, Konstantin Belousov wrote:
> On Mon, Dec 16, 2024 at 02:08:43PM -0500, John Levine wrote:
>> PS: I can believe there are some versions of linux that screwed up disk cache
>> coherency, but that just means they don't properly implement the spec, not for
>> the first time. I mean, it's not *that* hard to make all the maps point to the
>> same physical page frame, even on a machine like POWER with reverse page maps.
>
> This is not enough. There are (were ?) architectures, typically with the
> virtually addressed caches, which require all mappings of the same page
> to be suitably aligned, at least. ...
>
> If addresses of different mappings are not aligned, caches were not coherent.
I think we're in "so don't do that" territory. mmap() normally lets the
system pick the memory address to map so it can pick something suitably
aligned. You can pass the MAP_FIXED flag to tell it to map at a
particular address, but it can return EINVAL if the address doesn't work.
The POSIX description says "The use of MAP_FIXED is discouraged, as it may
prevent an implementation from making the most effective use of
resources."
It's not always trivial to make this work. On systems with reverse maps,
a physical page can only be mapped to one virtual address at a time, so
for shared pages it has to mark all of the aliases nonresident and on a
fault remap the page into the map of the process that is running. But
it's not rocket science, either.
R's,
John
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 19:40 ` Steffen Nurpmeso
2024-12-16 19:47 ` Steffen Nurpmeso
@ 2024-12-16 20:58 ` Clem Cole
2024-12-16 21:50 ` Steffen Nurpmeso
1 sibling, 1 reply; 16+ messages in thread
From: Clem Cole @ 2024-12-16 20:58 UTC (permalink / raw)
To: Clem Cole, Douglas McIlroy, TUHS main list
[-- Attachment #1: Type: text/plain, Size: 2205 bytes --]
On Mon, Dec 16, 2024 at 2:40 PM Steffen Nurpmeso <steffen@sdaoden.eu> wrote:
>
>
> They are all lowercase nowadays.
>
They were always case-sensitive to use. I used upper case to make it stand
out in my message.
>
> |have try dig up the troff sources to an early draft of the original and
> do
> |a "grep -i should * | wc -l" through it. I bet the number is very small
> |and "can" is zero.
>
> They offer lots of hints like "can be implemented efficiently" or
> "conforming implementations cannot count on", which turns "can"
> into a "dying butterfly". (In the rationale's.)
>
It always say. *"conforming implementations shall not count on."*
The standard has to be as precise as it can be. If there are
implementation grey areas, then the standard needs to state that too.
Again - we were careful when we wrote original documents (and had lots of
arguments as you can imagine).
As for what you read, I'm not worried. The fact the over time, standards
stop being what they were originally intended and take a life of their
own. POSIX is less of an issue than say, C++ or even FORTRAN for that
matter. But over time, a new group of people want to "make their mark."
Some of us grew tired of arguing. We got what we originally set out to do,
and I left the POSIX work soon after the *.2 began. I worked through one
draft of it, but was not there for approval, although I helped with *.4 at
one point.
Someone implementing something new, be it a language or a system, should be
familiar with the standards. They is a lot be learned both good and bad.
Reinventing things, particularly bad ideas, is hardly for the greater good
(although the implementor might find it fun). e.g. C++, IMO, is a great
example of what >>not<< to do. The core POSIX.1 interface, think is
excellent and gets to the point. After that, YMMV and the value you get
from it also is a bit variable. But thinking the world should be stagnant
and believing that C or UNIX (or modern Unix, a.k.a. Linux) is "the end" is
hardly a good idea either.
So, reading and learning about them does have some merit. The question is
how much and how far to go.
Clem
ᐧ
[-- Attachment #2: Type: text/html, Size: 5359 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 19:40 ` Steffen Nurpmeso
@ 2024-12-16 19:47 ` Steffen Nurpmeso
2024-12-16 20:58 ` Clem Cole
1 sibling, 0 replies; 16+ messages in thread
From: Steffen Nurpmeso @ 2024-12-16 19:47 UTC (permalink / raw)
To: Clem Cole; +Cc: Douglas McIlroy, TUHS main list
Steffen Nurpmeso wrote in
<20241216194043.PqDb-or7@steffen%sdaoden.eu>:
|Clem Cole wrote in
| <CAC20D2PfMx=tox0NmLYCi_7345BSr8yoCQZTRVspS8YxAawYdg@mail.gmail.com>:
||On Mon, Dec 16, 2024 at 10:40 AM Douglas McIlroy <
||douglas.mcilroy@dartmouth.edu> wrote:
||
||>>> Does anyone know whether there are implementations of mmap that
||>>> do transparent file sharing? It seems to me that should be possible by
||>>> making the buffer cache share pages with mmapping processes.
||>
||>> These days they all do. The POSIX rationale says:
||>
||>> ... When multiple processes map the same memory object, they can
||>> share access to the underlying data.
||>
||> Notice the weasel word "can". It is not guaranteed that they will do so
||> automatically without delay. Apparently each process may have a \
||> physical\
||> ly
||> distinct copy of the data, not shared access to a single location.
I think it was in FreeBSD when sed(1) once got an optimization to
mmap(1) data, but it was reverted because of crashes caused by
concurrent modifications. (There definetely was something like
this, but FreeBSD / sed is nothing but what i remember.)
But there is no weasel word in the real standard text, but lots of
"shall". The only "can" there are
There may be implementation-defined limits on the number of
memory regions that can be mapped (per process or per system).
and
If such a limit is imposed, whether the number of memory regions
that can be mapped by a process is decreased by the use of
shmat( ) is implementation-defined.
which are no "weasel-can's".
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
|
|And in Fall, feel "The Dropbear Bard"s ball(s).
|
|The banded bear
|without a care,
|Banged on himself for e'er and e'er
|
|Farewell, dear collar bear
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 19:08 ` John Levine
@ 2024-12-16 19:45 ` Chet Ramey via TUHS
2024-12-16 22:43 ` John Levine
[not found] ` <Z2CAuymbU9HkF38q@kib.kiev.ua>
1 sibling, 1 reply; 16+ messages in thread
From: Chet Ramey via TUHS @ 2024-12-16 19:45 UTC (permalink / raw)
To: tuhs
[-- Attachment #1.1: Type: text/plain, Size: 940 bytes --]
On 12/16/24 2:08 PM, John Levine wrote:
> It appears that Douglas McIlroy <douglas.mcilroy@dartmouth.edu> said:
>> [Weasel words of my own: I have not read the POSIX definition of mmap.]
>
> We're in luck, I have it right here. On pages 529-530:
Here are the current (POSIX.1-2024) versions of the appropriate definitions
and the mmap() description, for those who want to read along.
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_200
https://pubs.opengroup.org/onlinepubs/9799919799/functions/mmap.html#tag_17_345
>
> 2.8.3.2 Memory Mapped Files
Is that from the Rationale? Here's the current version.
https://pubs.opengroup.org/onlinepubs/9799919799/xrat/V4_xsh_chap01.html#tag_22_02_08_13
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 16:09 ` Clem Cole
@ 2024-12-16 19:40 ` Steffen Nurpmeso
2024-12-16 19:47 ` Steffen Nurpmeso
2024-12-16 20:58 ` Clem Cole
0 siblings, 2 replies; 16+ messages in thread
From: Steffen Nurpmeso @ 2024-12-16 19:40 UTC (permalink / raw)
To: Clem Cole; +Cc: Douglas McIlroy, TUHS main list
Clem Cole wrote in
<CAC20D2PfMx=tox0NmLYCi_7345BSr8yoCQZTRVspS8YxAawYdg@mail.gmail.com>:
|On Mon, Dec 16, 2024 at 10:40 AM Douglas McIlroy <
|douglas.mcilroy@dartmouth.edu> wrote:
|
|>>> Does anyone know whether there are implementations of mmap that
|>>> do transparent file sharing? It seems to me that should be possible by
|>>> making the buffer cache share pages with mmapping processes.
|>
|>> These days they all do. The POSIX rationale says:
|>
|>> ... When multiple processes map the same memory object, they can
|>> share access to the underlying data.
|>
|> Notice the weasel word "can". It is not guaranteed that they will do so
|> automatically without delay. Apparently each process may have a physical\
|> ly
|> distinct copy of the data, not shared access to a single location.
|>
|Hmmm — how did that make it through the IEEE editing process? However, that
|is not part of the standard if it's in the Rationale - just an explanation
|of the standard - which I want want look at more carefully to see what it
|says about what is required of mmap(2).
|
|That said, as Heinz and the rest of us who worked on the original version
|of what would come to POSIX can tell you, the word "can" was always a
|no-no. The word "SHALL" is the the official IEEE word, and if you use
|"SHOULD," even in the rationale and it was very much frowned upon. I'll
They are all lowercase nowadays.
|have try dig up the troff sources to an early draft of the original and do
|a "grep -i should * | wc -l" through it. I bet the number is very small
|and "can" is zero.
They offer lots of hints like "can be implemented efficiently" or
"conforming implementations cannot count on", which turns "can"
into a "dying butterfly". (In the rationale's.)
|That said, it has been a few years since I have read a current draft, but
|less the rationale section (which is not the standard itself).
The 2024 version is 4057 pages, all in all. It would -- in my
opinion -- make a nice discussion whether it would be better for
anybody, including engineers, to take the time to read that amount
of cultural foundation, real literature that is, instead of
a technical standard. (I admit i personally perform standard
hopping and have at times some spare time for other things,
because if i fail it currently endangers noone for one, and when
i fall i stand up anyway, unless i do no more, but then having
lived in mindsets++ of fantastic literats.)
|Clem
--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
|
|And in Fall, feel "The Dropbear Bard"s ball(s).
|
|The banded bear
|without a care,
|Banged on himself for e'er and e'er
|
|Farewell, dear collar bear
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 15:40 Douglas McIlroy
2024-12-16 16:09 ` Clem Cole
2024-12-16 19:02 ` Theodore Ts'o
@ 2024-12-16 19:08 ` John Levine
2024-12-16 19:45 ` Chet Ramey via TUHS
[not found] ` <Z2CAuymbU9HkF38q@kib.kiev.ua>
2024-12-16 22:19 ` Bakul Shah via TUHS
3 siblings, 2 replies; 16+ messages in thread
From: John Levine @ 2024-12-16 19:08 UTC (permalink / raw)
To: tuhs; +Cc: douglas.mcilroy
It appears that Douglas McIlroy <douglas.mcilroy@dartmouth.edu> said:
>[Weasel words of my own: I have not read the POSIX definition of mmap.]
We're in luck, I have it right here. On pages 529-530:
2.8.3.2 Memory Mapped Files
Range memory mapping operations are defined in terms of pages. Implementations may
restrict the size and alignment of range mappings to be on page-size boundaries. The page size,
in bytes, is the value of the configurable system variable {PAGESIZE}. If an implementation has
no restrictions on size or alignment, it may specify a 1-byte page size.
Memory mapped files provide a mechanism that allows a process to access files by directly
incorporating file data into its address space. Once a file is mapped into a process address space,
the data can be manipulated as memory. If more than one process maps a file, its contents are
shared among them. If the mappings allow shared write access, then data written into the
memory object through the address space of one process appears in the address spaces of all
processes that similarly map the same portion of the memory object.
The mmap() man page on debian says this, which seems clear enough:
MAP_SHARED
Share this mapping. Updates to the mapping are visible to other processes mapping the same region, and (in the case of file-backed mappings) are carried through to
the underlying file. (To precisely control when updates are carried through to the underlying file requires the use of msync(2).)
The msync() call is a minor fudge that lets you control how briskly changes are
written back to disk. But it seems clear enough that all the processes that have
a page mapped see the same page, and msync is for databases that want to bs sure
that changes are committed to permananent storage (a term in the POSIX spec)
before unlocking or unmapping.
R's,
John
PS: I can believe there are some versions of linux that screwed up disk cache
coherency, but that just means they don't properly implement the spec, not for
the first time. I mean, it's not *that* hard to make all the maps point to the
same physical page frame, even on a machine like POWER with reverse page maps.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 15:40 Douglas McIlroy
2024-12-16 16:09 ` Clem Cole
@ 2024-12-16 19:02 ` Theodore Ts'o
2024-12-16 19:08 ` John Levine
2024-12-16 22:19 ` Bakul Shah via TUHS
3 siblings, 0 replies; 16+ messages in thread
From: Theodore Ts'o @ 2024-12-16 19:02 UTC (permalink / raw)
To: Douglas McIlroy; +Cc: TUHS main list
On Mon, Dec 16, 2024 at 10:40:34AM -0500, Douglas McIlroy wrote:
> >> Does anyone know whether there are implementations of mmap that
> >> do transparent file sharing? It seems to me that should be possible by
> >> making the buffer cache share pages with mmapping processes.
>
> > These days they all do. The POSIX rationale says:
>
> > ... When multiple processes map the same memory object, they can
> > share access to the underlying data.
I don't see this language in the 2024 version of POSIX (IEEE Std 1003.1-2024).
> Notice the weasel word "can". It is not guaranteed that they will do so
> automatically without delay. Apparently each process may have a physically
> distinct copy of the data, not shared access to a single location.
>
> The Linux man page mmap(2), for example, makes it very clear that mmap
> has a cache-coherence problem, at least in that system. The existence
> of msync(2) is a visible symptom of the problem.
Huh? What language are you talking about? msync(2) is about when
dirty data is written to stable storage (e.g., written back out to
disk). It works much like fsync(2).
There is a cache-coherency problem between what's in the buffer/page
cache (Linux has a unified page cache, so file-backed blocks are not
cached in the buffer cache; only the page cache) and O_DIRECT. Some
OS's will tell you that there is no coherency guarantees at all.
Linux will give a best-efforts attempt at coherency which is to say
that before doing an O_DIRECT read, any dirty pages will be written
out to disk before the O_DIRECT read is allowed to proceed, and before
doing an O_DIRECT write, any underlying pages in the page cache will
be invalidated. It is not 100% guaranteed to be coherent, and system
which try to mix simultaneous buffered and O_DIRECT I/O may not
necessarily get what they want.
However, if what you are talking about is multiple processes mmap'ing
the same file, they *will* get the same page in the page cache mapped
into the page tables, which means that writes by one process will be
immediately seen by another process (modulo CPU cache coherency
guarantees, of course; Intel is pretty good; with a weekly consistent
architecture like Alpha, your mileage may vary unless you know what
you are doing and write your assembly code accordingly).
Cheers,
- Ted
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
2024-12-16 15:40 Douglas McIlroy
@ 2024-12-16 16:09 ` Clem Cole
2024-12-16 19:40 ` Steffen Nurpmeso
2024-12-16 19:02 ` Theodore Ts'o
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Clem Cole @ 2024-12-16 16:09 UTC (permalink / raw)
To: Douglas McIlroy; +Cc: TUHS main list
[-- Attachment #1: Type: text/plain, Size: 1608 bytes --]
On Mon, Dec 16, 2024 at 10:40 AM Douglas McIlroy <
douglas.mcilroy@dartmouth.edu> wrote:
> >> Does anyone know whether there are implementations of mmap that
> >> do transparent file sharing? It seems to me that should be possible by
> >> making the buffer cache share pages with mmapping processes.
>
> > These days they all do. The POSIX rationale says:
>
> > ... When multiple processes map the same memory object, they can
> > share access to the underlying data.
>
> Notice the weasel word "can". It is not guaranteed that they will do so
> automatically without delay. Apparently each process may have a physically
> distinct copy of the data, not shared access to a single location.
>
Hmmm — how did that make it through the IEEE editing process? However, that
is not part of the standard if it's in the Rationale - just an explanation
of the standard - which I want want look at more carefully to see what it
says about what is required of mmap(2).
That said, as Heinz and the rest of us who worked on the original version
of what would come to POSIX can tell you, the word "can" was always a
no-no. The word "SHALL" is the the official IEEE word, and if you use
"SHOULD," even in the rationale and it was very much frowned upon. I'll
have try dig up the troff sources to an early draft of the original and do
a "grep -i should * | wc -l" through it. I bet the number is very small
and "can" is zero.
That said, it has been a few years since I have read a current draft, but
less the rationale section (which is not the standard itself).
Clem
Clem
ᐧ
[-- Attachment #2: Type: text/html, Size: 3538 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* [TUHS] Re: mmap, was SCCS, TeamWare, BitKeeper, and Git
@ 2024-12-16 15:40 Douglas McIlroy
2024-12-16 16:09 ` Clem Cole
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Douglas McIlroy @ 2024-12-16 15:40 UTC (permalink / raw)
To: TUHS main list
>> Does anyone know whether there are implementations of mmap that
>> do transparent file sharing? It seems to me that should be possible by
>> making the buffer cache share pages with mmapping processes.
> These days they all do. The POSIX rationale says:
> ... When multiple processes map the same memory object, they can
> share access to the underlying data.
Notice the weasel word "can". It is not guaranteed that they will do so
automatically without delay. Apparently each process may have a physically
distinct copy of the data, not shared access to a single location.
The Linux man page mmap(2), for example, makes it very clear that mmap
has a cache-coherence problem, at least in that system. The existence
of msync(2) is a visible symptom of the problem.
[Weasel words of my own: I have not read the POSIX definition of mmap.]
Doug
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-12-16 23:19 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-15 20:48 [TUHS] Re: SCCS, TeamWare, BitKeeper, and Git Douglas McIlroy
2024-12-15 20:57 ` Larry McVoy
2024-12-15 23:05 ` [TUHS] Re: mmap, was " John Levine
2024-12-16 15:40 Douglas McIlroy
2024-12-16 16:09 ` Clem Cole
2024-12-16 19:40 ` Steffen Nurpmeso
2024-12-16 19:47 ` Steffen Nurpmeso
2024-12-16 20:58 ` Clem Cole
2024-12-16 21:50 ` Steffen Nurpmeso
2024-12-16 19:02 ` Theodore Ts'o
2024-12-16 19:08 ` John Levine
2024-12-16 19:45 ` Chet Ramey via TUHS
2024-12-16 22:43 ` John Levine
2024-12-16 23:19 ` Chet Ramey via TUHS
[not found] ` <Z2CAuymbU9HkF38q@kib.kiev.ua>
2024-12-16 21:08 ` John R Levine
2024-12-16 22:19 ` Bakul Shah via TUHS
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).