* [9front] Questions/concerns on gefs
@ 2024-08-15 9:04 Timothy Robert Bednarzyk
2024-08-15 9:21 ` Steve Simon
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Timothy Robert Bednarzyk @ 2024-08-15 9:04 UTC (permalink / raw)
To: 9front
Before I get into the questions I have, let me briefly describe my
setup. I currently have a 9front (CPU+AUTH+File) server running on real
hardware that I plan to primarily use for keeping my (mostly media)
files in one place. I ended up performing the install with gefs, and I
haven't had any problems with that in the three weeks since. Anyways, I
also have four 12 TB SATA HDDs that are being used in a pseudo-RAID 10
setup having mostly followed the example in fs(3) except, notably, with
gefs instead of hjfs. In terms of stability, I have not had any problems
copying around 12 TB to those drives and with deleting about a TB from
them. I also have a 4 TB NVMe SSD that I use for storing some videos
with high bitrates; this drive also has a gefs filesystem. I use socat,
tlsclient, and 9pfs (which I had to slightly modify) for viewing files
on my Linux devices and primarily use drawterm for everything else
related to the server (although I have just connected it straight to a
monitor and keyboard). I doubt that anything else regarding my
particular setup is relevant to my questions, but if any more
information is desired, just ask.
Having read https://orib.dev/gefs.html, I know that ori's intent is to
one day implement RAID features as a part of gefs itself. I realize that
the following questions can't truly be answered until the time comes,
but I just want to gauge what my future may hold. When RAID-like
features are eventually added to gefs, is there any intent to provide
some way to migrate from an fs(3) setup to a gefs-native setup? If not,
is it possible that I might be able to set up two of my four drives with
gefs-native RAID 1, copy files from my existing gefs filesystem (after
altering my fs(3) setup to only interleave between my other two drives),
and then afterwards modify my new gefs filesystem to have a RAID 10
setup across all four drives (potentially after manually mirroring the
two drives on the new filesystem to the two drives on the old)? Again, I
fully realize that it's not _really_ possible to answer these
hypothetical questions before the work to add RAID features to gefs is
done, and I would be perfectly content with "I don't know" as an answer.
While I have not had any _issues_ using gefs so far, I have some
concerns regarding performance. When I was copying files to my pseudo-
RAID 10 fs, the write speeds seemed about reasonable (I was copying from
several USB HDDs plugged directly into the server and from the
aforementioned NVMe SSD so that I could replace its ext4 filesystem with
gefs, and while I didn't really profile the actual speeds, I estimate
them to have been around 100 MiB/s based on the total size and the time
it took to copy from each drive). However, when I was copying files from
the pseudo-RAID 10 fs to the NVMe SSD, I noticed that it took somewhere
between 3-4 times longer than the other way around (when the NVMe SSD
was still using ext4). Additionally, it took about the same amount of
time to delete those same files from the pseudo-RAID 10 fs later.
Lastly, I tried moving (rather, cp followed by rm) a ~1 GB file from one
directory on the pseudo-RAID 10 fs to another and that ended up taking a
while. Shouldn't cp to different locations on the same drive be near-
instant since gefs is COW? And shouldn't rm also be near-instant since
all it needs to do is update the filesystem itself?
--
trb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-15 9:04 [9front] Questions/concerns on gefs Timothy Robert Bednarzyk
@ 2024-08-15 9:21 ` Steve Simon
2024-08-15 9:42 ` Stuart Morrow
2024-08-15 14:22 ` ori
2 siblings, 0 replies; 10+ messages in thread
From: Steve Simon @ 2024-08-15 9:21 UTC (permalink / raw)
To: 9front
i apologies if this is out of date, however in my (old) plan9 system cp is single threaded, whilst fcp is multithreaded. if you want to do copy performance tests fcp is a better tool to try.
this does not explain why rm appears slow, though perhaps having rm block until the filesystem is stable is rather quite a good idea - rm being rare and filesystem integrity being important.
-Steve
> On 15 Aug 2024, at 12:04 pm, Timothy Robert Bednarzyk <mail@trbsw.dev> wrote:
>
> Before I get into the questions I have, let me briefly describe my
> setup. I currently have a 9front (CPU+AUTH+File) server running on real
> hardware that I plan to primarily use for keeping my (mostly media)
> files in one place. I ended up performing the install with gefs, and I
> haven't had any problems with that in the three weeks since. Anyways, I
> also have four 12 TB SATA HDDs that are being used in a pseudo-RAID 10
> setup having mostly followed the example in fs(3) except, notably, with
> gefs instead of hjfs. In terms of stability, I have not had any problems
> copying around 12 TB to those drives and with deleting about a TB from
> them. I also have a 4 TB NVMe SSD that I use for storing some videos
> with high bitrates; this drive also has a gefs filesystem. I use socat,
> tlsclient, and 9pfs (which I had to slightly modify) for viewing files
> on my Linux devices and primarily use drawterm for everything else
> related to the server (although I have just connected it straight to a
> monitor and keyboard). I doubt that anything else regarding my
> particular setup is relevant to my questions, but if any more
> information is desired, just ask.
>
> Having read https://orib.dev/gefs.html, I know that ori's intent is to
> one day implement RAID features as a part of gefs itself. I realize that
> the following questions can't truly be answered until the time comes,
> but I just want to gauge what my future may hold. When RAID-like
> features are eventually added to gefs, is there any intent to provide
> some way to migrate from an fs(3) setup to a gefs-native setup? If not,
> is it possible that I might be able to set up two of my four drives with
> gefs-native RAID 1, copy files from my existing gefs filesystem (after
> altering my fs(3) setup to only interleave between my other two drives),
> and then afterwards modify my new gefs filesystem to have a RAID 10
> setup across all four drives (potentially after manually mirroring the
> two drives on the new filesystem to the two drives on the old)? Again, I
> fully realize that it's not _really_ possible to answer these
> hypothetical questions before the work to add RAID features to gefs is
> done, and I would be perfectly content with "I don't know" as an answer.
>
> While I have not had any _issues_ using gefs so far, I have some
> concerns regarding performance. When I was copying files to my pseudo-
> RAID 10 fs, the write speeds seemed about reasonable (I was copying from
> several USB HDDs plugged directly into the server and from the
> aforementioned NVMe SSD so that I could replace its ext4 filesystem with
> gefs, and while I didn't really profile the actual speeds, I estimate
> them to have been around 100 MiB/s based on the total size and the time
> it took to copy from each drive). However, when I was copying files from
> the pseudo-RAID 10 fs to the NVMe SSD, I noticed that it took somewhere
> between 3-4 times longer than the other way around (when the NVMe SSD
> was still using ext4). Additionally, it took about the same amount of
> time to delete those same files from the pseudo-RAID 10 fs later.
> Lastly, I tried moving (rather, cp followed by rm) a ~1 GB file from one
> directory on the pseudo-RAID 10 fs to another and that ended up taking a
> while. Shouldn't cp to different locations on the same drive be near-
> instant since gefs is COW? And shouldn't rm also be near-instant since
> all it needs to do is update the filesystem itself?
>
> --
> trb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-15 9:04 [9front] Questions/concerns on gefs Timothy Robert Bednarzyk
2024-08-15 9:21 ` Steve Simon
@ 2024-08-15 9:42 ` Stuart Morrow
2024-08-15 14:22 ` ori
2 siblings, 0 replies; 10+ messages in thread
From: Stuart Morrow @ 2024-08-15 9:42 UTC (permalink / raw)
To: 9front
On Thu, 15 Aug 2024 at 10:04, Timothy Robert Bednarzyk <mail@trbsw.dev> wrote:
> some way to migrate from an fs(3) setup to a gefs-native setup? If not,
If nothing else you could copy it all with delorean (cinap's contrib)
and replica/cphist (quanstro's contrib). But it's pretty impossible
for that to be an in-place upgrade.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-15 9:04 [9front] Questions/concerns on gefs Timothy Robert Bednarzyk
2024-08-15 9:21 ` Steve Simon
2024-08-15 9:42 ` Stuart Morrow
@ 2024-08-15 14:22 ` ori
2024-08-15 14:47 ` ori
2024-08-26 4:45 ` James Cook
2 siblings, 2 replies; 10+ messages in thread
From: ori @ 2024-08-15 14:22 UTC (permalink / raw)
To: 9front
Quoth Timothy Robert Bednarzyk <mail@trbsw.dev>:
>
> Again, I fully realize that it's not _really_ possible to answer these
> hypothetical questions before the work to add RAID features to gefs is
> done, and I would be perfectly content with "I don't know" as an answer.
I don't know.
> While I have not had any _issues_ using gefs so far, I have some
> concerns regarding performance. When I was copying files to my pseudo-
> RAID 10 fs, the write speeds seemed about reasonable (I was copying from
> several USB HDDs plugged directly into the server and from the
> aforementioned NVMe SSD so that I could replace its ext4 filesystem with
> gefs, and while I didn't really profile the actual speeds, I estimate
> them to have been around 100 MiB/s based on the total size and the time
> it took to copy from each drive). However, when I was copying files from
> the pseudo-RAID 10 fs to the NVMe SSD, I noticed that it took somewhere
> between 3-4 times longer than the other way around (when the NVMe SSD
> was still using ext4). Additionally, it took about the same amount of
> time to delete those same files from the pseudo-RAID 10 fs later.
> Lastly, I tried moving (rather, cp followed by rm) a ~1 GB file from one
> directory on the pseudo-RAID 10 fs to another and that ended up taking a
> while. Shouldn't cp to different locations on the same drive be near-
> instant since gefs is COW? And shouldn't rm also be near-instant since
> all it needs to do is update the filesystem itself?
>
There's a lot of low hanging performance fruit -- especially for
writing and deletion. Writing currently copies blocks a lot more
than it theoretically needs to, and deletion is O(n) in file size,
when it could theoretically be O(1) with some complicated code.
There's nothing in 9p that allows copy to be smart about COW,
so it isn't.
https://orib.dev/gefs-future.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-15 14:22 ` ori
@ 2024-08-15 14:47 ` ori
2024-08-26 4:45 ` James Cook
1 sibling, 0 replies; 10+ messages in thread
From: ori @ 2024-08-15 14:47 UTC (permalink / raw)
To: 9front
Quoth ori@eigenstate.org:
> Quoth Timothy Robert Bednarzyk <mail@trbsw.dev>:
> >
> > Again, I fully realize that it's not _really_ possible to answer these
> > hypothetical questions before the work to add RAID features to gefs is
> > done, and I would be perfectly content with "I don't know" as an answer.
>
> I don't know.
>
> > While I have not had any _issues_ using gefs so far, I have some
> > concerns regarding performance. When I was copying files to my pseudo-
> > RAID 10 fs, the write speeds seemed about reasonable (I was copying from
> > several USB HDDs plugged directly into the server and from the
> > aforementioned NVMe SSD so that I could replace its ext4 filesystem with
> > gefs, and while I didn't really profile the actual speeds, I estimate
> > them to have been around 100 MiB/s based on the total size and the time
> > it took to copy from each drive). However, when I was copying files from
> > the pseudo-RAID 10 fs to the NVMe SSD, I noticed that it took somewhere
> > between 3-4 times longer than the other way around (when the NVMe SSD
> > was still using ext4). Additionally, it took about the same amount of
> > time to delete those same files from the pseudo-RAID 10 fs later.
> > Lastly, I tried moving (rather, cp followed by rm) a ~1 GB file from one
> > directory on the pseudo-RAID 10 fs to another and that ended up taking a
> > while. Shouldn't cp to different locations on the same drive be near-
> > instant since gefs is COW? And shouldn't rm also be near-instant since
> > all it needs to do is update the filesystem itself?
> >
>
> There's a lot of low hanging performance fruit -- especially for
> writing and deletion. Writing currently copies blocks a lot more
> than it theoretically needs to, and deletion is O(n) in file size,
> when it could theoretically be O(1) with some complicated code.
>
> There's nothing in 9p that allows copy to be smart about COW,
> so it isn't.
>
> https://orib.dev/gefs-future.html
>
also -- current priority is getting enough miles
on it to be sure there aren't any serious bugs, space
leaks, or other big issues; optimization comes once I'm
happy enough with it to show as an option in the installer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-15 14:22 ` ori
2024-08-15 14:47 ` ori
@ 2024-08-26 4:45 ` James Cook
2024-08-26 7:43 ` Steve Simon
1 sibling, 1 reply; 10+ messages in thread
From: James Cook @ 2024-08-26 4:45 UTC (permalink / raw)
To: 9front
>There's nothing in 9p that allows copy to be smart about COW,
>so it isn't.
It may be worth noting cp could nonetheless be made (mostly) COW
by being clever about noticing recently-read data being written
back. Hammer2 does this. At [0]: "... This means that typical
situations such as when copying files or whole directory hierarchies
will naturally de-duplicate. Simply reading filesystem data in makes
it available for deduplication later."
I have no idea whether that would be appropriate for gefs.
--
James
[0] http://apollo.backplane.com/DFlyMisc/hammer2.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-26 4:45 ` James Cook
@ 2024-08-26 7:43 ` Steve Simon
2024-08-26 8:10 ` Frank D. Engel, Jr.
0 siblings, 1 reply; 10+ messages in thread
From: Steve Simon @ 2024-08-26 7:43 UTC (permalink / raw)
To: 9front
[-- Attachment #1.1: Type: text/html, Size: 4591 bytes --]
[-- Attachment #1.2: openqnx.png --]
[-- Type: image/png, Size: 63506 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-26 7:43 ` Steve Simon
@ 2024-08-26 8:10 ` Frank D. Engel, Jr.
2024-08-26 8:26 ` Alex Musolino
2024-08-26 14:36 ` ron minnich
0 siblings, 2 replies; 10+ messages in thread
From: Frank D. Engel, Jr. @ 2024-08-26 8:10 UTC (permalink / raw)
To: 9front
[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]
One way to handle that would be to extend 9p to include a duplicate
command of some sort. Then the utility would simply note that the file
is being copied to the same file server it is already located on and ask
that file server to duplicate it. If the file server does not support
the command it reports an error back and cp falls back on its existing
behavior.
That prevents the issue of cp needing to be familiar with the individual
file servers and minimizes the required changes, while still allowing
for the possibility of taking advantage of the more advanced
functionality being proposed.
On 8/26/24 03:43, Steve Simon wrote:
>
> sorry to be negative, but i think this is not a good idea.
>
> copying directories does happen but it rare.
>
> cp is small, neat, and clean. adding special optimisations for some
> backends/filesystems would be a mistake. don’t forget cp is often used
> with filesystems other than the main backend file store, whatever that is.
>
> an example of how badly this can go can be seen in the qnx source
> where every utility knows how to handle every fileserver:
>
> openqnx.png
> openqnx/trunk/utils/c/cp/cp.c at master · vocho/openqnx
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
> github.com
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
>
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
>
> i would suggest keep it simple.
>
> -Steve
>
>
>
>> On 26 Aug 2024, at 5:47 am, James Cook <falsifian@falsifian.org> wrote:
>>
>>
>>> There's nothing in 9p that allows copy to be smart about COW,
>>> so it isn't.
>>
>> It may be worth noting cp could nonetheless be made (mostly) COW by
>> being clever about noticing recently-read data being written back.
>> Hammer2 does this. At [0]: "... This means that typical situations
>> such as when copying files or whole directory hierarchies will
>> naturally de-duplicate. Simply reading filesystem data in makes it
>> available for deduplication later."
>>
>> I have no idea whether that would be appropriate for gefs.
>>
>> --
>> James
>>
>> [0] http://apollo.backplane.com/DFlyMisc/hammer2.txt
[-- Attachment #2.1: Type: text/html, Size: 7644 bytes --]
[-- Attachment #2.2: openqnx.png --]
[-- Type: image/png, Size: 63506 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-26 8:10 ` Frank D. Engel, Jr.
@ 2024-08-26 8:26 ` Alex Musolino
2024-08-26 14:36 ` ron minnich
1 sibling, 0 replies; 10+ messages in thread
From: Alex Musolino @ 2024-08-26 8:26 UTC (permalink / raw)
To: 9front
> One way to handle that would be to extend 9p to include a duplicate
> command of some sort. Then the utility would simply note that the file
> is being copied to the same file server it is already located on and ask
> that file server to duplicate it. If the file server does not support
> the command it reports an error back and cp falls back on its existing
> behavior.
And then you have to extend the system call table so that cp and other
programs can acually do this kind of thing. Not going to happen, I
don't think.
In any case, I don't think James was suggesting that cp would actually
have to be modified but rather that, like Hammer2, a filesystem could
deduplicate incoming writes (from cp(1) or wherever) against any data
it had already loaded off the disk(s).
--
Cheers,
Alex Musolino
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [9front] Questions/concerns on gefs
2024-08-26 8:10 ` Frank D. Engel, Jr.
2024-08-26 8:26 ` Alex Musolino
@ 2024-08-26 14:36 ` ron minnich
1 sibling, 0 replies; 10+ messages in thread
From: ron minnich @ 2024-08-26 14:36 UTC (permalink / raw)
To: 9front
[-- Attachment #1.1: Type: text/plain, Size: 2856 bytes --]
suppose you are cp'ing a directory from a single place to a place that is
backed by several mount points, i.e. the destination is backed by several
servers. Suppose, further, that the source is coming from multiple servers.
Each 9p server knows only of itself; it may well think it can copy /bin to
/bin, when in fact your /bin is full of bind mounts, and your idea of /bin,
and the servers idea of /bin, are very different.
This is the kind of corner case that is rare, but difficult.
On Mon, Aug 26, 2024 at 1:14 AM Frank D. Engel, Jr. <fde101@fjrhome.net>
wrote:
> One way to handle that would be to extend 9p to include a duplicate
> command of some sort. Then the utility would simply note that the file is
> being copied to the same file server it is already located on and ask that
> file server to duplicate it. If the file server does not support the
> command it reports an error back and cp falls back on its existing behavior.
>
> That prevents the issue of cp needing to be familiar with the individual
> file servers and minimizes the required changes, while still allowing for
> the possibility of taking advantage of the more advanced functionality
> being proposed.
>
>
> On 8/26/24 03:43, Steve Simon wrote:
>
>
> sorry to be negative, but i think this is not a good idea.
>
> copying directories does happen but it rare.
>
> cp is small, neat, and clean. adding special optimisations for some
> backends/filesystems would be a mistake. don’t forget cp is often used with
> filesystems other than the main backend file store, whatever that is.
>
> an example of how badly this can go can be seen in the qnx source where
> every utility knows how to handle every fileserver:
>
> [image: openqnx.png]
>
> openqnx/trunk/utils/c/cp/cp.c at master · vocho/openqnx
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
> github.com
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
> <https://github.com/vocho/openqnx/blob/master/trunk/utils/c/cp/cp.c>
>
> i would suggest keep it simple.
>
> -Steve
>
>
>
> On 26 Aug 2024, at 5:47 am, James Cook <falsifian@falsifian.org>
> <falsifian@falsifian.org> wrote:
>
>
>
> There's nothing in 9p that allows copy to be smart about COW,
>
> so it isn't.
>
>
> It may be worth noting cp could nonetheless be made (mostly) COW by being
> clever about noticing recently-read data being written back. Hammer2 does
> this. At [0]: "... This means that typical situations such as when copying
> files or whole directory hierarchies will naturally de-duplicate. Simply
> reading filesystem data in makes it available for deduplication later."
>
> I have no idea whether that would be appropriate for gefs.
>
> --
> James
>
> [0] http://apollo.backplane.com/DFlyMisc/hammer2.txt
>
>
[-- Attachment #1.2: Type: text/html, Size: 6845 bytes --]
[-- Attachment #2: openqnx.png --]
[-- Type: image/png, Size: 63506 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-08-26 14:38 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-15 9:04 [9front] Questions/concerns on gefs Timothy Robert Bednarzyk
2024-08-15 9:21 ` Steve Simon
2024-08-15 9:42 ` Stuart Morrow
2024-08-15 14:22 ` ori
2024-08-15 14:47 ` ori
2024-08-26 4:45 ` James Cook
2024-08-26 7:43 ` Steve Simon
2024-08-26 8:10 ` Frank D. Engel, Jr.
2024-08-26 8:26 ` Alex Musolino
2024-08-26 14:36 ` ron minnich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).