9front - general discussion about 9front
 help / color / mirror / Atom feed
From: Shawn Rutledge <lists@ecloud.org>
To: 9front@9front.org
Subject: Disconnection-tolerant / distributed filesystems (was Re: [9front] Enabling a service)
Date: Wed, 8 May 2024 14:17:34 -0700	[thread overview]
Message-ID: <863BD749-CEEC-401E-9762-367B1ABA1367@ecloud.org> (raw)
In-Reply-To: <CAO8UeJfOeZhrQwXgvemrZQ94yVBsfdviH2FkuVTuGef3KE23iw@mail.gmail.com>



> On May 8, 2024, at 9:37 AM, Brian Stuart <blstuart@gmail.com> wrote:
> 
>> Anyway, shout-out to inferno-bls's lapfs.
> 
> I had played with a few different variants on the idea, though all
> quite a while back.  To give an idea of just how far back, I presented
> it at the 4th IWP9 15 years ago:
> 
> https://www.cs.drexel.edu/~bls96/lapfs.pdf
> 
> Maybe it's about time to pull it out of mothballs and do fresh 9native
> and *nix versions.

Cool, I read it just now.

For limbo, not having 64-bit support seems to be an obstacle.  Am I missing something?  I didn’t succeed in getting 9ferno working yet, but isn’t it working in theory?

I’ve been following (and trying to use) ipfs for a while.  It seems that using it as a mounted filesystem was not a priority (those guys seem to care about web applications first, and calling it a filesystem was a stretch at first; perhaps it still is, but I need to test the latest version).  But the core idea of using a storage block’s hash as its ID and building Merkle trees out of those blocks seems like a good way to build a filesystem to me.   A file is then identified by its cumulative hash (i.e. probably the hash of the block that is the head of the metadata pointing to the data blocks).  Then maybe you no longer care about comparing paths and timestamps: if both the laptop and the file server have hash tables to keep track of which files and blocks they’re storing, checking whether a particular file is cached should be very fast and not depend on walking so much.

When planning for a laptop to be away from the file server for a while (and not assuming that network access is always available) I’d want to pre-sync some directories.  On filesystems that have extended attributes, I was thinking it would be nice to just set an attribute ahead of time (sharing peers or channel ID or something) and be able to verify the sync state later, before disconnecting from the file server.  But 9p doesn’t support those, so any automated sync arrangement has to be a separately-configured process.  Currently I use syncthing, but I suppose it does more work than it would need to if the filesystem were designed to make syncing more efficient.  Fanotify is a big win for syncing with less walking on Linux though.  (I used it myself in a syncing experiment.)

But then I was thinking one downside of the ipfs approach is just the difficulty of maintaining a Merkle tree, depending on how you like to do typical file modifications.  Blocks could form an intrusive linked list, using their neighbors’ hashes as pointers: but they’d better not point directly, because it would de-optimize updates everywhere but on one end.  (Imagine what rrdtool would do to it.  If your philosophy is that storage is growing fast enough, why delete anything ever - then a simple log is better.  But rrdtool is very efficient if storage is limited and random writes are cheap.  It might even turn out that some blocks in the database file are stable over some time periods.)  So the index of blocks that make up a file probably should be completely separate.  Still, the index itself takes some blocks; and if the index is a Merkle tree, then every write could potentially invalidate most of it.  (Although a suitable design could make appending cheap, or some other selective optimization.)  The filesystem should be venti-compatible, so minimizing garbage would be best.

Anyway, even if only the actual data blocks are hashed, you have easy venti compatibility and at least somewhat easier checking of what is synced and what is not.  Migrating any amount of data between any number and size of storage devices ought to be easier.

Aren’t filesystems like zfs built on hashing too?  But zfs includes the block layer rather than building on top of a generic block layer.  Explicit layering should yield some benefits, I imagine.


  parent reply	other threads:[~2024-05-08 21:19 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-06 11:32 [9front] Enabling a service Rocky Hotas
2024-05-06 11:58 ` Alex Musolino
2024-05-06 12:43 ` ori
2024-05-06 15:16   ` Scott Flowers
2024-05-06 15:37     ` sirjofri
2024-05-06 16:32     ` Stanley Lieber
2024-05-06 22:18   ` Rocky Hotas
2024-05-06 22:59     ` ori
2024-05-06 23:00     ` ori
2024-05-07  8:22       ` Rocky Hotas
2024-05-07  8:29         ` Frank D. Engel, Jr.
2024-05-07  9:03           ` Rocky Hotas
2024-05-07  9:14         ` sirjofri
2024-05-07 21:11           ` Shawn Rutledge
2024-05-07 21:35             ` Kurt H Maier
2024-05-07 21:45               ` sirjofri
2024-05-07 21:54             ` sl
2024-05-07 21:58               ` sl
2024-05-07 23:15                 ` Lennart Jablonka
2024-05-07 23:16                 ` Shawn Rutledge
2024-05-07 23:45                   ` Shawn Rutledge
2024-05-08  0:34                   ` Kurt H Maier
2024-05-08  0:35                   ` sl
2024-05-08  1:05                     ` Jacob Moody
2024-05-08  1:24                       ` sl
2024-05-08  7:22                         ` hiro
2024-05-08 14:04                           ` Stanley Lieber
2024-05-08 12:08                         ` Stuart Morrow
2024-05-08 16:37                           ` Brian Stuart
2024-05-08 20:16                             ` hiro
2024-05-08 21:26                               ` Stuart Morrow
2024-05-08 21:17                             ` Shawn Rutledge [this message]
2024-05-08 14:25                         ` Jacob Moody
2024-05-08  3:41                       ` Ori Bernstein
2024-05-08  4:09                         ` sl
2024-05-08  8:39                           ` Frank D. Engel, Jr.
2024-05-08 14:17                             ` Jacob Moody
2024-05-08 15:49                               ` Frank D. Engel, Jr.
2024-05-08 16:10                                 ` Jacob Moody
2024-05-08 16:33                                   ` Frank D. Engel, Jr.
2024-05-08 17:27                                     ` Jacob Moody
2024-05-08 18:00                                       ` Steve Simon
2024-05-08 19:46                                         ` hiro
2024-05-08 19:46                                   ` Roberto E. Vargas Caballero
2024-05-08 20:34                                     ` tlaronde
2024-05-08 14:57                   ` Lucas Francesco
2024-05-08 15:10                     ` an2qzavok
2024-05-08  2:11             ` Thaddeus Woskowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=863BD749-CEEC-401E-9762-367B1ABA1367@ecloud.org \
    --to=lists@ecloud.org \
    --cc=9front@9front.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).