9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Georg Lehner <jorge-plan9@magma.com.ni>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: [9fans] walk and find again
Date: Tue,  9 Feb 2010 07:32:28 +0100	[thread overview]
Message-ID: <4B71017C.8080505@magma.com.ni> (raw)

walk, find, locate and friends try to cope with exploring filesystem
metadata at breadth
and length effectively, efficiently and with controlled time/space
consumption.

Proposal: walkfs(4) (or finds, or indexfs, or sphinx, or ...)

walkfs serves a filesystem tree similar to network devices.  A 'clone'
file is used to create new connection to the metadata database.  Each
connection subdirectory contains the following files (and maybe more):
root, data, ctl, size, count, stat, metadata.

Writing a string "subdir" to the 'root' file (re)starts a walk through
the designated
subdirectory. Each read from the 'data' file returns the next path
found. EOF indicates
that the walk has terminated.

Query constraints and walking indications can be written to the 'ctl'
files as
attribute=value pairs.  Example:
"user=glenda traversal=depth depth=3 type=file mode=u=r".
The ctl message "mode=sync" indicates that the walker thread writing to
the data file
and the reading process synchornize on each found path.  The message
"mode=async" allows the walker thread to run ahead of the reader.  The
message
"mode=stat" does not write any found path to the data file, but just
updates size and
count (see below).

The stat file indicates the current status of traversal. If it is "eof"
the count file holds
the number of found files and the stat size file holds the total size in
bytes of all found files. While in traversal (stat is "walking") the
size and count files hold the totals up to the moment.

The metadata file holds the metadata corresponding to the current path
in 'data' in
"attribute=value" format.  If walkfs has been called with a cache
options, writing a path inside root to data, sets the metadata to the
respective values.

walkfs options:
-c size      indicates the size of  the memory buffer or cache to be
used by walkfs
-p path     filename of a persistent cache/metadata database.  When
walkfs is started
                with -p information may be outdated,  ctl messages are
used to update the
                metadata database
-s channel
                a fileserver can write update messages to channel, when
changing
                metadata of the filesystem it serves. walkfs then
updates its metadata
                database.

One usage scenario of walkfs is to implement find, du, walk, rdup and
the like.
Another usage schenario of walkfs, with the -s option, is to add file
indexing to a
fileserver.

Regards,

    Jorge-Le�n




             reply	other threads:[~2010-02-09  6:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-09  6:32 Georg Lehner [this message]
2010-02-09 15:06 ` erik quanstrom
2010-02-09 21:27   ` Georg Lehner
2010-02-09 21:33     ` erik quanstrom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B71017C.8080505@magma.com.ni \
    --to=jorge-plan9@magma.com.ni \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).