9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] walk and find again
@ 2010-02-09  6:32 Georg Lehner
  2010-02-09 15:06 ` erik quanstrom
  0 siblings, 1 reply; 4+ messages in thread
From: Georg Lehner @ 2010-02-09  6:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

walk, find, locate and friends try to cope with exploring filesystem
metadata at breadth
and length effectively, efficiently and with controlled time/space
consumption.

Proposal: walkfs(4) (or finds, or indexfs, or sphinx, or ...)

walkfs serves a filesystem tree similar to network devices.  A 'clone'
file is used to create new connection to the metadata database.  Each
connection subdirectory contains the following files (and maybe more):
root, data, ctl, size, count, stat, metadata.

Writing a string "subdir" to the 'root' file (re)starts a walk through
the designated
subdirectory. Each read from the 'data' file returns the next path
found. EOF indicates
that the walk has terminated.

Query constraints and walking indications can be written to the 'ctl'
files as
attribute=value pairs.  Example:
"user=glenda traversal=depth depth=3 type=file mode=u=r".
The ctl message "mode=sync" indicates that the walker thread writing to
the data file
and the reading process synchornize on each found path.  The message
"mode=async" allows the walker thread to run ahead of the reader.  The
message
"mode=stat" does not write any found path to the data file, but just
updates size and
count (see below).

The stat file indicates the current status of traversal. If it is "eof"
the count file holds
the number of found files and the stat size file holds the total size in
bytes of all found files. While in traversal (stat is "walking") the
size and count files hold the totals up to the moment.

The metadata file holds the metadata corresponding to the current path
in 'data' in
"attribute=value" format.  If walkfs has been called with a cache
options, writing a path inside root to data, sets the metadata to the
respective values.

walkfs options:
-c size      indicates the size of  the memory buffer or cache to be
used by walkfs
-p path     filename of a persistent cache/metadata database.  When
walkfs is started
                with -p information may be outdated,  ctl messages are
used to update the
                metadata database
-s channel
                a fileserver can write update messages to channel, when
changing
                metadata of the filesystem it serves. walkfs then
updates its metadata
                database.

One usage scenario of walkfs is to implement find, du, walk, rdup and
the like.
Another usage schenario of walkfs, with the -s option, is to add file
indexing to a
fileserver.

Regards,

    Jorge-Le�n




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] walk and find again
  2010-02-09  6:32 [9fans] walk and find again Georg Lehner
@ 2010-02-09 15:06 ` erik quanstrom
  2010-02-09 21:27   ` Georg Lehner
  0 siblings, 1 reply; 4+ messages in thread
From: erik quanstrom @ 2010-02-09 15:06 UTC (permalink / raw)
  To: 9fans

> One usage scenario of walkfs is to implement find, du, walk, rdup and
> the like. Another usage [scenario] of walkfs, with the -s option, is to add file
> indexing to a
> fileserver.

this seems more complicated than a straightforward
non-fileserver based implementation.  why do you
need a fileserver for this?

- erik



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] walk and find again
  2010-02-09 15:06 ` erik quanstrom
@ 2010-02-09 21:27   ` Georg Lehner
  2010-02-09 21:33     ` erik quanstrom
  0 siblings, 1 reply; 4+ messages in thread
From: Georg Lehner @ 2010-02-09 21:27 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

erik quanstrom wrote:
>> One usage scenario of walkfs is to implement find, du, walk, rdup and
>> the like. Another usage [scenario] of walkfs, with the -s option, is to add file
>> indexing to a
>> fileserver.
>>
>
> this seems more complicated than a straightforward
> non-fileserver based implementation.  why do you
> need a fileserver for this?
>
> - erik
>
>

- Every three months there need to be a discussion about 'find', right ;)
- walkfs can cache/reuse results from previous runs
- arbitrary filesystem indexing and lookup schemes can be implemented
  without changing the frontend-interface.  Consider mime-type or
keyword lookup
- no more hassle with space or other special characters in filenames
- inaccessible parts of the filesystem are just masked out, instead of
returning
  errors.
- ...

frontend tools are simple straightforward rc-scripts, consider:

find:
  get a new walker thread
  write filter to ctl file
  write path to root file
  while !eof {
    path = read data
    do with path whatever has been specified
  }

du -s:
  get a new walker thread
  write path to root file
  write "mode=stat" to the ctl file
  read data (and block until walk is done)
  print count and size


Best Regards,

    Jorge-Le�n




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] walk and find again
  2010-02-09 21:27   ` Georg Lehner
@ 2010-02-09 21:33     ` erik quanstrom
  0 siblings, 0 replies; 4+ messages in thread
From: erik quanstrom @ 2010-02-09 21:33 UTC (permalink / raw)
  To: 9fans

> - walkfs can cache/reuse results from previous runs

that is a bad idea.  caching is just going to cause trouble.

> > - no more hassle with space or other special characters in filenames

what?  if the underlying fs doesn't want to do spaces, you
can't force it.

> > - inaccessible parts of the filesystem are just masked out, instead of
returning errors.

this also seems like a bad idea.

- erik



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-02-09 21:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-09  6:32 [9fans] walk and find again Georg Lehner
2010-02-09 15:06 ` erik quanstrom
2010-02-09 21:27   ` Georg Lehner
2010-02-09 21:33     ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).