From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4B71017C.8080505@magma.com.ni> Date: Tue, 9 Feb 2010 07:32:28 +0100 From: Georg Lehner User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Subject: [9fans] walk and find again Topicbox-Message-UUID: d0cfcaf2-ead5-11e9-9d60-3106f5b1d025 walk, find, locate and friends try to cope with exploring filesystem metadata at breadth and length effectively, efficiently and with controlled time/space consumption. Proposal: walkfs(4) (or finds, or indexfs, or sphinx, or ...) walkfs serves a filesystem tree similar to network devices. A 'clone' file is used to create new connection to the metadata database. Each connection subdirectory contains the following files (and maybe more): root, data, ctl, size, count, stat, metadata. Writing a string "subdir" to the 'root' file (re)starts a walk through the designated subdirectory. Each read from the 'data' file returns the next path found. EOF indicates that the walk has terminated. Query constraints and walking indications can be written to the 'ctl' files as attribute=value pairs. Example: "user=glenda traversal=depth depth=3 type=file mode=u=r". The ctl message "mode=sync" indicates that the walker thread writing to the data file and the reading process synchornize on each found path. The message "mode=async" allows the walker thread to run ahead of the reader. The message "mode=stat" does not write any found path to the data file, but just updates size and count (see below). The stat file indicates the current status of traversal. If it is "eof" the count file holds the number of found files and the stat size file holds the total size in bytes of all found files. While in traversal (stat is "walking") the size and count files hold the totals up to the moment. The metadata file holds the metadata corresponding to the current path in 'data' in "attribute=value" format. If walkfs has been called with a cache options, writing a path inside root to data, sets the metadata to the respective values. walkfs options: -c size indicates the size of the memory buffer or cache to be used by walkfs -p path filename of a persistent cache/metadata database. When walkfs is started with -p information may be outdated, ctl messages are used to update the metadata database -s channel a fileserver can write update messages to channel, when changing metadata of the filesystem it serves. walkfs then updates its metadata database. One usage scenario of walkfs is to implement find, du, walk, rdup and the like. Another usage schenario of walkfs, with the -s option, is to add file indexing to a fileserver. Regards, Jorge-Le�n