[9fans] text database Kirara

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] text database Kirara
@ 2013-08-06  1:14 arisawa
  2013-08-06  7:26 ` Francisco J Ballesteros
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: arisawa @ 2013-08-06  1:14 UTC (permalink / raw)
  To: 9fans

Hello 9fans,

I have written a text database named Kirara.
The following is a brief introduction to Kirara.
If you are interested in, get Kirara from:
  http://plan9.aichi-u.ac.jp/netlib/kirara/

Kenji Arisawa

-------------

Kirara

-------------

Kirara is a text indexing/retrieval tool for Plan 9.

Personal use: index/retrieve local files.

Kirara is based on the idea similar to Glimpse.

(1) indexing + grep
(2) multi-level indexing

(a) small space for indexing
(b) small update time
(c) quick search

Note that:
small indexing   <->	quick search
Kirara makes more index -> quick search
Glimpse is single-level indexing.

-------------

Query

Kirara does not support phrase search.
The database is index of words,

supporting:
QE mode (query expression mode)
	'&', '|', '*'
The example:
	'snoopy&html'
	'snoop*&htm*'

RE mode (regular expression mode)
	'&', RE
where RE denotes regular expression.
The example:
	'sn.*y&h.+l'

RE mode is a bit slow. (a few second.)

-------------

Words

Two or more runes.
All words are converted to lower case.
In English, words is composed of alphabets.
The number of runes is configurable

Assumption:
Text is composed of space-separated words
popular in English and many European Languages,
but not in Japanese.

-------------

The user's interface

Best match with Rio
	term% kfind snoop
	G snoop /sys/src/9/ip/
	G snoop /sys/src/cmd/spell/
	G snoop /sys/src/9/kw/
	...
	term% G snoop /sys/src/9/ip
	devip.c:34: 	Qsnoop,
	devip.c:95: 	case Qsnoop:
	devip.c:98: 		devdir(c, q, "snoop", qlen(cv->sq), cv->owner, 0400, dp);
	...
Note that: two steps
1. find directories
2. find files and the contents
Step 2 is actually 'grep'. we can use RE.

Two-steps search is not a weekness, but a desirable feature.
Because we have so many files that are hit by the query.

-------------

The organization

My example

/n/other/kirara/sysdb
target: (/lib /sys/lib /sys/src /sys/man /sys/include /sys/doc /rc)

/n/other/kirara/usrdb
target: $home/^(bin/rc lib netlib doc adm issues srclib src sources)

Indexing target is fully configurable.

-------------

Multi-Level Indexing

(1) Indexing (top level)
word to directory mapping

sysdb/index		# main index					# used for RE mode
sysdb/mindex	# meta index (alphabetic index)	# used for QE mode
sysdb/dind/*	# rough index of each directory
sysdb/QTDir		# map table (QID, mtime, path-to-dir)

index		# word to dir QID
	aa	0000000000014f0a
	aa	000000000001a1e0
	aa	000000000001a26e

mindex	# word to range in index
	aa 0 126669
	ab 126669 491569
	ac 491569 1258566
	ad 1258566 1852467
	...
dind/*				# `*' is a directory QID
	0000000000014f05
	0000000000014f0a
	000000000001a1ce

usrdb is same.

(2) Indexing (directory level)	# optional
word to file mapping

sysdb/find/*/ind.gz	# fine index of the directory (gzipped)
sysdb/find/*/qtn	# map table (QID, mtime, name)
where `*' is a directory QID

usrdb is same as sysdb.

-------------

Experiment

(a) hardware
GA-H61M-USB3-B3
Intel Pentium G860 (3GHz)
DDR3 PC3 4GB

(b) software
9front
cwfs64x

-------------

The performance (compression ratio)

target	 target	 num_of_dirs	  indexing
sysdb:   556 MB	   1790 dirs		 49 MB
usrdb:	6620 MB	   8948 dirs		150 MB

compression ratio: 49/556 (sysdb)
note: usrdb includes many non-text file.

-------------

The performance (retrieval time)
system dependent

RQ search	# kfind foo

0.1 seconds.

It is not important to make this time smaller.
(sufficiently small)

RE search	# kfind -r foo

a few seconds


-------------

The performance (construction/update)

(a) Construction time
system dependent

Initial construction
need
	10 minutes for sysdb
	30 minutes for usrdb

(b) Updating time
two commands for update

mkdb
	20 seconds to a few minutes for usrdb
	depends largely on state of cache

mkdb1 (currently only for usrdb)
	5 to 15 seconds for usrdb
	mkdb1 needs event log

-------------

Scalability

Main factors

(a) retrieval time
QE search: proportional to number of dirs that include the query
RE search: proportional to size of index

(b) initial construction time
proportional to total data

(c) update time
mkdb: 	proportional to number of dirs and the changes
mkdb1:	proportional to changes and size of index

-------------

Used Tools

(1) rc

(2) grep, sed, awk, sort, diff, gzip, ...

(3) some new tools written in C

-------------

What Kirara means?

Kirara is name of a girl that appeared in a Japanese comic book.
(But I have never read the book.)
The name is seldom used in real world.
From the name we Japanese imagine something glittering.
I like the name.

-------------

References

[1] GLIMPSE: A Tool to Search Through Entire File Systems
Udi Manber and Sun Wu (1993)
http://webglimpse.net/pubs/glimpse.pdf
[2] Glimpse Documentation
http://webglimpse.net/gdocs/glimpsehelp.html




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-06  1:14 [9fans] text database Kirara arisawa
@ 2013-08-06  7:26 ` Francisco J Ballesteros
  2013-08-06  8:12 ` Peter A. Cejchan
  2013-08-06 18:14 ` brz-systemd-dev
  2 siblings, 0 replies; 8+ messages in thread
From: Francisco J Ballesteros @ 2013-08-06  7:26 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

nice. btw I think I have another tag tool at contrib. not sure, but it was called tags, and the search tool was called F. 

I say this because it used file and per file type tag listing (ms2txt, etc) to index other file types. It might be an idea to add to your nice tool. 

On Aug 6, 2013, at 3:14 AM, arisawa <arisawa@ar.aichi-u.ac.jp> wrote:

> Hello 9fans,
> 
> I have written a text database named Kirara.
> The following is a brief introduction to Kirara.
> If you are interested in, get Kirara from:
>  http://plan9.aichi-u.ac.jp/netlib/kirara/
> 
> Kenji Arisawa
> 
> -------------
> 
> Kirara
> 
> -------------
> 
> Kirara is a text indexing/retrieval tool for Plan 9.
> 
> Personal use: index/retrieve local files.
> 
> Kirara is based on the idea similar to Glimpse.
> 
> (1) indexing + grep
> (2) multi-level indexing
> 
> (a) small space for indexing
> (b) small update time
> (c) quick search
> 
> Note that:
> small indexing   <->    quick search
> Kirara makes more index -> quick search
> Glimpse is single-level indexing.
> 
> -------------
> 
> Query
> 
> Kirara does not support phrase search.
> The database is index of words,
> 
> supporting:
> QE mode (query expression mode)
>    '&', '|', '*'
> The example:
>    'snoopy&html'
>    'snoop*&htm*'
> 
> RE mode (regular expression mode)
>    '&', RE
> where RE denotes regular expression.
> The example:
>    'sn.*y&h.+l'
> 
> RE mode is a bit slow. (a few second.)
> 
> -------------
> 
> Words
> 
> Two or more runes.
> All words are converted to lower case.
> In English, words is composed of alphabets.
> The number of runes is configurable
> 
> Assumption:
> Text is composed of space-separated words
> popular in English and many European Languages,
> but not in Japanese.
> 
> -------------
> 
> The user's interface
> 
> Best match with Rio
>    term% kfind snoop
>    G snoop /sys/src/9/ip/
>    G snoop /sys/src/cmd/spell/
>    G snoop /sys/src/9/kw/
>    ...
>    term% G snoop /sys/src/9/ip
>    devip.c:34:    Qsnoop,
>    devip.c:95:    case Qsnoop:
>    devip.c:98:        devdir(c, q, "snoop", qlen(cv->sq), cv->owner, 0400, dp);
>    ...
> Note that: two steps
> 1. find directories
> 2. find files and the contents
> Step 2 is actually 'grep'. we can use RE.
> 
> Two-steps search is not a weekness, but a desirable feature.
> Because we have so many files that are hit by the query.
> 
> -------------
> 
> The organization
> 
> My example
> 
> /n/other/kirara/sysdb
> target: (/lib /sys/lib /sys/src /sys/man /sys/include /sys/doc /rc)
> 
> /n/other/kirara/usrdb
> target: $home/^(bin/rc lib netlib doc adm issues srclib src sources)
> 
> Indexing target is fully configurable.
> 
> -------------
> 
> Multi-Level Indexing
> 
> (1) Indexing (top level)
> word to directory mapping
> 
> sysdb/index        # main index                    # used for RE mode
> sysdb/mindex    # meta index (alphabetic index)    # used for QE mode
> sysdb/dind/*    # rough index of each directory
> sysdb/QTDir        # map table (QID, mtime, path-to-dir)
> 
> index        # word to dir QID
>    aa    0000000000014f0a
>    aa    000000000001a1e0
>    aa    000000000001a26e
> 
> mindex    # word to range in index
>    aa 0 126669
>    ab 126669 491569
>    ac 491569 1258566
>    ad 1258566 1852467
>    ...
> dind/*                # `*' is a directory QID
>    0000000000014f05
>    0000000000014f0a
>    000000000001a1ce
> 
> usrdb is same.
> 
> (2) Indexing (directory level)    # optional
> word to file mapping
> 
> sysdb/find/*/ind.gz    # fine index of the directory (gzipped)
> sysdb/find/*/qtn    # map table (QID, mtime, name)
> where `*' is a directory QID
> 
> usrdb is same as sysdb.
> 
> -------------
> 
> Experiment
> 
> (a) hardware
> GA-H61M-USB3-B3
> Intel Pentium G860 (3GHz)
> DDR3 PC3 4GB
> 
> (b) software
> 9front
> cwfs64x
> 
> -------------
> 
> The performance (compression ratio)
> 
> target     target     num_of_dirs      indexing
> sysdb:   556 MB       1790 dirs         49 MB
> usrdb:    6620 MB       8948 dirs        150 MB
> 
> compression ratio: 49/556 (sysdb)
> note: usrdb includes many non-text file.
> 
> -------------
> 
> The performance (retrieval time)
> system dependent
> 
> RQ search    # kfind foo
> 
> 0.1 seconds.
> 
> It is not important to make this time smaller.
> (sufficiently small)
> 
> RE search    # kfind -r foo
> 
> a few seconds
> 
> 
> -------------
> 
> The performance (construction/update)
> 
> (a) Construction time
> system dependent
> 
> Initial construction
> need
>    10 minutes for sysdb
>    30 minutes for usrdb
> 
> (b) Updating time
> two commands for update
> 
> mkdb
>    20 seconds to a few minutes for usrdb
>    depends largely on state of cache
> 
> mkdb1 (currently only for usrdb)
>    5 to 15 seconds for usrdb
>    mkdb1 needs event log
> 
> -------------
> 
> Scalability
> 
> Main factors
> 
> (a) retrieval time
> QE search: proportional to number of dirs that include the query
> RE search: proportional to size of index
> 
> (b) initial construction time
> proportional to total data
> 
> (c) update time
> mkdb:    proportional to number of dirs and the changes
> mkdb1:    proportional to changes and size of index
> 
> -------------
> 
> Used Tools
> 
> (1) rc
> 
> (2) grep, sed, awk, sort, diff, gzip, ...
> 
> (3) some new tools written in C
> 
> -------------
> 
> What Kirara means?
> 
> Kirara is name of a girl that appeared in a Japanese comic book.
> (But I have never read the book.)
> The name is seldom used in real world.
> From the name we Japanese imagine something glittering.
> I like the name.
> 
> -------------
> 
> References
> 
> [1] GLIMPSE: A Tool to Search Through Entire File Systems
> Udi Manber and Sun Wu (1993)
> http://webglimpse.net/pubs/glimpse.pdf
> [2] Glimpse Documentation
> http://webglimpse.net/gdocs/glimpsehelp.html
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-06  1:14 [9fans] text database Kirara arisawa
  2013-08-06  7:26 ` Francisco J Ballesteros
@ 2013-08-06  8:12 ` Peter A. Cejchan
  2013-08-06 18:14 ` brz-systemd-dev
  2 siblings, 0 replies; 8+ messages in thread
From: Peter A. Cejchan @ 2013-08-06  8:12 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 6007 bytes --]

Sounds great!
Thank you, Kenji !!
++pac


On Tue, Aug 6, 2013 at 3:14 AM, arisawa <arisawa@ar.aichi-u.ac.jp> wrote:

> Hello 9fans,
>
> I have written a text database named Kirara.
> The following is a brief introduction to Kirara.
> If you are interested in, get Kirara from:
>   http://plan9.aichi-u.ac.jp/netlib/kirara/
>
> Kenji Arisawa
>
> -------------
>
> Kirara
>
> -------------
>
> Kirara is a text indexing/retrieval tool for Plan 9.
>
> Personal use: index/retrieve local files.
>
> Kirara is based on the idea similar to Glimpse.
>
> (1) indexing + grep
> (2) multi-level indexing
>
> (a) small space for indexing
> (b) small update time
> (c) quick search
>
> Note that:
> small indexing   <->    quick search
> Kirara makes more index -> quick search
> Glimpse is single-level indexing.
>
> -------------
>
> Query
>
> Kirara does not support phrase search.
> The database is index of words,
>
> supporting:
> QE mode (query expression mode)
>         '&', '|', '*'
> The example:
>         'snoopy&html'
>         'snoop*&htm*'
>
> RE mode (regular expression mode)
>         '&', RE
> where RE denotes regular expression.
> The example:
>         'sn.*y&h.+l'
>
> RE mode is a bit slow. (a few second.)
>
> -------------
>
> Words
>
> Two or more runes.
> All words are converted to lower case.
> In English, words is composed of alphabets.
> The number of runes is configurable
>
> Assumption:
> Text is composed of space-separated words
> popular in English and many European Languages,
> but not in Japanese.
>
> -------------
>
> The user's interface
>
> Best match with Rio
>         term% kfind snoop
>         G snoop /sys/src/9/ip/
>         G snoop /sys/src/cmd/spell/
>         G snoop /sys/src/9/kw/
>         ...
>         term% G snoop /sys/src/9/ip
>         devip.c:34:     Qsnoop,
>         devip.c:95:     case Qsnoop:
>         devip.c:98:             devdir(c, q, "snoop", qlen(cv->sq),
> cv->owner, 0400, dp);
>         ...
> Note that: two steps
> 1. find directories
> 2. find files and the contents
> Step 2 is actually 'grep'. we can use RE.
>
> Two-steps search is not a weekness, but a desirable feature.
> Because we have so many files that are hit by the query.
>
> -------------
>
> The organization
>
> My example
>
> /n/other/kirara/sysdb
> target: (/lib /sys/lib /sys/src /sys/man /sys/include /sys/doc /rc)
>
> /n/other/kirara/usrdb
> target: $home/^(bin/rc lib netlib doc adm issues srclib src sources)
>
> Indexing target is fully configurable.
>
> -------------
>
> Multi-Level Indexing
>
> (1) Indexing (top level)
> word to directory mapping
>
> sysdb/index             # main index                                    #
> used for RE mode
> sysdb/mindex    # meta index (alphabetic index) # used for QE mode
> sysdb/dind/*    # rough index of each directory
> sysdb/QTDir             # map table (QID, mtime, path-to-dir)
>
> index           # word to dir QID
>         aa      0000000000014f0a
>         aa      000000000001a1e0
>         aa      000000000001a26e
>
> mindex  # word to range in index
>         aa 0 126669
>         ab 126669 491569
>         ac 491569 1258566
>         ad 1258566 1852467
>         ...
> dind/*                          # `*' is a directory QID
>         0000000000014f05
>         0000000000014f0a
>         000000000001a1ce
>
> usrdb is same.
>
> (2) Indexing (directory level)  # optional
> word to file mapping
>
> sysdb/find/*/ind.gz     # fine index of the directory (gzipped)
> sysdb/find/*/qtn        # map table (QID, mtime, name)
> where `*' is a directory QID
>
> usrdb is same as sysdb.
>
> -------------
>
> Experiment
>
> (a) hardware
> GA-H61M-USB3-B3
> Intel Pentium G860 (3GHz)
> DDR3 PC3 4GB
>
> (b) software
> 9front
> cwfs64x
>
> -------------
>
> The performance (compression ratio)
>
> target   target  num_of_dirs      indexing
> sysdb:   556 MB    1790 dirs             49 MB
> usrdb:  6620 MB    8948 dirs            150 MB
>
> compression ratio: 49/556 (sysdb)
> note: usrdb includes many non-text file.
>
> -------------
>
> The performance (retrieval time)
> system dependent
>
> RQ search       # kfind foo
>
> 0.1 seconds.
>
> It is not important to make this time smaller.
> (sufficiently small)
>
> RE search       # kfind -r foo
>
> a few seconds
>
>
> -------------
>
> The performance (construction/update)
>
> (a) Construction time
> system dependent
>
> Initial construction
> need
>         10 minutes for sysdb
>         30 minutes for usrdb
>
> (b) Updating time
> two commands for update
>
> mkdb
>         20 seconds to a few minutes for usrdb
>         depends largely on state of cache
>
> mkdb1 (currently only for usrdb)
>         5 to 15 seconds for usrdb
>         mkdb1 needs event log
>
> -------------
>
> Scalability
>
> Main factors
>
> (a) retrieval time
> QE search: proportional to number of dirs that include the query
> RE search: proportional to size of index
>
> (b) initial construction time
> proportional to total data
>
> (c) update time
> mkdb:   proportional to number of dirs and the changes
> mkdb1:  proportional to changes and size of index
>
> -------------
>
> Used Tools
>
> (1) rc
>
> (2) grep, sed, awk, sort, diff, gzip, ...
>
> (3) some new tools written in C
>
> -------------
>
> What Kirara means?
>
> Kirara is name of a girl that appeared in a Japanese comic book.
> (But I have never read the book.)
> The name is seldom used in real world.
> From the name we Japanese imagine something glittering.
> I like the name.
>
> -------------
>
> References
>
> [1] GLIMPSE: A Tool to Search Through Entire File Systems
> Udi Manber and Sun Wu (1993)
> http://webglimpse.net/pubs/glimpse.pdf
> [2] Glimpse Documentation
> http://webglimpse.net/gdocs/glimpsehelp.html
>
>
>

[-- Attachment #2: Type: text/html, Size: 7504 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-06  1:14 [9fans] text database Kirara arisawa
  2013-08-06  7:26 ` Francisco J Ballesteros
  2013-08-06  8:12 ` Peter A. Cejchan
@ 2013-08-06 18:14 ` brz-systemd-dev
  2013-08-07  6:31   ` arisawa
  2 siblings, 1 reply; 8+ messages in thread
From: brz-systemd-dev @ 2013-08-06 18:14 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

I've played around with Kirara for a couple hours, now, and am pretty
surprised at how simple it is.  It's already become integrated into my
workflow.  Being able to quickly (and easily) search for relevant
snippets of code throughout the system is quite useful.

I feel compelled to mention that the code is abnormally high in
quality.  (This is seen, even in the rc scripts)
Now I'm going to have to look through your other projects.

Thanks for releasing this.

- BurnZeZ

Bug:
	kirara-1.1/INSTALL:9: mkdir -p $kirarar/bin/^(rc $objtype)
	Here (and on line 11), '$kirarar' is used instead of '$kirara'.

[-- Attachment #2: Type: message/rfc822, Size: 7706 bytes --]

From: arisawa <arisawa@ar.aichi-u.ac.jp>
To: 9fans@9fans.net
Subject: [9fans] text database Kirara
Date: Tue, 6 Aug 2013 10:14:36 +0900
Message-ID: <5E38D9FC-75B6-40C9-AB64-E210DACE0B4E@ar.aichi-u.ac.jp>

Hello 9fans,

I have written a text database named Kirara.
The following is a brief introduction to Kirara.
If you are interested in, get Kirara from:
  http://plan9.aichi-u.ac.jp/netlib/kirara/

Kenji Arisawa

-------------

Kirara

-------------

Kirara is a text indexing/retrieval tool for Plan 9.

Personal use: index/retrieve local files.

Kirara is based on the idea similar to Glimpse.

(1) indexing + grep
(2) multi-level indexing

(a) small space for indexing
(b) small update time
(c) quick search

Note that:
small indexing   <->	quick search
Kirara makes more index -> quick search
Glimpse is single-level indexing.

-------------

Query

Kirara does not support phrase search.
The database is index of words,

supporting:
QE mode (query expression mode)
	'&', '|', '*'
The example:
	'snoopy&html'
	'snoop*&htm*'

RE mode (regular expression mode)
	'&', RE
where RE denotes regular expression.
The example:
	'sn.*y&h.+l'

RE mode is a bit slow. (a few second.)

-------------

Words

Two or more runes.
All words are converted to lower case.
In English, words is composed of alphabets.
The number of runes is configurable

Assumption:
Text is composed of space-separated words
popular in English and many European Languages,
but not in Japanese.

-------------

The user's interface

Best match with Rio
	term% kfind snoop
	G snoop /sys/src/9/ip/
	G snoop /sys/src/cmd/spell/
	G snoop /sys/src/9/kw/
	...
	term% G snoop /sys/src/9/ip
	devip.c:34: 	Qsnoop,
	devip.c:95: 	case Qsnoop:
	devip.c:98: 		devdir(c, q, "snoop", qlen(cv->sq), cv->owner, 0400, dp);
	...
Note that: two steps
1. find directories
2. find files and the contents
Step 2 is actually 'grep'. we can use RE.

Two-steps search is not a weekness, but a desirable feature.
Because we have so many files that are hit by the query.

-------------

The organization

My example

/n/other/kirara/sysdb
target: (/lib /sys/lib /sys/src /sys/man /sys/include /sys/doc /rc)

/n/other/kirara/usrdb
target: $home/^(bin/rc lib netlib doc adm issues srclib src sources)

Indexing target is fully configurable.

-------------

Multi-Level Indexing

(1) Indexing (top level)
word to directory mapping

sysdb/index		# main index					# used for RE mode
sysdb/mindex	# meta index (alphabetic index)	# used for QE mode
sysdb/dind/*	# rough index of each directory
sysdb/QTDir		# map table (QID, mtime, path-to-dir)

index		# word to dir QID
	aa	0000000000014f0a
	aa	000000000001a1e0
	aa	000000000001a26e

mindex	# word to range in index
	aa 0 126669
	ab 126669 491569
	ac 491569 1258566
	ad 1258566 1852467
	...
dind/*				# `*' is a directory QID
	0000000000014f05
	0000000000014f0a
	000000000001a1ce

usrdb is same.

(2) Indexing (directory level)	# optional
word to file mapping

sysdb/find/*/ind.gz	# fine index of the directory (gzipped)
sysdb/find/*/qtn	# map table (QID, mtime, name)
where `*' is a directory QID

usrdb is same as sysdb.

-------------

Experiment

(a) hardware
GA-H61M-USB3-B3
Intel Pentium G860 (3GHz)
DDR3 PC3 4GB

(b) software
9front
cwfs64x

-------------

The performance (compression ratio)

target	 target	 num_of_dirs	  indexing
sysdb:   556 MB	   1790 dirs		 49 MB
usrdb:	6620 MB	   8948 dirs		150 MB

compression ratio: 49/556 (sysdb)
note: usrdb includes many non-text file.

-------------

The performance (retrieval time)
system dependent

RQ search	# kfind foo

0.1 seconds.

It is not important to make this time smaller.
(sufficiently small)

RE search	# kfind -r foo

a few seconds

-------------

The performance (construction/update)

(a) Construction time
system dependent

Initial construction
need
	10 minutes for sysdb
	30 minutes for usrdb

(b) Updating time
two commands for update

mkdb
	20 seconds to a few minutes for usrdb
	depends largely on state of cache

mkdb1 (currently only for usrdb)
	5 to 15 seconds for usrdb
	mkdb1 needs event log

-------------

Scalability

Main factors

(a) retrieval time
QE search: proportional to number of dirs that include the query
RE search: proportional to size of index

(b) initial construction time
proportional to total data

(c) update time
mkdb: 	proportional to number of dirs and the changes
mkdb1:	proportional to changes and size of index

-------------

Used Tools

(1) rc

(2) grep, sed, awk, sort, diff, gzip, ...

(3) some new tools written in C

-------------

What Kirara means?

Kirara is name of a girl that appeared in a Japanese comic book.
(But I have never read the book.)
The name is seldom used in real world.
From the name we Japanese imagine something glittering.
I like the name.

-------------

References

[1] GLIMPSE: A Tool to Search Through Entire File Systems
Udi Manber and Sun Wu (1993)
http://webglimpse.net/pubs/glimpse.pdf
[2] Glimpse Documentation
http://webglimpse.net/gdocs/glimpsehelp.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-06 18:14 ` brz-systemd-dev
@ 2013-08-07  6:31   ` arisawa
  2013-08-07  7:22     ` Skip Tavakkolian
  0 siblings, 1 reply; 8+ messages in thread
From: arisawa @ 2013-08-07  6:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello 9fans,

Thanks for the attention to Kirara.
There was bugs in document, and probably in programs.
These bug reports are helpful for me.
There is room for improvement, any comment is welcome for future
development.

Kenji Arisawa

On 2013/08/07, at 3:14, brz-systemd-dev@intma.in wrote:

> I've played around with Kirara for a couple hours, now, and am pretty
> surprised at how simple it is.  It's already become integrated into my
> workflow.  Being able to quickly (and easily) search for relevant
> snippets of code throughout the system is quite useful.
>
> I feel compelled to mention that the code is abnormally high in
> quality.  (This is seen, even in the rc scripts)
> Now I'm going to have to look through your other projects.
>
> Thanks for releasing this.
>
> - BurnZeZ
>
> Bug:
> 	kirara-1.1/INSTALL:9: mkdir -p $kirarar/bin/^(rc $objtype)
> 	Here (and on line 11), '$kirarar' is used instead of '$kirara'.
>





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-07  6:31   ` arisawa
@ 2013-08-07  7:22     ` Skip Tavakkolian
  2013-08-07  8:17       ` arisawa
  0 siblings, 1 reply; 8+ messages in thread
From: Skip Tavakkolian @ 2013-08-07  7:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1676 bytes --]

i've just started playing with it.  why are the binaries installed in
/n/other/kirara?

on fossil i had to fake /n/other/kirara with:

aux/stub -d /n/other/kirara
mkdir kirara && chmod +t kirara
bind -c /n/other/kirara kirara
mk install

but it doesn't install correctly:

% kirara=`{pwd}^/kirara
% kfind snoop
Can't cd /n/other/kirara/sysdb: '/n/other/kirara' does not exist
grep: can't open Dir: 'Dir' file does not exist
Can't cd /n/other/kirara/usrdb: '/n/other/kirara' does not exist
grep: can't open Dir: 'Dir' file does not exist
%



On Tue, Aug 6, 2013 at 11:31 PM, arisawa <arisawa@ar.aichi-u.ac.jp> wrote:

> Hello 9fans,
>
> Thanks for the attention to Kirara.
> There was bugs in document, and probably in programs.
> These bug reports are helpful for me.
> There is room for improvement, any comment is welcome for future
> development.
>
> Kenji Arisawa
>
> On 2013/08/07, at 3:14, brz-systemd-dev@intma.in wrote:
>
> > I've played around with Kirara for a couple hours, now, and am pretty
> > surprised at how simple it is.  It's already become integrated into my
> > workflow.  Being able to quickly (and easily) search for relevant
> > snippets of code throughout the system is quite useful.
> >
> > I feel compelled to mention that the code is abnormally high in
> > quality.  (This is seen, even in the rc scripts)
> > Now I'm going to have to look through your other projects.
> >
> > Thanks for releasing this.
> >
> > - BurnZeZ
> >
> > Bug:
> >       kirara-1.1/INSTALL:9: mkdir -p $kirarar/bin/^(rc $objtype)
> >       Here (and on line 11), '$kirarar' is used instead of '$kirara'.
> >
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 2509 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-07  7:22     ` Skip Tavakkolian
@ 2013-08-07  8:17       ` arisawa
  2013-08-07 13:32         ` erik quanstrom
  0 siblings, 1 reply; 8+ messages in thread
From: arisawa @ 2013-08-07  8:17 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello Skip,

On fossil, Kirara can be installed anywhere.
Let the directory be $kirara.
It is desirable that the directory is out of Kirara's target,
and out of fs dump.
On fossil, execute
	chmod +t $kirara
You must set the environment variable, otherwise you must
change all /n/other/kirara in the scripts to the place.

To install Kirara, change the line in mkfile
	kirara=/n/other/kirara
to the directory. ( sorry I forgot to instruct this one)

Kirara's binary is installed to $kirara/bin. Because Kirara introduced many new tools
that might conflict existing binaries. Therefore it is safe not to mix with your binaries.

Kenji Arisawa

On 2013/08/07, at 16:22, Skip Tavakkolian <skip.tavakkolian@gmail.com> wrote:

> i've just started playing with it.  why are the binaries installed in /n/other/kirara?
> 
> on fossil i had to fake /n/other/kirara with:
> 
> aux/stub -d /n/other/kirara
> mkdir kirara && chmod +t kirara
> bind -c /n/other/kirara kirara
> mk install
> 
> but it doesn't install correctly:
> 
> % kirara=`{pwd}^/kirara
> % kfind snoop
> Can't cd /n/other/kirara/sysdb: '/n/other/kirara' does not exist
> grep: can't open Dir: 'Dir' file does not exist
> Can't cd /n/other/kirara/usrdb: '/n/other/kirara' does not exist
> grep: can't open Dir: 'Dir' file does not exist
> % 
> 
> 
> 
> On Tue, Aug 6, 2013 at 11:31 PM, arisawa <arisawa@ar.aichi-u.ac.jp> wrote:
> Hello 9fans,
> 
> Thanks for the attention to Kirara.
> There was bugs in document, and probably in programs.
> These bug reports are helpful for me.
> There is room for improvement, any comment is welcome for future
> development.
> 
> Kenji Arisawa
> 
> On 2013/08/07, at 3:14, brz-systemd-dev@intma.in wrote:
> 
> > I've played around with Kirara for a couple hours, now, and am pretty
> > surprised at how simple it is.  It's already become integrated into my
> > workflow.  Being able to quickly (and easily) search for relevant
> > snippets of code throughout the system is quite useful.
> >
> > I feel compelled to mention that the code is abnormally high in
> > quality.  (This is seen, even in the rc scripts)
> > Now I'm going to have to look through your other projects.
> >
> > Thanks for releasing this.
> >
> > - BurnZeZ
> >
> > Bug:
> >       kirara-1.1/INSTALL:9: mkdir -p $kirarar/bin/^(rc $objtype)
> >       Here (and on line 11), '$kirarar' is used instead of '$kirara'.
> >
> 
> 
> 
> 




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] text database Kirara
  2013-08-07  8:17       ` arisawa
@ 2013-08-07 13:32         ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2013-08-07 13:32 UTC (permalink / raw)
  To: 9fans

> You must set the environment variable, otherwise you must
> change all /n/other/kirara in the scripts to the place.
> 
> To install Kirara, change the line in mkfile
> 	kirara=/n/other/kirara
> to the directory. ( sorry I forgot to instruct this one)
> 
> Kirara's binary is installed to $kirara/bin. Because Kirara introduced many new tools
> that might conflict existing binaries. Therefore it is safe not to mix with your binaries.

if you let $kirara ≡ kirara then you will have a standard
place to install that fits with the system.

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-08-07 13:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-06  1:14 [9fans] text database Kirara arisawa
2013-08-06  7:26 ` Francisco J Ballesteros
2013-08-06  8:12 ` Peter A. Cejchan
2013-08-06 18:14 ` brz-systemd-dev
2013-08-07  6:31   ` arisawa
2013-08-07  7:22     ` Skip Tavakkolian
2013-08-07  8:17       ` arisawa
2013-08-07 13:32         ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).