Post-processing ConTeXt's output for text search

* Post-processing ConTeXt's output for text search
@ 2007-07-04  8:28 Piotr Kopszak
  2007-07-04 10:05 ` Patrick Gundlach
  0 siblings, 1 reply; 2+ messages in thread
From: Piotr Kopszak @ 2007-07-04  8:28 UTC (permalink / raw)
  To: ntg-context

Hello list, 

I would like to implement online  text search of a book I'm publishing
with ConTeXt  now, however, without  making the book  itself available
online, something like the snippet view in text search on Google Books
site.   I don't even  need snippet  view, just  page numbers  would be
sufficient for the beginning.  I  am not asking for complete solution,
of course,  rather an advice on  direction in which to  go.  May first
vague idea  is following:  to make  queries fast I  think it  would be
useful to obtain  a list of words with numbers of  pages on which they
appear, something very similar to  plain index. So perhaps it would be
possible to force indexing engine to treat every word in text as if it
was an argument of \index command  and split out a list in text format
which it would be easy to feed  into a database.  This is just a blind
guess which is  far from perfect by design,  only something that seems
easiest to implement.  Please, tell  me what other approaches would be
more promising.

Thanks in advance

Piotr

--

  Piotr Kopszak, Ph.D.
  Polish Art Gallery, National Museum in Warsaw
  ----------------------------->    http://kopszak.mnw.art.pl/
  http://www.magnatune.com/artists/altri_stromenti

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 2+ messages in thread