From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 14 May 2004 10:47:52 -0500 From: splite@purdue.edu To: 9fans@cse.psu.edu Subject: Re: [9fans] text search in PDF? Message-ID: <20040514154752.GB16819@sigint.cs.purdue.edu> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Topicbox-Message-UUID: 7b759cd8-eacd-11e9-9e20-41e7f4b1d025 On Fri, May 14, 2004 at 07:22:06PM +0500, dvd@davidashen.net wrote: > > The problem is high quality formatting -- most documents which are > kerned are kerned explicitely -- words are broken into parts, and > displacements are set for word parts, which makes searching for whole > words impossible. Funny how things come full-circle. We used to need preview images attached to Encapsulated PostScript files because on-screen rendering was too slow. Now it seems we need "plain-text thumbnails" embedded in PDF files to facilitate searches, braille or spoken output, etc. Maybe PDF already has that capability but nobody uses it. (Almost nobody used the EPSI format either.)