From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Wed, 3 Nov 2004 07:53:42 -0500 From: Russ Cox To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] pdf2txt In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: Topicbox-Message-UUID: fa920970-eacd-11e9-9e20-41e7f4b1d025 > looking into src of 'page', it seems that > pdf docs don't go to GhostScript, but are treated internally. they go to ghostscript. > Would it then be a problem to have something like 'pdf2txt'? > > cat doc.pdf | pdf2ps | ps2a (or the like) > > doesn't work apparently due to pdf2ps. you could look at the xpdf tools, which include a pdftotext. > Or, hoe do YOU index your pdf docs? Manually?? i don't. russ