ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* ConTeXt and the blind
@ 2004-04-14 20:50 Alan Bowen
  2004-04-14 22:20 ` Bill McClain
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alan Bowen @ 2004-04-14 20:50 UTC (permalink / raw)


I have very recently launched a new journal which has been designed on 
the assumption that it will exist in both electronic form and in 
print—hence, it is produced using ConTeXt and exists natively in PDF 
files. This morning I was asked by a colleague who is totally blind 
whether it would be possible to for him have ASCII or .txt files that 
he could use easily with his screen reading software. (My sense is that 
he may be able to use PDF files with this software, but that it is not 
easy.)

So, does anyone on the list have ideas about how to produce such files 
from the files I currently have in hand or any experience with this 
sort of problem? Is there, for instance, a way to strip away all the 
formatting commands from a ConTeXt source file automatically so as to 
leave an unencoded .txt file that I could send him? I gather that he 
can use .htm files, but so far as I can tell there is no path from a 
ConTeXt source file to an HTML file—at least, a specific query about 
this made recently on this list by someone else seems to have gone 
unanswered.

Cheers, Alan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-14 20:50 ConTeXt and the blind Alan Bowen
@ 2004-04-14 22:20 ` Bill McClain
  2004-04-15 11:06   ` Erik Hetzner
  2004-04-14 23:44 ` Matthew Huggett
  2004-04-15 15:32 ` Jan Hlavacek
  2 siblings, 1 reply; 8+ messages in thread
From: Bill McClain @ 2004-04-14 22:20 UTC (permalink / raw)


On Wed, 14 Apr 2004 16:50:04 -0400
Alan Bowen <acbowen@princeton.edu> wrote:

> So, does anyone on the list have ideas about how to produce such files
> 
> from the files I currently have in hand or any experience with this 
> sort of problem?

I have used the pdftotext utility, part of the xpdf package, for similar
tasks. In the case of hyphenated line endings, the word will be
hyphenated and broken across lines just as in the pdf, and that might be
a problem for the reader program.

-Bill
-- 
Sattre Press                                The King in Yellow
http://sattre-press.com/                 by Robert W. Chambers
info@sattre-press.com         http://sattre-press.com/kiy.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-14 20:50 ConTeXt and the blind Alan Bowen
  2004-04-14 22:20 ` Bill McClain
@ 2004-04-14 23:44 ` Matthew Huggett
  2004-04-15 14:00   ` Alan Bowen
  2004-04-15 15:32 ` Jan Hlavacek
  2 siblings, 1 reply; 8+ messages in thread
From: Matthew Huggett @ 2004-04-14 23:44 UTC (permalink / raw)


You'd have to do it a file at a time, but does the Acrobat Reader's 
"save as text" function do what you need?

A much bigger solution would be to have your source as xml and then go 
from there to ConTeXt and pdf or straight to plain text via XSLT.

Matt


Alan Bowen wrote:

> I have very recently launched a new journal which has been designed on 
> the assumption that it will exist in both electronic form and in 
> print—hence, it is produced using ConTeXt and exists natively in PDF 
> files. This morning I was asked by a colleague who is totally blind 
> whether it would be possible to for him have ASCII or .txt files that 
> he could use easily with his screen reading software. (My sense is 
> that he may be able to use PDF files with this software, but that it 
> is not easy.)
>
> So, does anyone on the list have ideas about how to produce such files 
> from the files I currently have in hand or any experience with this 
> sort of problem? Is there, for instance, a way to strip away all the 
> formatting commands from a ConTeXt source file automatically so as to 
> leave an unencoded .txt file that I could send him? I gather that he 
> can use .htm files, but so far as I can tell there is no path from a 
> ConTeXt source file to an HTML file—at least, a specific query about 
> this made recently on this list by someone else seems to have gone 
> unanswered.
>
> Cheers, Alan
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-14 22:20 ` Bill McClain
@ 2004-04-15 11:06   ` Erik Hetzner
  0 siblings, 0 replies; 8+ messages in thread
From: Erik Hetzner @ 2004-04-15 11:06 UTC (permalink / raw)


Bill McClain wrote:

>On Wed, 14 Apr 2004 16:50:04 -0400
>Alan Bowen <acbowen@princeton.edu> wrote:
>
>  
>
>>So, does anyone on the list have ideas about how to produce such files
>>
>>from the files I currently have in hand or any experience with this 
>>sort of problem?
>>    
>>
>
>I have used the pdftotext utility, part of the xpdf package, for similar
>tasks. In the case of hyphenated line endings, the word will be
>hyphenated and broken across lines just as in the pdf, and that might be
>a problem for the reader program.
>
>-Bill
>  
>
 From my own experience pdftotext also has trouble handling multicolumn 
documents. Adobe has an online utility for transforming PDF to html, 
which can rather easily be turned into text, which worked pretty well 
for me, breaking columns into something useful instead of mashing all 
the text together.

Regards,
Erik Hetzner

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-14 23:44 ` Matthew Huggett
@ 2004-04-15 14:00   ` Alan Bowen
  2004-04-15 14:24     ` Bill McClain
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Bowen @ 2004-04-15 14:00 UTC (permalink / raw)


Bill, Erik, and Matthew—

Thank you very much for the suggestions. I will explore pdftotext and 
the Acrobat “Save As” options. One of the problems for all—and perhaps 
it is insuperable—is the ability of such reading software to present 
phrases in foreign languages and mathematical expressions. I will 
report to you at least on what I discover.

Best, Alan

On Apr 14, 2004, at 7:44 PM, Matthew Huggett wrote:

> You'd have to do it a file at a time, but does the Acrobat Reader's 
> "save as text" function do what you need?
>
> A much bigger solution would be to have your source as xml and then go 
> from there to ConTeXt and pdf or straight to plain text via XSLT.
>
> Matt
>
>
> Alan Bowen wrote:
>
>> I have very recently launched a new journal which has been designed 
>> on the assumption that it will exist in both electronic form and in 
>> print—hence, it is produced using ConTeXt and exists natively in PDF 
>> files. This morning I was asked by a colleague who is totally blind 
>> whether it would be possible to for him have ASCII or .txt files that 
>> he could use easily with his screen reading software. (My sense is 
>> that he may be able to use PDF files with this software, but that it 
>> is not easy.)
>>
>> So, does anyone on the list have ideas about how to produce such 
>> files from the files I currently have in hand or any experience with 
>> this sort of problem? Is there, for instance, a way to strip away all 
>> the formatting commands from a ConTeXt source file automatically so 
>> as to leave an unencoded .txt file that I could send him? I gather 
>> that he can use .htm files, but so far as I can tell there is no path 
>> from a ConTeXt source file to an HTML file—at least, a specific query 
>> about this made recently on this list by someone else seems to have 
>> gone unanswered.
>>
>> Cheers, Alan
>> _______________________________________________
>> ntg-context mailing list
>> ntg-context@ntg.nl
>> http://www.ntg.nl/mailman/listinfo/ntg-context
>>
>>
>
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-15 14:00   ` Alan Bowen
@ 2004-04-15 14:24     ` Bill McClain
  0 siblings, 0 replies; 8+ messages in thread
From: Bill McClain @ 2004-04-15 14:24 UTC (permalink / raw)


On Thu, 15 Apr 2004 10:00:12 -0400
Alan Bowen <acbowen@princeton.edu> wrote:

> Thank you very much for the suggestions. I will explore pdftotext and 
> the Acrobat _Save As_ options. 

Another issue with these methods is that the header and footer
information on each page will be included, which could be irritating or
helpful, depending on the application.

> One of the problems for all_and perhaps
> it is insuperable_is the ability of such reading software to present 
> phrases in foreign languages and mathematical expressions.

I haven't done any XML writing, but I think that would be the superior
approach. If special elements of the text are tagged, then they could be
translated appropriately for the blind reader. 

I use a text-to-speech program for proofing some of my documents and
have found it helpful to filter the original text and emit a coded
version which makes it easy for the speech program to read, and easier
for me to understand. I'll have it say "quote", "endquote", "italics",
etc. I'm working from the Context source directly, but XML sources could
be used similarly, and there are lots of XML tools in the world. 

-Bill
-- 
Sattre Press                              History of Astronomy 
http://sattre-press.com/               During the 19th Century
info@sattre-press.com                       by Agnes M. Clerke
                              http://sattre-press.com/han.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-14 20:50 ConTeXt and the blind Alan Bowen
  2004-04-14 22:20 ` Bill McClain
  2004-04-14 23:44 ` Matthew Huggett
@ 2004-04-15 15:32 ` Jan Hlavacek
  2004-04-15 20:47   ` Hans Hagen
  2 siblings, 1 reply; 8+ messages in thread
From: Jan Hlavacek @ 2004-04-15 15:32 UTC (permalink / raw)


On Wed, Apr 14, 2004 at 04:50:04PM -0400, Alan Bowen wrote:

> So, does anyone on the list have ideas about how to produce such files 
> from the files I currently have in hand or any experience with this 
> sort of problem? Is there, for instance, a way to strip away all the 
> formatting commands from a ConTeXt source file automatically so as to 
> leave an unencoded .txt file that I could send him? I gather that he 
> can use .htm files, but so far as I can tell there is no path from a 
> ConTeXt source file to an HTML file?at least, a specific query about 
> this made recently on this list by someone else seems to have gone 
> unanswered.

There is a utility called untex, that strips LaTeX formating from a tex
file.  I didn't test it with ConTeXt, but it may work too.  If you can
produce a dvi file, there is couple of programs: dvi2tty  and catdvi
that can extract text from a dvi file,  Finally, pdftotext, which I
believe is a part of the xpdf package, can extract text from many pdf
files. 

Finally, there is a program called tex2page, that convert TeX to html.
Unlike latex2html, it can handle at least some plain TeX, so it may be
possible to use it on ConTeXt files.  Again, I didn't try it.  If you
want to experiment with it, it is at
http://www.ccs.neu.edu/home/dorai/tex2page/tex2page-doc.html

-- 
Jan Hlavacek                                            (260) 434-7566
Department of Mathematics                             Jhlavacek@sf.edu
University of Saint Francis               http://www.sf.edu/jhlavacek/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ConTeXt and the blind
  2004-04-15 15:32 ` Jan Hlavacek
@ 2004-04-15 20:47   ` Hans Hagen
  0 siblings, 0 replies; 8+ messages in thread
From: Hans Hagen @ 2004-04-15 20:47 UTC (permalink / raw)


At 17:32 15/04/2004, you wrote:

>There is a utility called untex, that strips LaTeX formating from a tex
>file.  I didn't test it with ConTeXt, but it may work too.  If you can
>produce a dvi file, there is couple of programs: dvi2tty  and catdvi
>that can extract text from a dvi file,  Finally, pdftotext, which I
>believe is a part of the xpdf package, can extract text from many pdf
>files.
>
>Finally, there is a program called tex2page, that convert TeX to html.
>Unlike latex2html, it can handle at least some plain TeX, so it may be
>possible to use it on ConTeXt files.  Again, I didn't try it.  If you
>want to experiment with it, it is at
>http://www.ccs.neu.edu/home/dorai/tex2page/tex2page-doc.html

since most context commands are instances of more generic ones, you can 
define another style to process the file to something suited for blind, say:

\setuphead[chapter][style=normal]

but that could be a lot of work. More simple is to use pdftotext which 
works ok for most cases,

\setuplayout[header=0pt,footer=0pt]
\setupcolumns[n=1]

is then probably enough

btw, there are ways to get auditive info in the pdf file, for instance let 
the voice engine speak and so

Hans  

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-04-15 20:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-14 20:50 ConTeXt and the blind Alan Bowen
2004-04-14 22:20 ` Bill McClain
2004-04-15 11:06   ` Erik Hetzner
2004-04-14 23:44 ` Matthew Huggett
2004-04-15 14:00   ` Alan Bowen
2004-04-15 14:24     ` Bill McClain
2004-04-15 15:32 ` Jan Hlavacek
2004-04-15 20:47   ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).