* Ligature handling for PDF searching.
@ 2005-07-27 3:52 Brooks Moses
2005-07-27 7:37 ` Hans Hagen
2005-07-27 8:04 ` Vit Zyka
0 siblings, 2 replies; 7+ messages in thread
From: Brooks Moses @ 2005-07-27 3:52 UTC (permalink / raw)
(This came up on comp.text.tex in a question about LaTeX, but it also
applies to ConTeXt, and the proposed solution for LaTeX doesn't apply.)
Consider the following document:
\starttext
Some ligature tests: ff, fi, ffi, fl, ffl.
\stoptext
If I process that with texexex -pdf, load it into Acrobat 5, and then
copy-and-paste the text from the PDF into a text editor, the fi and fl
ligatures are correctly treated as two letters, but the ff, ffi, and ffl
ligatures are treated as single (unknown) characters. Similarly, searching
for "f" within the document only finds the fi and fl ligatures; it doesn't
find the others. Searching for "ff" finds nothing.
This is a fairly significant problem in the on-screen usability of
ConTeXt-created documents.
In LaTeX, there is apparently a solution in the cmap.sty package (though it
currently only works for T1 encoding):
http://www.ctan.org/tex-archive/macros/latex/contrib/cmap/
Is there a similar solution for ConTeXt? (Has this perhaps been solved
with a later version of ConTeXt than I have on my computer?)
Thanks,
- Brooks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 3:52 Ligature handling for PDF searching Brooks Moses
@ 2005-07-27 7:37 ` Hans Hagen
2005-07-27 8:25 ` Taco Hoekwater
2005-07-27 8:04 ` Vit Zyka
1 sibling, 1 reply; 7+ messages in thread
From: Hans Hagen @ 2005-07-27 7:37 UTC (permalink / raw)
Brooks Moses wrote:
> Is there a similar solution for ConTeXt? (Has this perhaps been
> solved with a later version of ConTeXt than I have on my computer?)
that kind of stuff was introduced in context ages ago -)
take a look at:
pdfr-il2
enco-pfr
it's rather integrated and automatic although i didn't test it recently (probably last in fall 2000)
the only thing needed is a pdfr-ec and pdfr-texnansi
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 3:52 Ligature handling for PDF searching Brooks Moses
2005-07-27 7:37 ` Hans Hagen
@ 2005-07-27 8:04 ` Vit Zyka
1 sibling, 0 replies; 7+ messages in thread
From: Vit Zyka @ 2005-07-27 8:04 UTC (permalink / raw)
Brooks Moses wrote:
> (This came up on comp.text.tex in a question about LaTeX, but it also
> applies to ConTeXt, and the proposed solution for LaTeX doesn't apply.)
>
> Consider the following document:
>
> \starttext
> Some ligature tests: ff, fi, ffi, fl, ffl.
> \stoptext
>
> If I process that with texexex -pdf, load it into Acrobat 5, and then
> copy-and-paste the text from the PDF into a text editor, the fi and fl
> ligatures are correctly treated as two letters, but the ff, ffi, and ffl
> ligatures are treated as single (unknown) characters. Similarly,
> searching for "f" within the document only finds the fi and fl
> ligatures; it doesn't find the others. Searching for "ff" finds nothing.
>
> This is a fairly significant problem in the on-screen usability of
> ConTeXt-created documents.
>
> In LaTeX, there is apparently a solution in the cmap.sty package (though
> it currently only works for T1 encoding):
> http://www.ctan.org/tex-archive/macros/latex/contrib/cmap/
>
> Is there a similar solution for ConTeXt? (Has this perhaps been solved
> with a later version of ConTeXt than I have on my computer?)
Yes, but IFAIK only for one or two encodings (CMAP files). I have to
remember ... the keyword is \usepdffontresource. See source enco-pfr.tex
for more info.
vit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 7:37 ` Hans Hagen
@ 2005-07-27 8:25 ` Taco Hoekwater
2005-07-27 8:50 ` Brooks Moses
0 siblings, 1 reply; 7+ messages in thread
From: Taco Hoekwater @ 2005-07-27 8:25 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]
Hi,
Attached is pdfr-ec.tex. I don't really understand what is going on,
so the texnansi version is out of my reach. Also, I cannot/will not
test because AR7 has no problem with ffi anyway.
Taco
Hans Hagen wrote:
> Brooks Moses wrote:
>
>> Is there a similar solution for ConTeXt? (Has this perhaps been
>> solved with a later version of ConTeXt than I have on my computer?)
>
>
> that kind of stuff was introduced in context ages ago -)
>
> take a look at:
>
> pdfr-il2 enco-pfr
>
> it's rather integrated and automatic although i didn't test it recently
> (probably last in fall 2000)
> the only thing needed is a pdfr-ec and pdfr-texnansi
> Hans
> -----------------------------------------------------------------
> Hans Hagen | PRAGMA ADE
> Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
> tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
> | www.pragma-pod.nl
> -----------------------------------------------------------------
>
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
[-- Attachment #2: pdfr-ec.tex --]
[-- Type: application/x-tex, Size: 2956 bytes --]
[-- Attachment #3: Type: text/plain, Size: 139 bytes --]
_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 8:25 ` Taco Hoekwater
@ 2005-07-27 8:50 ` Brooks Moses
2005-07-27 9:13 ` Taco Hoekwater
0 siblings, 1 reply; 7+ messages in thread
From: Brooks Moses @ 2005-07-27 8:50 UTC (permalink / raw)
At 01:25 AM 7/27/2005, you wrote:
>Attached is pdfr-ec.tex. I don't really understand what is going on,
>so the texnansi version is out of my reach. Also, I cannot/will not
>test because AR7 has no problem with ffi anyway.
I'm perfectly glad to test this, but I'm not at all sure how to use
it. What do I need to do to use it?
Thanks!
- Brooks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 8:50 ` Brooks Moses
@ 2005-07-27 9:13 ` Taco Hoekwater
2005-07-27 10:35 ` Hans Hagen
0 siblings, 1 reply; 7+ messages in thread
From: Taco Hoekwater @ 2005-07-27 9:13 UTC (permalink / raw)
I'm guessing:
\input enco-pfr
\startencoding [ec]
\usepdffontresource ec
\stopencoding
\starttext
fi ff ffi
\stoptext
(at least this loads pdfr-ec.tex)
Taco
Brooks Moses wrote:
> At 01:25 AM 7/27/2005, you wrote:
>
>> Attached is pdfr-ec.tex. I don't really understand what is going on,
>> so the texnansi version is out of my reach. Also, I cannot/will not
>> test because AR7 has no problem with ffi anyway.
>
>
> I'm perfectly glad to test this, but I'm not at all sure how to use it.
> What do I need to do to use it?
>
> Thanks!
> - Brooks
>
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Ligature handling for PDF searching.
2005-07-27 9:13 ` Taco Hoekwater
@ 2005-07-27 10:35 ` Hans Hagen
0 siblings, 0 replies; 7+ messages in thread
From: Hans Hagen @ 2005-07-27 10:35 UTC (permalink / raw)
Taco Hoekwater wrote:
>
> I'm guessing:
>
> \input enco-pfr
> \startencoding [ec]
> \usepdffontresource ec
> \stopencoding
> \starttext
> fi ff ffi
> \stoptext
>
>
> (at least this loads pdfr-ec.tex)
>
> Taco
it's hard to check with compressed files, but:
\pdfcompresslevel=0
\useencoding[pfr]
\startencoding [ec]
\usepdffontresource ec
\stopencoding
\usetypescript[palatino][ec] \setupbodyfont[palatino]
\starttext
fi ff ffi
\stoptext
seems to work here; i'll add the file and definition to the distribution
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-07-27 10:35 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-27 3:52 Ligature handling for PDF searching Brooks Moses
2005-07-27 7:37 ` Hans Hagen
2005-07-27 8:25 ` Taco Hoekwater
2005-07-27 8:50 ` Brooks Moses
2005-07-27 9:13 ` Taco Hoekwater
2005-07-27 10:35 ` Hans Hagen
2005-07-27 8:04 ` Vit Zyka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).