ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Ligature handling for PDF searching.
@ 2005-07-27  3:52 Brooks Moses
  2005-07-27  7:37 ` Hans Hagen
  2005-07-27  8:04 ` Vit Zyka
  0 siblings, 2 replies; 7+ messages in thread
From: Brooks Moses @ 2005-07-27  3:52 UTC (permalink / raw)


(This came up on comp.text.tex in a question about LaTeX, but it also 
applies to ConTeXt, and the proposed solution for LaTeX doesn't apply.)

Consider the following document:

   \starttext
   Some ligature tests: ff, fi, ffi, fl, ffl.
   \stoptext

If I process that with texexex -pdf, load it into Acrobat 5, and then 
copy-and-paste the text from the PDF into a text editor, the fi and fl 
ligatures are correctly treated as two letters, but the ff, ffi, and ffl 
ligatures are treated as single (unknown) characters.  Similarly, searching 
for "f" within the document only finds the fi and fl ligatures; it doesn't 
find the others.  Searching for "ff" finds nothing.

This is a fairly significant problem in the on-screen usability of 
ConTeXt-created documents.

In LaTeX, there is apparently a solution in the cmap.sty package (though it 
currently only works for T1 encoding):
   http://www.ctan.org/tex-archive/macros/latex/contrib/cmap/

Is there a similar solution for ConTeXt?  (Has this perhaps been solved 
with a later version of ConTeXt than I have on my computer?)

Thanks,
- Brooks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  3:52 Ligature handling for PDF searching Brooks Moses
@ 2005-07-27  7:37 ` Hans Hagen
  2005-07-27  8:25   ` Taco Hoekwater
  2005-07-27  8:04 ` Vit Zyka
  1 sibling, 1 reply; 7+ messages in thread
From: Hans Hagen @ 2005-07-27  7:37 UTC (permalink / raw)


Brooks Moses wrote:

> Is there a similar solution for ConTeXt?  (Has this perhaps been 
> solved with a later version of ConTeXt than I have on my computer?)

that kind of stuff was introduced in context ages ago -)

take a look at:

  pdfr-il2 
  enco-pfr

it's rather integrated and automatic although i didn't test it recently (probably last in fall 2000) 

the only thing needed is a pdfr-ec and pdfr-texnansi 

Hans 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  3:52 Ligature handling for PDF searching Brooks Moses
  2005-07-27  7:37 ` Hans Hagen
@ 2005-07-27  8:04 ` Vit Zyka
  1 sibling, 0 replies; 7+ messages in thread
From: Vit Zyka @ 2005-07-27  8:04 UTC (permalink / raw)


Brooks Moses wrote:
> (This came up on comp.text.tex in a question about LaTeX, but it also 
> applies to ConTeXt, and the proposed solution for LaTeX doesn't apply.)
> 
> Consider the following document:
> 
>   \starttext
>   Some ligature tests: ff, fi, ffi, fl, ffl.
>   \stoptext
> 
> If I process that with texexex -pdf, load it into Acrobat 5, and then 
> copy-and-paste the text from the PDF into a text editor, the fi and fl 
> ligatures are correctly treated as two letters, but the ff, ffi, and ffl 
> ligatures are treated as single (unknown) characters.  Similarly, 
> searching for "f" within the document only finds the fi and fl 
> ligatures; it doesn't find the others.  Searching for "ff" finds nothing.
> 
> This is a fairly significant problem in the on-screen usability of 
> ConTeXt-created documents.
> 
> In LaTeX, there is apparently a solution in the cmap.sty package (though 
> it currently only works for T1 encoding):
>   http://www.ctan.org/tex-archive/macros/latex/contrib/cmap/
> 
> Is there a similar solution for ConTeXt?  (Has this perhaps been solved 
> with a later version of ConTeXt than I have on my computer?)

Yes, but IFAIK only for one or two encodings (CMAP files). I have to 
remember ... the keyword is \usepdffontresource. See source enco-pfr.tex 
for more info.

vit

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  7:37 ` Hans Hagen
@ 2005-07-27  8:25   ` Taco Hoekwater
  2005-07-27  8:50     ` Brooks Moses
  0 siblings, 1 reply; 7+ messages in thread
From: Taco Hoekwater @ 2005-07-27  8:25 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]


Hi,

Attached is pdfr-ec.tex. I don't really understand what is going on,
so the texnansi version is out of my reach. Also, I cannot/will not
test because AR7 has no problem with ffi anyway.

Taco

Hans Hagen wrote:
> Brooks Moses wrote:
> 
>> Is there a similar solution for ConTeXt?  (Has this perhaps been 
>> solved with a later version of ConTeXt than I have on my computer?)
> 
> 
> that kind of stuff was introduced in context ages ago -)
> 
> take a look at:
> 
>  pdfr-il2  enco-pfr
> 
> it's rather integrated and automatic although i didn't test it recently 
> (probably last in fall 2000)
> the only thing needed is a pdfr-ec and pdfr-texnansi
> Hans
> -----------------------------------------------------------------
>                                          Hans Hagen | PRAGMA ADE
>              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
>                                             | www.pragma-pod.nl
> -----------------------------------------------------------------
> 
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context

[-- Attachment #2: pdfr-ec.tex --]
[-- Type: application/x-tex, Size: 2956 bytes --]

[-- Attachment #3: Type: text/plain, Size: 139 bytes --]

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  8:25   ` Taco Hoekwater
@ 2005-07-27  8:50     ` Brooks Moses
  2005-07-27  9:13       ` Taco Hoekwater
  0 siblings, 1 reply; 7+ messages in thread
From: Brooks Moses @ 2005-07-27  8:50 UTC (permalink / raw)


At 01:25 AM 7/27/2005, you wrote:
>Attached is pdfr-ec.tex. I don't really understand what is going on,
>so the texnansi version is out of my reach. Also, I cannot/will not
>test because AR7 has no problem with ffi anyway.

I'm perfectly glad to test this, but I'm not at all sure how to use 
it.  What do I need to do to use it?

Thanks!
- Brooks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  8:50     ` Brooks Moses
@ 2005-07-27  9:13       ` Taco Hoekwater
  2005-07-27 10:35         ` Hans Hagen
  0 siblings, 1 reply; 7+ messages in thread
From: Taco Hoekwater @ 2005-07-27  9:13 UTC (permalink / raw)



I'm guessing:

   \input enco-pfr
   \startencoding [ec]
     \usepdffontresource ec
   \stopencoding
   \starttext
     fi ff ffi
   \stoptext


(at least this loads pdfr-ec.tex)

Taco

Brooks Moses wrote:
> At 01:25 AM 7/27/2005, you wrote:
> 
>> Attached is pdfr-ec.tex. I don't really understand what is going on,
>> so the texnansi version is out of my reach. Also, I cannot/will not
>> test because AR7 has no problem with ffi anyway.
> 
> 
> I'm perfectly glad to test this, but I'm not at all sure how to use it.  
> What do I need to do to use it?
> 
> Thanks!
> - Brooks
> 
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ligature handling for PDF searching.
  2005-07-27  9:13       ` Taco Hoekwater
@ 2005-07-27 10:35         ` Hans Hagen
  0 siblings, 0 replies; 7+ messages in thread
From: Hans Hagen @ 2005-07-27 10:35 UTC (permalink / raw)


Taco Hoekwater wrote:

>
> I'm guessing:
>
>   \input enco-pfr
>   \startencoding [ec]
>     \usepdffontresource ec
>   \stopencoding
>   \starttext
>     fi ff ffi
>   \stoptext
>
>
> (at least this loads pdfr-ec.tex)
>
> Taco

it's hard to check with compressed files, but:

\pdfcompresslevel=0

\useencoding[pfr]
    
\startencoding [ec]
  \usepdffontresource ec
\stopencoding

\usetypescript[palatino][ec] \setupbodyfont[palatino]

\starttext
fi ff ffi
\stoptext

seems to work here; i'll add the file and definition to the distribution 

Hans 


-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-07-27 10:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-27  3:52 Ligature handling for PDF searching Brooks Moses
2005-07-27  7:37 ` Hans Hagen
2005-07-27  8:25   ` Taco Hoekwater
2005-07-27  8:50     ` Brooks Moses
2005-07-27  9:13       ` Taco Hoekwater
2005-07-27 10:35         ` Hans Hagen
2005-07-27  8:04 ` Vit Zyka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).