ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* searchable PDF with MinionPro under mkiv
@ 2011-01-16 23:30 Oliver Heins
  2011-01-17  0:37 ` Li Yanrui (李延瑞)
  2011-01-17 11:53 ` Florian Wobbe
  0 siblings, 2 replies; 6+ messages in thread
From: Oliver Heins @ 2011-01-16 23:30 UTC (permalink / raw)
  To: mailing list for ConTeXt users

How can I generate a searchable PDF with mkiv, using a non standard font
like MinionPro?

\definefontfeature [default] [default] [mode=node,script=latn,onum=yes]
\usemodule[simplefonts]
\setmainfont[minionpro]

\starttext
fi ff ffi ffl 1234567890
\stoptext

Using pdftotext, I get this:

fi ff ffi ffl 

However, using Adobe Reader this things won't be found.  It should read:

fi ff ffi ffl 1234567890

Using latex, one would use \input glyphtounicode.tex \pdfgentounicode=1,
but this doesn't seem to work with context.  Context used pdfr-def, but
this seems to be mkii-only.


TIA,
 olli


-- 
Oliver Heins heins@sopos.org  http://www.sopos.org/olli
GPG: F27A BA8C 1CFB B905 65A8  2544 0F07 B675 9A00 D827
1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827
Please avoid sending me Word or PowerPoint attachments:
http://www.gnu.org/philosophy/no-word-attachments.html
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: searchable PDF with MinionPro under mkiv
  2011-01-16 23:30 searchable PDF with MinionPro under mkiv Oliver Heins
@ 2011-01-17  0:37 ` Li Yanrui (李延瑞)
  2011-01-17  1:25   ` Oliver Heins
  2011-01-17 11:53 ` Florian Wobbe
  1 sibling, 1 reply; 6+ messages in thread
From: Li Yanrui (李延瑞) @ 2011-01-17  0:37 UTC (permalink / raw)
  To: mailing list for ConTeXt users

2011/1/17 Oliver Heins <olli@sopos.org>:
> How can I generate a searchable PDF with mkiv, using a non standard font
> like MinionPro?
>
> \definefontfeature [default] [default] [mode=node,script=latn,onum=yes]
> \usemodule[simplefonts]
> \setmainfont[minionpro]
>
> \starttext
> fi ff ffi ffl 1234567890
> \stoptext
>
> Using pdftotext, I get this:
>
> fi ff ffi ffl 
>
> However, using Adobe Reader this things won't be found.  It should read:
>
> fi ff ffi ffl 1234567890
>
> Using latex, one would use \input glyphtounicode.tex \pdfgentounicode=1,
> but this doesn't seem to work with context.  Context used pdfr-def, but
> this seems to be mkii-only.
>

Hi Oliver,

Your example works for me with the beta 2011.01.14 and pdftotext-0.16.0.

Which version is your ConTeXt MkIV? Your problem looks like
"http://www.ntg.nl/pipermail/ntg-context/2010/052259.html" but that
one has been solved by Taco.

-- 
Best regards,

Li Yanrui (李延瑞)
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: searchable PDF with MinionPro under mkiv
  2011-01-17  0:37 ` Li Yanrui (李延瑞)
@ 2011-01-17  1:25   ` Oliver Heins
  0 siblings, 0 replies; 6+ messages in thread
From: Oliver Heins @ 2011-01-17  1:25 UTC (permalink / raw)
  To: mailing list for ConTeXt users

"Li Yanrui (李延瑞)" <liyanrui.m2@gmail.com> writes:

> Your example works for me with the beta 2011.01.14 and
> pdftotext-0.16.0.
>
> Which version is your ConTeXt MkIV? Your problem looks like
> "http://www.ntg.nl/pipermail/ntg-context/2010/052259.html" but that
> one has been solved by Taco.

Hi Li,

ConTeXt  ver: 2011.01.12 10:20 MKIV  fmt: 2011.1.12

This is a quite recent version, however I updated my minimals, so now I
have:

ConTeXt  ver: 2011.01.14 14:44 MKIV  fmt: 2011.1.17

The result stays the same.

My pdftotext is an older version (0.12.4), but that shouldn't be a
problem.  Adobe reader, evince and xpdf are able to find the ligatures,
but not the numbers.  okular even fails to find the ligatures, but I
would consider this a bug in okular.

Best regards,
 olli

-- 
Oliver Heins heins@sopos.org  http://www.sopos.org/olli
GPG: F27A BA8C 1CFB B905 65A8  2544 0F07 B675 9A00 D827
1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827
Please avoid sending me Word or PowerPoint attachments:
http://www.gnu.org/philosophy/no-word-attachments.html
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: searchable PDF with MinionPro under mkiv
  2011-01-16 23:30 searchable PDF with MinionPro under mkiv Oliver Heins
  2011-01-17  0:37 ` Li Yanrui (李延瑞)
@ 2011-01-17 11:53 ` Florian Wobbe
  2011-01-17 13:29   ` Oliver Heins
  1 sibling, 1 reply; 6+ messages in thread
From: Florian Wobbe @ 2011-01-17 11:53 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> How can I generate a searchable PDF with mkiv, using a non standard font
> like MinionPro?
> 
> \definefontfeature [default] [default] [mode=node,script=latn,onum=yes]
> \usemodule[simplefonts]
> \setmainfont[minionpro]
> 
> \starttext
> fi ff ffi ffl 1234567890
> \stoptext
> 
> Using pdftotext, I get this:
> 
> fi ff ffi ffl 

Hi Oliver,

it works for me with the beta 2011.01.12 and 2011.01.14 and poppler-0.14.5/ poppler-0.16.0.

However, it turns out that pdftotext converts to

fi ff ffi ffl 1234567890,

splitting fi ligature while leaving ff, ffi and ffl intact, which is strange.

I did not try with Adobe Reader but the pdf is searchable with Apple Preview and the pasted copy is still intact:

fi ff ffi ffl 1234567890

Florian

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: searchable PDF with MinionPro under mkiv
  2011-01-17 11:53 ` Florian Wobbe
@ 2011-01-17 13:29   ` Oliver Heins
  2011-01-17 13:53     ` Florian Wobbe
  0 siblings, 1 reply; 6+ messages in thread
From: Oliver Heins @ 2011-01-17 13:29 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi Florian,

Florian Wobbe <Florian.Wobbe@awi.de> writes:

> it works for me with the beta 2011.01.12 and 2011.01.14 and
> poppler-0.14.5/ poppler-0.16.0.
>
> However, it turns out that pdftotext converts to
>
> fi ff ffi ffl 1234567890,
>
> splitting fi ligature while leaving ff, ffi and ffl intact, which is
> strange.
>
> I did not try with Adobe Reader but the pdf is searchable with Apple
> Preview and the pasted copy is still intact:
>
> fi ff ffi ffl 1234567890

For me, it still doesn't work.  I get oldstyle numbers in the text, and
neither in Adobe Reader nor in okular, evince or xpdf the numbers are
searchable.  However, I figured out that it is my version of the font
causing the wrong result.

$ otfinfo -i /usr/local/share/fonts/MinionPro_Regular.otf                                                 
Family:              Minion Pro
Subfamily:           Regular
Full name:           Minion Pro
PostScript name:     MinionPro-Regular
Version:             OTF 1.011;PS 001.000;Core 1.0.27;makeotf.lib1.3.1
Unique ID:           1.011;ADBE;MinionPro-Regular
Designer:            Robert Slimbach
Vendor URL:          http://www.adobe.com/type/
Trademark:           Minion is either a registered trademark or a
                     trademark of Adobe Systems Incorporated in the
                     United States and/or other countries. 
Copyright:           © 2000 Adobe Systems Incorporated. All Rights
                     Reserved. U.S. Patent Des. 337,604. Other patents
                     pending.
License URL:         http://www.adobe.com/type/legal.html

When using the MinionPro fonts shipped with Adobe reader, I get the same
results as you:

$ otfinfo -i /usr/local/share/fonts/MinionPro-Regular.otf 

Family:              Minion Pro
Subfamily:           Regular
Full name:           Minion Pro
PostScript name:     MinionPro-Regular
Version:             Version 2.068;PS 2.000;hotconv 1.0.57;
                     makeotf.lib2.0.21895
Unique ID:           2.068;ADBE;MinionPro-Regular
Designer:            Robert Slimbach
Manufacturer:        Adobe Systems Incorporated
Vendor URL:          http://www.adobe.com/type/
Trademark:           Minion is either a registered trademark or a
                     trademark of Adobe Systems Incorporated in the
                     United States and/or other countries. 
Copyright:           © 1990, 1991, 1992, 1994, 1997, 1998, 2000, 2002,
                     2004 Adobe Systems Incorporated. All rights reserved. 
License URL:         http://www.adobe.com/type/legal.html

Has this to be consired a bug in the font?

Best regards,
 olli

-- 
Oliver Heins heins@sopos.org  http://www.sopos.org/olli
GPG: F27A BA8C 1CFB B905 65A8  2544 0F07 B675 9A00 D827
1024D/9A00D827 2004-09-24 -- gpg --recv-keys 0x9A00D827
Please avoid sending me Word or PowerPoint attachments:
http://www.gnu.org/philosophy/no-word-attachments.html
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: searchable PDF with MinionPro under mkiv
  2011-01-17 13:29   ` Oliver Heins
@ 2011-01-17 13:53     ` Florian Wobbe
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Wobbe @ 2011-01-17 13:53 UTC (permalink / raw)
  To: mailing list for ConTeXt users

>> However, it turns out that pdftotext converts to
>> 
>> fi ff ffi ffl 1234567890,
>> 
>> splitting fi ligature while leaving ff, ffi and ffl intact, which is
>> strange.
>> 
>> I did not try with Adobe Reader but the pdf is searchable with Apple
>> Preview and the pasted copy is still intact:
>> 
>> fi ff ffi ffl 1234567890
> 
> For me, it still doesn't work.  I get oldstyle numbers in the text, and
> neither in Adobe Reader nor in okular, evince or xpdf the numbers are
> searchable.  However, I figured out that it is my version of the font
> causing the wrong result.

You are right! I have not considered that. Depending on the used font, pdftotext expands (some) the ligatures or not. With TeXGyre Pagella for instance there is no ligature expansion at all:

fi ff ffi ffl 1234567890

and with Cambria I get a pdf which is not searchable with Preview:

􀅩i ff f􀅩i f􀅩l 1234567890

Florian

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-01-17 13:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-16 23:30 searchable PDF with MinionPro under mkiv Oliver Heins
2011-01-17  0:37 ` Li Yanrui (李延瑞)
2011-01-17  1:25   ` Oliver Heins
2011-01-17 11:53 ` Florian Wobbe
2011-01-17 13:29   ` Oliver Heins
2011-01-17 13:53     ` Florian Wobbe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).