ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* copy&paste from pdf bug (smallcaps, text figures)
@ 2013-08-01 17:33 Philipp Gesang
  2013-08-01 20:23 ` Otared Kavian
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Philipp Gesang @ 2013-08-01 17:33 UTC (permalink / raw)
  To: ConTeXt ML


[-- Attachment #1.1: Type: text/plain, Size: 737 bytes --]

Hi,

copy&paste from PDF is broken:

  \setupbodyfont [iwona]
  \starttext
    \feature[+][just-os,smallcaps] 0123456789 abcdefghijklmnopqrstuvwxyz
  \stoptext

Result:   
Expected: 0123456789abcdefghijklmnopqrstuvwxyz

(Tried in Okular, but reported for other readers as well [1])

Thanks to Marius’ git mirror I could bisect the changes since
TL 2012. It looks like the issue has been introduced with release
"stable 2013.05.27 09:10" [2].

Best regards,
Philipp


[1] http://tex.stackexchange.com/q/126333/14066
[2] http://repo.or.cz/w/context.git/commitdiff/6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8

[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 17:33 copy&paste from pdf bug (smallcaps, text figures) Philipp Gesang
@ 2013-08-01 20:23 ` Otared Kavian
  2013-08-01 21:46   ` Philipp Gesang
  2013-08-01 21:38 ` Marco Patzer
  2013-08-01 22:08 ` Jannik Voges
  2 siblings, 1 reply; 15+ messages in thread
From: Otared Kavian @ 2013-08-01 20:23 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi,

I tested your example: no problem here on Mac OS X 10.8.4, with either TeXShop, Adobe Redaer or Preview, with the latest beta (ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int: english/english).

Best regards: OK

On 1 août 2013, at 19:33, Philipp Gesang <Philipp.Gesang@alumni.uni-heidelberg.de> wrote:

> Hi,
> 
> copy&paste from PDF is broken:
> 
>  \setupbodyfont [iwona]
>  \starttext
>    \feature[+][just-os,smallcaps] 0123456789 abcdefghijklmnopqrstuvwxyz
>  \stoptext
> 
> Result:   
> Expected: 0123456789abcdefghijklmnopqrstuvwxyz
> 
> (Tried in Okular, but reported for other readers as well [1])
> 
> Thanks to Marius’ git mirror I could bisect the changes since
> TL 2012. It looks like the issue has been introduced with release
> "stable 2013.05.27 09:10" [2].
> 
> Best regards,
> Philipp
> 
> 
> [1] http://tex.stackexchange.com/q/126333/14066
> [2] http://repo.or.cz/w/context.git/commitdiff/6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 17:33 copy&paste from pdf bug (smallcaps, text figures) Philipp Gesang
  2013-08-01 20:23 ` Otared Kavian
@ 2013-08-01 21:38 ` Marco Patzer
  2013-08-01 22:08 ` Jannik Voges
  2 siblings, 0 replies; 15+ messages in thread
From: Marco Patzer @ 2013-08-01 21:38 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 445 bytes --]

On 2013–08–01 Philipp Gesang wrote:

> copy&paste from PDF is broken:
> 
>   \setupbodyfont [iwona]
>   \starttext
>     \feature[+][just-os,smallcaps] 0123456789 abcdefghijklmnopqrstuvwxyz
>   \stoptext
> 
> Result:   
> Expected: 0123456789abcdefghijklmnopqrstuvwxyz

Works here with 2013.08.01 01:31 (linux)

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 20:23 ` Otared Kavian
@ 2013-08-01 21:46   ` Philipp Gesang
  2013-08-01 22:01     ` Marco Patzer
  0 siblings, 1 reply; 15+ messages in thread
From: Philipp Gesang @ 2013-08-01 21:46 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 623 bytes --]

···<date: 2013-08-01, Thursday>···<from: Otared Kavian>···

> I tested your example: no problem here on Mac OS X 10.8.4, with
> either TeXShop, Adobe Redaer or Preview, with the latest beta
> (ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int:
> english/english).

x64 linux here, but it’s the same with the windows version in
wine32. I get the bad output with okular (poppler), acroread, and
mupdf, but strangely not with zathura (mupdf-based). It is
present both in texlive and the most recent beta. Going back
exactly one release to "2013.05.22 19:28" fixes it consistently.

Philipp

[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 21:46   ` Philipp Gesang
@ 2013-08-01 22:01     ` Marco Patzer
  2013-08-01 22:12       ` Philipp Gesang
  0 siblings, 1 reply; 15+ messages in thread
From: Marco Patzer @ 2013-08-01 22:01 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 684 bytes --]

On 2013–08–01 Philipp Gesang wrote:

> ···<date: 2013-08-01, Thursday>···<from: Otared Kavian>···
> 
> > I tested your example: no problem here on Mac OS X 10.8.4, with
> > either TeXShop, Adobe Redaer or Preview, with the latest beta
> > (ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int:
> > english/english).
> 
> x64 linux here, but it’s the same with the windows version in
> wine32. I get the bad output with okular (poppler), acroread, and
> mupdf, but strangely not with zathura (mupdf-based).

Just to add to the list:

x64 linux here, and it works with the following poppler based
viewers (zathura-poppler, xpdf, evince)

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 17:33 copy&paste from pdf bug (smallcaps, text figures) Philipp Gesang
  2013-08-01 20:23 ` Otared Kavian
  2013-08-01 21:38 ` Marco Patzer
@ 2013-08-01 22:08 ` Jannik Voges
  2 siblings, 0 replies; 15+ messages in thread
From: Jannik Voges @ 2013-08-01 22:08 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 268 bytes --]

Works with OS X Preview (10.8.4).



Am 01.08.2013 um 19:33 schrieb Philipp Gesang <Philipp.Gesang@alumni.uni-heidelberg.de>:


>  \setupbodyfont [iwona]
>  \starttext
>    \feature[+][just-os,smallcaps] 0123456789 abcdefghijklmnopqrstuvwxyz
>  \stoptext


[-- Attachment #1.2: Type: text/html, Size: 3472 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 22:01     ` Marco Patzer
@ 2013-08-01 22:12       ` Philipp Gesang
  2013-08-02 11:09         ` Hans Hagen
  0 siblings, 1 reply; 15+ messages in thread
From: Philipp Gesang @ 2013-08-01 22:12 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 1786 bytes --]

···<date: 2013-08-02, Friday>···<from: Marco Patzer>···

> On 2013–08–01 Philipp Gesang wrote:
> 
> > ···<date: 2013-08-01, Thursday>···<from: Otared Kavian>···
> > 
> > > I tested your example: no problem here on Mac OS X 10.8.4, with
> > > either TeXShop, Adobe Redaer or Preview, with the latest beta
> > > (ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int:
> > > english/english).
> > 
> > x64 linux here, but it’s the same with the windows version in
> > wine32. I get the bad output with okular (poppler), acroread, and
> > mupdf, but strangely not with zathura (mupdf-based).
> 
> Just to add to the list:
> 
> x64 linux here, and it works with the following poppler based
> viewers (zathura-poppler, xpdf, evince)

For those who want to test the git version, the commits are:

    last good: a61813ccdd4b7bcc81932317e1360fda6c79962d
    first bad: 6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8

Don’t forget to delete the cache.

I suspect I found the troublesome changes. The problem vanishes
if I revert this modification to font-map.lua:

    -local separator   = S("_.")
    -local other       = C((1 - separator)^1)
    -local ligsplitter = Ct(other * (separator * other)^0)
    +local ligseparator = P("_")
    +local varseparator = P(".")
    +local namesplitter = Ct(C((1 - ligseparator - varseparator)^1) * (ligseparator * C((1 - ligseparator - varseparator)^1))^0)

and then further down:

    -                local split = lpegmatch(ligsplitter,name)
    <...>
    +                local split = lpegmatch(namesplitter,name)

For convenience I repeat the link to the changeset:

    http://repo.or.cz/w/context.git/commitdiff/6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8

Best,
Philipp


[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-01 22:12       ` Philipp Gesang
@ 2013-08-02 11:09         ` Hans Hagen
  2013-08-02 11:37           ` Philipp Gesang
  0 siblings, 1 reply; 15+ messages in thread
From: Hans Hagen @ 2013-08-02 11:09 UTC (permalink / raw)
  To: ntg-context

On 8/2/2013 12:12 AM, Philipp Gesang wrote:
> ···<date: 2013-08-02, Friday>···<from: Marco Patzer>···
>
>> On 2013–08–01 Philipp Gesang wrote:
>>
>>> ···<date: 2013-08-01, Thursday>···<from: Otared Kavian>···
>>>
>>>> I tested your example: no problem here on Mac OS X 10.8.4, with
>>>> either TeXShop, Adobe Redaer or Preview, with the latest beta
>>>> (ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int:
>>>> english/english).
>>>
>>> x64 linux here, but it’s the same with the windows version in
>>> wine32. I get the bad output with okular (poppler), acroread, and
>>> mupdf, but strangely not with zathura (mupdf-based).
>>
>> Just to add to the list:
>>
>> x64 linux here, and it works with the following poppler based
>> viewers (zathura-poppler, xpdf, evince)

i'm a bit puzzled

> For those who want to test the git version, the commits are:
>
>      last good: a61813ccdd4b7bcc81932317e1360fda6c79962d
>      first bad: 6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8
>
> Don’t forget to delete the cache.
>
> I suspect I found the troublesome changes. The problem vanishes
> if I revert this modification to font-map.lua:
>
>      -local separator   = S("_.")
>      -local other       = C((1 - separator)^1)
>      -local ligsplitter = Ct(other * (separator * other)^0)
>      +local ligseparator = P("_")
>      +local varseparator = P(".")
>      +local namesplitter = Ct(C((1 - ligseparator - varseparator)^1) * (ligseparator * C((1 - ligseparator - varseparator)^1))^0)
>
> and then further down:
>
>      -                local split = lpegmatch(ligsplitter,name)
>      <...>
>      +                local split = lpegmatch(namesplitter,name)
>
> For convenience I repeat the link to the changeset:

what do you revert from ... the + things are already in the file

>      http://repo.or.cz/w/context.git/commitdiff/6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8

btw, this bit of code is evolving (was recently adapt to some border 
case fonts that use their own rules)

anyhow, on my win8 system the beta works with sumatra, okular and 
acrobat (indeed one might need to wipe the cache, but i can increment 
the version number)

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 11:09         ` Hans Hagen
@ 2013-08-02 11:37           ` Philipp Gesang
  2013-08-02 12:02             ` Marco Patzer
  0 siblings, 1 reply; 15+ messages in thread
From: Philipp Gesang @ 2013-08-02 11:37 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 3250 bytes --]

···<date: 2013-08-02, Friday>···<from: Hans Hagen>···

> On 8/2/2013 12:12 AM, Philipp Gesang wrote:
> >···<date: 2013-08-02, Friday>···<from: Marco Patzer>···
> >
> >>On 2013–08–01 Philipp Gesang wrote:
> >>
> >>>···<date: 2013-08-01, Thursday>···<from: Otared Kavian>···
> >>>
> >>>>I tested your example: no problem here on Mac OS X 10.8.4, with
> >>>>either TeXShop, Adobe Redaer or Preview, with the latest beta
> >>>>(ConTeXt  ver: 2013.08.01 01:31 MKIV beta  fmt: 2013.8.1  int:
> >>>>english/english).
> >>>
> >>>x64 linux here, but it’s the same with the windows version in
> >>>wine32. I get the bad output with okular (poppler), acroread, and
> >>>mupdf, but strangely not with zathura (mupdf-based).
> >>
> >>Just to add to the list:
> >>
> >>x64 linux here, and it works with the following poppler based
> >>viewers (zathura-poppler, xpdf, evince)

I’m on a different machine now: the problem affects linux x86 and
pdftotext as well. Also, in xpdf I get smallcaps copied as
uppercase instead of lowercase.

> i'm a bit puzzled
> 
> >For those who want to test the git version, the commits are:
> >
> >     last good: a61813ccdd4b7bcc81932317e1360fda6c79962d
> >     first bad: 6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8
> >
> >Don’t forget to delete the cache.
> >
> >I suspect I found the troublesome changes. The problem vanishes
> >if I revert this modification to font-map.lua:
> >
> >     -local separator   = S("_.")
> >     -local other       = C((1 - separator)^1)
> >     -local ligsplitter = Ct(other * (separator * other)^0)
> >     +local ligseparator = P("_")
> >     +local varseparator = P(".")
> >     +local namesplitter = Ct(C((1 - ligseparator - varseparator)^1) * (ligseparator * C((1 - ligseparator - varseparator)^1))^0)
> >
> >and then further down:
> >
> >     -                local split = lpegmatch(ligsplitter,name)
> >     <...>
> >     +                local split = lpegmatch(namesplitter,name)
> >
> >For convenience I repeat the link to the changeset:
> 
> what do you revert from ... the + things are already in the file

I’m quoting from the changeset, so the “-” lines indicate the
good version.

> >     http://repo.or.cz/w/context.git/commitdiff/6b2f7c5fd7a3e465f4e2662b1e5bd2c9d5cce8f8
> 
> btw, this bit of code is evolving (was recently adapt to some border
> case fonts that use their own rules)
> 
> anyhow, on my win8 system the beta works with sumatra, okular and
> acrobat (indeed one might need to wipe the cache, but i can
> increment the version number)

Weird. Here’s a PDF of the code I posted compiled with version
“2013.08.01 01:31” and how pdftotext renders it:

    https://phi-gamma.net/pdf/copypasta.pdf
    https://phi-gamma.net/files/copypasta.txt
  
I definitely get   from this
one. The characters are mapped from the private use area:


    <...>
    30 beginbfchar
    <0409> <F761>
    <0416> <F762>
    <0418> <F763>
    <0423> <F764>
    <042A> <F765>
    <0435> <F738>
    <...>

Can someone reproduce it at all?

Philipp


[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 11:37           ` Philipp Gesang
@ 2013-08-02 12:02             ` Marco Patzer
  2013-08-02 12:28               ` Philipp Gesang
  0 siblings, 1 reply; 15+ messages in thread
From: Marco Patzer @ 2013-08-02 12:02 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 639 bytes --]

On 2013–08–02 Philipp Gesang wrote:

>     https://phi-gamma.net/pdf/copypasta.pdf
>     https://phi-gamma.net/files/copypasta.txt
>   
> I definitely get   from this
> one.

Indeed. When I copy from your file I get those private Unicode
slots. When I run the example code from your OP, I get the correct
characters. I don't know what's the difference between those two
files. The LuaTeX version and ConTeXt version is the same.

Creator:        ConTeXt - 2013.08.01 01:31
Producer:       LuaTeX-0.76.0

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 12:02             ` Marco Patzer
@ 2013-08-02 12:28               ` Philipp Gesang
  2013-08-02 14:59                 ` Hans Hagen
  0 siblings, 1 reply; 15+ messages in thread
From: Philipp Gesang @ 2013-08-02 12:28 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 2151 bytes --]

···<date: 2013-08-02, Friday>···<from: Marco Patzer>···

> On 2013–08–02 Philipp Gesang wrote:
> 
> >     https://phi-gamma.net/pdf/copypasta.pdf
> >     https://phi-gamma.net/files/copypasta.txt
> >   
> > I definitely get   from this
> > one.
> 
> Indeed. When I copy from your file I get those private Unicode
> slots. When I run the example code from your OP, I get the correct
> characters. I don't know what's the difference between those two
> files.

The PDF is what Context produces here with that code.

>        The LuaTeX version and ConTeXt version is the same.
> 
> Creator:        ConTeXt - 2013.08.01 01:31
> Producer:       LuaTeX-0.76.0

There appears to be a difference between node and base mode
depending on how the font is defined:

  \pdfcompresslevel0
  
  \setupbodyfont [iwona]
  
  \definefontfeature [proto]    [onum=yes,smcp=yes,script=dflt,lang=dflt]
  \definefontfeature [withbase] [proto] [mode=base]
  \definefontfeature [withnode] [proto] [mode=node]
  
  \definefont [iwonab] [file:Iwona-Regular.otf*withbase]
  \definefont [iwonan] [file:Iwona-Regular.otf*withnode]
  
  \starttext
    \feature[<]
    base mode\par
    {\feature[!][withbase]0123456789abcdefghijklmnopqrstuvwxyz\par}
    {\iwonab 0123456789abcdefghijklmnopqrstuvwxyz}
  
    node mode\par
    {\feature[!][withnode]0123456789abcdefghijklmnopqrstuvwxyz\par}
    {\iwonan 0123456789abcdefghijklmnopqrstuvwxyz}
  \stoptext \endinput

This gets me (through pdftotext):

  base mode
  
  0123456789abcdefghijklmnopqrstuvwxyz
  node mode
  
  

So base mode with \definefont works while node mode or the font
from the typescript doesn’t.

Philipp

[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 12:28               ` Philipp Gesang
@ 2013-08-02 14:59                 ` Hans Hagen
  2013-08-02 15:56                   ` Philipp Gesang
  2013-08-02 18:21                   ` Philipp Gesang
  0 siblings, 2 replies; 15+ messages in thread
From: Hans Hagen @ 2013-08-02 14:59 UTC (permalink / raw)
  To: ntg-context

[-- Attachment #1: Type: text/plain, Size: 3961 bytes --]

On 8/2/2013 2:28 PM, Philipp Gesang wrote:
> ···<date: 2013-08-02, Friday>···<from: Marco Patzer>···
>
>> On 2013–08–02 Philipp Gesang wrote:
>>
>>>      https://phi-gamma.net/pdf/copypasta.pdf
>>>      https://phi-gamma.net/files/copypasta.txt
>>>
>>> I definitely get   from this
>>> one.
>>
>> Indeed. When I copy from your file I get those private Unicode
>> slots. When I run the example code from your OP, I get the correct
>> characters. I don't know what's the difference between those two
>> files.
>
> The PDF is what Context produces here with that code.
>
>>         The LuaTeX version and ConTeXt version is the same.
>>
>> Creator:        ConTeXt - 2013.08.01 01:31
>> Producer:       LuaTeX-0.76.0
>
> There appears to be a difference between node and base mode
> depending on how the font is defined:
>
>    \pdfcompresslevel0
>
>    \setupbodyfont [iwona]
>
>    \definefontfeature [proto]    [onum=yes,smcp=yes,script=dflt,lang=dflt]
>    \definefontfeature [withbase] [proto] [mode=base]
>    \definefontfeature [withnode] [proto] [mode=node]
>
>    \definefont [iwonab] [file:Iwona-Regular.otf*withbase]
>    \definefont [iwonan] [file:Iwona-Regular.otf*withnode]
>
>    \starttext
>      \feature[<]
>      base mode\par
>      {\feature[!][withbase]0123456789abcdefghijklmnopqrstuvwxyz\par}
>      {\iwonab 0123456789abcdefghijklmnopqrstuvwxyz}
>
>      node mode\par
>      {\feature[!][withnode]0123456789abcdefghijklmnopqrstuvwxyz\par}
>      {\iwonan 0123456789abcdefghijklmnopqrstuvwxyz}
>    \stoptext \endinput
>
> This gets me (through pdftotext):
>
>    base mode
>    
>    0123456789abcdefghijklmnopqrstuvwxyz
>    node mode
>    
>    
>
> So base mode with \definefont works while node mode or the font
> from the typescript doesn’t.

For such tests you need to compare all cases:

nopdfcompression

\setupbodyfont [iwona]

\definefontfeature [withbaseone] [proto] [mode=base]
\definefontfeature [withnodeone] [proto] [mode=node]
\definefontfeature [withbasetwo] [proto] 
[mode=base,onum=yes,smcp=yes,script=dflt,lang=dflt]
\definefontfeature [withnodetwo] [proto] 
[mode=node,onum=yes,smcp=yes,script=dflt,lang=dflt]

\definefont [iwonabone] [file:Iwona-Regular.otf*withbaseone]
\definefont [iwonanone] [file:Iwona-Regular.otf*withnodeone]

\definefont [iwonabtwo] [file:Iwona-Regular.otf*withbasetwo]
\definefont [iwonantwo] [file:Iwona-Regular.otf*withnodetwo]

\starttext

test 1, both modes:

{\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
{\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
{\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
{\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par

% test 2, base only:

% {\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
% {\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par

% test 3, node only:

% {\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
% {\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par

\stoptext

i.e. get rid of potential interferences

attached is what i get


-- 

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------

[-- Attachment #2: oeps.pdf --]
[-- Type: application/pdf, Size: 27862 bytes --]

[-- Attachment #3: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 14:59                 ` Hans Hagen
@ 2013-08-02 15:56                   ` Philipp Gesang
  2013-08-02 16:04                     ` Arthur Reutenauer
  2013-08-02 18:21                   ` Philipp Gesang
  1 sibling, 1 reply; 15+ messages in thread
From: Philipp Gesang @ 2013-08-02 15:56 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 4219 bytes --]

···<date: 2013-08-02, Friday>···<from: Hans Hagen>···

> On 8/2/2013 2:28 PM, Philipp Gesang wrote:
> >···<date: 2013-08-02, Friday>···<from: Marco Patzer>···
> >
> >>On 2013–08–02 Philipp Gesang wrote:
> >>
> >>>     https://phi-gamma.net/pdf/copypasta.pdf
> >>>     https://phi-gamma.net/files/copypasta.txt
> >>>
> >>>I definitely get   from this
> >>>one.
> >>
> >>Indeed. When I copy from your file I get those private Unicode
> >>slots. When I run the example code from your OP, I get the correct
> >>characters. I don't know what's the difference between those two
> >>files.
> >
> >The PDF is what Context produces here with that code.
> >
> >>        The LuaTeX version and ConTeXt version is the same.
> >>
> >>Creator:        ConTeXt - 2013.08.01 01:31
> >>Producer:       LuaTeX-0.76.0
> >
> >There appears to be a difference between node and base mode
> >depending on how the font is defined:
> >
> >   \pdfcompresslevel0
> >
> >   \setupbodyfont [iwona]
> >
> >   \definefontfeature [proto]    [onum=yes,smcp=yes,script=dflt,lang=dflt]
> >   \definefontfeature [withbase] [proto] [mode=base]
> >   \definefontfeature [withnode] [proto] [mode=node]
> >
> >   \definefont [iwonab] [file:Iwona-Regular.otf*withbase]
> >   \definefont [iwonan] [file:Iwona-Regular.otf*withnode]
> >
> >   \starttext
> >     \feature[<]
> >     base mode\par
> >     {\feature[!][withbase]0123456789abcdefghijklmnopqrstuvwxyz\par}
> >     {\iwonab 0123456789abcdefghijklmnopqrstuvwxyz}
> >
> >     node mode\par
> >     {\feature[!][withnode]0123456789abcdefghijklmnopqrstuvwxyz\par}
> >     {\iwonan 0123456789abcdefghijklmnopqrstuvwxyz}
> >   \stoptext \endinput
> >
> >This gets me (through pdftotext):
> >
> >   base mode
> >   
> >   0123456789abcdefghijklmnopqrstuvwxyz
> >   node mode
> >   
> >   
> >
> >So base mode with \definefont works while node mode or the font
> >from the typescript doesn’t.
> 
> For such tests you need to compare all cases:
> 
> nopdfcompression
> 
> \setupbodyfont [iwona]
> 
> \definefontfeature [withbaseone] [proto] [mode=base]
> \definefontfeature [withnodeone] [proto] [mode=node]
> \definefontfeature [withbasetwo] [proto]
> [mode=base,onum=yes,smcp=yes,script=dflt,lang=dflt]
> \definefontfeature [withnodetwo] [proto]
> [mode=node,onum=yes,smcp=yes,script=dflt,lang=dflt]
> 
> \definefont [iwonabone] [file:Iwona-Regular.otf*withbaseone]
> \definefont [iwonanone] [file:Iwona-Regular.otf*withnodeone]
> 
> \definefont [iwonabtwo] [file:Iwona-Regular.otf*withbasetwo]
> \definefont [iwonantwo] [file:Iwona-Regular.otf*withnodetwo]
> 
> \starttext
> 
> test 1, both modes:
> 
> {\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> % test 2, base only:
> 
> % {\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> % {\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> % test 3, node only:
> 
> % {\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> % {\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> \stoptext
> 
> i.e. get rid of potential interferences
> 
> attached is what i get

Your PDF is flawless, but on my machine node mode produces the
wrong output. (Also I run the luatex beta-0.76.0-2013040516 that
comes with the minimals.) PDF:

  https://phi-gamma.net/pdf/copypasta-hh.pdf

Here’s a link to the diff between your PDF and my output:

  http://pastie.org/private/zwnesrug7wpy4ket6ppl1g

This bug is quite elusive; can you think of anything that this
behavior might be a side-effect of?

Philipp


[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 15:56                   ` Philipp Gesang
@ 2013-08-02 16:04                     ` Arthur Reutenauer
  0 siblings, 0 replies; 15+ messages in thread
From: Arthur Reutenauer @ 2013-08-02 16:04 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> Here’s a link to the diff between your PDF and my output:
> 
>   http://pastie.org/private/zwnesrug7wpy4ket6ppl1g

  I don't know where in ConTeXt's code the difference comes from, but in
the PDF it is quite clear: Hans' file has a number of CMap entries that
map glyph IDs to lowercase latin letters, which yours doesn't.  See
lines 153, 154, and similar ones.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: copy&paste from pdf bug (smallcaps, text figures)
  2013-08-02 14:59                 ` Hans Hagen
  2013-08-02 15:56                   ` Philipp Gesang
@ 2013-08-02 18:21                   ` Philipp Gesang
  1 sibling, 0 replies; 15+ messages in thread
From: Philipp Gesang @ 2013-08-02 18:21 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 4116 bytes --]

···<date: 2013-08-02, Friday>···<from: Hans Hagen>···

> On 8/2/2013 2:28 PM, Philipp Gesang wrote:
> >···<date: 2013-08-02, Friday>···<from: Marco Patzer>···
> >
> >>On 2013–08–02 Philipp Gesang wrote:
> >>
> >>>     https://phi-gamma.net/pdf/copypasta.pdf
> >>>     https://phi-gamma.net/files/copypasta.txt
> >>>
> >>>I definitely get   from this
> >>>one.
> >>
> >>Indeed. When I copy from your file I get those private Unicode
> >>slots. When I run the example code from your OP, I get the correct
> >>characters. I don't know what's the difference between those two
> >>files.
> >
> >The PDF is what Context produces here with that code.
> >
> >>        The LuaTeX version and ConTeXt version is the same.
> >>
> >>Creator:        ConTeXt - 2013.08.01 01:31
> >>Producer:       LuaTeX-0.76.0
> >
> >There appears to be a difference between node and base mode
> >depending on how the font is defined:
> >
> >   \pdfcompresslevel0
> >
> >   \setupbodyfont [iwona]
> >
> >   \definefontfeature [proto]    [onum=yes,smcp=yes,script=dflt,lang=dflt]
> >   \definefontfeature [withbase] [proto] [mode=base]
> >   \definefontfeature [withnode] [proto] [mode=node]
> >
> >   \definefont [iwonab] [file:Iwona-Regular.otf*withbase]
> >   \definefont [iwonan] [file:Iwona-Regular.otf*withnode]
> >
> >   \starttext
> >     \feature[<]
> >     base mode\par
> >     {\feature[!][withbase]0123456789abcdefghijklmnopqrstuvwxyz\par}
> >     {\iwonab 0123456789abcdefghijklmnopqrstuvwxyz}
> >
> >     node mode\par
> >     {\feature[!][withnode]0123456789abcdefghijklmnopqrstuvwxyz\par}
> >     {\iwonan 0123456789abcdefghijklmnopqrstuvwxyz}
> >   \stoptext \endinput
> >
> >This gets me (through pdftotext):
> >
> >   base mode
> >   
> >   0123456789abcdefghijklmnopqrstuvwxyz
> >   node mode
> >   
> >   
> >
> >So base mode with \definefont works while node mode or the font
> >from the typescript doesn’t.
> 
> For such tests you need to compare all cases:
> 
> nopdfcompression
> 
> \setupbodyfont [iwona]
> 
> \definefontfeature [withbaseone] [proto] [mode=base]
> \definefontfeature [withnodeone] [proto] [mode=node]
> \definefontfeature [withbasetwo] [proto]
> [mode=base,onum=yes,smcp=yes,script=dflt,lang=dflt]
> \definefontfeature [withnodetwo] [proto]
> [mode=node,onum=yes,smcp=yes,script=dflt,lang=dflt]
> 
> \definefont [iwonabone] [file:Iwona-Regular.otf*withbaseone]
> \definefont [iwonanone] [file:Iwona-Regular.otf*withnodeone]
> 
> \definefont [iwonabtwo] [file:Iwona-Regular.otf*withbasetwo]
> \definefont [iwonantwo] [file:Iwona-Regular.otf*withnodetwo]
> 
> \starttext
> 
> test 1, both modes:
> 
> {\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> {\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> % test 2, base only:
> 
> % {\iwonabone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> % {\iwonabtwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> % test 3, node only:
> 
> % {\iwonanone 0123456789abcdefghijklmnopqrstuvwxyz}\par
> % {\iwonantwo 0123456789abcdefghijklmnopqrstuvwxyz}\par
> 
> \stoptext
> 
> i.e. get rid of potential interferences
> 
> attached is what i get

Hi all,

I removed the entire tree (Hans suggested this in a private
message) and started over with a clean install of the minimals.
Now it works as expected, both with Context and Luatex-Fonts.
Might have been caused by a leftover file that didn’t update.

Thanks for your assistance and sorry for the noise!

Best,
Philipp


[-- Attachment #1.2: Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-08-02 18:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01 17:33 copy&paste from pdf bug (smallcaps, text figures) Philipp Gesang
2013-08-01 20:23 ` Otared Kavian
2013-08-01 21:46   ` Philipp Gesang
2013-08-01 22:01     ` Marco Patzer
2013-08-01 22:12       ` Philipp Gesang
2013-08-02 11:09         ` Hans Hagen
2013-08-02 11:37           ` Philipp Gesang
2013-08-02 12:02             ` Marco Patzer
2013-08-02 12:28               ` Philipp Gesang
2013-08-02 14:59                 ` Hans Hagen
2013-08-02 15:56                   ` Philipp Gesang
2013-08-02 16:04                     ` Arthur Reutenauer
2013-08-02 18:21                   ` Philipp Gesang
2013-08-01 21:38 ` Marco Patzer
2013-08-01 22:08 ` Jannik Voges

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).