ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Modify ToUnicode with Goodies
@ 2021-06-03  9:25 Christoph Reller
  2021-06-07 17:05 ` Hans Hagen
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Reller @ 2021-06-03  9:25 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 1618 bytes --]

Hi,

On Windows, we have the consola font. Consider the MWE:

\starttext
\definedfont[name:consola*default at 12 pt]
-
\stoptext

The output PDF is correctly generated with recent versions of ConTeXt LMTX.
The hyphen is, however, mapped to a soft hyphen
<https://unicode-table.com/en/00AD/> by means of the ToUnicode table which
contains:
    beginbfchar
        <015E> <00AD>
    endbfchar

Consequently, when copying the text from the PDF and pasting in an editor
or a console, the soft hyphen is pasted.

I would like to change the ToUnicode information to an ordinary hyphen-minus
<https://unicode-table.com/en/002D/>:
    beginbfchar
        <015E> <002D>
    endbfchar

I have tried with a goodies file, and an updated MWE:

--- 8< ------------------------------------------
return {
   name = "consola",
   version = "1.00",
   comment = "",
   author = "",
   copyright = "",
   remapping = {
      tounicode = true,
      unicodes = {
         hyphen = 0x002D,
      },
   },
}
--- 8< ------------------------------------------
\definefontfeature[consola][mode=base, goodies=consola, unicoding=yes]
\starttypescript[mono][consolas]
  \definefontsynonym[ConsolasRegular][file:consola][features=consola]
\stoptypescript
\starttypescript[mono][consolas]
  \definefontsynonym[Mono][ConsolasRegular]
\stoptypescript
\definetypeface[Body][tt][mono][consolas][default]
\setupbodyfont[Body, ss, 10pt]

\starttext
\tt -
\stoptext
--- 8< ------------------------------------------

Unfortunately, this has no effect.

Please tell me how to correctly update ToUnicode information with a goodies
file.

Cheers,
Christoph

[-- Attachment #1.2: Type: text/html, Size: 2298 bytes --]

[-- Attachment #2: Type: text/plain, Size: 493 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Modify ToUnicode with Goodies
  2021-06-03  9:25 Modify ToUnicode with Goodies Christoph Reller
@ 2021-06-07 17:05 ` Hans Hagen
  2021-06-20  4:03   ` Christoph Reller
  0 siblings, 1 reply; 3+ messages in thread
From: Hans Hagen @ 2021-06-07 17:05 UTC (permalink / raw)
  To: mailing list for ConTeXt users, Christoph Reller

On 6/3/2021 11:25 AM, Christoph Reller wrote:
> Hi,
> 
> On Windows, we have the consola font. Consider the MWE:
> 
> \starttext
> \definedfont[name:consola*default at 12 pt]
> -
> \stoptext
> 
> The output PDF is correctly generated with recent versions of ConTeXt 
> LMTX. The hyphen is, however, mapped to a soft hyphen 
> <https://unicode-table.com/en/00AD/> by means of the ToUnicode table 
> which contains:
>      beginbfchar
>          <015E> <00AD>
>      endbfchar
> 
> Consequently, when copying the text from the PDF and pasting in an 
> editor or a console, the soft hyphen is pasted.
> 
> I would like to change the ToUnicode information to an ordinary 
> hyphen-minus <https://unicode-table.com/en/002D/>:
>      beginbfchar
>          <015E> <002D>
>      endbfchar
> 
> I have tried with a goodies file, and an updated MWE:
> 
> --- 8< ------------------------------------------
> return {
>     name = "consola",
>     version = "1.00",
>     comment = "",
>     author = "",
>     copyright = "",
>     remapping = {
>        tounicode = true,
>        unicodes = {
>           hyphen = 0x002D,
>        },
>     },
> }
> --- 8< ------------------------------------------
> \definefontfeature[consola][mode=base, goodies=consola, unicoding=yes]
> \starttypescript[mono][consolas]
>    \definefontsynonym[ConsolasRegular][file:consola][features=consola]
> \stoptypescript
> \starttypescript[mono][consolas]
>    \definefontsynonym[Mono][ConsolasRegular]
> \stoptypescript
> \definetypeface[Body][tt][mono][consolas][default]
> \setupbodyfont[Body, ss, 10pt]
> 
> \starttext
> \tt -
> \stoptext
> --- 8< ------------------------------------------
> 
> Unfortunately, this has no effect.
> 
> Please tell me how to correctly update ToUnicode information with a 
> goodies file.
It is (as awlways with fonts) more complex than that (1) because 
different unicode slots share the same shape and (2) we have some 
(already) old hyphen patching code for messy fonts (which is kind of bad 
anyway).

We actually want all these hyphens to have the right tounicode even if 
they share shapes (i already had some comment about looking into that 
but never ran into a font that needed it).

So, after some experimenting i decided to solve that in a different way 
(lmtx only because there i have more control) ... i need to run some 
checks and then do an upload so that you can test (also other files if 
possible).

Hans



-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Modify ToUnicode with Goodies
  2021-06-07 17:05 ` Hans Hagen
@ 2021-06-20  4:03   ` Christoph Reller
  0 siblings, 0 replies; 3+ messages in thread
From: Christoph Reller @ 2021-06-20  4:03 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1799 bytes --]

On Mon, Jun 7, 2021 at 7:05 PM Hans Hagen <j.hagen@xs4all.nl> wrote:

> On 6/3/2021 11:25 AM, Christoph Reller wrote:
> > Hi,
> >
> > On Windows, we have the consola font. Consider the MWE:
> >
> > \starttext
> > \definedfont[name:consola*default at 12 pt]
> > -
> > \stoptext
> >
> > The output PDF is correctly generated with recent versions of ConTeXt
> > LMTX. The hyphen is, however, mapped to a soft hyphen
> > <https://unicode-table.com/en/00AD/> by means of the ToUnicode table
> > which contains:
> >      beginbfchar
> >          <015E> <00AD>
> >      endbfchar
> >
> > Consequently, when copying the text from the PDF and pasting in an
> > editor or a console, the soft hyphen is pasted.
> >
> > I would like to change the ToUnicode information to an ordinary
> > hyphen-minus <https://unicode-table.com/en/002D/>:
> >      beginbfchar
> >          <015E> <002D>
> >      endbfchar
> >
> It is (as awlways with fonts) more complex than that (1) because
> different unicode slots share the same shape and (2) we have some
> (already) old hyphen patching code for messy fonts (which is kind of bad
> anyway).
>
> We actually want all these hyphens to have the right tounicode even if
> they share shapes (i already had some comment about looking into that
> but never ran into a font that needed it).
>
> So, after some experimenting i decided to solve that in a different way
> (lmtx only because there i have more control) ... i need to run some
> checks and then do an upload so that you can test (also other files if
> possible).
>

Finally I found the time to do some extended testing on this and it seems
that for my use-case the LMTX version 2021-06-09 behaves as I would expect:
Hyphens are now extracted as hyphens.

Thanks a lot for your implementation, Hans!

Cheers,
Christoph

[-- Attachment #1.2: Type: text/html, Size: 2573 bytes --]

[-- Attachment #2: Type: text/plain, Size: 493 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-20  4:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-03  9:25 Modify ToUnicode with Goodies Christoph Reller
2021-06-07 17:05 ` Hans Hagen
2021-06-20  4:03   ` Christoph Reller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).