ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* accessing glyphs in the private area
@ 2018-09-30 20:08 Ulrike Fischer
  2018-10-01  8:20 ` Hans Hagen
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrike Fischer @ 2018-09-30 20:08 UTC (permalink / raw)
  To: ntg-context


The font Coelacanth (on CTAN) has glyphs in the private area. 

Between 2/2017 (luaotfload in texlive 2017) and now the storing and
accessing of this glyphs has changed. 

In the lua of the font of 2017 I find e.g.

  [62860]={
   ["boundingbox"]=165,
   ["index"]=2622,
   ["unicode"]=62860,
   ["width"]=523,

and the glyph can be accessed with \Uchar62860

In the current lua I now find

 [983910]={
   ["boundingbox"]=195,
   ["index"]=2622,
   ["unicode"]=62860,
   ["width"]=523,
  },

and \Uchar62860 not longer works, one has to use \Uchar983910.

Is this change intentional? How is one supposed to access such
chars? The manual says about \Uchar that it "expands to the
associated Unicode character." but this seems no longer to be true.

A context example to test is

\starttext
\font\test={name:Coelacanth:mode=node;script=latn;language=DFLT;+tlig;}
\test
1.: \Uchar62860

2.: \Uchar983910

\stoptext

The question was triggered by this tex.sx question
https://tex.stackexchange.com/questions/453224/using-glyphs-in-the-corporate-use-or-microsoft-symbol-areas
https://github.com/u-fischer/luaotfload/issues/7

-- 
Ulrike Fischer 
https://www.troubleshooting-tex.de/

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-09-30 20:08 accessing glyphs in the private area Ulrike Fischer
@ 2018-10-01  8:20 ` Hans Hagen
  2018-10-01  9:42   ` Ulrike Fischer
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Hagen @ 2018-10-01  8:20 UTC (permalink / raw)
  To: Ulrike Fischer, mailing list for ConTeXt users

On 9/30/2018 10:08 PM, Ulrike Fischer wrote:
> 
> The font Coelacanth (on CTAN) has glyphs in the private area.
> 
> Between 2/2017 (luaotfload in texlive 2017) and now the storing and
> accessing of this glyphs has changed.
> 
> In the lua of the font of 2017 I find e.g.
> 
>    [62860]={
>     ["boundingbox"]=165,
>     ["index"]=2622,
>     ["unicode"]=62860,
>     ["width"]=523,
> 
> and the glyph can be accessed with \Uchar62860
> 
> In the current lua I now find
> 
>   [983910]={
>     ["boundingbox"]=195,
>     ["index"]=2622,
>     ["unicode"]=62860,
>     ["width"]=523,
>    },
> 
> and \Uchar62860 not longer works, one has to use \Uchar983910.
> 
> Is this change intentional? How is one supposed to access such
> chars? The manual says about \Uchar that it "expands to the
> associated Unicode character." but this seems no longer to be true.
> 
> A context example to test is
> 
> \starttext
> \font\test={name:Coelacanth:mode=node;script=latn;language=DFLT;+tlig;}
> \test
> 1.: \Uchar62860
> 
> 2.: \Uchar983910
> 
> \stoptext
\Uchar expands to the character in the font, so to whatever sits in that 
slot ... in fact, fonts in luatex are not that different from 
traditional tex: slot 123 can be anything but it happens that we use 
unicode in the fontloader ..

anyway, the problem, with these private areas is that they are also used 
by the loader (and context) so in order to avoid clashes we move all 
private chars in the font to a dedicated private range

in your case the glyphs have no real useful names so basically i wonder 
what their use it (are they meant for direct access?)

you can define

\def\byindex#1{\ctxlua{
     for k, v in pairs(fonts.hashes.identifiers[true].characters) do
         if v.index == #1 then
             tex.print(utf.char(k))
             break
         end
     end
}}

{\definedfont[Coelacanth] test \byindex{\number"00A33}}

I can remap those privates to a normalized private name, like P0F581 but 
it depends on how bloated fonts become that have lots of privates.

In that case you can have:

\def\byname#1{\ctxlua{
     for k, v in 
pairs(fonts.hashes.identifiers[true].shared.rawdata.descriptions) do
         if v.name == "#1" then
             tex.print(utf.char(k))
             break
         end
     end
}}

{\definedfont[Coelacanth] test \byname {P0F581}}

(btw,  This code is not for context users! They have other means; this 
is typically stuff that differs per macro package. One might for 
instance make a list per font with meaningfull names or so that can be 
accessed in a more friendly way.)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01  8:20 ` Hans Hagen
@ 2018-10-01  9:42   ` Ulrike Fischer
  2018-10-01  9:53     ` luigi scarso
  2018-10-01 17:29     ` Hans Hagen
  0 siblings, 2 replies; 14+ messages in thread
From: Ulrike Fischer @ 2018-10-01  9:42 UTC (permalink / raw)
  To: ntg-context

Am Mon, 1 Oct 2018 10:20:07 +0200 schrieb Hans Hagen:


> anyway, the problem, with these private areas is that they are also used 
> by the loader (and context) so in order to avoid clashes we move all 
> private chars in the font to a dedicated private range

This basically means that for every document and package which uses
the generic fontloader the access to chars in the private area with
\char is now broken in luatex (in xetex it still works fine).

I just got from Claudio Beccari (which seem to have complained to
Luigi) a bug report that the libertine fonts no longer show some of
the keyboard key glyphs due to the same problem. 

Can you tell me when this change happend? Perhaps I can build an
older fontloader as a fall back. 


> in your case the glyphs have no real useful names so basically i wonder 
> what their use it (are they meant for direct access?)

The question on tex.sx claimed that it has the name uniF58C. 
I never used the font and don't know how Therese accessed the glyphs
before, but the libertine package has long lists of mappings like
this:

\DeclareTextGlyphY{LinBiolinum_K}{uniE18C}{57740}

How do context users access such glyphs? Why is there no problem?


> you can define
> 
> \def\byindex#1{\ctxlua{
>      for k, v in pairs(fonts.hashes.identifiers[true].characters) do
>          if v.index == #1 then
>              tex.print(utf.char(k))
>              break
>          end
>      end
> }}
> 
> {\definedfont[Coelacanth] test \byindex{\number"00A33}}

I don't see a use of accessing this glyphs by index - index
positions can change if the font is updated. This can only be a last
resort for glyphs without unicode position.

The only sensible access is by unicode number (which works).
 

> I can remap those privates to a normalized private name, like P0F581 but 
> it depends on how bloated fonts become that have lots of privates.

> 
> In that case you can have:
> 
> \def\byname#1{\ctxlua{
>      for k, v in 
> pairs(fonts.hashes.identifiers[true].shared.rawdata.descriptions) do
>          if v.name == "#1" then
>              tex.print(utf.char(k))
>              break
>          end
>      end
> }}
> 
> {\definedfont[Coelacanth] test \byname {P0F581}}

It would at least mean that not the whole characters list must be
searched. And we could create a documented and stable access
command. 


-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01  9:42   ` Ulrike Fischer
@ 2018-10-01  9:53     ` luigi scarso
  2018-10-01 17:29     ` Hans Hagen
  1 sibling, 0 replies; 14+ messages in thread
From: luigi scarso @ 2018-10-01  9:53 UTC (permalink / raw)
  To: Ulrike Fischer, mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 421 bytes --]

On Mon, Oct 1, 2018 at 11:43 AM Ulrike Fischer <news3@nililand.de> wrote:

>
> I just got from Claudio Beccari (which seem to have complained to
> Luigi)
>
hm, not a complain, a simple "bug report". I always try to answer directly
to Claudio  when/if I  can,
but in this case,  if I have understood correctly, you are now
the/a maintainer of luaotfload,  and for sure you can give much better
answers than me.

-- 
luigi

[-- Attachment #1.2: Type: text/html, Size: 815 bytes --]

[-- Attachment #2: Type: text/plain, Size: 492 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01  9:42   ` Ulrike Fischer
  2018-10-01  9:53     ` luigi scarso
@ 2018-10-01 17:29     ` Hans Hagen
  2018-10-01 17:55       ` Ulrike Fischer
  1 sibling, 1 reply; 14+ messages in thread
From: Hans Hagen @ 2018-10-01 17:29 UTC (permalink / raw)
  To: news3, mailing list for ConTeXt users

On 10/1/2018 11:42 AM, Ulrike Fischer wrote:

> Can you tell me when this change happend? Perhaps I can build an
> older fontloader as a fall back.

no, probably a while ago when some other clash in private area use was 
solved .. i'm not going to mess with the code now as  0xE000-0xEFFF is 
used in context for various things

>> in your case the glyphs have no real useful names so basically i wonder
>> what their use it (are they meant for direct access?)
> 
> The question on tex.sx claimed that it has the name uniF58C.
> I never used the font and don't know how Therese accessed the glyphs
> before, but the libertine package has long lists of mappings like
> this:
> 
> \DeclareTextGlyphY{LinBiolinum_K}{uniE18C}{57740}

A funny definition ... is that access by name or number?

> I don't see a use of accessing this glyphs by index - index
> positions can change if the font is updated. This can only be a last
> resort for glyphs without unicode position.

So can private unicodes as they are as undefined.

> The only sensible access is by unicode number (which works).

Anyway, for generic (so not for context) I can keep these glyphs in the 
0xE000-0xEFFF range for now (also the names so larger files ... actually 
a mess as that font has Uni and uni so who's to know). I have no clue if 
it clashes with some features at some point so that you can use numbers 
but probably those features are context specific anyway.

Hans
    -----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01 17:29     ` Hans Hagen
@ 2018-10-01 17:55       ` Ulrike Fischer
  2018-10-01 20:42         ` Hans Hagen
  2018-10-02  4:55         ` luigi scarso
  0 siblings, 2 replies; 14+ messages in thread
From: Ulrike Fischer @ 2018-10-01 17:55 UTC (permalink / raw)
  To: ntg-context

Am Mon, 1 Oct 2018 19:29:40 +0200 schrieb Hans Hagen:

>> \DeclareTextGlyphY{LinBiolinum_K}{uniE18C}{57740}

>A funny definition ... is that access by name or number?

By number, it uses \char at the end to get the glyph. 

> Anyway, for generic (so not for context) I can keep these glyphs in the 
> 0xE000-0xEFFF range for now 

That would be great. Thanks.

What about the rest and the other PUA's? 
U+F000–U+F8FF, U+F0000–U+FFFFD, U+100000–U+10FFFD

And when will the change be available? I will have to update
luaotfload. 

> (also the names so larger files ... actually 
> a mess as that font has Uni and uni so who's to know).

It is not only that font. Actually the libertine package broke,
fontawesome broke, and Coelacanth was only used by Thérèse in the
example as it is free, her real problem was with using Goudy
fleurons. 

The private use area is there for every font to use. And quite often
they document this code points and tools like fontforge show them.
I don't think that it is a good idea if an application comes along
and pushes the glyphs from the seats because it wants the space for
itself.

> I have no clue if it clashes with some features at some point so
> that you can use numbers but probably those features are context
> specific anyway.

For what do you reserve the space in the PUA?

-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01 17:55       ` Ulrike Fischer
@ 2018-10-01 20:42         ` Hans Hagen
  2018-10-02  4:55         ` luigi scarso
  1 sibling, 0 replies; 14+ messages in thread
From: Hans Hagen @ 2018-10-01 20:42 UTC (permalink / raw)
  To: Ulrike Fischer, mailing list for ConTeXt users

On 10/1/2018 7:55 PM, Ulrike Fischer wrote:
> Am Mon, 1 Oct 2018 19:29:40 +0200 schrieb Hans Hagen:
> 
>>> \DeclareTextGlyphY{LinBiolinum_K}{uniE18C}{57740}
> 
>> A funny definition ... is that access by name or number?
> 
> By number, it uses \char at the end to get the glyph.
> 
>> Anyway, for generic (so not for context) I can keep these glyphs in the
>> 0xE000-0xEFFF range for now
> 
> That would be great. Thanks.
> 
> What about the rest and the other PUA's?
> U+F000–U+F8FF, U+F0000–U+FFFFD, U+100000–U+10FFFD

only U+F000–U+F8FF as i'm not in the mood writing code that skips over 
the other ones (we need code points for alternaties and such and i also 
need consistent room for virtual chars) ... so, if someone really needs 
those slots he/she should remap them somehow

if you really want i can keep their names if there are names (UniXXXXXX) 
but that would mean way more mem for cjk fonts (then it's all or nothing 
for latex, as for plain generic i won't do that anyway)

but i don't expect those areas to be used for useable glyphs

(in fact, i would probably never rely on numbers in such private areas 
myself or write some plug into the loader that would remap them to areas 
i want them in .. i prefer glyph names)

> And when will the change be available? I will have to update
> luaotfload.

dunno, when i have some more to upload (sometime this week)

>> (also the names so larger files ... actually
>> a mess as that font has Uni and uni so who's to know).
> 
> It is not only that font. Actually the libertine package broke,
> fontawesome broke, and Coelacanth was only used by Thérèse in the
> example as it is free, her real problem was with using Goudy
> fleurons.

in context i strongly advice against using numbers instead of names

> The private use area is there for every font to use. And quite often
> they document this code points and tools like fontforge show them.
> I don't think that it is a good idea if an application comes along
> and pushes the glyphs from the seats because it wants the space for
> itself.

well, we do need space in a valid area

>> I have no clue if it clashes with some features at some point so
>> that you can use numbers but probably those features are context
>> specific anyway.
> 
> For what do you reserve the space in the PUA?
all kind of stuff (for instance we have to put substitutes, alternates 
etc somewhere; i also have been using it for virtual math fonst for over 
a decade now) and i'm definitely not going to move around all kind of 
already used slots around now (i might do that some day as i do have 
some abstract model but then i also do to need a lot of testing)

i also need the same slots in all fonts for some purposes so i need some 
shared private space

(in fact, if i need the higher space glyphs i can always decide to use 
names but first i need to run into a real font using these slots in 
order to see what is the impact)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01 17:55       ` Ulrike Fischer
  2018-10-01 20:42         ` Hans Hagen
@ 2018-10-02  4:55         ` luigi scarso
  2018-10-02  7:29           ` Ulrike Fischer
  1 sibling, 1 reply; 14+ messages in thread
From: luigi scarso @ 2018-10-02  4:55 UTC (permalink / raw)
  To: Ulrike Fischer, mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 643 bytes --]

On Mon, Oct 1, 2018 at 7:56 PM Ulrike Fischer <news3@nililand.de> wrote:

>
> For what do you reserve the space in the PUA?
>

 http://www.pragma-ade.nl/general/manuals/fonts-mkiv.pdf
page 32 of the document :

As we already mentioned in a previous chapter, in ConTeXt we use Unicode
internally.
This also means that fonts are organized this way. By default the glyph
representation
of a Unicode character sits in the same slot in the glyph table. All
additional glyphs, like
ligatures or alternates are pushed in the private unicode space. This is
why in the lists
shown in the figures the ligatures have a private Unicode number.

-- 
luigi

[-- Attachment #1.2: Type: text/html, Size: 1746 bytes --]

[-- Attachment #2: Type: text/plain, Size: 492 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-02  4:55         ` luigi scarso
@ 2018-10-02  7:29           ` Ulrike Fischer
  2018-10-02  9:29             ` Hans Hagen
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrike Fischer @ 2018-10-02  7:29 UTC (permalink / raw)
  To: ntg-context

Am Tue, 2 Oct 2018 06:55:02 +0200 schrieb luigi scarso:

>> For what do you reserve the space in the PUA?

>  http://www.pragma-ade.nl/general/manuals/fonts-mkiv.pdf
> page 32 of the document :
 
> As we already mentioned in a previous chapter, in ConTeXt we use
> Unicode internally. This also means that fonts are organized this
> way. By default the glyph representation of a Unicode character
> sits in the same slot in the glyph table. All additional glyphs,
> like ligatures or alternates are pushed in the private unicode
> space. This is why in the lists shown in the figures the
> ligatures have a private Unicode number.

Hm. To clarify. In xetex there is clear distinction between the slot
and unicode. \XeTeXglyph (slot) and \char (unicode) give different
output and \char actively uses the tounicode mapping of the font.  

\font\test="[lmroman10-regular.otf]"
\test
\XeTeXglyph"7A  
\char"7A
\bye


In luatex \char and \Uchar don't really care about unicode, even if
the font has tounicode=1 and tounicode entries, they access the char
by the hashed integer number. 

So to get "unicode" the font loader has to sort the glyphs, index
unicode glyphs by their unicode code point, and assign "non-unicode"
glyphs numbers that don't interfere. 

Did I got right?

Then I do understand that you need some free numbers to push
glyphes. But I do not understand why to achieve this you remove
glyphs from their unicode points. The PUA is not some non-unicode
wilderness. The code points there are as valid as in the other code
blocks. You wouldn't move away the greek block to get the place, so
why do you think it is okay to throw out of the PUA block what SIL
and other font designers encoded there?  Can't you check for a free
range instead?

-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-02  7:29           ` Ulrike Fischer
@ 2018-10-02  9:29             ` Hans Hagen
  2018-10-02 11:39               ` Ulrike Fischer
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Hagen @ 2018-10-02  9:29 UTC (permalink / raw)
  To: news3, mailing list for ConTeXt users

On 10/2/2018 9:29 AM, Ulrike Fischer wrote:
> Am Tue, 2 Oct 2018 06:55:02 +0200 schrieb luigi scarso:
> 
>>> For what do you reserve the space in the PUA?
> 
>>   http://www.pragma-ade.nl/general/manuals/fonts-mkiv.pdf
>> page 32 of the document :
>   
>> As we already mentioned in a previous chapter, in ConTeXt we use
>> Unicode internally. This also means that fonts are organized this
>> way. By default the glyph representation of a Unicode character
>> sits in the same slot in the glyph table. All additional glyphs,
>> like ligatures or alternates are pushed in the private unicode
>> space. This is why in the lists shown in the figures the
>> ligatures have a private Unicode number.
> 
> Hm. To clarify. In xetex there is clear distinction between the slot
> and unicode. \XeTeXglyph (slot) and \char (unicode) give different
> output and \char actively uses the tounicode mapping of the font.
> 
> \font\test="[lmroman10-regular.otf]"
> \test
> \XeTeXglyph"7A
> \char"7A
> \bye
>  
> In luatex \char and \Uchar don't really care about unicode, even if
> the font has tounicode=1 and tounicode entries, they access the char
> by the hashed integer number.

they access the char in the characters table (where each character has 
an index field so one can write a simple function that accesses it by 
index; also, i assume that in xetex \char gives the character as known 
to tex so if one input non-unicode one gets that)

> So to get "unicode" the font loader has to sort the glyphs, index
> unicode glyphs by their unicode code point, and assign "non-unicode"
> glyphs numbers that don't interfere.
> 
> Did I got right?

indeed, and we use the private space for those with no unicode (which 
can be a lot, also think for instance of the snippets that make up math 
extensibles)

> Then I do understand that you need some free numbers to push
> glyphes. But I do not understand why to achieve this you remove
> glyphs from their unicode points. The PUA is not some non-unicode
> wilderness. The code points there are as valid as in the other code
> blocks. You wouldn't move away the greek block to get the place, so
> why do you think it is okay to throw out of the PUA block what SIL
> and other font designers encoded there?  Can't you check for a free
> range instead?

sure, but then i also loose some functionality in context (unless i gho 
for ugly solutions) ... as all glyphs are supposed to have a name access 
by name is a pretty good alternative

the main issue is that there are fonts that use private > 0xFFFF space 
which then would mean a lot of extra mem for names ... so the question 
is are there fonts that use that range

Hans


-- 

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-02  9:29             ` Hans Hagen
@ 2018-10-02 11:39               ` Ulrike Fischer
  2018-10-02 12:42                 ` Hans Hagen
  0 siblings, 1 reply; 14+ messages in thread
From: Ulrike Fischer @ 2018-10-02 11:39 UTC (permalink / raw)
  To: ntg-context

Am Tue, 2 Oct 2018 11:29:46 +0200 schrieb Hans Hagen:
>>  Can't you check for a free range instead?
 
> sure, but then i also loose some functionality in context (unless i gho 
> for ugly solutions) ... as all glyphs are supposed to have a name access 
> by name is a pretty good alternative

Well in my view name and code point are both valid and useful
accesses (and I wouldn't trust names too much). 

Beside this:
xetex has (for non-legacy fonts) primitives for all accesses: by
char (unicode), slot and name. 


luatex hasn't, here the only (primitive) access are commands like
\char which expect a number; the name field of a character is marked
as "unused" in the manual. 

Neither has the generic fontloader imho some suitable primitive
command for name access. All the examples in the generic folder uses
numbers or direct input: e.g. \Uchar"1D49D or \Uradical "0 "221A

So it is imho quite natural that people who write code and packages
expect the access by \char + code point to work. Why should I bother
with a (perhaps font specific) glyph name if I can simply look up a
clear code point number in a table?  

And if I got it right you are reserving a specific space to have
stable numbers internally, so you are caring about numbers too ;-)  

> the main issue is that there are fonts that use private > 0xFFFF space 

I don't know. Wikipedia says that code2000 uses plane 15 but I
didn't check. 


-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-02 11:39               ` Ulrike Fischer
@ 2018-10-02 12:42                 ` Hans Hagen
  0 siblings, 0 replies; 14+ messages in thread
From: Hans Hagen @ 2018-10-02 12:42 UTC (permalink / raw)
  To: news3, mailing list for ConTeXt users

On 10/2/2018 1:39 PM, Ulrike Fischer wrote:
> Am Tue, 2 Oct 2018 11:29:46 +0200 schrieb Hans Hagen:
>>>   Can't you check for a free range instead?
>   
>> sure, but then i also loose some functionality in context (unless i gho
>> for ugly solutions) ... as all glyphs are supposed to have a name access
>> by name is a pretty good alternative
> 
> Well in my view name and code point are both valid and useful
> accesses (and I wouldn't trust names too much).
> 
> Beside this:
> xetex has (for non-legacy fonts) primitives for all accesses: by
> char (unicode), slot and name.

whatever ...

> luatex hasn't, here the only (primitive) access are commands like
> \char which expect a number; the name field of a character is marked
> as "unused" in the manual.

sure, as one can write lua code to provide that feature .. there is no 
benefit in having that code in the engine (in fact, even more could go)

> Neither has the generic fontloader imho some suitable primitive
> command for name access. All the examples in the generic folder uses
> numbers or direct input: e.g. \Uchar"1D49D or \Uradical "0 "221A

one can write these helpers ... i consider those things macro package 
dependent asd there's often some higher leel interface

> So it is imho quite natural that people who write code and packages
> expect the access by \char + code point to work. Why should I bother
> with a (perhaps font specific) glyph name if I can simply look up a
> clear code point number in a table?

ok, so it depends on the users and viewpoints of macro package writers 
.. if some extra glyph cannot be given a meaningful name it's probably 
not worth using anyway

> And if I got it right you are reserving a specific space to have
> stable numbers internally, so you are caring about numbers too ;-)

symbolic mapping and for text not hard coded (and shared therefore 
efficient) btu i shifted that space up and hope for the best (for 
context users that is, as i cannot test a lot now)

>> the main issue is that there are fonts that use private > 0xFFFF space
> 
> I don't know. Wikipedia says that code2000 uses plane 15 but I
> didn't check.

anyway ... i adapted the code to keep the pua intact and also added an 
option for outside context to keep bogus names ... (context users have 
several ways to access shapes anyway)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: accessing glyphs in the private area
  2018-10-01 21:11 David Carlisle
@ 2018-10-02  9:13 ` Hans Hagen
  0 siblings, 0 replies; 14+ messages in thread
From: Hans Hagen @ 2018-10-02  9:13 UTC (permalink / raw)
  To: mailing list for ConTeXt users, David Carlisle

On 10/1/2018 11:11 PM, David Carlisle wrote:

> If you need to allocate a block for internal use wouldn't it be possible
> to use one of the high areas Supplementary Private Use Area-A or B
> (U+F0000 - U+FFFFF) (U+100000 - U+10FFFF) ?
> 
> The BMP PUA block (U+E000 U+F8FF) just has so many documented uses in
> existing fonts.
that's what i now do when the font loader is used outside context (no 
beta uploaded yet)

(fyi: we use U+E000 U+F8FF in context for other purposes ... and when we 
need something special from the font we use glyph names)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

* accessing glyphs in the private area
@ 2018-10-01 21:11 David Carlisle
  2018-10-02  9:13 ` Hans Hagen
  0 siblings, 1 reply; 14+ messages in thread
From: David Carlisle @ 2018-10-01 21:11 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 1550 bytes --]



Ulrike and Hans wrote

 > > It is not only that font. Actually the libertine package broke,
 > > fontawesome broke, and Coelacanth was only used by Thérèse in the
 > > example as it is free, her real problem was with using Goudy
 > > fleurons.

 > in context i strongly advice against using numbers instead of names

It's not just explicit numbers via \char, it's character data in 
documents using specific fonts with documented PUA characters

Many fonts use this for assorted reasons
https://en.wikipedia.org/wiki/Private_Use_Areas#Vendor_use
notably SIL fonts using it for minority languages not in Unicode and 
Microsoft for all kinds of CJK stuff.

Given how many reports are appearing in the few days that this has been 
exposed to a larger number of uses via LaTeX, use of PUA characters 
really isn't a rare occurrence at all.

If you need to allocate a block for internal use wouldn't it be possible
to use one of the high areas Supplementary Private Use Area-A or B
(U+F0000 - U+FFFFF) (U+100000 - U+10FFFF) ?

The BMP PUA block (U+E000 U+F8FF) just has so many documented uses in 
existing fonts.

David

Disclaimer

The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. 

[-- Attachment #1.2: Type: text/html, Size: 2276 bytes --]

[-- Attachment #2: Type: text/plain, Size: 492 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-10-02 12:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-30 20:08 accessing glyphs in the private area Ulrike Fischer
2018-10-01  8:20 ` Hans Hagen
2018-10-01  9:42   ` Ulrike Fischer
2018-10-01  9:53     ` luigi scarso
2018-10-01 17:29     ` Hans Hagen
2018-10-01 17:55       ` Ulrike Fischer
2018-10-01 20:42         ` Hans Hagen
2018-10-02  4:55         ` luigi scarso
2018-10-02  7:29           ` Ulrike Fischer
2018-10-02  9:29             ` Hans Hagen
2018-10-02 11:39               ` Ulrike Fischer
2018-10-02 12:42                 ` Hans Hagen
2018-10-01 21:11 David Carlisle
2018-10-02  9:13 ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).