From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/103810 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Christoph Reller Newsgroups: gmane.comp.tex.context Subject: Re: follow up Date: Wed, 3 Apr 2019 07:58:04 +0200 Message-ID: References: <56E56112-A5AA-40F9-BC91-2825E45170AB@me.com> <5625BBF4-54EC-4D40-9531-B123EDA22D44@klankschap.nl> <71b2bb7d-47cc-8b09-ba2b-a2ea65655263@xs4all.nl> <635183C0-FE37-4E0A-89F6-5EFE0A57CAAB@klankschap.nl> <1hrzupbzez4fu$.dlg@nililand.de> <10b93bec-c4d6-c949-ed5e-78417496aeb9@xs4all.nl> <1BC1BA6B-67AB-488B-94BC-967ED5A69B87@elvenkind.com> Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7195486176077433257==" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="195669"; mail-complaints-to="usenet@blaine.gmane.org" To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed Apr 03 07:58:56 2019 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hBYvI-000olU-8z for gctc-ntg-context-518@m.gmane.org; Wed, 03 Apr 2019 07:58:56 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id CEC9411511D; Wed, 3 Apr 2019 07:58:27 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8xcVTMhx579I; Wed, 3 Apr 2019 07:58:25 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 73B6F115111; Wed, 3 Apr 2019 07:58:25 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 6DE52115111 for ; Wed, 3 Apr 2019 07:58:23 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bKZ3D8lLQ5rX for ; Wed, 3 Apr 2019 07:58:22 +0200 (CEST) Original-Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 641471150DC for ; Wed, 3 Apr 2019 07:58:22 +0200 (CEST) Original-Received: by mail-lj1-f172.google.com with SMTP id r24so13728380ljg.3 for ; Tue, 02 Apr 2019 22:58:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=pafSR7WRnQDbUrydc6bSSmkHW+ZZAAuJxwp4MRB3VbM=; b=jKPBg7IYeIGZHA2ocD7T8/amK4AVnAx1goPbS+D5PK2VljNHf/d5qZgIhrmwdhgpJ1 WYIathmQx1PoVVxAZzuiHKBMt8rYDpfK+FRD9oBDQsKVLts9h3WxdlV/vpQmHBxxDxtG 9RjTTA2FIxhC6V45ae6PwoY9brWiwDtH+Gxcq57R+PdHOkHNqqaWHByVcnUgaY6PjE7z rqZmyCRCPqmA693Dbl63WNGRCryxDGeBfnRkQuPZYVUbWBEjwaxYndeW1bylbbRe32b+ mKW1MI/zQlAO18UiT43qYhDYbE2dfeZGNpm8vPIyd9UhC8CIlw3WeoqDFQqafZX7/zAs X7dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=pafSR7WRnQDbUrydc6bSSmkHW+ZZAAuJxwp4MRB3VbM=; b=c1kyOYoco9MaVnj4jOlaYETVl99b6Jp5tj4wJA2onfGF5p7ocnNCZZ2RXQclwp6wvr 9dIOhK83OQucTQ4T5lguVm9FpEz1Uhwlreowj6ZF6Z5LLDoTHE1CPzCGVR5A4IGEXetv j7gKkfSTM7pRk0WxV6UJep2W6d+klThULrjg+1qRKchZOCd16cmpIwHHK/tWBht5h2KI Yh5QiWvSH/AUDQ2mZcdo1D3G+ewKHWKnSPXu2tzSj1bGyJrNm6SabLLyVRhPW/ANtOcP cvqw4kHBxAIqUjo/+fcclPH8vNM1svwtTQ62rRvLbbdvJwtJCtDR5B7afCsHTw15SBd3 Rmrw== X-Gm-Message-State: APjAAAWY0BBZqwtaOGGjNRjuWqvIoEptGgYeR1NSf/yulrZrJm0bP0qf iQ9efEycRhU2lZ0yfEAqWGr7Whnu/FfzyS0W5LqKxG9/ X-Google-Smtp-Source: APXvYqxc2oeCQ6fqXZlYFhd1txHTTcrcFbhdbJghiiIIYi8qcd1hoFECdyK2Ei8mV1Mg/S7V24HS8OD+UaVjoCb2LRw= X-Received: by 2002:a2e:7114:: with SMTP id m20mr9756015ljc.120.1554271101185; Tue, 02 Apr 2019 22:58:21 -0700 (PDT) In-Reply-To: X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.org gmane.comp.tex.context:103810 Archived-At: --===============7195486176077433257== Content-Type: multipart/alternative; boundary="000000000000def1fd058599f068" --000000000000def1fd058599f068 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable My two cents: I don't believe that it is the CIDSet. Both fun.pdf and fun1.pdf have no CIDSet (which is good). The (relevant) differences between the two PDFs are: - Different ToUnicode - Different embedded font stream - Minor differences in the font descriptor It could be the ToUnicode: If preview is not able to parse the last entry in the ToUnicode table then it may also drop this glyph in its display, although ToUnicode is only relevant for text extraction. It could be the font stream: In the CFF font file there is a CharSet table that maps character-IDs to glyph-IDs. If preview cannot read the last entry in this table (or the last glyph, glyph nr. 10) then it might drop it. By bet is on the ToUnicode, because, usually, if viewers fail to read a font file then they drop the entire font file and not single glyphs. Anyway, both PDFs seem to be valid. But I wonder if the differences in the font descriptor are legitimate (especially StemV): Object 9 <-> 7: Different entry Ascent integer value: 1127 <-> 806 in font descriptor dictionary. Object 9 <-> 7: Different entry Descent integer value: -280 <-> -194 in font descriptor dictionary. Object 9 <-> 7: Different entry StemV integer value: 91 <-> 0 in font descriptor dictionary. Cheers, Christoph On Tue, Apr 2, 2019 at 10:16 PM Hans Hagen wrote: > On 4/2/2019 8:38 PM, Taco Hoekwater wrote: > > > > > >> On 2 Apr 2019, at 17:11, Hans Hagen wrote: > >> > >> On 4/2/2019 4:18 PM, Ulrike Fischer wrote: > >>> Am Tue, 2 Apr 2019 15:58:18 +0200 schrieb Floris van Manen: > >>>>> indeed on preview no x shows up but it does in other viewers > >>>>> > >>>> > >>>> Not just the x. > >>>> In the second example the s will disappear, be shows up if you add > some extra digits, and then dropping the 2 > >>> I don't have a mac and can't reproduce the problem. But the missing > >>> char seems to be always the last one in the beginbfchar/endbfchar > >>> list. > >>>> The OSX preview is flaky but i=E2=80=99d assume the output of both c= ontext > version would be similar (enough) > >>> The new context adds new lines inside the beginbfchar/endbfchar > >>> block. Perhaps this confuses preview and it drops the last entry. > >> it is indeed the last one that is the issue but changing spacing or > adding dummies doesn't help > > > > More likely the problem it has is due to the omitted /CIDSet in the fon= t > descriptor. > > > > The error is in the display engine, not the text extractor (since > cut&paste work ok). > > And that means the problem is almost certainly not the cmap. The only > other non-trivial > > difference I saw in the old vs. new pdf was that no longer present > /CIDSet. > > > > Unf., generating one in the text editor is bit beyond me-on-the-could > mode, so I can > > not be certain of that although it seems likely (I checked with FF that > the two glyphs > > are indeed in the embedded font subset and in the exact slots the pdf > says they have, so > > that is also unlikely to be the problem.) > ok, i'll check that tomorrow ... (cidsets are actually obsolete) > > Hans > > ----------------------------------------------------------------- > Hans Hagen | PRAGMA ADE > Ridderstraat 27 | 8061 GH Hasselt | The Netherlands > tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl > ----------------------------------------------------------------- > > _________________________________________________________________________= __________ > If your question is of interest to others as well, please add an entry to > the Wiki! > > maillist : ntg-context@ntg.nl / > http://www.ntg.nl/mailman/listinfo/ntg-context > webpage : http://www.pragma-ade.nl / http://context.aanhet.net > archive : https://bitbucket.org/phg/context-mirror/commits/ > wiki : http://contextgarden.net > > _________________________________________________________________________= __________ > --000000000000def1fd058599f068 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
My two cents:

I don't believe that it is the CIDSet. Both fun.pdf and fun1.pdf hav= e no CIDSet (which is good).
The (relevant) differences between t= he two PDFs are:
- Different ToUnicode
- Different embe= dded font stream
- Minor differences in the font descriptor
=
It could be the ToUnicode: If preview is not able to parse the last en= try in the ToUnicode table then it may also drop this glyph in its display,= although ToUnicode is only relevant for text extraction.
It coul= d be the font stream: In the CFF font file there is a CharSet table that ma= ps character-IDs to glyph-IDs. If preview cannot read the last entry in thi= s table (or the last glyph, glyph nr. 10) then it might drop it.
= By bet is on the ToUnicode, because, usually, if viewers fail to read a fon= t file then they drop the entire font file and not single glyphs.

Anyway, both PDFs seem to be valid. But I wonder if the dif= ferences in the font descriptor are legitimate (especially StemV):
Object 9 <-> 7: Different entry Ascent integer value: 1127 <= ;-> 806 in font descriptor dictionary.
Object 9 <-> 7: D= ifferent entry Descent integer value: -280 <-> -194 in font descripto= r dictionary.
Object 9 <-> 7: Different entry StemV integer= value: 91 <-> 0 in font descriptor dictionary.

<= /div>
Cheers,
Christoph

<= div dir=3D"ltr" class=3D"gmail_attr">On Tue, Apr 2, 2019 at 10:16 PM Hans H= agen <j.hagen@xs4all.nl> wro= te:
On 4/2/2019 = 8:38 PM, Taco Hoekwater wrote:
>
>
>> On 2 Apr 2019, at 17:11, Hans Hagen <j.hagen@xs4all.nl> wrote:
>>
>> On 4/2/2019 4:18 PM, Ulrike Fischer wrote:
>>> Am Tue, 2 Apr 2019 15:58:18 +0200 schrieb Floris van Manen: >>>>> indeed on preview no x shows up but it does in other v= iewers
>>>>>
>>>>
>>>> Not just the x.
>>>> In the second example the s will disappear, be shows up if= you add some extra digits, and then dropping the 2
>>> I don't have a mac and can't reproduce the problem. Bu= t the missing
>>> char seems to be always the last one in the beginbfchar/endbfc= har
>>> list.
>>>> The OSX preview is flaky but i=E2=80=99d assume the output= of both context version would be similar (enough)
>>> The new context adds new lines inside the beginbfchar/endbfcha= r
>>> block. Perhaps this confuses preview and it drops the last ent= ry.
>> it is indeed the last one that is the issue but changing spacing o= r adding dummies doesn't help
>
> More likely the problem it has is due to the omitted /CIDSet in the fo= nt descriptor.
>
> The error is in the display engine, not the text extractor (since cut&= amp;paste work ok).
> And that means the problem is almost certainly not the cmap. The only = other non-trivial
> difference I saw in the old vs. new pdf was that no longer present /CI= DSet.
>
> Unf., generating one in the text editor is bit beyond me-on-the-could = mode, so I can
> not be certain of that although it seems likely (I checked with FF tha= t the two glyphs
> are indeed in the embedded font subset and in the exact slots the pdf = says they have, so
> that is also unlikely to be the problem.)
ok, i'll check that tomorrow ... (cidsets are actually obsolete)

Hans

-----------------------------------------------------------------
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0Hans Hagen | PRAGMA ADE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Ridderstraat 27 | 80= 61 GH Hasselt | The Netherlands
=C2=A0 =C2=A0 =C2=A0 =C2=A0 tel: 038 477 53 69 | www.pragma-ade.nl | www.= pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________= ________
If your question is of interest to others as well, please add an entry to t= he Wiki!

maillist : ntg-cont= ext@ntg.nl / http://www.ntg.nl/mailman/listinfo/nt= g-context
webpage=C2=A0 : http://www.pragma-ade.nl / http://context.aanhet.net=
archive=C2=A0 : https://bitbucket.org/phg/context-m= irror/commits/
wiki=C2=A0 =C2=A0 =C2=A0: http://contextgarden.net
___________________________________________________________________________= ________
--000000000000def1fd058599f068-- --===============7195486176077433257== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============7195486176077433257==--