From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/113277 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Adam Reviczky via ntg-context Newsgroups: gmane.comp.tex.context Subject: PDF text stream with delimitedtext Date: Sat, 30 Oct 2021 21:12:19 +0100 Message-ID: Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4240949498341268748==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8809"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Adam Reviczky To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Sat Oct 30 22:13:38 2021 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mgujR-00028i-2i for gctc-ntg-context-518@m.gmane-mx.org; Sat, 30 Oct 2021 22:13:37 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 4E961AF4; Sat, 30 Oct 2021 22:12:35 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 90RGDZKV5Uw6; Sat, 30 Oct 2021 22:12:33 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id AF381287962; Sat, 30 Oct 2021 22:12:33 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id E731D28796A for ; Sat, 30 Oct 2021 22:12:31 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KlNXXdlshX0n for ; Sat, 30 Oct 2021 22:12:30 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.49; helo=mail-ed1-f49.google.com; envelope-from=reviczky@gmail.com; receiver= Original-Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id C9155AF4 for ; Sat, 30 Oct 2021 22:12:30 +0200 (CEST) Original-Received: by mail-ed1-f49.google.com with SMTP id ee16so37292481edb.10 for ; Sat, 30 Oct 2021 13:12:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:from:date:message-id:subject:to; bh=QYuHf/wZAhWEJYu02RwfGXKXlT/m8D2cyr2tTetenF8=; b=pjfdS9/qq9n2NyiheR/xs1E8VzMDW3QKwgXgS0oV4pL4Mng4m+tBbqw/V4ugihdbpu axrnwRqOCB3zRarC7dy9cUnGtIvIkeehkAO2f+P/6hDlwLqfQsqTih1DkJw7xdu0Sjz4 ljtQ0KKFFEBHcX5ZsMtOOYgBxAtgj29N6fXVfeG1Xwt3fi+TMlQMI+9lAvJBY13ej8Sb mdWVXARI8hhOMduz0w+jRxNuPVUxXIOY3g+8uTOGBBGL2SMuodzDkRvKXADnT0yK3oPp Q0o1zPo+YQQeXfM3uiPpflHEfShXF5RzosGn1gZWFlNQBqt8nu6ORIfxFzd2/L72SDas pPMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=QYuHf/wZAhWEJYu02RwfGXKXlT/m8D2cyr2tTetenF8=; b=DeIdDQ0cLW7hSpcUryAa3Qpbe2mZs9ZuR4m6cLnUzoncKzJKJqlXqEDMVXFf1ydzVY 3uQD/GqYtrB4qlYEdZIW4SDg36nac4bsFxl8Gr2Yv73IOSLVfg5g/g7QDoGQW5idV6R+ KAJ6RqdKYE6z5fdkuCYVLVJWkbF1c9nky6vLTAQN2xjIEyRG9f7pQbW2UTMYJbhFD+xj azum1EnkTyweZw+uhJu4cOx3gJ4jrDZwtjKBKEu7Xch9rKCuxKajYDptuEjfcyuXli51 PY5r9ksFoqAdgFg1WMb84TuoweEq3ojrf4Do5jFFzV25GEPZreptdJymqAUKwRchMbJX Vqvg== X-Gm-Message-State: AOAM53167O1lLcbA09lCbp1aAt6UduyBjH3JVyMDhj0loty7A1UMqCNy dBpso+8XvOMrSYfFCjuR9Lp+51UiFOmtNm3kx2OBkBWFKfXnFg== X-Google-Smtp-Source: ABdhPJxkG/XcCnW9EovBP2MmQeTMA1EjrxY2A/Vk2z4p8vmLQVqR4K/YIfuqmuyf/vjGYFKTKXaaozvppJCbo9q0CvM= X-Received: by 2002:aa7:d697:: with SMTP id d23mr26629638edr.152.1635624750077; Sat, 30 Oct 2021 13:12:30 -0700 (PDT) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:113277 Archived-At: --===============4240949498341268748== Content-Type: multipart/alternative; boundary="00000000000036dac705cf9790df" --00000000000036dac705cf9790df Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I am trying to use quotation or blockquote in a document, but when selecting and copying the text in poppler/pdf.js I get double marks. Looking at the minimal example below: \nopdfcompression \setuppagenumbering[location=3D] \starttext \startquotation Hello world! \stopquotation \setupquotation[left=3D=C2=AB,right=3D=C2=BB] \startquotation Hello world! \stopquotation \setupquotation[left=3D=E2=80=98,right=3D=E2=80=99] \startquotation Hello world! \stopquotation \setupdelimitedtext[blockquote][left=3D=E2=80=98,right=3D=E2=80=99] \startblockquote Hello world! \stopblockquote =E2=80=9CHello world!=E2=80=9D\\ =C2=ABHello world!=C2=BB\\ =E2=80=98Hello world!=E2=80=99\\ \startquote Hello world! \stopquote\\ \quote{Hello world!}\\ \stoptext The text stream seems to have an additional object for the quotation and blockquote lines. Rendering the PDF in itself is fine but when copying the text I get mixed results (poppler/evince and pdf.js give double marks but mupdf does not: =E2=80=9CHello world!=E2=80=9D=E2=80=9D,=C2=ABHello world!=C2=BB=C2=BB,=E2= =80=98Hello world!=E2=80=99=E2=80=99,=E2=80=98Hello world!=E2=80=99=E2=80= =99). Is this just an issue with poppler/pdf.js when trying to extract the text (as the PDF rendering seems all fine)? Adam --00000000000036dac705cf9790df Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

I am trying to use quota= tion or blockquote in a document, but when selecting and copying the text i= n poppler/pdf.js I get double marks.

Looking at th= e minimal example below:
\nopdfcompression
\setuppagenumbering= [location=3D]
\starttext
\startquotation Hello world! \stopquotation<= br>\setupquotation[left=3D=C2=AB,right=3D=C2=BB]
\startquotation Hello w= orld! \stopquotation
\setupquotation[left=3D=E2=80=98,right=3D=E2=80=99]=
\startquotation Hello world! \stopquotation
\setupdelimitedtext[bloc= kquote][left=3D=E2=80=98,right=3D=E2=80=99]
\startblockquote Hello world= ! \stopblockquote
=E2=80=9CHello world!=E2=80=9D\\
=C2=ABHello world!= =C2=BB\\
=E2=80=98Hello world!=E2=80=99\\
\startquote Hello world! \s= topquote\\
\quote{Hello world!}\\
\stoptext

<= div>The text stream seems to have an additional object for the quotation an= d blockquote lines.

Rendering the PDF in itsel= f is fine but when copying the text I get mixed results (poppler/evince and= pdf.js give double marks but mupdf does not: =E2=80=9CHello world!=E2=80= =9D=E2=80=9D,=C2=ABHello world!=C2=BB=C2=BB,=E2=80=98Hello world!=E2=80=99= =E2=80=99,=E2=80=98Hello world!=E2=80=99=E2=80=99).

Is this just an issue with poppler/pdf.js when trying to extract the = text (as the PDF rendering seems all fine)?

Ad= am

--00000000000036dac705cf9790df-- --===============4240949498341268748== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============4240949498341268748==--