public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Robert Fekete <fekete77.robert-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Copy-pasting code from the PDF loses formatting
Date: Thu, 6 Jan 2022 06:50:02 -0800 (PST)	[thread overview]
Message-ID: <d82e995e-040c-44ae-9658-211660d69887n@googlegroups.com> (raw)
In-Reply-To: <CALu=v3LO_f8GBNxwre9mTrMT+Mttf6-b4eA45iKS1SUb8vSs=Q@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3033 bytes --]

Hi Leonard,

Thanks a lot for the tip, unfortunately it doesn't seem to solve the 
problem, but I'll play with it some more. Is there any way to force this, 
maybe from the HTML side, like replacing spaces with tabs? (Sorry if this 
doesn't make sense, I don't know much about the inner workings of the PDF 
format).

Leonard Rosenthol a következőt írta (2022. január 6., csütörtök, 15:00:08 
UTC+1):

> Robert - the reason why none of the viewers are copyring out indentation 
> is that there isn't actually indentation there (aka no spaces are tab 
> characters), the text is simply "moved".    Normally PDF viewers are able 
> to apply heuristics to "guess" when the amount of "movement" is supposed to 
> mean indentation - but this particular amount of "movement" is too small 
> for consideration.  If you make the indent say 4 spaces worth instead of 2, 
> I suspect you will get the result you wish.
>
> On Thu, Jan 6, 2022 at 4:10 AM Robert Fekete <fekete7...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Hi Everyone, 
>>
>> I'm trying to create PDF output from HTML input, and ran into a weird 
>> error:
>>
>> Code samples (for example, YAML or Python) are properly formatted in the 
>> pdf, but most of the formatting is lost when copy-pasting the code from the 
>> PDF into a text editor or terminal. Depending on the PDF viewer, either:
>>
>>    - line breaks are retained, but indentation is lost (evince, preview, 
>>    adobe reader), or
>>    - line breaks are lost and everything becomes a single line, but 
>>    whitespace is retained (built-in pdf viewer of Firefox and VS Code)
>>
>> I'm currently using pandoc 2.14.2 on MacOS Big Sur.
>>
>> I have attached two test files (input and output), I created the pdf with 
>> the wkhtml2pdf engine, but I've tested other engines as well and the 
>> results were similar (xelatex, weasyprint). 
>>
>> Has anyone seen a similar problem? Any pointers are appreciated.
>>
>> Kind Regards,
>> Robert
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/a976bf18-7019-43cf-84c2-0a2d375cef55n%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/a976bf18-7019-43cf-84c2-0a2d375cef55n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d82e995e-040c-44ae-9658-211660d69887n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4541 bytes --]

      reply	other threads:[~2022-01-06 14:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-06  9:10 Robert Fekete
     [not found] ` <a976bf18-7019-43cf-84c2-0a2d375cef55n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-01-06 13:59   ` Leonard Rosenthol
2022-01-06 14:50     ` Robert Fekete [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d82e995e-040c-44ae-9658-211660d69887n@googlegroups.com \
    --to=fekete77.robert-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).