public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Axel Kielhorn <a.kielhorn-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: [Markdown=>PDF] Performance and other differences by switching between pdflatex/xelatex/lualatex
Date: Sat, 11 Jan 2014 19:31:06 +0100	[thread overview]
Message-ID: <B219D441-229E-43F4-AC71-DA65D6902CFB@gmail.com> (raw)
In-Reply-To: <66137e52-b12d-476a-b79c-9afb3ca612bb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 4655 bytes --]


Am 11.01.2014 um 16:12 schrieb kurt.pfeifle-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org:

> A few weeks ago I've been playing with different settings to created PDF from my own Markdown files, using --latex-engine=pdflatex|xelatex|lualatex
> 
> At the time I noticed there were significant performance differences:
> 	• pdflatex was the fastest (but sometimes had problems with special characters, like German umlauts)
> 	• xelatex was significantly slower (but handled my umlauts out of the box)
> 	• lualatex was extremely slow, and in many cases didn't finish the job at all but threw an error

This is well known.
If pdflatex does the job, use it, it is the fastest.
If you need XeLaTeX or luatex features it will take longer.
One problem I often run into is shift-space (non-breaking space) which isn't supported by inputenc.
Another is non ASCII characters in sections.

> But I didn't have much time to investigate more deeply -- I decided to write most of my content in Markdown first, before I turning to fine-tuning the style details of the different output formats.
> 
> Today I found some time to start taking a deeper look at the performance. In order to have a common (and stable) base for these measurements, I'm using the main README file from pandoc's Git repository as my Markdown source.
> 
> I'm using the freshly released version 1.12.3, installed via cabal on a Macbook (running Mavericks), the different LaTeX-engines were installed via MacPorts:

Please use MacTeX and not TeX from MacPorts.
Thus we will be talking about the same binaries (and if you update MacTeX via TeX Live Utility the same version of style files.)

With MacTeX

time pandoc -f markdown --latex-engine=lualatex -o myreadme_lualatex.pdf README

succeeds.

first run:
real	1m4.123s
user	0m38.678s
sys	0m10.288s

second run:

real	0m12.798s
user	0m11.822s
sys	0m1.004s


With pdflatex:

real	0m3.609s
user	0m2.880s
sys	0m0.332s

> Another significant difference in the output of the two successful PDF conversions:
> 	• xelatex used A4 media format for the PDF pages
> 	• pdflatex used Letter format
> (but I guess these defaults are builtin to the respective engines and do not have anything to do with pandoc. Or?!) This pagesize difference does not allow for an easy visual side-by side inspection of the two PDFs for any more subtile differences in their  pages' appearance.

You should always set the paper size in the document.
Everything else is undefined.
It is best to use a custom template.
(I have set the page size to A4 via tlmgr, but others may have not.)

> PDF Metadata
> 
> As you can see, there are a few significant differences:
> 	• File size: pdflatex outputs ~444 kB, xelatex outputs -185 kB (difference of ~259 kB).

Font expansion (via microtype) in pdflatex which does not work in XeTeX?

> 	• Page numbers: pdflatex generates 36 pages, xelatex generates 37 pages.

microtype changes the line/paragraph/page breaks.

> 	• Producer: pdflatex states "pdfTeX-1.40.14", xelatex states "xdvipdfmx (0.7.9)". This means xelatex goes a detour via DVI to produce its PDF.

This is the way XeTeX works.

> 	• Subject and Keywords: pdflatex doesn't put these metadata fields into the PDF (into object with /Type /Catalog), xelatex does so, but leaves them empty.

\hypersetup{breaklinks=true,
            bookmarks=true,
            pdfauthor={John MacFarlane},
            pdftitle={Pandoc User's Guide},
            colorlinks=true,
            citecolor=blue,
            urlcolor=blue,
            linkcolor=magenta,
            pdfborder={0 0 0}}

Doesn't define subject and keywords.
You may define pdfsubject and pdfkeywords and fill them with YAML data in a custom template.
(Please submit the changes.)

> 	• Page size: despite identical commandline parameters, there are slight differences in the page size. I assume this is because of the DVI detour of xelatex which may introduce some rounding errors when calculating stuff.

> (It's nice when pandoc works flawlessly -- however, it is quite difficult to narrow down the cause of a problem when something goes wrong, like in this case with LuaLaTeX. I think I'll run a Markdown=>LaTeX conversion next, and then run a LaTeX=>PDF conversion manually on the commandline, to see if I can enable some debugging switches there. Currently I do not have any experiences about directly running lualatex, xelatex or pdflatex on the command line...)

My first solution is always to generate LaTeX and examine that.
You don't have to run them on the command line, MacTeX will install TeXShop.

Axel



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 1587 bytes --]

  parent reply	other threads:[~2014-01-11 18:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-11 15:12 kurt.pfeifle-gM/Ye1E23mwN+BqQ9rBEUg
     [not found] ` <66137e52-b12d-476a-b79c-9afb3ca612bb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-01-11 15:31   ` kurt.pfeifle-gM/Ye1E23mwN+BqQ9rBEUg
2014-01-11 18:31   ` Axel Kielhorn [this message]
     [not found]     ` <B219D441-229E-43F4-AC71-DA65D6902CFB-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-01-12  2:30       ` kurt.pfeifle-gM/Ye1E23mwN+BqQ9rBEUg
2014-01-12  9:52   ` Dirk Laurie
     [not found]     ` <CABcj=tm7ON4n5G6joBFN17Rz+5vim_crbts5mqtEbnSDn7Nq+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-12 13:18       ` kurt.pfeifle-gM/Ye1E23mwN+BqQ9rBEUg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B219D441-229E-43F4-AC71-DA65D6902CFB@gmail.com \
    --to=a.kielhorn-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).