public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Chris Wright <cawright.99-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: ligatures in html
Date: Mon, 21 Sep 2015 18:03:06 -0700 (PDT)	[thread overview]
Message-ID: <ab142d34-ed35-4034-be33-744e955f0329@googlegroups.com> (raw)
In-Reply-To: <20150921205458.GA92420-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4903 bytes --]

Thanks for the help folks. 
I don't understand why this would be an issue with the latex reader- that's 
a comment about me, not pandoc! - so I apologise for the laborious pace of 
this question...

If I change the test document to:

$ cat test.md
... \ae robe

that's three periods then the ligature

and look at the native format:

$ pandoc -S -f markdown -t native test.md > test.native
[Para [Str "\8230",Space,RawInline (Format "tex") "\\ae ",Str "robe"]]

so the three periods are converted to the correct ellipsis character, 
and the ligature is parsed to a RawInline

then outputting latex from this native representation:

$ pandoc -S -f native -t latex test.native 
\ldots{} \ae robe

so Str "\8230" is converted to \dots, and the ligature is done from 
RawInline

converting the same native to html:

$ pandoc -S -f native -t html test.native 
<p>… robe </p>

the Str"\8230" is printed as the single ellipsis character, and the 
RawInline is dropped.

It seems as if the md reader can parse three periods to an ellipsis 
character, but doesn't have a representation of the ligature that would 
work in both HTML and LaTex - thought it would work in LuaLaTeX/XeTex if it 
outputted the ligature character (e.g. Str"\8230"). 

Might it work if the ligature was recognised as something like:

Ligature(ae) was generated in the native format, which could then be 
converted by whatever writer produced the output format?

again,  thanks for your patience and help

Chris






On Tuesday, 22 September 2015 06:55:14 UTC+10, John MacFarlane wrote:
Ideally pandoc's latex reader would recognize \ae and 
convert it to the proper character, so feel free to put 
an issue on the bug tracker about this. 

+++ 'Jason Seeley' via pandoc-discuss [Sep 21 15 07:50 ]: 
> Hello, 
> Ligatures like \ae are specific to the LaTeX (and thus PDF) writer, so 
> they don't work in any other formats. Pandoc just passes it through 
> unchanged. For HTML output, you can use an entity: `&AElig;` or 
> `&aelig;`, for upper case or lower case. Another option is to use the 
> unicode character directly (how you do this depends on your system and 
> text editor; in Windows hold Alt and type 0230 on the number pad; in 
> vim type CTRL-K a e; use a character-map app, etc.) This should work 
> for most output formats. It'll work with LaTeX if you use XeLaTeX or 
> LuaLaTeX, as those allow unicode input. 
> Jason 
> On Monday, September 21, 2015 at 5:57:37 AM UTC-5, Chris Wright wrote: 
> 
> I want to publish a document with an \ae ligature to html and to pdf. 
> The latex form "\ae robic" converts to the appropriate form and 
> displays properly in pdf, but the html just drops the ligature. 
> 
> Simple test case: 
> 
> chriswri$ cat > test.txt 
> 
> \ae robic 
> 
> chriswri$ more test.txt 
> 
> \ae robic 
> 
> chriswri$ pandoc -t native test.txt 
> 
> [Para [RawInline (Format "tex") "\\ae ",Str "robic"]] 
> 
> chriswri$ pandoc -t html test.txt 
> 
> <p>robic</p> 
> 
> What's the best way around this - write a filter? finding some docs 
> that will help? (I've found that ... is automatically converted to an 
> ellipsis - so \dots isn't necessary). 
> 
> with thanks 
> 
> Chris 
> 
> -- 
> You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> To post to this group, send email to 
> [2]pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> To view this discussion on the web visit 
> [3]https://groups.google.com/d/msgid/pandoc-discuss/bbaae9b2-c139-415f- 
> 9063-86a887358b4c%40googlegroups.com. 
> For more options, visit [4]https://groups.google.com/d/optout. 
> 
>References 
> 
> 1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> 2. mailto:pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> 3. 
https://groups.google.com/d/msgid/pandoc-discuss/bbaae9b2-c139-415f-9063-86a887358b4c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer 
> 4. https://groups.google.com/d/optout 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ab142d34-ed35-4034-be33-744e955f0329%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6681 bytes --]

  parent reply	other threads:[~2015-09-22  1:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21 10:57 Chris Wright
     [not found] ` <7d633ff1-c25d-436c-a66f-9a8456699db6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-09-21 14:50   ` 'Jason Seeley' via pandoc-discuss
     [not found]     ` <bbaae9b2-c139-415f-9063-86a887358b4c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-09-21 20:54       ` John MACFARLANE
     [not found]         ` <20150921205458.GA92420-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
2015-09-22  1:03           ` Chris Wright [this message]
2015-09-22  9:52   ` david.pw.smith-Re5JQEeQqe8AvxtiuMwx3w
     [not found]     ` <874daeba-ced3-4d7d-b2ad-b0178e5a079b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-09-22 10:12       ` Stefan Björk
2015-09-22 10:26       ` Joost Kremers
     [not found]         ` <87twqmycla.fsf-97jfqw80gc6171pxa8y+qA@public.gmane.org>
2015-09-23  3:37           ` Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab142d34-ed35-4034-be33-744e955f0329@googlegroups.com \
    --to=cawright.99-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).