public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Strange bug in 1.15.1
@ 2015-10-23 19:44 BP Jonsson
       [not found] ` <562A8E21.7020009-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: BP Jonsson @ 2015-10-23 19:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

With 1.15.1:

     pandoc -f markdown -t latex

     \textsf<span class="sans">Life</span>

     \textgreek<span lang=grc">Βίος</span>

     \texrussian<span lang="ru">жизнь</span>

     ^D

     \textsf{Life}

     \textgreek<span lang=grc``\textgreater{}Βίος

     \texrussian{жизнь}

The strangeness is that it only happens with `\textgreek`, 
anything else I've tried works as expected, with any argument/span 
contents.

/bpj

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/562A8E21.7020009%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found] ` <562A8E21.7020009-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-23 21:48   ` Joost Kremers
       [not found]     ` <87r3klfe97.fsf-97jfqw80gc6171pxa8y+qA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Joost Kremers @ 2015-10-23 21:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


On Fr, Okt 23 2015, BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
> With 1.15.1:
>
>      pandoc -f markdown -t latex
>
>      \textsf<span class="sans">Life</span>
>
>      \textgreek<span lang=grc">Βίος</span>
............................^

You seem to be missing a double quote here.

>      \texrussian<span lang="ru">жизнь</span>
>
>      ^D
>
>      \textsf{Life}
>
>      \textgreek<span lang=grc``\textgreater{}Βίος
>
>      \texrussian{жизнь}
>
> The strangeness is that it only happens with `\textgreek`, 
> anything else I've tried works as expected, with any argument/span 
> contents.

If the problem is not the double quote, then I haven't got a clue...


-- 
Joost Kremers
Life has its moments

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87r3klfe97.fsf%40fastmail.fm.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]     ` <87r3klfe97.fsf-97jfqw80gc6171pxa8y+qA@public.gmane.org>
@ 2015-10-24  9:53       ` BP Jonsson
       [not found]         ` <562B5504.7010107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: BP Jonsson @ 2015-10-24  9:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2015-10-23 kl. 23:48, skrev Joost Kremers:
>
> On Fr, Okt 23 2015, BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
>> With 1.15.1:
>>
>>       pandoc -f markdown -t latex
>>
>>       \textsf<span class="sans">Life</span>
>>
>>       \textgreek<span lang=grc">Βίος</span>
> ............................^
>
> You seem to be missing a double quote here.

Alas tht's not it; I changed/removed the attributes in my MWE 
several times, and it's the same even without any attributes, or 
with the correct doublequotes:

     pandoc -f markdown -t latex

     \textsf<span>Life</span>

     \textgreek<span>Βίος</span>

     \texrussian<span>жизнь</span>

     ^D

     \textsf{Life}

     \textgreek<span\textgreater{}Βίος

     \texrussian{жизнь}


     pandoc -f markdown -t latex

     \textsf<span class="sans">Life</span>

     \textgreek<span lang="grc">Βίος</span>

     \texrussian<span lang="ru">жизнь</span>

     ^D

     \textsf{Life}

     \textgreek<span lang=``grc''\textgreater{}Βίος

     \texrussian{жизнь}


> If the problem is not the double quote, then I haven't got a clue...

I hope somebody has a clue. It seems totally weird that a 
particular command gets misparsed. Anyway it seems the problem is 
with the reader:

     [Para [RawInline (Format "tex") "\\textsf",Span 
("",["sans"],[]) [Str "Life"]]
     ,Para [RawInline (Format "tex") "\\textgreek<",Str 
"span",Space,Str "lang=\"grc\">\914\8055\959\962",RawInline 
(Format "html") "</span>"]
     ,Para [RawInline (Format "tex") "\\texrussian",Span 
("",[],[("lang","ru")]) [Str "\1078\1080\1079\1085\1100"]]]

I have also checked with <http://pandoc.org/try> and it's the same.

For now I can use `\let\textgrk\textgreek` as a workaround, but it 
is clearly not satisfactory in the long run.

/bpj

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/562B5504.7010107%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]         ` <562B5504.7010107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-24 12:59           ` mb21
       [not found]             ` <d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: mb21 @ 2015-10-24 12:59 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: bpj-J3H7GcXPSITLoDKTGw+V6w


[-- Attachment #1.1: Type: text/plain, Size: 944 bytes --]

I suppose it's due to 
https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L450 
introduced 
in https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785af9a7bb41

@bpj You can also try the new language handling in master, where you can do:

    echo '<span lang="el">my greek text</span>' | pandoc -t latex
    \textgreek{my greek text}

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d8b41263-76b4-417d-a3be-1d81d92dfa7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1542 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Modern (el) and Ancient (grc) Greek (was: Re: Strange bug in 1.15.1)
       [not found]             ` <d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-10-24 21:29               ` BP Jonsson
       [not found]                 ` <562BF82E.80402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-10-24 22:39               ` Strange bug in 1.15.1 John MacFarlane
  1 sibling, 1 reply; 16+ messages in thread
From: BP Jonsson @ 2015-10-24 21:29 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mb21

Den 2015-10-24 kl. 14:59, skrev mb21:
> I suppose it's due to
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L450
> introduced
> in https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785af9a7bb41
>
> @bpj You can also try the new language handling in master, where you can do:
>
>      echo '<span lang="el">my greek text</span>' | pandoc -t latex
>      \textgreek{my greek text}
>

Just to be clear: Modern and ancient Greek have different subtags. 
  One may think whatever one wants about that, and about putting 
the cutoff-point at 1453 and not, say, 381[^1] but one shouldn't 
use el for Ancient or Medieval Greek, which has its own subtag grc.
I just tried and `<span lang="grc">my ancient greek text</span>` 
doesn't currently work.  The reason polyglossia calls Ancient 
Greek `\textgreek[variant=ancient]` is because it uses a different 
classification scheme (IMNSHO with good reason), but if we are to 
use BCP 47 tags to select language one should be able to select 
`\textgreek[variant=ancient]` with lang="grc", because that's what 
is valid.  Even if I were the only pandoc user ever using Ancient 
Greek I think all languages and variants supported by polyglossia 
and babel should be handled correctly.

I'm willing to help chase down the needed tag--language(variant) 
pairs if you can provide me with something like a readable list of 
the current pairings.

In the meantime I *must* use something like

     \setotherlanguage[variant=ancient]{greek}
     \let\grc\textgreek
     \grc<span lang="grc">Ὁ βίος βραχύς, ἡ δὲ τέχνη μακρή.</span>

Not a huge burden but anyway...

<http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry> 
says:

     %%
     Type: language
     Subtag: el
     Description: Modern Greek (1453-)
     Added: 2005-10-16
     Suppress-Script: Grek
     %%
     Type: language
     Subtag: grc
     Description: Ancient Greek (to 1453)
     Added: 2005-10-16


[^1]: To be sure the last text in Ancient Greek was not written 
before 1453, nor the first text in post-Koiné vernacular written 
after 1453. 381 on the other hand marks a real cultural shift as 
near as can be, but neither warrants a bipartition of the Greek 
language.  The same can be said of Old Norse (non) vs. Icelandic 
(is).  At least in that case they don't demand that we draw the 
line at 1262 or any other specific date!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/562BF82E.80402%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek (was: Re: Strange bug in 1.15.1)
       [not found]                 ` <562BF82E.80402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-24 21:35                   ` mb
       [not found]                     ` <62FE420F-92AF-4DB4-9B99-EF8C57F50304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: mb @ 2015-10-24 21:35 UTC (permalink / raw)
  To: pandoc-discuss

huh, isn’t the following exactly what you want? (you need to checkout the newest version on master from github and compile yourself or wait for the next release)

pandoc -t latex
<span lang="grc">my ancient greek</span>
^D
\textgreek[variant=ancient]{my ancient greek}


You can find the polyglossia mappings at https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1075 corrections always welcome.



> On 24.Oct, 2015, at 23:29 , BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
> 
> Den 2015-10-24 kl. 14:59, skrev mb21:
>> I suppose it's due to
>> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L450
>> introduced
>> in https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785af9a7bb41
>> 
>> @bpj You can also try the new language handling in master, where you can do:
>> 
>>     echo '<span lang="el">my greek text</span>' | pandoc -t latex
>>     \textgreek{my greek text}
>> 
> 
> Just to be clear: Modern and ancient Greek have different subtags.  One may think whatever one wants about that, and about putting the cutoff-point at 1453 and not, say, 381[^1] but one shouldn't use el for Ancient or Medieval Greek, which has its own subtag grc.
> I just tried and `<span lang="grc">my ancient greek text</span>` doesn't currently work.  The reason polyglossia calls Ancient Greek `\textgreek[variant=ancient]` is because it uses a different classification scheme (IMNSHO with good reason), but if we are to use BCP 47 tags to select language one should be able to select `\textgreek[variant=ancient]` with lang="grc", because that's what is valid.  Even if I were the only pandoc user ever using Ancient Greek I think all languages and variants supported by polyglossia and babel should be handled correctly.
> 
> I'm willing to help chase down the needed tag--language(variant) pairs if you can provide me with something like a readable list of the current pairings.
> 
> In the meantime I *must* use something like
> 
>    \setotherlanguage[variant=ancient]{greek}
>    \let\grc\textgreek
>    \grc<span lang="grc">Ὁ βίος βραχύς, ἡ δὲ τέχνη μακρή.</span>
> 
> Not a huge burden but anyway...
> 
> <http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry> says:
> 
>    %%
>    Type: language
>    Subtag: el
>    Description: Modern Greek (1453-)
>    Added: 2005-10-16
>    Suppress-Script: Grek
>    %%
>    Type: language
>    Subtag: grc
>    Description: Ancient Greek (to 1453)
>    Added: 2005-10-16
> 
> 
> [^1]: To be sure the last text in Ancient Greek was not written before 1453, nor the first text in post-Koiné vernacular written after 1453. 381 on the other hand marks a real cultural shift as near as can be, but neither warrants a bipartition of the Greek language.  The same can be said of Old Norse (non) vs. Icelandic (is).  At least in that case they don't demand that we draw the line at 1262 or any other specific date!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/62FE420F-92AF-4DB4-9B99-EF8C57F50304%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]             ` <d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2015-10-24 21:29               ` Modern (el) and Ancient (grc) Greek (was: Re: Strange bug in 1.15.1) BP Jonsson
@ 2015-10-24 22:39               ` John MacFarlane
       [not found]                 ` <20151024223902.GB3531-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: John MacFarlane @ 2015-10-24 22:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

mb21 is right about the cause.  We could change the LaTeX
reader to only accept bracketed text (rather than any token)
after \textgreek.  That would make conversion from LaTeX
slightly less correct, however.

+++ mb21 [Oct 24 15 05:59 ]:
>   I suppose it's due to
>   https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX
>   .hs#L450 introduced
>   in https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785
>   af9a7bb41
>   @bpj You can also try the new language handling in master, where you
>   can do:
>       echo '<span lang="el">my greek text</span>' | pandoc -t latex
>       \textgreek{my greek text}
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/d8b41263-76b4-417d-
>   a3be-1d81d92dfa7e%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]                 ` <20151024223902.GB3531-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
@ 2015-10-25  9:22                   ` mb21
       [not found]                     ` <2919c1dd-2778-49a6-bc40-18dec1fa9315-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: mb21 @ 2015-10-25  9:22 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2614 bytes --]

@jgm I'm not sure why \textgreek was added in the LaTeX reader and no other 
languages, but maybe we can remove that line now that we have proper 
language attributes in spans/divs?

On Sunday, October 25, 2015 at 12:39:18 AM UTC+2, John MacFarlane wrote:
>
> mb21 is right about the cause.  We could change the LaTeX 
> reader to only accept bracketed text (rather than any token) 
> after \textgreek.  That would make conversion from LaTeX 
> slightly less correct, however. 
>
> +++ mb21 [Oct 24 15 05:59 ]: 
> >   I suppose it's due to 
> >   
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX 
> >   .hs#L450 introduced 
> >   in 
> https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785 
> >   af9a7bb41 
> >   @bpj You can also try the new language handling in master, where you 
> >   can do: 
> >       echo '<span lang="el">my greek text</span>' | pandoc -t latex 
> >       \textgreek{my greek text} 
> > 
> >   -- 
> >   You received this message because you are subscribed to the Google 
> >   Groups "pandoc-discuss" group. 
> >   To unsubscribe from this group and stop receiving emails from it, send 
> >   an email to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To post to this group, send email to 
> >   [2]pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To view this discussion on the web visit 
> >   [3]
> https://groups.google.com/d/msgid/pandoc-discuss/d8b41263-76b4-417d- 
> >   a3be-1d81d92dfa7e%40googlegroups.com. 
> >   For more options, visit [4]https://groups.google.com/d/optout. 
> > 
> >References 
> > 
> >   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   2. mailto:pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   3. 
> https://groups.google.com/d/msgid/pandoc-discuss/d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer 
> >   4. https://groups.google.com/d/optout 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2919c1dd-2778-49a6-bc40-18dec1fa9315%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 7152 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek
       [not found]                     ` <62FE420F-92AF-4DB4-9B99-EF8C57F50304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-25 11:09                       ` BP Jonsson
       [not found]                         ` <562CB869.6010202-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: BP Jonsson @ 2015-10-25 11:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2015-10-24 kl. 23:35, skrev mb:
> huh, isn’t the following exactly what you want?

Yes.

> (you need to checkout the newest version on master from github

Sorry, I didn't get that part!

> and compile yourself

Cab I find instructions somewhere? Preferably how to have both the 
released and the dev version installed. I'm on Ubuntu 14.04 if 
that makes any difference.  I have vague memories of trying and 
failing once.

>or wait for the next release)

That will be a while I guess, since there just was a release.

>
> pandoc -t latex
> <span lang="grc">my ancient greek</span>
> ^D
> \textgreek[variant=ancient]{my ancient greek}
>
>
> You can find the polyglossia mappings at https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1075 corrections always welcome.

I think an option for Sanskrit in Latin script would be in order. 
Us comparative philologists Romanize everything *except* Greek...

I'm guessing monotonic Modern Greek (the current official 
orthography) is covered by

<https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1102>

>
>
>
>> On 24.Oct, 2015, at 23:29 , BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
>>
>> Den 2015-10-24 kl. 14:59, skrev mb21:
>>> I suppose it's due to
>>> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L450
>>> introduced
>>> in https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785af9a7bb41
>>>
>>> @bpj You can also try the new language handling in master, where you can do:
>>>
>>>      echo '<span lang="el">my greek text</span>' | pandoc -t latex
>>>      \textgreek{my greek text}
>>>
>>
>> Just to be clear: Modern and ancient Greek have different subtags.  One may think whatever one wants about that, and about putting the cutoff-point at 1453 and not, say, 381[^1] but one shouldn't use el for Ancient or Medieval Greek, which has its own subtag grc.
>> I just tried and `<span lang="grc">my ancient greek text</span>` doesn't currently work.  The reason polyglossia calls Ancient Greek `\textgreek[variant=ancient]` is because it uses a different classification scheme (IMNSHO with good reason), but if we are to use BCP 47 tags to select language one should be able to select `\textgreek[variant=ancient]` with lang="grc", because that's what is valid.  Even if I were the only pandoc user ever using Ancient Greek I think all languages and variants supported by polyglossia and babel should be handled correctly.
>>
>> I'm willing to help chase down the needed tag--language(variant) pairs if you can provide me with something like a readable list of the current pairings.
>>
>> In the meantime I *must* use something like
>>
>>     \setotherlanguage[variant=ancient]{greek}
>>     \let\grc\textgreek
>>     \grc<span lang="grc">Ὁ βίος βραχύς, ἡ δὲ τέχνη μακρή.</span>
>>
>> Not a huge burden but anyway...
>>
>> <http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry> says:
>>
>>     %%
>>     Type: language
>>     Subtag: el
>>     Description: Modern Greek (1453-)
>>     Added: 2005-10-16
>>     Suppress-Script: Grek
>>     %%
>>     Type: language
>>     Subtag: grc
>>     Description: Ancient Greek (to 1453)
>>     Added: 2005-10-16
>>
>>
>> [^1]: To be sure the last text in Ancient Greek was not written before 1453, nor the first text in post-Koiné vernacular written after 1453. 381 on the other hand marks a real cultural shift as near as can be, but neither warrants a bipartition of the Greek language.  The same can be said of Old Norse (non) vs. Icelandic (is).  At least in that case they don't demand that we draw the line at 1262 or any other specific date!
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/562CB869.6010202%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek
       [not found]                         ` <562CB869.6010202-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-25 18:17                           ` mb21
       [not found]                             ` <562D3436.6040707@gmail.com>
       [not found]                             ` <14ec196f-2a1f-4c8c-99a3-46e869f0823a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 2 replies; 16+ messages in thread
From: mb21 @ 2015-10-25 18:17 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: bpj-J3H7GcXPSITLoDKTGw+V6w


[-- Attachment #1.1: Type: text/plain, Size: 5712 bytes --]

> Cab I find instructions somewhere? 
see https://github.com/jgm/pandoc/blob/master/INSTALL
Does anybody know whether stack or cabal is currently the preferred build 
method? (I'd guess stack, but I'm still on cabal so I cannot vouch for it).


> I think an option for Sanskrit in Latin script would be in order. 

Could you tell us what Polyglossia command Pandoc would need to output for 
that? Probably something like \textsanskrit[script=latin]{...}, but on my 
system is still complains about Devanagari fonts... are you sure this is 
supported in Polyglossia?


> I'm guessing monotonic Modern Greek (the current official 
> orthography) is covered by 
https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1102

Yes, the language names shared by Babel and Polyglossia are 
at https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1131


On Sunday, October 25, 2015 at 12:09:41 PM UTC+1, BP Jonsson wrote:
>
> Den 2015-10-24 kl. 23:35, skrev mb: 
> > huh, isn’t the following exactly what you want? 
>
> Yes. 
>
> > (you need to checkout the newest version on master from github 
>
> Sorry, I didn't get that part! 
>
> > and compile yourself 
>
> Cab I find instructions somewhere? Preferably how to have both the 
> released and the dev version installed. I'm on Ubuntu 14.04 if 
> that makes any difference.  I have vague memories of trying and 
> failing once. 
>
> >or wait for the next release) 
>
> That will be a while I guess, since there just was a release. 
>
> > 
> > pandoc -t latex 
> > <span lang="grc">my ancient greek</span> 
> > ^D 
> > \textgreek[variant=ancient]{my ancient greek} 
> > 
> > 
> > You can find the polyglossia mappings at 
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1075 
> corrections always welcome. 
>
> I think an option for Sanskrit in Latin script would be in order. 
> Us comparative philologists Romanize everything *except* Greek... 
>
> I'm guessing monotonic Modern Greek (the current official 
> orthography) is covered by 
>
> <
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/LaTeX.hs#L1102> 
>
>
> > 
> > 
> > 
> >> On 24.Oct, 2015, at 23:29 , BP Jonsson <b...-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org <javascript:>> 
> wrote: 
> >> 
> >> Den 2015-10-24 kl. 14:59, skrev mb21: 
> >>> I suppose it's due to 
> >>> 
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/LaTeX.hs#L450 
> >>> introduced 
> >>> in 
> https://github.com/jgm/pandoc/commit/9bf76fa5a256a20d03d251ec15f1785af9a7bb41 
> >>> 
> >>> @bpj You can also try the new language handling in master, where you 
> can do: 
> >>> 
> >>>      echo '<span lang="el">my greek text</span>' | pandoc -t latex 
> >>>      \textgreek{my greek text} 
> >>> 
> >> 
> >> Just to be clear: Modern and ancient Greek have different subtags.  One 
> may think whatever one wants about that, and about putting the cutoff-point 
> at 1453 and not, say, 381[^1] but one shouldn't use el for Ancient or 
> Medieval Greek, which has its own subtag grc. 
> >> I just tried and `<span lang="grc">my ancient greek text</span>` 
> doesn't currently work.  The reason polyglossia calls Ancient Greek 
> `\textgreek[variant=ancient]` is because it uses a different classification 
> scheme (IMNSHO with good reason), but if we are to use BCP 47 tags to 
> select language one should be able to select `\textgreek[variant=ancient]` 
> with lang="grc", because that's what is valid.  Even if I were the only 
> pandoc user ever using Ancient Greek I think all languages and variants 
> supported by polyglossia and babel should be handled correctly. 
> >> 
> >> I'm willing to help chase down the needed tag--language(variant) pairs 
> if you can provide me with something like a readable list of the current 
> pairings. 
> >> 
> >> In the meantime I *must* use something like 
> >> 
> >>     \setotherlanguage[variant=ancient]{greek} 
> >>     \let\grc\textgreek 
> >>     \grc<span lang="grc">Ὁ βίος βραχύς, ἡ δὲ τέχνη μακρή.</span> 
> >> 
> >> Not a huge burden but anyway... 
> >> 
> >> <
> http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry> 
> says: 
> >> 
> >>     %% 
> >>     Type: language 
> >>     Subtag: el 
> >>     Description: Modern Greek (1453-) 
> >>     Added: 2005-10-16 
> >>     Suppress-Script: Grek 
> >>     %% 
> >>     Type: language 
> >>     Subtag: grc 
> >>     Description: Ancient Greek (to 1453) 
> >>     Added: 2005-10-16 
> >> 
> >> 
> >> [^1]: To be sure the last text in Ancient Greek was not written before 
> 1453, nor the first text in post-Koiné vernacular written after 1453. 381 
> on the other hand marks a real cultural shift as near as can be, but 
> neither warrants a bipartition of the Greek language.  The same can be said 
> of Old Norse (non) vs. Icelandic (is).  At least in that case they don't 
> demand that we draw the line at 1262 or any other specific date! 
> > 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/14ec196f-2a1f-4c8c-99a3-46e869f0823a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 10391 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek
       [not found]                               ` <562D3436.6040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-25 20:37                                 ` mb
       [not found]                                   ` <88EFDCE6-7216-4C7B-B2F1-22F1C457A8C7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: mb @ 2015-10-25 20:37 UTC (permalink / raw)
  To: pandoc-discuss

Thanks for your explanations. Yes, inserting `\newfontfamily\sanskritfont{Latin Modern Roman}` (or similar) at least lets me compile the PDF, it still emits a bunch of warning though, probably because I haven’t installed all the font/language packs.

sa-deva and sa-latn already compile to \textsanskrit since the code splits the BCP47 language string by dashes before feeding it to the functions I showed you which fall back to “sa” if they cannot find “sa-deva” etc. But what you’re saying is that both Pali and Prakrit are also variants of Sanskrit (or that it at least makes sense to treat them as such for polyglossia’s purpose of hyphenation etc.)? Shouldn’t pi-thai be compiled to \textthai or something instead and only pi, pi-deva and pi-latn to \textsanskrit?


> On 25.Oct, 2015, at 20:57 , BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
> 
> Den 2015-10-25 kl. 19:17, skrev mb21:
>>> I think an option for Sanskrit in Latin script would be in order.
>> Could you tell us what Polyglossia command Pandoc would need to output for
>> that? Probably something like \textsanskrit[script=latin]{...}, but on my
>> system is still complains about Devanagari fonts... are you sure this is
>> supported in Polyglossia?
>> 
>> 
> 
> Sorry, the Script option with polyglossia/sanskrit is for other *Indic* scripts.  The default hyphenation patterns for sanskrit cover both Devanagari and Latin!  So you just define a Latin font for Sanskrit and `\setotherlanguage{sanskrit}` and you are good to go, but HTML wants lang="sa-latn" for Sanskrit in Latin script!
> 
> Truth to tell I have never used Sanskrit as the main language of a document, but done something like
> 
> 
>    \usepackage{polyglossia}
>    \setmainfont{Charis SIL} % \defaultfontfeatures does the rest
>    \setmainlanguage{english}
>    \newfontfamily\sanskritfont[Script=Latin]{Charis SIL}
>    % loads hyphenation for Devanagari *and* Latin hyphenation or so I've been told.
>    \setotherlanguage{sanskrit}
>    \newcommand{\skt}[1]{\emph{\textsanskrit{#1}}}
> 
> 
>    \skt<span lang="sa-latn">uvāca</span> ‘he said’.
> 
> (I load those definitions as header-includes.)
> 
> Essentially the same thing as described here, with the difference that I'm not concerned about Devanagari with my comparatist hat on.
> 
> <http://cikitsa.blogspot.se/2012/07/sanskrit-hyphenation-list.html>
> 
> I can't tell for sure if it's the `\sanskritfont` declaration or the fact that I happen to have Devanagari fonts installed which does the difference, but I guess it's the latter.  The thing is probably that when you load Sanskrit with polyglossia it *both* checks for Dev fonts *and* loads Dev/Latin combined hyph patterns, not being able to tell that you are in fact going to use only Romanization.
> 
> <https://github.com/reutenauer/polyglossia/blob/master/tex/gloss-sanskrit.ldf>
> 
> Whish reminds me that 'sanskrit' in polyglossia loads hyph patterns valid for Pali and Prakrit as well, so you'd need to have mappings like
> 
> subtag      polyglossia
> -------     -------------
> sa          \textsanskrit
> sa-deva     \textsanskrit
> sa-latn     \textsanskrit
> pi          \textsanskrit
> pi-deva     \textsanskrit
> pi-latn     \textsanskrit
> pra         \textsanskrit
> pra-deva    \textsanskrit
> pra-latn    \textsanskrit
> 
> Reason: HTML cares semantically which language/script combo you actually use while polyglossia's/LaTeX's main concern is hyph patterns.  Western scholars will normally print Pali in Latin script, while for Prakrit editions would normally use Devanagari, while articles etc. would use Latin script.
> 
> In theory Pali can be written in Sinhala/Thai/Burmese script and a bunch of other SE Asian scripts as well, but polyglossia doesn't provide for that at all. I guess that in practice you'd use hyph patterns for the languages normally written in those scripts and live with the results.
> 
> /bpj

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/88EFDCE6-7216-4C7B-B2F1-22F1C457A8C7%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek
       [not found]                             ` <14ec196f-2a1f-4c8c-99a3-46e869f0823a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-10-26  4:18                               ` John MacFarlane
  0 siblings, 0 replies; 16+ messages in thread
From: John MacFarlane @ 2015-10-26  4:18 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ mb21 [Oct 25 15 11:17 ]:
>   > Cab I find instructions somewhere?
>   see [1]https://github.com/jgm/pandoc/blob/master/INSTALL
>   Does anybody know whether stack or cabal is currently the preferred
>   build method? (I'd guess stack, but I'm still on cabal so I cannot
>   vouch for it).

They're both tested regularly, and both work well.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]                     ` <2919c1dd-2778-49a6-bc40-18dec1fa9315-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-10-26 11:54                       ` Andrew Dunning
  2015-10-26 13:56                       ` John MacFarlane
  1 sibling, 0 replies; 16+ messages in thread
From: Andrew Dunning @ 2015-10-26 11:54 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1022 bytes --]



On Sunday, October 25, 2015 at 5:22:18 AM UTC-4, mb21 wrote:
>
> @jgm I'm not sure why \textgreek was added in the LaTeX reader and no 
> other languages, but maybe we can remove that line now that we have proper 
> language attributes in spans/divs?
>

That was my fault; without that, any text contained in \textgreek{} was 
disappearing (https://github.com/jgm/pandoc/issues/1783). If it will handle 
it properly now, no point in leaving it in.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3a4dd01f-5bf7-486f-8101-76ecc108edde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1603 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]                     ` <2919c1dd-2778-49a6-bc40-18dec1fa9315-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2015-10-26 11:54                       ` Andrew Dunning
@ 2015-10-26 13:56                       ` John MacFarlane
       [not found]                         ` <20151026135613.GA8076-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: John MacFarlane @ 2015-10-26 13:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ mb21 [Oct 25 15 02:22 ]:
>   @jgm I'm not sure why \textgreek was added in the LaTeX reader and no
>   other languages, but maybe we can remove that line now that we have
>   proper language attributes in spans/divs?

it shouldn't be removed.  After all, this is the LaTeX
reader; it's designed to be used on LaTex documents, not
just LaTeX in Markdown.

Presumably we should add similar commands for other
languages.  And perhaps instead of just returning its
contents, this should return a span with an appropriate
lang attribute.  (Implementing this might require
splitting out some of the mappings currently in the
LaTeX writer into an independent module, where they
could be used here too.)

Another possible change would be changed `tok` to `grouped`.
This would help in the present case, though it would be
less correct for LaTeX: technically, you can do
`\textgreek α` withou `{}`.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20151026135613.GA8076%40MacBook-Air.local.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Strange bug in 1.15.1
       [not found]                         ` <20151026135613.GA8076-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
@ 2015-10-27 10:32                           ` Melroch
  0 siblings, 0 replies; 16+ messages in thread
From: Melroch @ 2015-10-27 10:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2810 bytes --]

Den 26 okt 2015 14:57 skrev "John MacFarlane" <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>:
>
> +++ mb21 [Oct 25 15 02:22 ]:
>
>>   @jgm I'm not sure why \textgreek was added in the LaTeX reader and no
>>   other languages, but maybe we can remove that line now that we have
>>   proper language attributes in spans/divs?
>
>
> it shouldn't be removed.  After all, this is the LaTeX
> reader; it's designed to be used on LaTex documents, not
> just LaTeX in Markdown.
>
> Presumably we should add similar commands for other
> languages.  And perhaps instead of just returning its
> contents, this should return a span with an appropriate
> lang attribute.  (Implementing this might require
> splitting out some of the mappings currently in the
> LaTeX writer into an independent module, where they
> could be used here too.)
>
> Another possible change would be changed `tok` to `grouped`.
> This would help in the present case, though it would be
> less correct for LaTeX: technically, you can do
> `\textgreek α` withou `{}`.

Would anyone really do that? It does not seem normal, not to say careless,
even if possible, to omit the braces just because you cite a one-letter
word. It would only be useful to cite letters of the alphabet and some
'grammatical' words (articles, adverbs, pronouns, interjections) in
isolation. (With the caveat that modern Greek may have some one-vowel
content word of which I am not aware.)

Granted I use editor snippet templates for this sort of thing but I don't
think I would be tempted to leave out the braces anyway, even if it's
theoretically valid LaTeX.

> --
> You received this message because you are subscribed to the Google Groups
"pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
https://groups.google.com/d/msgid/pandoc-discuss/20151026135613.GA8076%40MacBook-Air.local
.
>
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhBkURFkMvbyo4uMNCF5D8E8n_JZv7_1DK79xCqVKEiLbw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3896 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Modern (el) and Ancient (grc) Greek
       [not found]                                   ` <88EFDCE6-7216-4C7B-B2F1-22F1C457A8C7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-29  9:31                                     ` BP Jonsson
  0 siblings, 0 replies; 16+ messages in thread
From: BP Jonsson @ 2015-10-29  9:31 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 6700 bytes --]

Sorry for late reply. I thought I sent one the other night but apparently
not.

While Pali is occasionally printed in the local scripts in Sri Lanka and
Southeast Asia I don't think we need to worry about that, as the number of
people who do so using any flavor of TeX is probably very small, and
correspondingly even smaller using pandoc. I think you can safely postpone
that issue until any such users make themselves known. Scholarly works
always print Pali in Latin script by longstanding tradition. If it would be
easy to detect any Pali+script tag combination and translate it into
`\text<somelanguageusingthatscript>` then by all means do it (the scripts
concerned would be Sinhala, Bengali, Burmese, Thai, Lao and Khmer of those
supported by polyglossia), but I think it might be best to postpone it
until a real solution emerges in LaTeX. You wouldn't want automatic strings
and the like to be Thai or whatever anyway if the main language of the
document really is Pali!
What's more Thai script needs a preprocessor even with XeTeX (because you
don't put space between words but you can break lines only at word
boundaries) and I doubt that preprocessor can handle Pali anyway.

/bpj
Den 25 okt 2015 21:37 skrev "mb" <mauro.bieg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> Thanks for your explanations. Yes, inserting
> `\newfontfamily\sanskritfont{Latin Modern Roman}` (or similar) at least
> lets me compile the PDF, it still emits a bunch of warning though, probably
> because I haven’t installed all the font/language packs.
>
> sa-deva and sa-latn already compile to \textsanskrit since the code splits
> the BCP47 language string by dashes before feeding it to the functions I
> showed you which fall back to “sa” if they cannot find “sa-deva” etc. But
> what you’re saying is that both Pali and Prakrit are also variants of
> Sanskrit (or that it at least makes sense to treat them as such for
> polyglossia’s purpose of hyphenation etc.)? Shouldn’t pi-thai be compiled
> to \textthai or something instead and only pi, pi-deva and pi-latn to
> \textsanskrit?
>
>
> > On 25.Oct, 2015, at 20:57 , BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:
> >
> > Den 2015-10-25 kl. 19:17, skrev mb21:
> >>> I think an option for Sanskrit in Latin script would be in order.
> >> Could you tell us what Polyglossia command Pandoc would need to output
> for
> >> that? Probably something like \textsanskrit[script=latin]{...}, but on
> my
> >> system is still complains about Devanagari fonts... are you sure this is
> >> supported in Polyglossia?
> >>
> >>
> >
> > Sorry, the Script option with polyglossia/sanskrit is for other *Indic*
> scripts.  The default hyphenation patterns for sanskrit cover both
> Devanagari and Latin!  So you just define a Latin font for Sanskrit and
> `\setotherlanguage{sanskrit}` and you are good to go, but HTML wants
> lang="sa-latn" for Sanskrit in Latin script!
> >
> > Truth to tell I have never used Sanskrit as the main language of a
> document, but done something like
> >
> >
> >    \usepackage{polyglossia}
> >    \setmainfont{Charis SIL} % \defaultfontfeatures does the rest
> >    \setmainlanguage{english}
> >    \newfontfamily\sanskritfont[Script=Latin]{Charis SIL}
> >    % loads hyphenation for Devanagari *and* Latin hyphenation or so I've
> been told.
> >    \setotherlanguage{sanskrit}
> >    \newcommand{\skt}[1]{\emph{\textsanskrit{#1}}}
> >
> >
> >    \skt<span lang="sa-latn">uvāca</span> ‘he said’.
> >
> > (I load those definitions as header-includes.)
> >
> > Essentially the same thing as described here, with the difference that
> I'm not concerned about Devanagari with my comparatist hat on.
> >
> > <http://cikitsa.blogspot.se/2012/07/sanskrit-hyphenation-list.html>
> >
> > I can't tell for sure if it's the `\sanskritfont` declaration or the
> fact that I happen to have Devanagari fonts installed which does the
> difference, but I guess it's the latter.  The thing is probably that when
> you load Sanskrit with polyglossia it *both* checks for Dev fonts *and*
> loads Dev/Latin combined hyph patterns, not being able to tell that you are
> in fact going to use only Romanization.
> >
> > <
> https://github.com/reutenauer/polyglossia/blob/master/tex/gloss-sanskrit.ldf
> >
> >
> > Whish reminds me that 'sanskrit' in polyglossia loads hyph patterns
> valid for Pali and Prakrit as well, so you'd need to have mappings like
> >
> > subtag      polyglossia
> > -------     -------------
> > sa          \textsanskrit
> > sa-deva     \textsanskrit
> > sa-latn     \textsanskrit
> > pi          \textsanskrit
> > pi-deva     \textsanskrit
> > pi-latn     \textsanskrit
> > pra         \textsanskrit
> > pra-deva    \textsanskrit
> > pra-latn    \textsanskrit
> >
> > Reason: HTML cares semantically which language/script combo you actually
> use while polyglossia's/LaTeX's main concern is hyph patterns.  Western
> scholars will normally print Pali in Latin script, while for Prakrit
> editions would normally use Devanagari, while articles etc. would use Latin
> script.
> >
> > In theory Pali can be written in Sinhala/Thai/Burmese script and a bunch
> of other SE Asian scripts as well, but polyglossia doesn't provide for that
> at all. I guess that in practice you'd use hyph patterns for the languages
> normally written in those scripts and live with the results.
> >
> > /bpj
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/88EFDCE6-7216-4C7B-B2F1-22F1C457A8C7%40gmail.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuSGUA7r4J1L9Kw4jRwqG6h4-_DvE0ZDEPY1JTokF6oyeg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 8413 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-10-29  9:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-23 19:44 Strange bug in 1.15.1 BP Jonsson
     [not found] ` <562A8E21.7020009-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-23 21:48   ` Joost Kremers
     [not found]     ` <87r3klfe97.fsf-97jfqw80gc6171pxa8y+qA@public.gmane.org>
2015-10-24  9:53       ` BP Jonsson
     [not found]         ` <562B5504.7010107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-24 12:59           ` mb21
     [not found]             ` <d8b41263-76b4-417d-a3be-1d81d92dfa7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-10-24 21:29               ` Modern (el) and Ancient (grc) Greek (was: Re: Strange bug in 1.15.1) BP Jonsson
     [not found]                 ` <562BF82E.80402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-24 21:35                   ` mb
     [not found]                     ` <62FE420F-92AF-4DB4-9B99-EF8C57F50304-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-25 11:09                       ` Modern (el) and Ancient (grc) Greek BP Jonsson
     [not found]                         ` <562CB869.6010202-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-25 18:17                           ` mb21
     [not found]                             ` <562D3436.6040707@gmail.com>
     [not found]                               ` <562D3436.6040707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-25 20:37                                 ` mb
     [not found]                                   ` <88EFDCE6-7216-4C7B-B2F1-22F1C457A8C7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-29  9:31                                     ` BP Jonsson
     [not found]                             ` <14ec196f-2a1f-4c8c-99a3-46e869f0823a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-10-26  4:18                               ` John MacFarlane
2015-10-24 22:39               ` Strange bug in 1.15.1 John MacFarlane
     [not found]                 ` <20151024223902.GB3531-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
2015-10-25  9:22                   ` mb21
     [not found]                     ` <2919c1dd-2778-49a6-bc40-18dec1fa9315-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-10-26 11:54                       ` Andrew Dunning
2015-10-26 13:56                       ` John MacFarlane
     [not found]                         ` <20151026135613.GA8076-jF64zX8BO08aTFSqC7bH4WZHpeb/A1Y/@public.gmane.org>
2015-10-27 10:32                           ` Melroch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).