dealing with languages in pandoc

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

* dealing with languages in pandoc
@ 2015-06-03 21:39 Pablo Rodríguez
       [not found] ` <556F7423.1060600-S0/GAf8tV78@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Rodríguez @ 2015-06-03 21:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dear John,

I think there are some issues in pandoc that affect to language
handling.

Right now from the formats that allow language markup, only HTML, ePub
and LaTeX include language information. At least, ConTeXt, docx,
OpenDocument may include language information.

Language identifiers are different in XML and in LaTeX (there are some
differences in ConTeXt also). For proper language handling, we need that
pandoc converts values when writing to and reading from LaTeX (or
ConTeXt).

Not only the language identifiers, but also language markup isn’t common
for HTML and for LaTeX. To generate language tagged documents from a
single source, we need a common syntax for language markup in pandoc.

This common syntax would involve three elements:

-   Language identifiers as special attributes, such as in {:en}.
    :language would be the third special attribute, besides .classes and
    \#identifiers.

    lang="en" is HTML syntax. It won’t work with LaTeX or ConTeXt.

-   Special syntax for divisions, because <div> and <span> are XML
    elements that don’t work in LaTeX or ConTeXt.

-   As already discussed for other purposes, it would be great if all
    elements in Markdown could enjoy attributes. As far as I can
    remember from the tread "Release road map - 1.14 and beyond", this
    was an accepted feature.

My questions are:

1.  Would it be possible to implement the language identifiers
    conversion lists for LaTeX and ConTeXt provided in
    https://github.com/jgm/pandoc/issues/1614\#issuecomment-60476904 and
    https://github.com/jgm/pandoc/issues/1667\#issuecomment-69243554.

    Just to be sure, YAML language metadata and *Markdown* should use
    ISO-639 language codes. The conversion is only needed to read from
    and write to LaTeX or ConTeXt.

2.  How about enabling the lang attribute in ConTeXt, docx and odf (as
    discussed in https://github.com/jgm/pandoc/issues/1667).

3.  Would it be possible to implement the :lang special attribute syntax
    as in https://github.com/jgm/pandoc/issues/895?

4.  Could we have in version 1.15 special syntax for divisions and spans
    (https://github.com/jgm/pandoc/issues/168) so that they could be
    used beyond HTML output formats?

5.  Would it be possible that version 1.15 grants attributes to all
    elements (https://github.com/jgm/pandoc/issues/1966)?

These questions are an attempt to help to identify the issues we have
using pandoc to generate multilanguage documents in different formats
(that support language information) from a single source.

Many thanks for your help,


Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/556F7423.1060600%40web.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found] ` <556F7423.1060600-S0/GAf8tV78@public.gmane.org>
@ 2015-06-04 16:37   ` BP Jonsson
       [not found]     ` <55707EC3.3030909-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-04 22:33   ` John MacFarlane
  2015-08-18 16:06   ` John MacFarlane
  2 siblings, 1 reply; 25+ messages in thread
From: BP Jonsson @ 2015-06-04 16:37 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2015-06-03 23:39, Pablo Rodríguez skrev:
> Language identifiers are different in XML and in LaTeX (there are some
> differences in ConTeXt also). For proper language handling, we need that
> pandoc converts values when writing to and reading from LaTeX (or
> ConTeXt).
>
>      Just to be sure, YAML language metadata and*Markdown*  should use
>      ISO-639 language codes. The conversion is only needed to read from
>      and write to LaTeX or ConTeXt.

Personally I have to look up those codes every time, apart from
the handful I use constantly. Language *names* don't suffer from
the same problem. Pandoc could use any kind of identifier as long
as the mapping from user identifiers to the format-specific ones
is documented somewhere, like in a file ~/.pandoc/langids.yaml.
Since ISO-639 codes and language names seldom if ever look the
same that file could map from either to the format specific
identifiers. Chacun à son goût.

I'm actually working on a filter which would implement exactly
that by modifying the document metadata looking at and below
certain keys, listed in a YAML registry file, for language
names, also listed in the same file, inserting their associated
data into the metadata which the writer sees.

In your metadata (I don't think it makes sense to set these
attributes on the command line now that we have metadata!) you
say something like:

```
---
polyglossia: true
# EITHER THIS:
mainlang:
   name:     english # or 'en' or 'eng' if you prefer
   variant:  american
# OR THIS
mainlang:   american # or 'usenglish' or...
```

The YAML registry file will have entries something like below.
Note how the YAML anchor/reference system is used to avoid
duplication errors! Such duplication as occurs is because the
filter doesn't walk the tree but merely picks a top node or a
`NAME.variants.VARIANT` node and uses its info, so the
information that the language name is 'english' must be in the
node it picks, since it doesn't look upwards. It will then pick a
language name/code and possibly a polyglossia variant name based
on the output format and any `polyglossia` or `babel` metadata
entry which happens to be true -- of course throwing an error if
the format is `latex` and both are true!

```
swedish: &swedish
   html:         sv
   babel:        swedish
   polyglossia:  swedish
sv:      *swedish
swe:     *swedish
greek:
   variants:
     modern: &greek_modern
       html:         el
       babel:        greek
       polyglossia:
         name:       greek
         variant:    monotonic
     ancient: &greek_ancient
       html:         grc
       babel:        polutonikogreek # sic!
       polyglossia:
         name:       greek
         variant:    ancient
ell *greek_modern
el  *greek_modern
gre *greek_modern
grc *greek_ancient
english: &english
   html:         en
   babel:        english
   polyglossia:  english
   variants:
     american: &english_american
       html:         en-US
       babel:        american
       polyglossia:
         name:       english
         variant:    american
     us: *english_american
     #### IMAGINE MORE VARIANTS HERE! ####
en:          *english
eng:         *english
USenglish:   *english_american
american:    *english_american
UKenglish:   *english_british
british:     *english_british
canadian:    *english_canadian
australian:  *english_australian
newzealand:  *english_newzealand
```

The plan is to cover all languages covered by babel and
polyglossia explicitly to begin with, then scrape pages like
<http://en.wikipedia.org/wiki/ISO_639:e> to map from
(lowercased) English names to codes. I guess ConTeXt support
could come in in the middle there, if anyone can point me to a
list of supported languages! I'll put it all on GitHub once I
have some decent code.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/55707EC3.3030909%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]     ` <55707EC3.3030909-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-04 18:52       ` Paulo Ney de Souza
       [not found]         ` <CAFVhNZOk1-Sm-fAD2_W9q9nefpAwzkK5+4KOFB1cqiZD-HoO6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-06-04 19:30       ` Pablo Rodríguez
  1 sibling, 1 reply; 25+ messages in thread
From: Paulo Ney de Souza @ 2015-06-04 18:52 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 6191 bytes --]

The current situation in LaTeX is a mess because there was never a "package
writing guidelines" and each package author did what they wanted... not
only that but also the fact that some of the packages used for "other"
language writing in LaTeX pre-date the ISO standards.

There is a broad effort to uniformize the way LaTeX deals with LANG, and
not only that, also the SCRIPT that these languages are written and their
LOCALE. This is one of the reasons the ISO codes have been created -- to
make it easier to move data in between applications... and we should stick
to it to the letter. The problem is complex and there are commonly used
languages in the world that are written in 4 (or more) different scripts,
so it is nice to use an ISO code instead of refereeing to a language by

  kazakh_used_in-china_written_in-arabic

Paulo Ney

On Thu, Jun 4, 2015 at 9:37 AM, BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:

> Den 2015-06-03 23:39, Pablo Rodríguez skrev:
>
>> Language identifiers are different in XML and in LaTeX (there are some
>> differences in ConTeXt also). For proper language handling, we need that
>> pandoc converts values when writing to and reading from LaTeX (or
>> ConTeXt).
>>
>>      Just to be sure, YAML language metadata and*Markdown*  should use
>>      ISO-639 language codes. The conversion is only needed to read from
>>      and write to LaTeX or ConTeXt.
>>
>
> Personally I have to look up those codes every time, apart from
> the handful I use constantly. Language *names* don't suffer from
> the same problem. Pandoc could use any kind of identifier as long
> as the mapping from user identifiers to the format-specific ones
> is documented somewhere, like in a file ~/.pandoc/langids.yaml.
> Since ISO-639 codes and language names seldom if ever look the
> same that file could map from either to the format specific
> identifiers. Chacun à son goût.
>
> I'm actually working on a filter which would implement exactly
> that by modifying the document metadata looking at and below
> certain keys, listed in a YAML registry file, for language
> names, also listed in the same file, inserting their associated
> data into the metadata which the writer sees.
>
> In your metadata (I don't think it makes sense to set these
> attributes on the command line now that we have metadata!) you
> say something like:
>
> ```
> ---
> polyglossia: true
> # EITHER THIS:
> mainlang:
>   name:     english # or 'en' or 'eng' if you prefer
>   variant:  american
> # OR THIS
> mainlang:   american # or 'usenglish' or...
> ```
>
> The YAML registry file will have entries something like below.
> Note how the YAML anchor/reference system is used to avoid
> duplication errors! Such duplication as occurs is because the
> filter doesn't walk the tree but merely picks a top node or a
> `NAME.variants.VARIANT` node and uses its info, so the
> information that the language name is 'english' must be in the
> node it picks, since it doesn't look upwards. It will then pick a
> language name/code and possibly a polyglossia variant name based
> on the output format and any `polyglossia` or `babel` metadata
> entry which happens to be true -- of course throwing an error if
> the format is `latex` and both are true!
>
> ```
> swedish: &swedish
>   html:         sv
>   babel:        swedish
>   polyglossia:  swedish
> sv:      *swedish
> swe:     *swedish
> greek:
>   variants:
>     modern: &greek_modern
>       html:         el
>       babel:        greek
>       polyglossia:
>         name:       greek
>         variant:    monotonic
>     ancient: &greek_ancient
>       html:         grc
>       babel:        polutonikogreek # sic!
>       polyglossia:
>         name:       greek
>         variant:    ancient
> ell *greek_modern
> el  *greek_modern
> gre *greek_modern
> grc *greek_ancient
> english: &english
>   html:         en
>   babel:        english
>   polyglossia:  english
>   variants:
>     american: &english_american
>       html:         en-US
>       babel:        american
>       polyglossia:
>         name:       english
>         variant:    american
>     us: *english_american
>     #### IMAGINE MORE VARIANTS HERE! ####
> en:          *english
> eng:         *english
> USenglish:   *english_american
> american:    *english_american
> UKenglish:   *english_british
> british:     *english_british
> canadian:    *english_canadian
> australian:  *english_australian
> newzealand:  *english_newzealand
> ```
>
> The plan is to cover all languages covered by babel and
> polyglossia explicitly to begin with, then scrape pages like
> <http://en.wikipedia.org/wiki/ISO_639:e> to map from
> (lowercased) English names to codes. I guess ConTeXt support
> could come in in the middle there, if anyone can point me to a
> list of supported languages! I'll put it all on GitHub once I
> have some decent code.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/55707EC3.3030909%40gmail.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZOk1-Sm-fAD2_W9q9nefpAwzkK5%2B4KOFB1cqiZD-HoO6Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 8069 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]     ` <55707EC3.3030909-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-04 18:52       ` Paulo Ney de Souza
@ 2015-06-04 19:30       ` Pablo Rodríguez
  1 sibling, 0 replies; 25+ messages in thread
From: Pablo Rodríguez @ 2015-06-04 19:30 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 06/04/2015 06:37 PM, BP Jonsson wrote:
> Den 2015-06-03 23:39, Pablo Rodríguez skrev:
>> Language identifiers are different in XML and in LaTeX (there are some
>> differences in ConTeXt also). For proper language handling, we need that
>> pandoc converts values when writing to and reading from LaTeX (or
>> ConTeXt).
>>
>>      Just to be sure, YAML language metadata and*Markdown*  should use
>>      ISO-639 language codes. The conversion is only needed to read from
>>      and write to LaTeX or ConTeXt.
> 
> Personally I have to look up those codes every time, apart from
> the handful I use constantly. Language *names* don't suffer from
> the same problem.

Rembembering English names for langauges not normally used may be a hard
task also. Or the right spelling may be also forgotten, even if language
name is remembered.

> Pandoc could use any kind of identifier as long
> as the mapping from user identifiers to the format-specific ones
> is documented somewhere, like in a file ~/.pandoc/langids.yaml.
> Since ISO-639 codes and language names seldom if ever look the
> same that file could map from either to the format specific
> identifiers. Chacun à son goût.

I asked for the usage of ISO-639 codes not because of the document’s
metadata, but due to a common markup for a new language special
attribute, such as in:

```
The German word for concept is _Begriff_{:de}
```

Also avoiding unnecessary conversions (only LaTeX and ConTeXt need
special names) speeds up the document reanding and writing.

> I'm actually working on a filter which would implement exactly
> that by modifying the document metadata looking at and below
> certain keys, listed in a YAML registry file, for language
> names, also listed in the same file, inserting their associated
> data into the metadata which the writer sees.

Language markup is a basic document feature, so I think it should be
performed by pandoc itself, not by filters.

And filters tend to slow down document parsing. If not strictly
required, I think they should be avoided.

> In your metadata (I don't think it makes sense to set these
> attributes on the command line now that we have metadata!) you
> say something like:
> 
> ```
> ---
> polyglossia: true
> # EITHER THIS:
> mainlang:
>    name:     english # or 'en' or 'eng' if you prefer
>    variant:  american
> # OR THIS
> mainlang:   american # or 'usenglish' or...
> ```

Well, do you really think that this is better to write than the
following and let document templates to do the rest?

```
----
lang: en-US
...
```

I’m afraid with your sample above, the user has to type a lot more.

> greek:
>    variants:
>      modern: &greek_modern
>        html:         el
>        babel:        greek
>        polyglossia:
>          name:       greek
>          variant:    monotonic
>      ancient: &greek_ancient
>        html:         grc
>        babel:        polutonikogreek # sic!

No, polytonikogreek was for modern Greek polytonic ortography (up to
1982). Right now, is greek.polutoniko.

The value for ancient Greek is greek.ancient. (Sorry, but grc seems to
me far easier to remember.)

> The plan is to cover all languages covered by babel and
> polyglossia explicitly to begin with, then scrape pages like
> <http://en.wikipedia.org/wiki/ISO_639:e> to map from
> (lowercased) English names to codes. I guess ConTeXt support
> could come in in the middle there, if anyone can point me to a
> list of supported languages! I'll put it all on GitHub once I
> have some decent code.

I provided a list at
https://github.com/jgm/pandoc/issues/1667#issuecomment-69243554.


Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5570A743.5060104%40web.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]         ` <CAFVhNZOk1-Sm-fAD2_W9q9nefpAwzkK5+4KOFB1cqiZD-HoO6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-06-04 19:43           ` BP Jonsson
       [not found]             ` <5570AA57.8000107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-05  5:19           ` juh
  1 sibling, 1 reply; 25+ messages in thread
From: BP Jonsson @ 2015-06-04 19:43 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2015-06-04 20:52, Paulo Ney de Souza skrev:
> so it is nice to use an ISO code instead of refereeing to a language by
>
>    kazakh_used_in-china_written_in-arabic

I, and I'm sure I'm not the only one, would have to look up the 
ISO code in some kind of list anyway,
because I couldn't remember it anyway unless I used it weekly or 
even daily, and I guess that somewhere, and somewhere down the 
line that lookup would have to look pretty much like this anyway:

```
kazakh:
   variant:
     china:
       script:
         arabic: kaz-CN-Arab
```

Then why on earth not `kazakh-china-arabic`? To save eight bytes?
Since the universe to describe isn't finite we anyway end up with
things like `qaa-Qaaa-QM-x-southern` (look for it at
<http://tools.ietf.org/html/rfc5646#appendix-A>!) I can't for the
life of me understand why some people are enamoured of letter and
digit combinations which most humans can't easily remember.
Computers can be made to deal with something human-parsable just
as easily, and there is nothing that mandates that the info must
be compressed into one string with as little punctuation as
possible either. Why not be human friendly? Computers don't care
anyway, they just compute the data the way we tell them to.
Surely nobody hopes that the languages of the world shall go
extinct at an even quicker rate than they already do,so that they
all can fit in some compressed scheme? It's all just a holdover
from a time when computer resources were scarce.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]             ` <5570AA57.8000107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-04 20:01               ` Pablo Rodríguez
       [not found]                 ` <5570AE8C.3070004-S0/GAf8tV78@public.gmane.org>
  2015-06-04 21:10               ` Paulo Ney de Souza
  1 sibling, 1 reply; 25+ messages in thread
From: Pablo Rodríguez @ 2015-06-04 20:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 06/04/2015 09:43 PM, BP Jonsson wrote:
> Den 2015-06-04 20:52, Paulo Ney de Souza skrev:
>> so it is nice to use an ISO code instead of refereeing to a language by
>>
>>    kazakh_used_in-china_written_in-arabic
> 
> 
> I, and I'm sure I'm not the only one, would have to look up the 
> ISO code in some kind of list anyway,
> because I couldn't remember it anyway unless I used it weekly or 
> even daily, and I guess that somewhere, and somewhere down the 
> line that lookup would have to look pretty much like this anyway:
> 
> ```
> kazakh:
>    variant:
>      china:
>        script:
>          arabic: kaz-CN-Arab
> ```
> 
> Then why on earth not `kazakh-china-arabic`? To save eight bytes?
> Since the universe to describe isn't finite we anyway end up with
> things like `qaa-Qaaa-QM-x-southern` (look for it at
> <http://tools.ietf.org/html/rfc5646#appendix-A>!) I can't for the
> life of me understand why some people are enamoured of letter and
> digit combinations which most humans can't easily remember.

Not all humans are fluent in English. Or in English spellings (I would
have surely written kazakh wrong. And china is misleading, since it is
also a common name.).

Of course, we can find complex samples.

But first, people will learn the codes for the languages they actually use.

And second, do you know where to find language ressources for
kaz-CN-arab? I wonder whether there are freely available hyphenation
patterns for this language.

Sorry, BPJ, I write and typeset documents with languages I understand.
In fact, I know a lot more ISO-639 codes than languages I can write (or
even languages I include in documents I digitally edit).


Pablo
-- 
http://www.ousia.tk


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                 ` <5570AE8C.3070004-S0/GAf8tV78@public.gmane.org>
@ 2015-06-04 21:01                   ` BP Jonsson
       [not found]                     ` <5570BCA8.1010102-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: BP Jonsson @ 2015-06-04 21:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2015-06-04 22:01, Pablo Rodríguez skrev:
> Sorry, BPJ, I write and typeset documents with languages I understand.
> In fact, I know a lot more ISO-639 codes than languages I can write (or
> even languages I include in documents I digitally edit).

Just to be clear: I'm not opposed to standardized codes -- that
would be stupid -- but to the urge to truncate them beyond what
is a reasonable linguistic token for a human. There is no way to
guess how a word has been truncated in these codes, so better not
truncate at all. If I ever need to write about Kazakh I need to
learn to spell it anyway, so why burden human memories with these
truncations? An example where they mostly have done it right are
Unicode character names, where they have resisted the temptation
to truncate. They could conceivably have truncated parts of the
names like the names of scripts, but thankfully they
didn't.[^1] The names are a mouthful to type, but for example
the Perl core module charnames provides for a mechanism to alias
those you need frequently. I have even written code for
generating aliases from full names algorithmically. Not least as
a shorthand writer I know that abbreviation has its uses, but it
should be left to private use, letting everyone choose their
abbreviations themselves, since what is a good abbreviation is 
highly subjective.

[^1]: Which isn't to say that not some things could be
better; I would have avoided significant punctuation and
whitespace altogether, calling the twenty-third Tibetan letter
'tibetan letter ha' and the twenty-ninth 'tibetan letter hha',
with good paleographic reason, to begin with. And why insist on
UPPER CASE?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5570BCA8.1010102%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]             ` <5570AA57.8000107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2015-06-04 20:01               ` Pablo Rodríguez
@ 2015-06-04 21:10               ` Paulo Ney de Souza
  1 sibling, 0 replies; 25+ messages in thread
From: Paulo Ney de Souza @ 2015-06-04 21:10 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2396 bytes --]

On Thu, Jun 4, 2015 at 12:43 PM, BP Jonsson <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrote:

> Then why on earth not `kazakh-china-arabic`? To save eight bytes?
>

No! it is not to save eight bytes -- it is to be precise about what China
you are referring to -- which could be three: CN (for Mainland China), HK
(for Hong-Kog) and TW (for Taiwan). In the case of the Kazakh language, it
is mostly spoken in Mainland China (of the three above) and there is little
doubt, but there are many other languages that strews all 3 Chinas and it
is really necessary to specify it by the ISO-country code (ISO-3166-1).

It is also to keep people from writing stupidity that will give headaches
to millions of others for many years down the line -- check this:

In 1985, the first LaTeX class files for the language "Português" were
written by a good friend of mine! Because of the limitations enshrined into
DOS by Mr. Bill Gates the identifier (which was a file-name at the time)
had to be 8 letters only and the chosen one was "PORTUGES" much to my
chagrin at the time -- observe the missing letter U and the missing
hat-accent in a word that mimics Portuguese.

30 years later this beast has propagated into Babel, Polyglosia, Biblatex,
... and you name it.... not to mention the gazillions of documents that
have been written using this string. While the authors oif this packages
almost do not recognize the missing "U" in the language identifier -- it is
a headache for almost anyone writing LaTeX in Portuguese... and in the eyes
is of many was to make it easier ...

The correct way to refer to Portuguese is: pt-PT, pt-BR, pt-AO, pt-MZ,
pt-IN, pt-CN, which is understood by most, specially the ones living,
speaking and writing in these languages.

Paulo Ney

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZNfff%3DXc5dM03RwsengEak76YYDxPMK9sCf_bfdjsH1xQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3256 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found] ` <556F7423.1060600-S0/GAf8tV78@public.gmane.org>
  2015-06-04 16:37   ` BP Jonsson
@ 2015-06-04 22:33   ` John MacFarlane
       [not found]     ` <20150604223339.GH14696-bi+AKbBUZKZDXvCVYMAFSHgRjnEvrCe7@public.gmane.org>
  2015-08-18 16:06   ` John MacFarlane
  2 siblings, 1 reply; 25+ messages in thread
From: John MacFarlane @ 2015-06-04 22:33 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Pablo Rodríguez [Jun 03 15 23:39 ]:
>
>Language identifiers are different in XML and in LaTeX (there are some
>differences in ConTeXt also). For proper language handling, we need that
>pandoc converts values when writing to and reading from LaTeX (or
>ConTeXt).

Yes, I agree, this should be done soon.  We should have a
single standard `lang` or `langguage` metadata field that
takes  ISO-639 values, which get converted where necessary
(e.g. LaTeX polyglossia and babel).

I'm less sure about the issues for marking up expressions by
language.  You can already do `<span lang="en">...</span>`.
This is not HTML specific.  It creates a native Pandoc Span
element in the AST.  This can be intercepted by filters and
changed into anything appropriate for the output format
you're using.

>-   Special syntax for divisions, because <div> and <span> are XML
>    elements that don’t work in LaTeX or ConTeXt.

The *syntax* `<div>` and `<span>` is not XML-specific -- it
creates special elements in the Pandoc AST.  It's just that
most formats don't yet define special meanings for these.
It's not clear that they should, because different people
may have different needs. Filters are your friend.

>-   As already discussed for other purposes, it would be great if all
>    elements in Markdown could enjoy attributes. As far as I can
>    remember from the tread "Release road map - 1.14 and beyond", this
>    was an accepted feature.

No, we'd only discussed attributes on images.

>    https://github.com/jgm/pandoc/issues/1614\#issuecomment-60476904 and
>    https://github.com/jgm/pandoc/issues/1667\#issuecomment-69243554.
>
>    Just to be sure, YAML language metadata and *Markdown* should use
>    ISO-639 language codes. The conversion is only needed to read from
>    and write to LaTeX or ConTeXt.

Yes.  I think this should definitely be done.

>2.  How about enabling the lang attribute in ConTeXt, docx and odf (as
>    discussed in https://github.com/jgm/pandoc/issues/1667).

Yes.

>3.  Would it be possible to implement the :lang special attribute syntax
>    as in https://github.com/jgm/pandoc/issues/895?

Not sure about this.  It would require very extensive
changes.

>4.  Could we have in version 1.15 special syntax for divisions and spans
>    (https://github.com/jgm/pandoc/issues/168) so that they could be
>    used beyond HTML output formats?

They already can be.  `<div>` and `<span>` in Markdown
create special Div and Span elements, which are not HTML
specific, though in most writer special meanings have not
been efined for them.

>5.  Would it be possible that version 1.15 grants attributes to all
>    elements (https://github.com/jgm/pandoc/issues/1966)?

This is a possibility, but it's a huge change and requires
many issues to be resolved.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20150604223339.GH14696%40localhost.t-mobile.de.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]         ` <CAFVhNZOk1-Sm-fAD2_W9q9nefpAwzkK5+4KOFB1cqiZD-HoO6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-06-04 19:43           ` BP Jonsson
@ 2015-06-05  5:19           ` juh
  1 sibling, 0 replies; 25+ messages in thread
From: juh @ 2015-06-05  5:19 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I would second this. Let's stick to the most common standard.

Am 04.06.15 um 20:52 schrieb Paulo Ney de Souza:
> There is a broad effort to uniformize the way LaTeX deals with LANG, and
> not only that, also the SCRIPT that these languages are written and
> their LOCALE. This is one of the reasons the ISO codes have been created
> -- to make it easier to move data in between applications... and we
> should stick to it to the letter. The problem is complex and there are
> commonly used languages in the world that are written in 4 (or more)
> different scripts, so it is nice to use an ISO code instead of
> refereeing to a language by


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                     ` <5570BCA8.1010102-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-06-05 16:45                       ` Pablo Rodríguez
       [not found]                         ` <5571D20C.1000606-S0/GAf8tV78@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Pablo Rodríguez @ 2015-06-05 16:45 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 06/04/2015 11:01 PM, BP Jonsson wrote:
> Den 2015-06-04 22:01, Pablo Rodríguez skrev:
>> Sorry, BPJ, I write and typeset documents with languages I understand.
>> In fact, I know a lot more ISO-639 codes than languages I can write (or
>> even languages I include in documents I digitally edit).
> 
> Just to be clear: I'm not opposed to standardized codes -- that
> would be stupid -- but to the urge to truncate them beyond what
> is a reasonable linguistic token for a human.

I’m fine with the language codes from ISO-639 (although I don’t use them
all). At least, I haven’t thought much about them.

I guess the same objections could be thought about country codes in
internet top level domains.

I’m afraid that the criteria for truncation “beyond what is a reasonable
linguistic token for a human” may differ for each individual.

> If I ever need to write about Kazakh I need to learn to spell it
> anyway, so why burden human memories with these truncations?

Sorry, but even if English is the most spoken language in the world, not
everyone speaks it.

The requirement with names is that you have to learn it in a foreign
language.

This may be totally unproblematic for you, but don’t forget it may
impose a burden on many people who speak and write in their mother
tongue only.

> An example where they mostly have done it right are
> Unicode character names, where they have resisted the temptation
> to truncate. They could conceivably have truncated parts of the
> names like the names of scripts, but thankfully they
> didn't.

I don’t think the example is fair: these names are basically
descriptions, since you may invoke them by number (such as in &#119334;).

Language codes aren’t descriptions, but codes to invoke them.

I think it is worth to use language codes because they are standards.
And they may be handy because they are short.

Just in case it helps,

Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5571D20C.1000606%40web.de.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                         ` <5571D20C.1000606-S0/GAf8tV78@public.gmane.org>
@ 2015-06-05 17:35                           ` Paulo Ney de Souza
  0 siblings, 0 replies; 25+ messages in thread
From: Paulo Ney de Souza @ 2015-06-05 17:35 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1379 bytes --]

I am sorry Pablo, but I can't let this one squeak by with correcting the
record. English is NOT the most spoken language in the world -- and it is
not even close, even considering native + second language speakers.

Chinese: 1,3 B
English: 508 M

If you consider native speakers only Hindi surpasses English and leaves it
third. The only way to get it to second is to consider English a dialect of
German, which many do, but that is how far up it gets ...

More details here:

    http://www.nationsonline.org/oneworld/most_spoken_languages.htm

Paulo Ney



On Fri, Jun 5, 2015 at 9:45 AM, Pablo Rodríguez <oinos-S0/GAf8tV78@public.gmane.org> wrote:


> Sorry, but even if English is the most spoken language in the world, not
> everyone speaks it.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZO%2BX60HhtfJsTBv61JPkPu%3D4mr9YuoUHf%3DSKhpb1T3m5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 2367 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]     ` <20150604223339.GH14696-bi+AKbBUZKZDXvCVYMAFSHgRjnEvrCe7@public.gmane.org>
@ 2015-06-05 18:10       ` Pablo Rodríguez
  0 siblings, 0 replies; 25+ messages in thread
From: Pablo Rodríguez @ 2015-06-05 18:10 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 06/05/2015 12:33 AM, John MacFarlane wrote:
> +++ Pablo Rodríguez [Jun 03 15 23:39 ]:
>>
>> Language identifiers are different in XML and in LaTeX (there are some
>> differences in ConTeXt also). For proper language handling, we need that
>> pandoc converts values when writing to and reading from LaTeX (or
>> ConTeXt).
> 
> Yes, I agree, this should be done soon.  We should have a
> single standard `lang` or `langguage` metadata field that
> takes  ISO-639 values, which get converted where necessary
> (e.g. LaTeX polyglossia and babel).

John,

since this could be done also in templates, I think I have a way to do
it in ConTeXt.

I have done a merge request in pandoc-templates to fix this in ConTeXt
(https://github.com/jgm/pandoc-templates/pull/101).

>> -   Special syntax for divisions, because <div> and <span> are XML
>>    elements that don’t work in LaTeX or ConTeXt.
> 
> The *syntax* `<div>` and `<span>` is not XML-specific -- it
> creates special elements in the Pandoc AST.  It's just that
> most formats don't yet define special meanings for these.
> It's not clear that they should, because different people
> may have different needs. Filters are your friend.

Well, pandoc may use them for other purposes, but both <div> and <span>
are XML elements.

And these are the only two XML elements that are used in Markdown, not
being raw HTML code.

I’m not asking for special meanings in divs (or at least not with the
element), I’m only asking for the issue discussed at
https://github.com/jgm/pandoc/issues/168.

It is only that users don’t have to write the HTML tags (as discussed in
the issue cited above).

>> -   As already discussed for other purposes, it would be great if all
>>    elements in Markdown could enjoy attributes. As far as I can
>>    remember from the tread "Release road map - 1.14 and beyond", this
>>    was an accepted feature.
> 
> No, we'd only discussed attributes on images.

I thought attributes for all elements was an accepted feature, but it
was only mentioned (sorry, no good anchors in Google Groups):

https://groups.google.com/forum/#!msg/pandoc-discuss/650Exs1sCrE/bl1MTuUpH1wJ
https://groups.google.com/forum/#!msg/pandoc-discuss/650Exs1sCrE/OhwMaKYK1kkJ

>> 3.  Would it be possible to implement the :lang special attribute syntax
>>    as in https://github.com/jgm/pandoc/issues/895?
> 
> Not sure about this.  It would require very extensive
> changes.
> 
>> 4.  Could we have in version 1.15 special syntax for divisions and spans
>>    (https://github.com/jgm/pandoc/issues/168) so that they could be
>>    used beyond HTML output formats?
> 
> They already can be.  `<div>` and `<span>` in Markdown
> create special Div and Span elements, which are not HTML
> specific, though in most writer special meanings have not
> been efined for them.
> 
>> 5.  Would it be possible that version 1.15 grants attributes to all
>>    elements (https://github.com/jgm/pandoc/issues/1966)?
> 
> This is a possibility, but it's a huge change and requires
> many issues to be resolved.

My questions aimed to clarify the language situation in pandoc.

Right now, I think we even have issues with the YAML metadata fields for
languages. (But this is a topic for another thread.)

Many thanks for your help,


Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5571E60A.60808%40web.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found] ` <556F7423.1060600-S0/GAf8tV78@public.gmane.org>
  2015-06-04 16:37   ` BP Jonsson
  2015-06-04 22:33   ` John MacFarlane
@ 2015-08-18 16:06   ` John MacFarlane
       [not found]     ` <68e1c2de-81fc-47f7-ba9e-572741381556-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2 siblings, 1 reply; 25+ messages in thread
From: John MacFarlane @ 2015-08-18 16:06 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 672 bytes --]

See https://github.com/jgm/pandoc/issues/2366 - an issue proposing a 
unified treatment of `lang`.


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/68e1c2de-81fc-47f7-ba9e-572741381556%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1112 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]     ` <68e1c2de-81fc-47f7-ba9e-572741381556-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-05 12:30       ` correio-/+iEWPDoN2yHYLhqrb4AQA
       [not found]         ` <57d603b7-ceba-4c5b-bccd-d71dcb63c79b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2015-12-05 12:30 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1.1: Type: text/plain, Size: 1617 bytes --]

Dear fellows,

I'm using Pandoc-markdown to PDF via Latex.
I hadn't updated Pandoc in a while. Yesterday I did so. 
I was using -V lang=brazil option before. It has been working fine for the 
past two years. And it doesn't anymore. Language was not set.
As far as I understood, accordingly to this issue here, I had to change it 
to pt-BR.

I did so, and if I export it to a TEX file, I can see that now the 
“brazilian” language is set everywhere (document class, babel and so on). 
That should work fine, and it does if I use pdflatex to convert TEX to PDF.

However, when I try to convert it straight from panic-markdown to PDF, or 
even from TEX to PDF using Pandoc, all the elements (caption, chapter 
prefixes and so on) come in English.

Is there a new variable to set now anywhere?

All the best,
Danilo 

On Tuesday, August 18, 2015 at 1:06:34 PM UTC-3, John MacFarlane wrote:
>
> See https://github.com/jgm/pandoc/issues/2366 - an issue proposing a 
> unified treatment of `lang`.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/57d603b7-ceba-4c5b-bccd-d71dcb63c79b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2876 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]         ` <57d603b7-ceba-4c5b-bccd-d71dcb63c79b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-05 22:04           ` mb21
       [not found]             ` <f3d7ce8e-f18c-4c50-b3be-bf5b9bdc802e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2015-12-06 16:28           ` John MacFarlane
  1 sibling, 1 reply; 25+ messages in thread
From: mb21 @ 2015-12-05 22:04 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1910 bytes --]

Can you post the exact command to generate the .tex that works, and the 
exact command to generate the PDF that doesn't work?


On Saturday, December 5, 2015 at 1:30:01 PM UTC+1, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
wrote:
>
> Dear fellows,
>
> I'm using Pandoc-markdown to PDF via Latex.
> I hadn't updated Pandoc in a while. Yesterday I did so. 
> I was using -V lang=brazil option before. It has been working fine for the 
> past two years. And it doesn't anymore. Language was not set.
> As far as I understood, accordingly to this issue here, I had to change it 
> to pt-BR.
>
> I did so, and if I export it to a TEX file, I can see that now the 
> “brazilian” language is set everywhere (document class, babel and so on). 
> That should work fine, and it does if I use pdflatex to convert TEX to PDF.
>
> However, when I try to convert it straight from panic-markdown to PDF, or 
> even from TEX to PDF using Pandoc, all the elements (caption, chapter 
> prefixes and so on) come in English.
>
> Is there a new variable to set now anywhere?
>
> All the best,
> Danilo 
>
>
>
>
>
> On Tuesday, August 18, 2015 at 1:06:34 PM UTC-3, John MacFarlane wrote:
>>
>> See https://github.com/jgm/pandoc/issues/2366 - an issue proposing a 
>> unified treatment of `lang`.
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f3d7ce8e-f18c-4c50-b3be-bf5b9bdc802e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3257 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]             ` <f3d7ce8e-f18c-4c50-b3be-bf5b9bdc802e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-06 14:15               ` correio-/+iEWPDoN2yHYLhqrb4AQA
       [not found]                 ` <b7a3240e-cf38-4068-bfc0-17df08f5d373-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2015-12-06 14:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2931 bytes --]

Thank you for your attention,
I have made some further testing, and that's what I found out:

1. Pandoc is converting -V lang=pt to “portuguese” and -V lang=pt-BR to 
"brazilian".

2. The latex engine pdflatex understands "portuguese" and "brazilian" in 
document class options, babel options, and set main language.

3. **The latex engine xelatex does NOT understand "portuguese" and 
"brazilian" as languages**, it **does** understand however "portuges" and 
"brazil" – also understood by pdflatex.

Therefore, the problem is that the language definitions adopted by pandoc 
for both variations of Portuguese (pt and pt-BR) are not understood by all 
latex engines. Was that a choice based on a standard list? Where is it? If 
pandoc simply turned back to "portuges" and "brazil", it would be probably 
simpler… 

Should I post it as a bug in github?

All the best,

Danilo

On Saturday, December 5, 2015 at 8:04:25 PM UTC-2, mb21 wrote:
>
> Can you post the exact command to generate the .tex that works, and the 
> exact command to generate the PDF that doesn't work?
>
>
> On Saturday, December 5, 2015 at 1:30:01 PM UTC+1, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
> wrote:
>>
>> Dear fellows,
>>
>> I'm using Pandoc-markdown to PDF via Latex.
>> I hadn't updated Pandoc in a while. Yesterday I did so. 
>> I was using -V lang=brazil option before. It has been working fine for 
>> the past two years. And it doesn't anymore. Language was not set.
>> As far as I understood, accordingly to this issue here, I had to change 
>> it to pt-BR.
>>
>> I did so, and if I export it to a TEX file, I can see that now the 
>> “brazilian” language is set everywhere (document class, babel and so on). 
>> That should work fine, and it does if I use pdflatex to convert TEX to PDF.
>>
>> However, when I try to convert it straight from panic-markdown to PDF, or 
>> even from TEX to PDF using Pandoc, all the elements (caption, chapter 
>> prefixes and so on) come in English.
>>
>> Is there a new variable to set now anywhere?
>>
>> All the best,
>> Danilo 
>>
>>
>>
>>
>>
>> On Tuesday, August 18, 2015 at 1:06:34 PM UTC-3, John MacFarlane wrote:
>>>
>>> See https://github.com/jgm/pandoc/issues/2366 - an issue proposing a 
>>> unified treatment of `lang`.
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b7a3240e-cf38-4068-bfc0-17df08f5d373%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4660 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                 ` <b7a3240e-cf38-4068-bfc0-17df08f5d373-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-06 14:30                   ` correio-/+iEWPDoN2yHYLhqrb4AQA
  0 siblings, 0 replies; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2015-12-06 14:30 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4076 bytes --]

As for mb21's question

1. with pandoc I was trying both with:

pandoc -s -S --normalize -V documentclass=extarticle -V lang=pt-BR -V 
classoption=A4paper -o filename.pdf filename.txt

[latex engine pdflatex]

and

pandoc -s -S --normalize --latex-engine=xelatex -V documentclass=extarticle 
-V lang=pt-BR -V classoption=A4paper -o filename.pdf filename.txt

[only this works for me, because I have many special characters in UTF-8 
that pdflatex does not read].

2. Then, for testing, I tried: 

pandoc -s -S --normalize --atex-engine=xelatex -V documentclass=extarticle 
-V lang=pt -V classoption=A4paper -o filename.tex filename.txt

and

pandoc -s -S --normalize --atex-engine=xelatex -V documentclass=extarticle 
-V lang=pt-BR -V classoption=A4paper -o filename.tex filename.txt

3. And finally both:

pdflatex filename.tex

and

xelatex filename.tex

Thank you for your attention.

All the best,
Danilo

On Sunday, December 6, 2015 at 12:15:07 PM UTC-2, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
wrote:
>
> Thank you for your attention,
> I have made some further testing, and that's what I found out:
>
> 1. Pandoc is converting -V lang=pt to “portuguese” and -V lang=pt-BR to 
> "brazilian".
>
> 2. The latex engine pdflatex understands "portuguese" and "brazilian" in 
> document class options, babel options, and set main language.
>
> 3. **The latex engine xelatex does NOT understand "portuguese" and 
> "brazilian" as languages**, it **does** understand however "portuges" and 
> "brazil" – also understood by pdflatex.
>
> Therefore, the problem is that the language definitions adopted by pandoc 
> for both variations of Portuguese (pt and pt-BR) are not understood by all 
> latex engines. Was that a choice based on a standard list? Where is it? If 
> pandoc simply turned back to "portuges" and "brazil", it would be probably 
> simpler… 
>
> Should I post it as a bug in github?
>
> All the best,
>
> Danilo
>
> On Saturday, December 5, 2015 at 8:04:25 PM UTC-2, mb21 wrote:
>>
>> Can you post the exact command to generate the .tex that works, and the 
>> exact command to generate the PDF that doesn't work?
>>
>>
>> On Saturday, December 5, 2015 at 1:30:01 PM UTC+1, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
>> wrote:
>>>
>>> Dear fellows,
>>>
>>> I'm using Pandoc-markdown to PDF via Latex.
>>> I hadn't updated Pandoc in a while. Yesterday I did so. 
>>> I was using -V lang=brazil option before. It has been working fine for 
>>> the past two years. And it doesn't anymore. Language was not set.
>>> As far as I understood, accordingly to this issue here, I had to change 
>>> it to pt-BR.
>>>
>>> I did so, and if I export it to a TEX file, I can see that now the 
>>> “brazilian” language is set everywhere (document class, babel and so on). 
>>> That should work fine, and it does if I use pdflatex to convert TEX to PDF.
>>>
>>> However, when I try to convert it straight from panic-markdown to PDF, 
>>> or even from TEX to PDF using Pandoc, all the elements (caption, chapter 
>>> prefixes and so on) come in English.
>>>
>>> Is there a new variable to set now anywhere?
>>>
>>> All the best,
>>> Danilo 
>>>
>>>
>>>
>>>
>>>
>>> On Tuesday, August 18, 2015 at 1:06:34 PM UTC-3, John MacFarlane wrote:
>>>>
>>>> See https://github.com/jgm/pandoc/issues/2366 - an issue proposing a 
>>>> unified treatment of `lang`.
>>>>
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ed5905d4-c463-46a9-8ade-9f247a924619%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6266 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]         ` <57d603b7-ceba-4c5b-bccd-d71dcb63c79b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2015-12-05 22:04           ` mb21
@ 2015-12-06 16:28           ` John MacFarlane
       [not found]             ` <20151206162857.GC45405-jF64zX8BO091tJRe0FUodcM6rOWSkUom@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: John MacFarlane @ 2015-12-06 16:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ correio-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org [Dec 05 15 04:30 ]:
>   Dear fellows,
>   I'm using Pandoc-markdown to PDF via Latex.
>   I hadn't updated Pandoc in a while. Yesterday I did so.
>   I was using -V lang=brazil option before. It has been working fine for
>   the past two years. And it doesn't anymore. Language was not set.
>   As far as I understood, accordingly to this issue here, I had to change
>   it to pt-BR.
>   I did so, and if I export it to a TEX file, I can see that now the
>   “brazilian” language is set everywhere (document class, babel and so
>   on). That should work fine, and it does if I use pdflatex to convert
>   TEX to PDF.
>   However, when I try to convert it straight from panic-markdown to PDF,
>   or even from TEX to PDF using Pandoc, all the elements (caption,
>   chapter prefixes and so on) come in English.
>   Is there a new variable to set now anywhere?

There have been quite a few changes to the templates
(related to the language changes), so if you're using a
custom template, you'll need to merge these changes or
start anew with the default.

I realize there's some pain in this change, but it's going
to be much better going forward, since you can now use the
same language codes in every format.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]             ` <20151206162857.GC45405-jF64zX8BO091tJRe0FUodcM6rOWSkUom@public.gmane.org>
@ 2015-12-06 17:38               ` Paulo Ney de Souza
       [not found]                 ` <CAFVhNZM8fnPUyieYjLHZUe_d2XRx_y4kP3t4z5EFk=sngKuOVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Paulo Ney de Souza @ 2015-12-06 17:38 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2877 bytes --]

Both Babel and Polyglossia are being changed and the only tags to be used
moving forward should be:

pt --> for generic Portuguese
pt-PT --> for specific Portuguese from Portugal
pt-BR --> for specific Portuguese from Brazil

Paulo Ney



On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:

> +++ correio-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org [Dec 05 15 04:30 ]:
>
>>   Dear fellows,
>>   I'm using Pandoc-markdown to PDF via Latex.
>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>   I was using -V lang=brazil option before. It has been working fine for
>>   the past two years. And it doesn't anymore. Language was not set.
>>   As far as I understood, accordingly to this issue here, I had to change
>>   it to pt-BR.
>>   I did so, and if I export it to a TEX file, I can see that now the
>>   “brazilian” language is set everywhere (document class, babel and so
>>   on). That should work fine, and it does if I use pdflatex to convert
>>   TEX to PDF.
>>   However, when I try to convert it straight from panic-markdown to PDF,
>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>   chapter prefixes and so on) come in English.
>>   Is there a new variable to set now anywhere?
>>
>
> There have been quite a few changes to the templates
> (related to the language changes), so if you're using a
> custom template, you'll need to merge these changes or
> start anew with the default.
>
> I realize there's some pain in this change, but it's going
> to be much better going forward, since you can now use the
> same language codes in every format.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZM8fnPUyieYjLHZUe_d2XRx_y4kP3t4z5EFk%3DsngKuOVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4428 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                 ` <CAFVhNZM8fnPUyieYjLHZUe_d2XRx_y4kP3t4z5EFk=sngKuOVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-12-07  0:40                   ` correio-/+iEWPDoN2yHYLhqrb4AQA
       [not found]                     ` <7359d875-67d2-479f-86d4-b79c10ad3c9d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-02-14 23:05                   ` Italo VEGA
  1 sibling, 1 reply; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2015-12-07  0:40 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3308 bytes --]

Dear John and Paulo,

Thank you for your explanations.
Let's wait for polyglossia (since I use xetex) to adopt the current 
language code standards then.
Until then, as suggested by John, I can do it with a custom latex-template.

All the best,
Danilo

On Sunday, December 6, 2015 at 3:38:44 PM UTC-2, Paulo Ney de Souza wrote:
>
> Both Babel and Polyglossia are being changed and the only tags to be used 
> moving forward should be:
>
> pt --> for generic Portuguese
> pt-PT --> for specific Portuguese from Portugal
> pt-BR --> for specific Portuguese from Brazil
>
> Paulo Ney
>
>
>
> On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org 
> <javascript:>> wrote:
>
>> +++ cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org <javascript:> [Dec 05 15 04:30 ]:
>>
>>>   Dear fellows,
>>>   I'm using Pandoc-markdown to PDF via Latex.
>>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>>   I was using -V lang=brazil option before. It has been working fine for
>>>   the past two years. And it doesn't anymore. Language was not set.
>>>   As far as I understood, accordingly to this issue here, I had to change
>>>   it to pt-BR.
>>>   I did so, and if I export it to a TEX file, I can see that now the
>>>   “brazilian” language is set everywhere (document class, babel and so
>>>   on). That should work fine, and it does if I use pdflatex to convert
>>>   TEX to PDF.
>>>   However, when I try to convert it straight from panic-markdown to PDF,
>>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>>   chapter prefixes and so on) come in English.
>>>   Is there a new variable to set now anywhere?
>>>
>>
>> There have been quite a few changes to the templates
>> (related to the language changes), so if you're using a
>> custom template, you'll need to merge these changes or
>> start anew with the default.
>>
>> I realize there's some pain in this change, but it's going
>> to be much better going forward, since you can now use the
>> same language codes in every format.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7359d875-67d2-479f-86d4-b79c10ad3c9d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5802 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                     ` <7359d875-67d2-479f-86d4-b79c10ad3c9d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-02-03 19:38                       ` correio-/+iEWPDoN2yHYLhqrb4AQA
       [not found]                         ` <a2cba45f-ad78-4afd-be64-fcbcd3a269e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2016-02-03 19:38 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3863 bytes --]

Dear fellows,

In the last update, has something changed in the way pandoc-citeproc 
handles with languages?
The "complementary" text in bibliography is not being translated do the 
document language anymore (e.g. author1 *and* author2; *edited by* etc.).
Texts that previously were being exported fine, now have this issue.

Is there something else to be setup now?

All the best,
Danilo


On Sunday, December 6, 2015 at 10:40:41 PM UTC-2, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
wrote:
>
> Dear John and Paulo,
>
> Thank you for your explanations.
> Let's wait for polyglossia (since I use xetex) to adopt the current 
> language code standards then.
> Until then, as suggested by John, I can do it with a custom latex-template.
>
> All the best,
> Danilo
>
> On Sunday, December 6, 2015 at 3:38:44 PM UTC-2, Paulo Ney de Souza wrote:
>>
>> Both Babel and Polyglossia are being changed and the only tags to be used 
>> moving forward should be:
>>
>> pt --> for generic Portuguese
>> pt-PT --> for specific Portuguese from Portugal
>> pt-BR --> for specific Portuguese from Brazil
>>
>> Paulo Ney
>>
>>
>>
>> On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> 
>> wrote:
>>
>>> +++ cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org [Dec 05 15 04:30 ]:
>>>
>>>>   Dear fellows,
>>>>   I'm using Pandoc-markdown to PDF via Latex.
>>>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>>>   I was using -V lang=brazil option before. It has been working fine for
>>>>   the past two years. And it doesn't anymore. Language was not set.
>>>>   As far as I understood, accordingly to this issue here, I had to 
>>>> change
>>>>   it to pt-BR.
>>>>   I did so, and if I export it to a TEX file, I can see that now the
>>>>   “brazilian” language is set everywhere (document class, babel and so
>>>>   on). That should work fine, and it does if I use pdflatex to convert
>>>>   TEX to PDF.
>>>>   However, when I try to convert it straight from panic-markdown to PDF,
>>>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>>>   chapter prefixes and so on) come in English.
>>>>   Is there a new variable to set now anywhere?
>>>>
>>>
>>> There have been quite a few changes to the templates
>>> (related to the language changes), so if you're using a
>>> custom template, you'll need to merge these changes or
>>> start anew with the default.
>>>
>>> I realize there's some pain in this change, but it's going
>>> to be much better going forward, since you can now use the
>>> same language codes in every format.
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a2cba45f-ad78-4afd-be64-fcbcd3a269e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5818 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                         ` <a2cba45f-ad78-4afd-be64-fcbcd3a269e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-02-03 23:47                           ` correio-/+iEWPDoN2yHYLhqrb4AQA
  0 siblings, 0 replies; 25+ messages in thread
From: correio-/+iEWPDoN2yHYLhqrb4AQA @ 2016-02-03 23:47 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4220 bytes --]

Ok. It was a matter with the locale.
--metadata locale=xx-XX solved the issue.

Sorry for the question. 

All the best

On Wednesday, February 3, 2016 at 5:38:06 PM UTC-2, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
wrote:
>
> Dear fellows,
>
> In the last update, has something changed in the way pandoc-citeproc 
> handles with languages?
> The "complementary" text in bibliography is not being translated do the 
> document language anymore (e.g. author1 *and* author2; *edited by* etc.).
> Texts that previously were being exported fine, now have this issue.
>
> Is there something else to be setup now?
>
> All the best,
> Danilo
>
>
> On Sunday, December 6, 2015 at 10:40:41 PM UTC-2, cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org 
> wrote:
>>
>> Dear John and Paulo,
>>
>> Thank you for your explanations.
>> Let's wait for polyglossia (since I use xetex) to adopt the current 
>> language code standards then.
>> Until then, as suggested by John, I can do it with a custom 
>> latex-template.
>>
>> All the best,
>> Danilo
>>
>> On Sunday, December 6, 2015 at 3:38:44 PM UTC-2, Paulo Ney de Souza wrote:
>>>
>>> Both Babel and Polyglossia are being changed and the only tags to be 
>>> used moving forward should be:
>>>
>>> pt --> for generic Portuguese
>>> pt-PT --> for specific Portuguese from Portugal
>>> pt-BR --> for specific Portuguese from Brazil
>>>
>>> Paulo Ney
>>>
>>>
>>>
>>> On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> 
>>> wrote:
>>>
>>>> +++ cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org [Dec 05 15 04:30 ]:
>>>>
>>>>>   Dear fellows,
>>>>>   I'm using Pandoc-markdown to PDF via Latex.
>>>>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>>>>   I was using -V lang=brazil option before. It has been working fine 
>>>>> for
>>>>>   the past two years. And it doesn't anymore. Language was not set.
>>>>>   As far as I understood, accordingly to this issue here, I had to 
>>>>> change
>>>>>   it to pt-BR.
>>>>>   I did so, and if I export it to a TEX file, I can see that now the
>>>>>   “brazilian” language is set everywhere (document class, babel and so
>>>>>   on). That should work fine, and it does if I use pdflatex to convert
>>>>>   TEX to PDF.
>>>>>   However, when I try to convert it straight from panic-markdown to 
>>>>> PDF,
>>>>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>>>>   chapter prefixes and so on) come in English.
>>>>>   Is there a new variable to set now anywhere?
>>>>>
>>>>
>>>> There have been quite a few changes to the templates
>>>> (related to the language changes), so if you're using a
>>>> custom template, you'll need to merge these changes or
>>>> start anew with the default.
>>>>
>>>> I realize there's some pain in this change, but it's going
>>>> to be much better going forward, since you can now use the
>>>> same language codes in every format.
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "pandoc-discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c7578a37-501b-4c25-8125-d76152abeab6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6260 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                 ` <CAFVhNZM8fnPUyieYjLHZUe_d2XRx_y4kP3t4z5EFk=sngKuOVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-12-07  0:40                   ` correio-/+iEWPDoN2yHYLhqrb4AQA
@ 2016-02-14 23:05                   ` Italo VEGA
       [not found]                     ` <b1981fc3-c61c-45f3-a0e9-bf6350eb525b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Italo VEGA @ 2016-02-14 23:05 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3172 bytes --]

Dear Paulo,

I did this for use with pandoc-xelatex:

---
lang: brazil
polyglossia-lang:
    name: brazil
---

Em domingo, 6 de dezembro de 2015 15:38:44 UTC-2, Paulo Ney de Souza 
escreveu:
>
> Both Babel and Polyglossia are being changed and the only tags to be used 
> moving forward should be:
>
> pt --> for generic Portuguese
> pt-PT --> for specific Portuguese from Portugal
> pt-BR --> for specific Portuguese from Brazil
>
> Paulo Ney
>
>
>
> On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org 
> <javascript:>> wrote:
>
>> +++ cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org <javascript:> [Dec 05 15 04:30 ]:
>>
>>>   Dear fellows,
>>>   I'm using Pandoc-markdown to PDF via Latex.
>>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>>   I was using -V lang=brazil option before. It has been working fine for
>>>   the past two years. And it doesn't anymore. Language was not set.
>>>   As far as I understood, accordingly to this issue here, I had to change
>>>   it to pt-BR.
>>>   I did so, and if I export it to a TEX file, I can see that now the
>>>   “brazilian” language is set everywhere (document class, babel and so
>>>   on). That should work fine, and it does if I use pdflatex to convert
>>>   TEX to PDF.
>>>   However, when I try to convert it straight from panic-markdown to PDF,
>>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>>   chapter prefixes and so on) come in English.
>>>   Is there a new variable to set now anywhere?
>>>
>>
>> There have been quite a few changes to the templates
>> (related to the language changes), so if you're using a
>> custom template, you'll need to merge these changes or
>> start anew with the default.
>>
>> I realize there's some pain in this change, but it's going
>> to be much better going forward, since you can now use the
>> same language codes in every format.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b1981fc3-c61c-45f3-a0e9-bf6350eb525b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5604 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: dealing with languages in pandoc
       [not found]                     ` <b1981fc3-c61c-45f3-a0e9-bf6350eb525b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-02-16 12:47                       ` mb21
  0 siblings, 0 replies; 25+ messages in thread
From: mb21 @ 2016-02-16 12:47 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3418 bytes --]

you shouldn't have to set `polyglossia-lang` etc. manually, only `lang`. 
See http://pandoc.org/README.html#language-variables

On Monday, February 15, 2016 at 12:05:57 AM UTC+1, Italo VEGA wrote:
>
> Dear Paulo,
>
> I did this for use with pandoc-xelatex:
>
> ---
> lang: brazil
> polyglossia-lang:
>     name: brazil
> ---
>
> Em domingo, 6 de dezembro de 2015 15:38:44 UTC-2, Paulo Ney de Souza 
> escreveu:
>>
>> Both Babel and Polyglossia are being changed and the only tags to be used 
>> moving forward should be:
>>
>> pt --> for generic Portuguese
>> pt-PT --> for specific Portuguese from Portugal
>> pt-BR --> for specific Portuguese from Brazil
>>
>> Paulo Ney
>>
>>
>>
>> On Sun, Dec 6, 2015 at 8:28 AM, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> 
>> wrote:
>>
>>> +++ cor...-/+iEWPDoN2yHYLhqrb4AQA@public.gmane.org [Dec 05 15 04:30 ]:
>>>
>>>>   Dear fellows,
>>>>   I'm using Pandoc-markdown to PDF via Latex.
>>>>   I hadn't updated Pandoc in a while. Yesterday I did so.
>>>>   I was using -V lang=brazil option before. It has been working fine for
>>>>   the past two years. And it doesn't anymore. Language was not set.
>>>>   As far as I understood, accordingly to this issue here, I had to 
>>>> change
>>>>   it to pt-BR.
>>>>   I did so, and if I export it to a TEX file, I can see that now the
>>>>   “brazilian” language is set everywhere (document class, babel and so
>>>>   on). That should work fine, and it does if I use pdflatex to convert
>>>>   TEX to PDF.
>>>>   However, when I try to convert it straight from panic-markdown to PDF,
>>>>   or even from TEX to PDF using Pandoc, all the elements (caption,
>>>>   chapter prefixes and so on) come in English.
>>>>   Is there a new variable to set now anywhere?
>>>>
>>>
>>> There have been quite a few changes to the templates
>>> (related to the language changes), so if you're using a
>>> custom template, you'll need to merge these changes or
>>> start anew with the default.
>>>
>>> I realize there's some pain in this change, but it's going
>>> to be much better going forward, since you can now use the
>>> same language codes in every format.
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/pandoc-discuss/20151206162857.GC45405%40MacBook-Air-2.local
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/011912b8-5bc1-4390-ac99-9276b06e4eab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5269 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-02-16 12:47 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-03 21:39 dealing with languages in pandoc Pablo Rodríguez
     [not found] ` <556F7423.1060600-S0/GAf8tV78@public.gmane.org>
2015-06-04 16:37   ` BP Jonsson
     [not found]     ` <55707EC3.3030909-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-04 18:52       ` Paulo Ney de Souza
     [not found]         ` <CAFVhNZOk1-Sm-fAD2_W9q9nefpAwzkK5+4KOFB1cqiZD-HoO6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-04 19:43           ` BP Jonsson
     [not found]             ` <5570AA57.8000107-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-04 20:01               ` Pablo Rodríguez
     [not found]                 ` <5570AE8C.3070004-S0/GAf8tV78@public.gmane.org>
2015-06-04 21:01                   ` BP Jonsson
     [not found]                     ` <5570BCA8.1010102-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-06-05 16:45                       ` Pablo Rodríguez
     [not found]                         ` <5571D20C.1000606-S0/GAf8tV78@public.gmane.org>
2015-06-05 17:35                           ` Paulo Ney de Souza
2015-06-04 21:10               ` Paulo Ney de Souza
2015-06-05  5:19           ` juh
2015-06-04 19:30       ` Pablo Rodríguez
2015-06-04 22:33   ` John MacFarlane
     [not found]     ` <20150604223339.GH14696-bi+AKbBUZKZDXvCVYMAFSHgRjnEvrCe7@public.gmane.org>
2015-06-05 18:10       ` Pablo Rodríguez
2015-08-18 16:06   ` John MacFarlane
     [not found]     ` <68e1c2de-81fc-47f7-ba9e-572741381556-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-05 12:30       ` correio-/+iEWPDoN2yHYLhqrb4AQA
     [not found]         ` <57d603b7-ceba-4c5b-bccd-d71dcb63c79b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-05 22:04           ` mb21
     [not found]             ` <f3d7ce8e-f18c-4c50-b3be-bf5b9bdc802e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-06 14:15               ` correio-/+iEWPDoN2yHYLhqrb4AQA
     [not found]                 ` <b7a3240e-cf38-4068-bfc0-17df08f5d373-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-06 14:30                   ` correio-/+iEWPDoN2yHYLhqrb4AQA
2015-12-06 16:28           ` John MacFarlane
     [not found]             ` <20151206162857.GC45405-jF64zX8BO091tJRe0FUodcM6rOWSkUom@public.gmane.org>
2015-12-06 17:38               ` Paulo Ney de Souza
     [not found]                 ` <CAFVhNZM8fnPUyieYjLHZUe_d2XRx_y4kP3t4z5EFk=sngKuOVA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-12-07  0:40                   ` correio-/+iEWPDoN2yHYLhqrb4AQA
     [not found]                     ` <7359d875-67d2-479f-86d4-b79c10ad3c9d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-02-03 19:38                       ` correio-/+iEWPDoN2yHYLhqrb4AQA
     [not found]                         ` <a2cba45f-ad78-4afd-be64-fcbcd3a269e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-02-03 23:47                           ` correio-/+iEWPDoN2yHYLhqrb4AQA
2016-02-14 23:05                   ` Italo VEGA
     [not found]                     ` <b1981fc3-c61c-45f3-a0e9-bf6350eb525b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-02-16 12:47                       ` mb21

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).