ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Ivan Pešić via ntg-context" <ntg-context@ntg.nl>
To: ntg-context@ntg.nl
Cc: "Ivan Pešić" <ivan.pesic@gmail.com>
Subject: Transliteration
Date: Thu, 3 Feb 2022 23:15:28 +0400	[thread overview]
Message-ID: <c0e09d5f-0198-8755-e730-922f5a0b3898@gmail.com> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 1763 bytes --]

Hello!
I've been working on a Serbian book and I had to transliterate it from 
cyrillic to latin.
There's been some nice improvement in transliteration, and I would like 
to propose a small change.
One of the peculiarities that current transliteration mechanisms (both 
internal one and the 3rd party module from Philipp Gesang)
don't process is that Љ, Њ and Џ are transliterated to Lj, Nj and Dž in 
normal words that start the sentence, or in names that normally start 
with a capital letter,
but in titles written in all capitals they should be transliterated to 
LJ, NJ and DŽ.
So, the quick solution was to update the current mapping vector and add 
another one (that is attached) that maps cyrillic capitals to LJ, NJ and DŽ
and set the correct 30 letters used in Serbian language.
It requires a bit more manual work to set the correct mapping for all 
capitals text, but it works.
I have also merged the Serbian hyphenation patterns, so there is no need 
to switch the language in order to have hyphenation in transliterated text.
That was possible because cyrillic and latin scripts use different code 
points, and there are no conflicts in patterns.
So I suggest merging the patterns for Serbian cyrillic and latin.

There is another issue if one wants to use a dropcap and the rest of 
that first word, and several following words are to be typeset in small 
caps.
If that first letter is Љ (or other two letters that transliterate as 
digraphs), then the second letter of the digraph is not typeset in small 
caps because
it gets injected before the group that turns on small caps.
For example:

    \placeinitial
    Љ{\sc уди нису знали}

but this is quite a special case...

Regards,
Ivan

[-- Attachment #1.2: Type: text/html, Size: 2138 bytes --]

[-- Attachment #2: lang-imp-serbian.lua --]
[-- Type: text/plain, Size: 2464 bytes --]

return {
  transliterations = {
    ["c2l"] = {
        mapping = {
        ["А"] = "A",  ["а"] = "a",
        ["Б"] = "B",  ["б"] = "b",
        ["В"] = "V",  ["в"] = "v",
        ["Г"] = "G",  ["г"] = "g",
        ["Д"] = "D",  ["д"] = "d",
        ["Ђ"] = "Đ",  ["ђ"] = "đ",
        ["Е"] = "E",  ["е"] = "e",
        ["Ж"] = "Ž",  ["ж"] = "ž",
        ["З"] = "Z",  ["з"] = "z",
        ["И"] = "I",  ["и"] = "i",
        ["Ј"] = "J",  ["ј"] = "j",
        ["К"] = "K",  ["к"] = "k",
        ["Л"] = "L",  ["л"] = "l",
        ["Љ"] = "Lj",  ["љ"] = "lj",
        ["М"] = "M",  ["м"] = "m",
        ["Н"] = "N",  ["н"] = "n",
        ["Њ"] = "Nj",  ["њ"] = "nj",
        ["О"] = "O",  ["о"] = "o",
        ["П"] = "P",  ["п"] = "p",
        ["Р"] = "R",  ["р"] = "r",
        ["С"] = "S",  ["с"] = "s",
        ["Т"] = "T", ["т"] = "t",
        ["Ћ"] = "Ć",  ["ћ"] = "ć",
        ["У"] = "U",  ["у"] = "u",
        ["Ф"] = "F",  ["ф"] = "f",
        ["Х"] = "H", ["х"] = "h",
        ["Ц"] = "C",  ["ц"] = "c",
        ["Ч"] = "Č",  ["ч"] = "č",
        ["Џ"] = "Dž", ["џ"] = "dž",
        ["Ш"] = "Š", ["ш"] = "š",
        }
    },
    ["C2L"] = {
        mapping = {
        ["А"] = "A",  ["а"] = "a",
        ["Б"] = "B",  ["б"] = "b",
        ["В"] = "V",  ["в"] = "v",
        ["Г"] = "G",  ["г"] = "g",
        ["Д"] = "D",  ["д"] = "d",
        ["Ђ"] = "Đ",  ["ђ"] = "đ",
        ["Е"] = "E",  ["е"] = "e",
        ["Ж"] = "Ž",  ["ж"] = "ž",
        ["З"] = "Z",  ["з"] = "z",
        ["И"] = "I",  ["и"] = "i",
        ["Ј"] = "J",  ["ј"] = "j",
        ["К"] = "K",  ["к"] = "k",
        ["Л"] = "L",  ["л"] = "l",
        ["Љ"] = "LJ",  ["љ"] = "lj",
        ["М"] = "M",  ["м"] = "m",
        ["Н"] = "N",  ["н"] = "n",
        ["Њ"] = "NJ",  ["њ"] = "nj",
        ["О"] = "O",  ["о"] = "o",
        ["П"] = "P",  ["п"] = "p",
        ["Р"] = "R",  ["р"] = "r",
        ["С"] = "S",  ["с"] = "s",
        ["Т"] = "T", ["т"] = "t",
        ["Ћ"] = "Ć",  ["ћ"] = "ć",
        ["У"] = "U",  ["у"] = "u",
        ["Ф"] = "F",  ["ф"] = "f",
        ["Х"] = "H", ["х"] = "h",
        ["Ц"] = "C",  ["ц"] = "c",
        ["Ч"] = "Č",  ["ч"] = "č",
        ["Џ"] = "DŽ", ["џ"] = "dž",
        ["Ш"] = "Š", ["ш"] = "š",
        }
     }
  }
}

[-- Attachment #3: Type: text/plain, Size: 493 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

             reply	other threads:[~2022-02-03 19:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-03 19:15 Ivan Pešić via ntg-context [this message]
2022-02-03 20:41 ` Transliteration Hans Hagen via ntg-context
2022-02-03 21:01   ` Transliteration Mojca Miklavec via ntg-context
2022-02-03 21:11     ` Transliteration Hans Hagen via ntg-context

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0e09d5f-0198-8755-e730-922f5a0b3898@gmail.com \
    --to=ntg-context@ntg.nl \
    --cc=ivan.pesic@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).