On sorting/collation, from https://pandoc.org/installing.html (note that you’ll have to compile pandoc-citeproc yourself; it’s not contained in the precompiled packages): > By default pandoc-citeproc uses the "i;unicode-casemap" method to sort > bibliography entries (RFC 5051). If you would like to use the > locale-sensitive unicode collation algorithm instead, specify the > `unicode_collation` flag: > cabal install pandoc-citeproc -funicode_collation > Note that this requires the text-icu library, which in turn depends on > the C library icu4c. Installation directions vary by platform. Here is > how it might work on macOS with Homebrew: > brew install icu4c > cabal install --extra-lib-dirs=/usr/local/Cellar/icu4c/51.1/lib \ > --extra-include-dirs=/usr/local/Cellar/icu4c/51.1/include \ > -funicode_collation text-icu pandoc-citeproc Some of these paths on longer seem to be accurate, and I’m using stack rather than cabal, so my current incantation on macOS is: ``` stack install pandoc-citeproc \ --flag "pandoc-citeproc:unicode_collation" \ --extra-lib-dirs=/usr/local/opt/icu4c/lib \ --extra-include-dirs=/usr/local/opt/icu4c/include ``` I’ve been using this for quite a while now, and the resulting collation in, e.g., Danish, German, Spanish, seems to be absolutely correct. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Saturday, September 7, 2019 10:10 AM, BPJ wrote: > Oops, forgot the link: > > https://metacpan.org/pod/Unicode::Collate > > Den lör 7 sep. 2019 12:05BPJ skrev: > >> I just realized two things which make matters much worse: >> >> 1. Not all publications accept the same transliteration schemes. Just by surveying one author's references to his own works in one bibliography I find that his surname, Яблонский, can be transliterated in five different ways (although two predominate)! So I'll need both a `transliterated title` field and a field `transliterated authors` field with (in each item) a mapping of alternative transliterations. Even Icelandic needs to be transliterated sometimes, e.g. Þórður becoming Thórdur (with data loss!) >> >> 2. Sorting. Latin letters like _č, š, ž_ need to sort as _c, s, z_ and probably _Þ_ must sometimes sort like _Th_ and sometimes after _z_! This needs sometimes tailored locale dependent sorting! Accented letters can ideally be handled by entering things in NFC and hoping that sort algorithms ignore combining marks, but then e.g. in Scandinavian languages _ö_ sorts not as _o_ but at the end of the alphabet (ideally _þ, æ, ø, å, ä, ö_ go at the end of the alphabet in that order, but often _æ/ä, ø/ö_ are conflated either before or after _å_!). Anyway it seems CSL has no customizable sort key field. I know how to handle these things myself with [Unicode::Collate][] but that at least means some postprocessing of as yet unknown complexity. >> >> Den ons 4 sep. 2019 09:33BPJ skrev: >> >>> Does anyone know how to handle transliterated titles and names in >>> citations, when you want to include both the transliteration and the >>> original? Does CSL have any fields for that? >>> >>> TIA, >>> >>> /bpj > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDm8bibSjJfrM6W69qM_j1N9tPHEgRwaic4bZmrsB1CVw%40mail.gmail.com](https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDm8bibSjJfrM6W69qM_j1N9tPHEgRwaic4bZmrsB1CVw%40mail.gmail.com?utm_medium=email&utm_source=footer). -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fIV6s9kzbISHgHoeDhgEEA9GgrDoyN6BgQZqqDi3ZqPGCKYi77R0CfaRAzDizlcsyySuTW-hUJ09r490q5YnNTuTK0A-Ag8aRrD7Fca9gfM%3D%40protonmail.com.