I updated my script to be configurable so that you can try various locales, normalization forms and lists of words with perl/Unicode::Collate::Locale/Unicode::Normalize. Info on required CPAN modules/perl version are in a comment at the top of the file. After installing the requirements use the --help option for usage instructions. Den ons 7 apr. 2021 09:52BPJ skrev: > I tried this out with the latest Unicode::Collate::Locale > > < > https://metacpan.org/pod/release/SADAHIRO/Unicode-Collate-1.29/Collate/Locale.pm > > > > With all of fr_FR fr_CA fr_BE fr_Ch and both Normalization Form C and > Normalization Form D and it turns out that fr_CA actually is different! > > Locale: fr_FR; getlocale: default > Normalization: NFC > Sorted: cote coté côte côté > Normalization: NFD > Sorted: cote coté côte côté > Locale: fr_CA; getlocale: fr_CA > Normalization: NFC > Sorted: cote côte coté côté > Normalization: NFD > Sorted: cote côte coté côté > Locale: fr_BE; getlocale: default > Normalization: NFC > Sorted: cote coté côte côté > Normalization: NFD > Sorted: cote coté côte côté > Locale: fr_CH; getlocale: default > Normalization: NFC > Sorted: cote coté côte côté > Normalization: NFD > Sorted: cote coté côte côté > > If you want to try the script you will need to install the > Unicode::Collate CPAN distribution first, and perl if you are not on a > Unixy system. See: > > > > > > I recommend Strawberry Perl on Windows. > > Den ons 7 apr. 2021 01:39John MacFarlane skrev: > >> >> I just checked my 2006 Le Robert Micro: it has >> >> cote < côte < côté >> >> coté appears as a subheading of cote, so I'm not sure it's >> clear from this how it is to be ordered. Not inconsistent >> with the French Academy anyway. >> >> Bastien DUMONT writes: >> >> > Hi, >> > >> > Honestly, these are such subtleties that, as a native French speaker, I >> have no precise ideas about it. I would say that accents are only a >> secondary criterium for sorting (cote < côte < coteau). Actually the >> Wikipedia page about the French alphabet agrees with that: "diacritics and >> ligatures are taken into account only at a third level, after the second >> level (case). [...] In Quebec French diacritics are considered more >> important than case." (I hope my translation is not too bad.) Unfortunately >> they give no reference. As for the "last syllable" rule, I have never heard >> of it, but the French Academy's dictionary online has cote < côte < coté < >> côté (https://www.dictionnaire-academie.fr/article/A9C4445?history=2). >> Anyway I guess that it rarely applies. I will check a recent Robert >> whenever possible (maybe tomorrow): they introduced a lot of changes in >> 2010. >> > >> > The French Association for Normalization produced a norm in 1969 about >> proper names' sorting, but it is behind a paywall and I am not sure that it >> is really in use. >> > >> > Cheers, >> > >> > Bastien >> > >> > Le Tuesday 06 April 2021 à 04:42:40PM, 'Nick Bart' via pandoc-discuss a >> écrit : >> >> Concerning French, I checked a few more sources, and some of them seem >> to hold different views on French collation: >> https://fr.wikipedia.org/wiki/Alphabet_fran%C3%A7ais states that >> diacritics should be disregarded when sorting, except in Quebec French, >> where accented characters are to appear after their unaccented >> counterparts. No "last syllable" rule is mentioned at all. In addition, in >> a printed French dictionary, Le Nouveau Petit Robert (1994), I couldn’t >> find any explicit rules on sorting, but entries are ordered "cote < coté < >> côte < côté". Hopefully some native speakers of French will chime in here. >> >> >> >> As to supporting multiple collations, I tend to think that the default >> collation (which usually seems to follow the most recent rules for a given >> language) would usually be sufficient. >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> >> To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/lIJvVkf_iXceir6oyQVnvHDTXlTIgech_5Trj2TRBY6uBZ_AnU8ghvMV6not9E_QSwG0BhZJUnHprUcIN8UlAKrUw7DzQF5-ZpIki3TC74Q%3D%40protonmail.com >> . >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/YGylIXTe6M3FSBIl%40localhost >> . >> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/m2h7kjoueo.fsf%40MacBook-Pro.hsd1.ca.comcast.net >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDZHQYcZQog7i3DiwFG%3D2T3WeefE_w3hUbfrq0o1FEiYQ%40mail.gmail.com.