public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: BPJ <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Cc: Bastien DUMONT <bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org>
Subject: Re: Error compiling with icu support / possible workaround?
Date: Wed, 7 Apr 2021 09:52:59 +0200	[thread overview]
Message-ID: <CADAJKhBpFS7Mq7NriLc8wexqwwLsEy+9OmBiNWbPaMgYKy8jbw@mail.gmail.com> (raw)
In-Reply-To: <m2h7kjoueo.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5481 bytes --]

I tried this out with the latest Unicode::Collate::Locale

<
https://metacpan.org/pod/release/SADAHIRO/Unicode-Collate-1.29/Collate/Locale.pm
>

With all of fr_FR fr_CA fr_BE fr_Ch and both Normalization Form C and
Normalization Form D and it turns out that fr_CA actually is different!

Locale: fr_FR; getlocale: default
Normalization: NFC
Sorted: cote coté côte côté
Normalization: NFD
Sorted: cote coté côte côté
Locale: fr_CA; getlocale: fr_CA
Normalization: NFC
Sorted: cote côte coté côté
Normalization: NFD
Sorted: cote côte coté côté
Locale: fr_BE; getlocale: default
Normalization: NFC
Sorted: cote coté côte côté
Normalization: NFD
Sorted: cote coté côte côté
Locale: fr_CH; getlocale: default
Normalization: NFC
Sorted: cote coté côte côté
Normalization: NFD
Sorted: cote coté côte côté

If you want to try the script you will need to install the Unicode::Collate
CPAN distribution first, and perl if you are not on a Unixy system. See:

<http://www.cpan.org/modules/INSTALL.html>

<https://www.perl.org/get.html>

I recommend Strawberry Perl on Windows.

Den ons 7 apr. 2021 01:39John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> skrev:

>
> I just checked my 2006 Le Robert Micro: it has
>
> cote < côte < côté
>
> coté appears as a subheading of cote, so I'm not sure it's
> clear from this how it is to be ordered.  Not inconsistent
> with the French Academy anyway.
>
> Bastien DUMONT <bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org> writes:
>
> > Hi,
> >
> > Honestly, these are such subtleties that, as a native French speaker, I
> have no precise ideas about it. I would say that accents are only a
> secondary criterium for sorting (cote < côte < coteau). Actually the
> Wikipedia page about the French alphabet agrees with that: "diacritics and
> ligatures are taken into account only at a third level, after the second
> level (case). [...] In Quebec French diacritics are considered more
> important than case." (I hope my translation is not too bad.) Unfortunately
> they give no reference. As for the "last syllable" rule, I have never heard
> of it, but the French Academy's dictionary online has cote < côte < coté <
> côté (https://www.dictionnaire-academie.fr/article/A9C4445?history=2).
> Anyway I guess that it rarely applies. I will check a recent Robert
> whenever possible (maybe tomorrow): they introduced a lot of changes in
> 2010.
> >
> > The French Association for Normalization produced a norm in 1969 about
> proper names' sorting, but it is behind a paywall and I am not sure that it
> is really in use.
> >
> > Cheers,
> >
> > Bastien
> >
> > Le Tuesday 06 April 2021 à 04:42:40PM, 'Nick Bart' via pandoc-discuss a
> écrit :
> >> Concerning French, I checked a few more sources, and some of them seem
> to hold different views on French collation:
> https://fr.wikipedia.org/wiki/Alphabet_fran%C3%A7ais states that
> diacritics should be disregarded when sorting, except in Quebec French,
> where accented characters are to appear after their unaccented
> counterparts. No "last syllable" rule is mentioned at all. In addition, in
> a printed French dictionary, Le Nouveau Petit Robert (1994), I couldn’t
> find any explicit rules on sorting, but entries are ordered "cote < coté <
> côte < côté". Hopefully some native speakers of French will chime in here.
> >>
> >> As to supporting multiple collations, I tend to think that the default
> collation (which usually seems to follow the most recent rules for a given
> language) would usually be sufficient.
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/lIJvVkf_iXceir6oyQVnvHDTXlTIgech_5Trj2TRBY6uBZ_AnU8ghvMV6not9E_QSwG0BhZJUnHprUcIN8UlAKrUw7DzQF5-ZpIki3TC74Q%3D%40protonmail.com
> .
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/YGylIXTe6M3FSBIl%40localhost
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/m2h7kjoueo.fsf%40MacBook-Pro.hsd1.ca.comcast.net
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhBpFS7Mq7NriLc8wexqwwLsEy%2B9OmBiNWbPaMgYKy8jbw%40mail.gmail.com.

[-- Attachment #1.2: Type: text/html, Size: 8282 bytes --]

[-- Attachment #2: french-sorting.pl --]
[-- Type: text/x-perl, Size: 705 bytes --]

#!/usr/bin/env perl

use 5.014;
# use utf8;
use utf8::all;
use strict;
use warnings;
use warnings FATAL => 'utf8';
use autodie;

# use open qw[ :utf8 :std ];

use Unicode::Collate::Locale;

my @nfd = qw[ côté cote côte coté ];
my @nfc = qw[ côté cote côte coté ];

my @locales = qw[ fr_FR fr_CA fr_BE fr_CH ];

my @norms = (
  [ NFC => \@nfc ],
  [ NFD => \@nfd ],
);

for my $locale ( @locales ) {
  my $coll = Unicode::Collate::Locale->new(locale => $locale);
  say sprintf "Locale: $locale; getlocale: %s", $coll->getlocale;
  for my $norm ( @norms ) {
    my($name, $words) = @$norm;
    say "Normalization: $name";
    my @sorted = $coll->sort(@$words);
    say "Sorted: @sorted";
  }
}


  parent reply	other threads:[~2021-04-07  7:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-21 13:04 'Nick Bart' via pandoc-discuss
2021-03-22  5:55 ` John MacFarlane
     [not found]   ` <m25z1jpw9n.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-03-22 20:29     ` jcr
     [not found]       ` <5035db2e-16b9-4923-8e38-d95b81d27840n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-03-23 19:04         ` John MacFarlane
     [not found]           ` <m2o8f9ofmw.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-03-23 19:53             ` 'Nick Bart' via pandoc-discuss
2021-03-25 19:45               ` John MacFarlane
     [not found]                 ` <m2pmznm2zk.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-04 18:52                   ` John MacFarlane
     [not found]                     ` <m2sg457ugn.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-05 23:17                       ` John MacFarlane
     [not found]                         ` <m21rbos4nd.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-06  9:21                           ` 'Nick Bart' via pandoc-discuss
2021-04-06 16:18                             ` John MacFarlane
     [not found]                               ` <m27dlfqtd1.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-06 16:42                                 ` 'Nick Bart' via pandoc-discuss
2021-04-06 18:14                                   ` Bastien DUMONT
2021-04-06 23:38                                     ` John MacFarlane
     [not found]                                       ` <m2h7kjoueo.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-07  7:52                                         ` BPJ [this message]
     [not found]                                           ` <CADAJKhBpFS7Mq7NriLc8wexqwwLsEy+9OmBiNWbPaMgYKy8jbw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-04-07  9:37                                             ` BPJ
2021-04-07  9:35                                   ` 'Nick Bart' via pandoc-discuss
2021-04-07 10:02                                     ` Bastien DUMONT
2021-04-07 12:32                                     ` BPJ
2021-04-08  1:41                                     ` John MacFarlane
     [not found]                                       ` <m2wntdo8m2.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-08  2:23                                         ` John MacFarlane
     [not found]                                           ` <m2o8epo6p8.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-08  7:12                                             ` Bastien DUMONT
2021-04-09 15:34                                             ` John MacFarlane
2021-03-22  5:59 ` John MacFarlane
     [not found]   ` <m235wnpw3l.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-03-22  6:08     ` John MacFarlane
     [not found]       ` <m2wntzoh3n.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-03-22 14:29         ` 'Nick Bart' via pandoc-discuss
2021-04-17 23:19           ` John MacFarlane
     [not found]             ` <m2eef8ebyx.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-19  9:54               ` 'Nick Bart' via pandoc-discuss
2021-04-19 11:10                 ` Bastien DUMONT
2021-04-19 12:56                   ` 'Nick Bart' via pandoc-discuss
2021-04-19 13:16                     ` Bastien DUMONT
2021-04-19 16:19                       ` John MacFarlane
2021-04-19 16:16                 ` John MacFarlane
     [not found]                   ` <m235vmdzbh.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-19 16:31                     ` 'Nick Bart' via pandoc-discuss
2021-04-19 18:08                       ` John MacFarlane

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADAJKhBpFS7Mq7NriLc8wexqwwLsEy+9OmBiNWbPaMgYKy8jbw@mail.gmail.com \
    --to=bpj-j3h7gcxpsitlodktgw+v6w@public.gmane.org \
    --cc=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).