From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28104 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Error compiling with icu support / possible workaround? Date: Wed, 07 Apr 2021 18:41:57 -0700 Message-ID: References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8850"; mail-complaints-to="usenet@ciao.gmane.io" To: 'Nick Bart' via pandoc-discuss , "pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org" Original-X-From: pandoc-discuss+bncBCJZJHG45QDBB456XGBQMGQE6QJCSXY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 08 03:42:15 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pg1-f187.google.com ([209.85.215.187]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lUJgU-0002Bw-6P for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 08 Apr 2021 03:42:14 +0200 Original-Received: by mail-pg1-f187.google.com with SMTP id a128sf442776pgc.9 for ; Wed, 07 Apr 2021 18:42:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1617846133; cv=pass; d=google.com; s=arc-20160816; b=zvsRFw51y00kZ657lSq/+WMXaglwJnEYUwIsOR4gdRH/Lh2D0uLa45NVULmoHlrELL OIW1iuOh9XcrvwZLJFlXKqaW8H99vPMzN9baBrEOpUtGQfBYbNuR+C7zZd8/07nrAmlW 8RmFU6EskvYncM1TVjIhHXs8XULQCu8SVxKCFc6bweCwRXLQGpZi3cZs9JpVQZ9RqjCn aodPuMyodKyjBOYidiytsnV2cUSYa2fkYM7HFQanXOYp87uqgc9xVuefVUaSLkvnoxSr W/wh27Hozd4tpXceruswUl6E1y8P91RwL+v0EodpqSwXeQph4JImdI+D7Knb3SbSIFBJ BKDw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-transfer-encoding :mime-version:message-id:date:references:in-reply-to:subject:to:from :sender:dkim-signature; bh=H+bzbuIenTjsk40XmUwcXMkkzbpiCNRCMZPmETTY84k=; b=VLlfzPvKs/RaFuI/tfE1jze1sYxunzQKkRYSDi6/3EWmtlIkZGSFMRlbPVISzwEQfS NJgLO7axZEdi43hXP4DCBUwjabFvSd3F+qN1Q7fmZ16VHPpiI8h++tuxK+355bVD6D8a pSd/093A5xvm2jgi9Rr3TwbU4KSRP8KRq29s/81vlZmTlEhsBdwqugAhtAdMuivTte0t lDZ8FggfBny9J1MNehcX3a6moiJ7d2nAeroX18L8FdnkNvgA3Df5qUVkUS/JVJMfa3iP 45l6Xnk9Ugkl29fQ+vPKmFmCs5CAj34crv0r/yXvSXQhB4TTDO6DbyngO9VUqf5/V8hX +fog== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=yGgIIs8k; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52a as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=H+bzbuIenTjsk40XmUwcXMkkzbpiCNRCMZPmETTY84k=; b=YByQlnFDIitzpxQ4hLO9dJyPzsq0KLLYiKeyGCgywpQg5Gsbhlj/y8rHlK9KJ+9YCH s4IWPizrUvk8TvnoJgNXCgyb74EIZH9uGJXG97sG0w5GoOBxdGJSFLBMUk2RQwtAFrMc QdDf5xbg51TxVvdxfHUIrS5JJeFa60fli48MH2mDd4iuwPX1gg51NIMRM2v0uXx3S14r SeiX8E1W12erILafdS1IcomTfhQqR3hM2/8IkMSOyEZsnvWDh3MYYURF8cCnosqZvGZ9 KwCDLmOBmHOp2u4mrgJN4cOwFMdAP8VMgcaZFnKZcQ/UM8ZdWoiL9EzdMd4mqfEY5UBZ Wirg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:content-transfer-encoding :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=H+bzbuIenTjsk40XmUwcXMkkzbpiCNRCMZPmETTY84k=; b=AYjfhryJe91w2wyePXpW0MNkqxpd+AjOu47Ikes1qhpAMgknGy3AOzQt84UK6gplqf nJUy+19527zLLzJwj/l0qfK16Vood25hA/KXj1Qhao0/Xwkq+kuJHo95sF09XvNI5rkF 5zornNFOHe5nUxk6CcJ97Iq0ozrLTGryXPvBYNtbko29nIgBpegHF0N4SjtUJZFpFBU4 J0OcvlakhL1zOfHps78fJW+qELO8w2ahZgKEuCLSR1AAaeQajj4yl1WtfIMVgfQcLJjg ycoS1AisF+GU6oY0Gdl7eH4BwuCnOLy+cHy3DhK5xM8Bhjf79JbWCCnAAx8jfYZfRqFW Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533IgYRxzKE/M83fMyUnt0hdyYj8ARSVpEIyvsreLakmpbSZVD/E OUN3L1AmnHC/ykYaNqKOOJw= X-Google-Smtp-Source: ABdhPJyX6A8nq9VXdArvEyan//sqVjpl9ayQ/NHSQTobmOIAcxKvdGRqcOyF4uJRbOpxJXjs7FK8Pw== X-Received: by 2002:a17:902:c408:b029:e7:3242:5690 with SMTP id k8-20020a170902c408b02900e732425690mr5607962plk.85.1617846132943; Wed, 07 Apr 2021 18:42:12 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a17:902:144:: with SMTP id 62ls1788561plb.11.gmail; Wed, 07 Apr 2021 18:42:10 -0700 (PDT) X-Received: by 2002:a17:90b:4c0c:: with SMTP id na12mr5671899pjb.117.1617846130763; Wed, 07 Apr 2021 18:42:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617846130; cv=none; d=google.com; s=arc-20160816; b=qBasqvx9RYqGXNovEUZLcnWo18ieUPAY+wID/BxTqriCrQpvoMAf8Bw85WJIyE4g5l UNr2YpZ/IpLc2VGMjdkWz7o9NBFKifoxvOpQlpuVrdTiwoJC3ATKPiZaepV3DTkNCyH+ YuzIHWAwOKrtrA2eenVyBbnAbSdeSfAB/RbyZvV7IVOOYM8KxY7wvm01EMb79t1Q8K7w 6Q3P0S//4P5TooJBHw8JDrUICyPS84k7FRerNiPTfO4hFeg8Y+Q8nZ1FHnFMAMZhGdv0 Yf7a7jDRSc0VRhgTG6w/PHQss8fWFkE+RJmyesE70FQYWPJ+TE7O4d4Q8QAXgtfLg/jP APOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:to:from:dkim-signature; bh=YXBzLT168SRg2VnOK/U6YC9Ma2L+cV26ZG/WYhwfFdM=; b=UlWVcqALZeNLvcT8jxeJy62xiJvL8NVVVWroaYKiFuvtI0fjO/juFseD8jtx8TWKW3 fi4UGo9IVV73Dvvc72MyKrkpbtMviw/0h0c/SeMISO0wqkxOpTxMkgA7zLvyaMTEOypY MF1npIKEMdUDU/c6E/NqvKQGEv3LweLpYHqykpL8/mWTNd0oAY+qHIwfFoTr16uDU10D s/O0fXs2ywzi6USjbyrc0w1iRlSOtAL9cfqhym1FK5UcckDyy9NfvAyUud1sqprWvvQV rXcTnzJz9Doz0sBshpCm7ypAlViq0O3r8qlFWWAxUu0fjAZJMW9V5QYFSlERTITfgBGp +7+A== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=yGgIIs8k; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52a as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com. [2607:f8b0:4864:20::52a]) by gmr-mx.google.com with ESMTPS id w16si623759pjq.3.2021.04.07.18.42.10 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Apr 2021 18:42:10 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52a as permitted sender) client-ip=2607:f8b0:4864:20::52a; Original-Received: by mail-pg1-x52a.google.com with SMTP id l76so218727pga.6 for ; Wed, 07 Apr 2021 18:42:10 -0700 (PDT) X-Received: by 2002:a63:f546:: with SMTP id e6mr5915965pgk.299.1617846129839; Wed, 07 Apr 2021 18:42:09 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id e1sm15691307pgl.25.2021.04.07.18.42.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Apr 2021 18:42:09 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id C1520A231; Wed, 7 Apr 2021 21:41:57 -0400 (EDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=yGgIIs8k; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52a as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28104 Archived-At: I note that data/collation/fr_CA.xml has [backwards 2] and data/collation/fr.xml does not. 'backwards 2' says to sort the second-level collation elements backwards; that's what the "French accents" option does. So that explains the perl script's behavior; it is faithfully following the locales, which specify this for Canadian French but not European French. My parser for collation files currently does nothing with the `[backwards 2]`, but maybe it's something I should implement. "'Nick Bart' via pandoc-discuss" writes: > Bastien, BJP - many thanks, that=E2=80=99s helpful. Still, the main pract= ical question, > I guess, is whether the default sort order the "new" pandoc generates for= French > - either with or without the "optFrenchAccents" modification - is accepta= ble > from the point of view of a native speaker of French or not, and if not, = what > you would suggest instead. > > > As to multiple collations, I commented earlier: > >> ... I tend to think that the default collation (which usually seems to f= ollow >> the most recent rules for a given language) would usually be sufficient. > > That being said, it seems that most of the information (in > https://github.com/jgm/unicode-collation/tree/main/data) and, I assume, > infrastructure for supporting different collation systems for a given lan= guage is > in place already, so the following might be worth a try: > > pandoc is relying on IETF BCP 47 language tags anyway > [https://tools.ietf.org/rfc/bcp/bcp47.txt]. > > A number of locale attributes contained in the Common Locale Data Reposit= ory > (CLDR), including those pertaining to collation, can be expressed as exte= nsions > to "simple" language tags of the form "en-US". > > IETF BCP 47 Extension U (Unicode Locale) is described in RFC 6067 > [https://tools.ietf.org/html/rfc6067]. Relevant quote: > >> For example, the language tag "de-DE-u-attr-co-phonebk" consists of: >> >> o The base language tag "de-DE" (German as used in Germany), exactly= as >> defined by [BCP47] using subtags from the IANA Language Subtag Regist= ry. >> >> o The singleton 'u', identifying this extension. >> >> o The attribute 'attr', which is an example for illustration (no >> attributes were defined at the time this document was published). >> >> o The keyword 'co-phonebk', consisting to the key 'co' (Collation) a= nd the >> type 'phonebk' (Phonebook collation order). > > On IETF BCP 47 extensions, see also > https://www.w3.org/International/articles/language-tags/#extension. > > So if this does not appear too difficult, it might provide a lot of addit= ional > flexibility if pandoc were to support the particular subset of "Extension= U" > strings pertaining to collation, i.e., those starting with "-u-co-" in pa= ndoc's > "lang" metadata field, or command line argument. (In the absence of such = a string, > pandoc should of course use the default collation order.) > > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgi= d/pandoc-discuss/fkSA06gm5QfCBknaCRunOSZwTsdOX6DMRGx0IQVOs9yszm16IeaCsTwX_c= V-nhZ1kQ0LDEkxylV4IKJzSuiZbkjx3HSyD2NLgJTkW9DQB6U%3D%40protonmail.com. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/m2wntdo8m2.fsf%40MacBook-Pro.hsd1.ca.comcast.net.