From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28086 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Error compiling with icu support / possible workaround? Date: Tue, 06 Apr 2021 09:18:34 -0700 Message-ID: References: <5035db2e-16b9-4923-8e38-d95b81d27840n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17847"; mail-complaints-to="usenet@ciao.gmane.io" To: 'Nick Bart' via pandoc-discuss , "pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org" Original-X-From: pandoc-discuss+bncBCJZJHG45QDBB2ETWKBQMGQE5K53MFA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Apr 06 18:18:50 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qv1-f57.google.com ([209.85.219.57]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lToPi-0004V8-Sn for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 06 Apr 2021 18:18:50 +0200 Original-Received: by mail-qv1-f57.google.com with SMTP id i7sf4884033qvj.1 for ; Tue, 06 Apr 2021 09:18:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1617725930; cv=pass; d=google.com; s=arc-20160816; b=yXiEwLC8mIG2oMAftx1YCOCj4DTjVN10GQzuLKZYAx/eXTo8mOv2ZMazlfQdSbR7JA QovqitsYa8rtpm0SQHNUwGZmaKBSYt9n01Hag6wxw9RVWG+M4rMyYe4By+A4HEMZFSTb GJ2aYwQfWiK0JVzfk4yf2W7XSZ+p36FmfUmi90cTUToycWPi7GYKXCEVDjOm6R4UaXMW O3zaPW2B/fFIt1kGyvLktBNWyKBoNOHV/QB/mVz/xd6GE08+acFmFALOJP8CyZJ2/hC6 C651iKxY/s48mzXFDsGbfgVnRInOMky62sSbWkkPoWgR0eE85s27Q8dSsX0NWnJC4KwN iccQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-transfer-encoding :mime-version:message-id:date:references:in-reply-to:subject:to:from :sender:dkim-signature; bh=1B1B4p0FpyOH2mxbrXe37Dc/46SYpqLXZAM1uFL/Zn0=; b=SmQ1LbK+r56sY2GEbxffeQvN2n9HECmb+AxNr4Gkev5e3RWkmVbpHOBpkBbeIYX5vG O9yShaJUVn/34vCYaQATx4BF2jjBkrvnbQ1huQwrzf59dbsRUdWMpqFoKTs4tpuuAGoC 6e6yeq8NHqn+nMXdn+LbS4z7KhaMNMPuhmbXhLxgOsDXywVmSzXOtif0MeZttsVpNHlg 1iBf/wWga63ZqFcfN+3GJzg34820fv5y18N52W70eyGuOLYFngNSLOvKTGBxig3Fet7b Zb+0qQ+twlRWFu1dHh/Nk+zwCdeWHgF8YCP5NP0ZVhdrH87F/UaT6OZw3wZf1q6l426O x2Rw== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=PeCgBWF3; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::102d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=1B1B4p0FpyOH2mxbrXe37Dc/46SYpqLXZAM1uFL/Zn0=; b=LkgzzMOnoo9H9oCx2CjBmqJZHPKXCTmwD6RJazrnywDBwLrggjnyPeaCsYFpk3TnM4 OdlCLKM5n2jFyY3Xa4pNSRRJMro+yNSRclLJWTpbWkz/TrpX6bNn5QEdGKpPLmvx+K++ oNR1PKgEFcl8hGxdxrR8aQowFinhl0ypvVJiJkEYab6/DQHoHSYXmswVKctPs5v0u9r1 twXhgfukoZQDGfFildAFJZKHBrVPCZipt4VPcvSaEKc9FsvaSMOTiE/3YE6mT1K5GV08 L74ibaeUAfNxfa8vAP4fhzd4bwI2s+F36SxmjuUSpvRMPSU2B/U7LfeWFeYn937RiPa0 01eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:content-transfer-encoding :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=1B1B4p0FpyOH2mxbrXe37Dc/46SYpqLXZAM1uFL/Zn0=; b=onspOH5TAhQI/cXZZKVUR/HFXHo6u1yL8fFKWWQcsx761wMn4M8ADUrXwhp13zvTZl QyrtSuhzZVNvP81HTfPTVdmSgC8a7ZGQsUULgXXz87TF2cTyt8QONj8AhqPxnjlFbYvB eQSXc/FtSC5dQ3hscq5GLKiFdye3NRoQKj3wov9ef6Eg/zxhue7OsSSb+WPSNBlvY65j hcB0wMXGo93bnfQjV1IUjsxJ1A+SHHSf9LTON1UlZpOm8ZlZGrQyj19t8jE2JzAdJj8b RpRwE+nQOFzrPPUpcf4Gk8y2PyMolYec7d+Y6uc6qF43TJ/JuSYcA5NAmAbzvxQMEqrp Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM532ThdsPMZ5tIJlM62Lf+4xoiLKD7B3+npF+Ucje8zPlsuD/uAw6 yLRRjRC3mWkgVJPhNpgTY9o= X-Google-Smtp-Source: ABdhPJz/25LrF2aH6FoOFxeHpFQzoK0rHfDuIbSi9CzegDNwy4mDqpnADrwAFlfW4obTAWEwWljFMQ== X-Received: by 2002:a05:620a:1497:: with SMTP id w23mr30141904qkj.260.1617725930074; Tue, 06 Apr 2021 09:18:50 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a37:c207:: with SMTP id i7ls10862051qkm.9.gmail; Tue, 06 Apr 2021 09:18:48 -0700 (PDT) X-Received: by 2002:a37:9dc9:: with SMTP id g192mr29370006qke.258.1617725928629; Tue, 06 Apr 2021 09:18:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617725928; cv=none; d=google.com; s=arc-20160816; b=K5ZRUYjODjzTLz5iAcYt2U5/I2euM1+88FSl76Yf7zEwezycuwFUlhScmgORP2doyl hWjRsrhP798ELE/DPODi1VVVock8Utt/YVrwIeQhXDqoHWfk7TxgNzyiwuhDsQxJbRd1 nMxSEMnTOjrdsDDhrdPWAyiS0MNvCpMFgJe1rjMwxRLM05E1RCtgWHJ8XBrz2QbNE1rA QYYJ+RHIt+48806nkYzccHxGKoToqZFAykfbGJcB1xq70c9jcu0w6opt+jzViidfr582 /Sx6IAY/2pSz/s+0if6gzBR2G2h3qHNVvT3T9mPJ9/i95T4Lxt730ibiyByu+SJCNAKe eXuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:to:from:dkim-signature; bh=jEwpXguLe4ifHcY9zk5tqJXbMRKP2AzThc4H/KsOyLo=; b=iZG+z+H0vSokP0VpSSmA+U096P5LZ6pkq0ZczbWf48Zc0SnTluF4BSBSazNHz5wGiO JkraSbFN+8SrJpK1M+5v/Hgvv8SwpCr80+1UOXVpbpc73t76AOjF0r7f0Z5T6+w/84qc oXpCDTI7qpoWropFnQbJwN106DdbYKHzW4iMcWQicD1ixfdj7R3M6xBfNDSyIE6EmxNu qzkUlH/sSTk7KmSAA8yLZSVJBF/Niof6k036/3CJOPar1M0SX+PCgzyj6SUxcx9sGZ4W ViyOp05QBY9ltAPyfiN4KQWEZqSA6e9SImnV8Y7ZOKj/6abgRsx/spFh+DhnHsZ6FPQa YMyg== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=PeCgBWF3; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::102d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com. [2607:f8b0:4864:20::102d]) by gmr-mx.google.com with ESMTPS id k1si3557479qtg.2.2021.04.06.09.18.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Apr 2021 09:18:48 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::102d as permitted sender) client-ip=2607:f8b0:4864:20::102d; Original-Received: by mail-pj1-x102d.google.com with SMTP id mj7-20020a17090b3687b029014d162a65b6so4027073pjb.2 for ; Tue, 06 Apr 2021 09:18:48 -0700 (PDT) X-Received: by 2002:a17:902:7d95:b029:e7:3780:3c5f with SMTP id a21-20020a1709027d95b02900e737803c5fmr29484952plm.59.1617725927542; Tue, 06 Apr 2021 09:18:47 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id w6sm2941797pjl.49.2021.04.06.09.18.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Apr 2021 09:18:46 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 32FE6A231; Tue, 6 Apr 2021 12:18:35 -0400 (EDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=PeCgBWF3; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::102d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28086 Archived-At: I've added optFrenchAccents to unicode-collation, and the latest pandoc in the unicode-collation branch fixes the issue you identified with French accents. Currently I enable this whenever lang is "fr" -- but I don't know if that's right; maybe some French-speaking countries don't do this? More testing welcome, esp. with non-latin alphabets -- I don't have a good stock of samples for those. One further point: unicode-collation supports multiple collations for a given language (e.g. es vs. es/traditional, which sorts some letter combinations differently). I don't know whether it's worth providing a way for pandoc and citeproc to use these, or if we can always use the default collation for a language. "'Nick Bart' via pandoc-discuss" writes: > The two examples from you latest post look ok to me - and, if further con= firmation for "es" should be needed, the ICU4C Demo at https://icu4c-demos.= unicode.org/icu-bin/collation.html generates the same sort order with any o= f the three "es" variants they offer. > > What doesn=E2=80=99t look right so far is sorting according to French rul= es: https://en.wikipedia.org/wiki/Alphabetical_order#Language-specific_conv= entions claims "For French, the last accent in a given word determines the = order.[13] For example, in French, the following four words would be sorted= this way: cote < c=C3=B4te < cot=C3=A9 < c=C3=B4t=C3=A9." > > https://icu4c-demos.unicode.org/icu-bin/collation.html (which for some re= ason offers "fr-CA" only) generates the same sort order ("cote < c=C3=B4te = < cot=C3=A9 < c=C3=B4t=C3=A9"). > > However, using the "new" pandoc branch with the following example: > > ``` > pandoc -C -t plain << EOT > > Expected: > cote > c=C3=B4te > cot=C3=A9 > c=C3=B4t=C3=A9 > > --- > nocite: '@*' > lang: fr > references: > - id: cote > author: cote > - id: c=C3=B4te > author: c=C3=B4te > - id: cot=C3=A9 > author: cot=C3=A9 > - id: c=C3=B4t=C3=A9 > author: c=C3=B4t=C3=A9 > ... > EOT > ``` > > I get this sort order instead: > > ``` > Expected: cote c=C3=B4te cot=C3=A9 c=C3=B4t=C3=A9 > > cote. s.=C2=A0d. > > cot=C3=A9. s.=C2=A0d. > > c=C3=B4te. s.=C2=A0d. > > c=C3=B4t=C3=A9. s.=C2=A0d. > ``` > > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgi= d/pandoc-discuss/ztOzz7OZvq0y49K9g2Rbuj3fXNL05TinB60Ntkc0jVom24XTwQenCasydv= kGxZPka8jEUD-3b-U2dM-fi-tnxxGIr2NDErxSfMFBEVekK7I%3D%40protonmail.com. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/m27dlfqtd1.fsf%40MacBook-Pro.hsd1.ca.comcast.net.