From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28105 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Error compiling with icu support / possible workaround? Date: Wed, 07 Apr 2021 19:23:15 -0700 Message-ID: References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21565"; mail-complaints-to="usenet@ciao.gmane.io" To: 'Nick Bart' via pandoc-discuss , "pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org" Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBIOSXGBQMGQERLHNFCA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 08 04:23:32 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f62.google.com ([209.85.210.62]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lUKKR-0005Uu-Vm for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 08 Apr 2021 04:23:32 +0200 Original-Received: by mail-ot1-f62.google.com with SMTP id e109sf237890ote.17 for ; Wed, 07 Apr 2021 19:23:31 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1617848611; cv=pass; d=google.com; s=arc-20160816; b=soTRGDeNeLW9Q/Qd23pV9NzTgzINoMS4bGDdPsyfEinvjLWI/ptzFP8XPVkpJISoqM LCGfZPcrUUZRAUX8WOj5BOb39XUPM/SQbOjpar5a3aDJs7vofQQlFoXLqSr/U3d0hALD tS1wszs1JcHAzyIWX58Bb5HLG6XVIS8mI3uMNZplRPPh4TUJ9GnyuLNPxFFAcvN4RNw7 nrLrWo4lOY3YYdTD4Hw71HVj3nfN6zdW+pC+FOE/Whkt82rE4qEJGUT1IBPAIKROfo2K 7KXjLnhY436FPCIQz9u+vJykm+GfTH9iNyQEeKtnGAS5DJ/fNpKPhhVz2K5KgJ3ncPoa +H+Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-transfer-encoding :mime-version:message-id:date:references:in-reply-to:subject:to:from :sender:dkim-signature; bh=acO4iroIr/tS+c2vcjAL3jZmVRmlbD5zLM3FiCaov8o=; b=eFLHIPjsrV9PAPUiKgmS0OG/4xzafYr/+RHKz601kCCR3FeQr/j4TbeMEI17UvUq3a 427ouA+pqkdc6EDWWDt5ohXxAca6OUvd3MbFtNQ/V7cU3xUaGGTkKm5ZsJc+nEx6RQ/s cP190461ethSdQ5jmFokFfJjNZgiMJDmX//9nLZYkW7vqlz2NPnnmBjR9A/giT6K0qKv Z6xGpuOUbOOUAJbI3hwA5lB0LMKlq5ta90Sjyu9rH/2AoNnl6p7FXhi0pav0ro65YIIv zziQNeaDdCR4L0ncau2Mh3AF1s4HhudmfgOrf0N2zHNV8GLXk2cO8yHc7kIeAI/8+AMA 5eVQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=u7J4i701; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::62b as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=acO4iroIr/tS+c2vcjAL3jZmVRmlbD5zLM3FiCaov8o=; b=hJVt5IOYeFhXZ7jWPMi8xlXNDBCuMRyErSj4U3OE2sL5w4H3F0rgdqjglVmfPpqmT0 IZwQnaf3G3eEqye52h8LiCpIShp7PtQpa1rYFGrzX5i8m1PN43/3UcXVthrfiK+3wE1b PepGAW+gAdwly5ESnUYjC8GTNLUnxpszE4DDvqBcGRsFts5wDxP8P5Gj2FNZLMOczVeD F/KkPpOH4BjkwormWyUEI6YmKrRg6idZMJISctQc7LNP43dwq1XORFFJSMF8nS1FSWfJ LMJrm04MQQztbrqkZpljxXGIIo4egZEKQ6O4G9mlJsIE4MHxr4QL8qUpXK1muvdYL1ui 9BWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:content-transfer-encoding :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=acO4iroIr/tS+c2vcjAL3jZmVRmlbD5zLM3FiCaov8o=; b=lnYcqQ780fhE64ITARjPp7A3hh6zYUcuzk921Ew07NrJOzJCPwpkbYFQiWlOjPfbaG sh1xybaTG9a8itHNhF6gyHb8EtNQEgLb8zY5/yrMFrnm+pOSPkNlKjRRLYZf4hVxaBJX Yoobt5WeD3RTkzhejACdF4nwqpsYnMvVf/KOgq1OeO4uaMfuHVtNPFuPPe3OhiM3yrDH Qhs81jXn6I1bCDRApkhEcQVXsupqOOvdVk6UGfpax7Spz1FkeTgaqzdl67k4p+1IRwj+ ZxSXaiKuWobs0eZEewUATYiLUSEWO2fkBNy4Q9xrzK8p2tbqOpLOPyGnsfnaLNqwj53P Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM531JAK9YIhivfrr04I8byvVOfox7TiwVnZOxwk3Aa9cLdBMet/KY 7JzN62m8C3hQmfUzKD4+8Xc= X-Google-Smtp-Source: ABdhPJw2l+zNR+bxFPFR/7+Vhur8sDYiXi4bnnSKrlqyHhKDDGw70emOeylhkH5ML6KzbxGocWWQFw== X-Received: by 2002:a9d:354:: with SMTP id 78mr5585278otv.123.1617848611005; Wed, 07 Apr 2021 19:23:31 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aca:5c5:: with SMTP id 188ls900440oif.6.gmail; Wed, 07 Apr 2021 19:23:29 -0700 (PDT) X-Received: by 2002:aca:5d43:: with SMTP id r64mr4297675oib.19.1617848609456; Wed, 07 Apr 2021 19:23:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617848609; cv=none; d=google.com; s=arc-20160816; b=uljcDwyl5YUwIhyYvVD3c95aHca+Mp0tP4dmF/e1LxC1NJEY5OFfb3ATbxYXW8F8VF zTHECsD3MH8D8sdUylFBcwDxkytks25jrbXLXUOajsCe+QAc567Xn/Sx2w0oXwDF/X3e AheRIuGib6HWEpSHdHtDytBgdYpgVX/XeBiy0NwHYOXty2AxcwNvP//LO/VkNQhzuwrq SDcj8J9GKLJIzhvLnmuu1c1RJ4/qyrg/hMoOjQgYAUVyiGjNxvJtNgImUwpTSIf6bzDq Yghg8QwHgLGVsfqighoswdgc+OVgzFvJW7LIyDcXH3+Ab92MCNGJaBNZFvUT4YYGojgJ X4bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:to:from:dkim-signature; bh=Z6i6d9+PtD+HqOvyym7BJgC3uhzMlPe9eSEX50iPagw=; b=CvE/rCS/j/9gJmwQXCDXBY/88pRZBKkN8DyFpzoBjGr4CpufzOgcC4XV6hH0nW29Yr rLU9NPtMOXS7x1WeKQQ4ftaOtK/aPgiRL2ddUctTE0DaiAUbitXjzM2vNE8oKH3f+0Z0 zDiK5rnJ5OLTGRVPFYW4MKqOvG08x1hORqulFWJ12PtoOpnzUWLQ7GVx74uUr91t92za kvCAYX4WgJmO0jQHJFvSYpj+2zwGhoohFZOGV/jUQO+IgI+m9DqZB5L1Q6/1RnKTCHOc jWApCMwApTAx4Fi4Xk2GR2iAvNngRSXOnIJflMy9HctlVyUs95vX7dWAkUBguluc+q05 +tPQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=u7J4i701; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::62b as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com. [2607:f8b0:4864:20::62b]) by gmr-mx.google.com with ESMTPS id w16si3461631oov.0.2021.04.07.19.23.29 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Apr 2021 19:23:29 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::62b as permitted sender) client-ip=2607:f8b0:4864:20::62b; Original-Received: by mail-pl1-x62b.google.com with SMTP id g10so235115plt.8 for ; Wed, 07 Apr 2021 19:23:29 -0700 (PDT) X-Received: by 2002:a17:90b:4a81:: with SMTP id lp1mr6057588pjb.154.1617848608340; Wed, 07 Apr 2021 19:23:28 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id y15sm26517939pgi.31.2021.04.07.19.23.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Apr 2021 19:23:27 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 9AB3AA231; Wed, 7 Apr 2021 22:23:16 -0400 (EDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=u7J4i701; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::62b as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28105 Archived-At: On second thought, leaving it as an option makes a lot of sense. We wouldn't want to force fr-FR to be sorted contrary to the French Academy's official dictionary... The question is really how to pass this kind of option through pandoc/citeproc, if it's going to be user-selectable for fr. It looks like there's a BCP 47 key "kb" corresponding to "backwards 2", https://www.unicode.org/reports/tr35/tr35-collation.html#Setting_Options so maybe one just says fr-FR-u-kb-true or (canonical equivalent according to 3.2.1) fr-FR-u-kb For alternative collations for a language we could do the same, e.g. es-ES-u-co-traditional. Parsing and representing these complex language tags is started to get pretty complicated! John MacFarlane writes: > I note that data/collation/fr_CA.xml has > > [backwards 2] > > and data/collation/fr.xml does not. > > 'backwards 2' says to sort the second-level collation elements > backwards; that's what the "French accents" option does. So that > explains the perl script's behavior; it is faithfully following > the locales, which specify this for Canadian French but not > European French. > > My parser for collation files currently does nothing with the > `[backwards 2]`, but maybe it's something I should implement. > > "'Nick Bart' via pandoc-discuss" > writes: > >> Bastien, BJP - many thanks, that=E2=80=99s helpful. Still, the main prac= tical question, >> I guess, is whether the default sort order the "new" pandoc generates fo= r French >> - either with or without the "optFrenchAccents" modification - is accept= able >> from the point of view of a native speaker of French or not, and if not,= what >> you would suggest instead. >> >> >> As to multiple collations, I commented earlier: >> >>> ... I tend to think that the default collation (which usually seems to = follow >>> the most recent rules for a given language) would usually be sufficient= . >> >> That being said, it seems that most of the information (in >> https://github.com/jgm/unicode-collation/tree/main/data) and, I assume, >> infrastructure for supporting different collation systems for a given la= nguage is >> in place already, so the following might be worth a try: >> >> pandoc is relying on IETF BCP 47 language tags anyway >> [https://tools.ietf.org/rfc/bcp/bcp47.txt]. >> >> A number of locale attributes contained in the Common Locale Data Reposi= tory >> (CLDR), including those pertaining to collation, can be expressed as ext= ensions >> to "simple" language tags of the form "en-US". >> >> IETF BCP 47 Extension U (Unicode Locale) is described in RFC 6067 >> [https://tools.ietf.org/html/rfc6067]. Relevant quote: >> >>> For example, the language tag "de-DE-u-attr-co-phonebk" consists of: >>> >>> o The base language tag "de-DE" (German as used in Germany), exactl= y as >>> defined by [BCP47] using subtags from the IANA Language Subtag Regis= try. >>> >>> o The singleton 'u', identifying this extension. >>> >>> o The attribute 'attr', which is an example for illustration (no >>> attributes were defined at the time this document was published). >>> >>> o The keyword 'co-phonebk', consisting to the key 'co' (Collation) = and the >>> type 'phonebk' (Phonebook collation order). >> >> On IETF BCP 47 extensions, see also >> https://www.w3.org/International/articles/language-tags/#extension. >> >> So if this does not appear too difficult, it might provide a lot of addi= tional >> flexibility if pandoc were to support the particular subset of "Extensio= n U" >> strings pertaining to collation, i.e., those starting with "-u-co-" in p= andoc's >> "lang" metadata field, or command line argument. (In the absence of such= a string, >> pandoc should of course use the default collation order.) >> >> --=20 >> You received this message because you are subscribed to the Google Group= s "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send a= n email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit https://groups.google.com/d/msg= id/pandoc-discuss/fkSA06gm5QfCBknaCRunOSZwTsdOX6DMRGx0IQVOs9yszm16IeaCsTwX_= cV-nhZ1kQ0LDEkxylV4IKJzSuiZbkjx3HSyD2NLgJTkW9DQB6U%3D%40protonmail.com. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/m2o8epo6p8.fsf%40MacBook-Pro.hsd1.ca.comcast.net.