From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/23400 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: BPJ Newsgroups: gmane.text.pandoc Subject: Re: Transliterated and original titles/names in citations Date: Sat, 7 Sep 2019 12:10:24 +0200 Message-ID: References: <0c05fcec-fbb7-aed6-c1ec-e84610bcdd96@gmail.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000019f39b0591f3c435" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="233663"; mail-complaints-to="usenet@blaine.gmane.org" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCWMVYEK54FRBHMEZ3VQKGQEAY7NQPQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Sep 07 12:10:40 2019 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pl1-f184.google.com ([209.85.214.184]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1i6Xg0-000yg3-6b for gtp-pandoc-discuss@m.gmane.org; Sat, 07 Sep 2019 12:10:40 +0200 Original-Received: by mail-pl1-f184.google.com with SMTP id h17sf3517869plr.11 for ; Sat, 07 Sep 2019 03:10:40 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1567851038; cv=pass; d=google.com; s=arc-20160816; b=GS0hkbdXCTAd4CW62nHsh0YFUElsFwoGKJjsUrCiGWkvgmbGDG4vYZJ2CHLboVUtdY m8HKQv1C8mQPXt3j7zgY4+GJJIwdgpNdOcN4K3sB/vQTtZ3g+uXkKUTjiSI8n0QU+WIP 9F0LY8u08gaYkvHHUhH3JKqVBSAS+8HbLTP+88wOSiHWCJCtb0ZZZ2O7yGz4DahZ/Chi PDgyhiNgESumBJ+EwK4YmtP/LqVUkLKuWPzxmpHKBbQ5SAB2+j5TL1PS+B3GSjYEAjrh 4zfsOF/X1o5S4UU7W3A2ubAVbp5BYGDRrzPx4gQYOdNzekmwTRIMARHgHX8aYcEXiQpT 21yQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=vgCe/gnPA6Io8sXBF970Fpy6E3xXwx4R5RpicRL3YW8=; b=qcisBe7ozf+xe5oU9YshRgP573bXlBUjOCpUr67KC4VPlBJmYZKMQs1PQjbXLhVKyF 0BnruHQEX+fP/0pvnp5JzhZluR3ZORDBoNW0wDGFMhxF1XJvzrrWk7TAO6rMGNEic+JU s7VE5TLEQyGE55dWbpwxiBMDmLZ5qTTrAyTZPcheHFsuD8ppwdfGyyrzcALqoroWQcao LYfYXnqgG8/Hg1u5llV8QbTzy5NKLyqhjSyD1fQmwZe0BnmhjAFoirsSqREFXfDAzLsF CgMpouN/Huxj2IGlti/mUO3rvlrselBP3TpFNj0lklfxl0yiy4TDd0OT9jPTdv4fcxbg 67aw== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=feezK6Iy; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::330 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=vgCe/gnPA6Io8sXBF970Fpy6E3xXwx4R5RpicRL3YW8=; b=ivNlsXNXwkpnxnAIx2FwhMjMXYZixi5k2FuX5S8dGasLiC7ziatcMPYTKuBPV+7FNx p8eKBYOsxDg7Zs03vQ3aGek/Xc6shP4DpfRUyuJZRTsUd9g95RsPuoJ/qtpBtsiJD3ST 4W25bQ9MBxJGc2pkzIEl9zOcAKoAPEqdI5pOUrNaEHKsyHGC97myo+fimQE/opTLyL/H j3U6dU+doUApe8nDaDnlF5j5Z0ciRyft1JR6ytvn1zyhpDon46yqq1teM69DMMD7sHjB mab/VHI+Y2oz4QbDht5WkzTjkGXri2hVGzTDHdKfFBC01HgtkUdly5PwSAfsXgaFerA6 FUJg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=vgCe/gnPA6Io8sXBF970Fpy6E3xXwx4R5RpicRL3YW8=; b=VFq7qzkn2PxuDlaOcxWwi7HHe/dD1eAOege7eoL6CThW4QXWrjZHVFHeBZyV1VEq00 f2jEOttEE9VtvKy/RCr3Y8tWbSskuMykrZyQU0/z8ozwz1/KdJ398yGhU/RT5nb8teV8 w4xa9cAdVwrzx343qlfeaO0O2RKctkFRpFEfpeIHpbZ+Xfa7v90NWbX3ktLGBHv/IFto E5jvm/mqDqI8ke47lZTXIrD5wHRVot7XSvWNWXImtHpH15U8o0sQcbKDA9FYUmP5Sh2/ L0HV6TLawKJZZVpRn1C6Yrm8PEaK4t+VfK7XGts2WK5LTNIVwZFtM0Wxi3q6Lb7PzrF1 QCmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=vgCe/gnPA6Io8sXBF970Fpy6E3xXwx4R5RpicRL3YW8=; b=WdFzYhaDlcdIsTAAFFhTzVu3ZQS0EGwZ2xJiK4JioaHJn9mKnU7zGfKwkMwH3i7+Jx wtWj2Kr10UWWLERyEG6gK7G6eF69i3JashdjvR8Tmu9605D1Ozq+EbJHnp3iRmKfjWy5 REVLH+GQA6K4r7DcBqz2+vrqMqnoEVVXNBQmJ0YNLf7umlA6DyBlV1v9NvtwCLoKkNcd Cpv5Jv2g4e7lxNlWqALU5QXaI9Vc9oliTYUif1G6e4Hk718aS6xHTXgrmPcew4oJyO9k iH/HEIS5SphhsJMkyvUnKh6fxNzv3htlDCXJ57u+RQjLZExtK8D5eA9WGT75bWYO7+t/ lfew== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAXrCs+emAzFE1Lr+mWLoi8Du+omaTW2XShectve2qpR4zqOSVVu Imz1S+HoGqtgMPlg6a29jBE= X-Google-Smtp-Source: APXvYqz9EZTAuvC4kbUj80BXtATKv02XwHyshfJFaUr5vq1yO5zdntPR0sIbNLIiYoiTylNxJSWwsA== X-Received: by 2002:a17:90a:d598:: with SMTP id v24mr14743153pju.8.1567851038434; Sat, 07 Sep 2019 03:10:38 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aa7:9d10:: with SMTP id k16ls2385276pfp.4.gmail; Sat, 07 Sep 2019 03:10:37 -0700 (PDT) X-Received: by 2002:a65:62cd:: with SMTP id m13mr11823492pgv.437.1567851037703; Sat, 07 Sep 2019 03:10:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567851037; cv=none; d=google.com; s=arc-20160816; b=M3rtLB7Q+/bnZlLZWBSMpfnpc60YCHQDmxEoEfMFP9sP1BQeHNcJWLRKoTARsZHKqu QjPTOgOVU7u03rpcin2IQHZd+lQSwa/QxjMgCsl2o5TzJxd3cT+X2fcv2T2IVBEBtPH1 5OkndQH6SEySdyaD6xzU0i1psdVxKBLTRW/z12O23xfTersjf1B20Rv/bBotCk9Si3cd +3oHzn2HpKfwnH3t747qO+er5D0V0FMa9koSelyu+Jqf1E1ieL9wiBwZzYZpGKHUCmfN 9QfBpKJ/qx+bxVDkZQtTNy+B8GDlor93MxyPBlKxC1WguL8p6O910AgtVlhVmUwMOqnG 5oFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Ao7Ms9fubnZYKi0BmMwXcVj1AZvkdUE0r2hbSyiqD6U=; b=a+kVLYAisNagiyED0X0JWuycJ9INVZa47J0m8XOb6dY6o1DGjjxbZsfjux/mJvKiuG y1H7zG/R/AAzpW3EwBdJxT0lNnM50vmMYsfGyuqHkT9cwh0JskTENibqz1IiW072FQLL 56Jw0yFc7ZTWdY58a3bNWbDxS1kVZR0MKZvlF6oIJj+jvYIvAsJv4nvHdK041qvZu+Xj 5JQjINQBFCrDHJr4mmMmyhq8xuamgSFbmgl+iLjZonve5990QP2F4g2vyC0dIeN9QoqR 5SjoOT4QVdX1ASacwo7RPkLx3pZ/dLXFw0DlP1tpI+q1KjC1yi06C52Obb/lEsUrVouN OhXw== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=feezK6Iy; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::330 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com. [2607:f8b0:4864:20::330]) by gmr-mx.google.com with ESMTPS id 85si394957pgb.2.2019.09.07.03.10.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 07 Sep 2019 03:10:37 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::330 as permitted sender) client-ip=2607:f8b0:4864:20::330; Original-Received: by mail-ot1-x330.google.com with SMTP id 100so8164923otn.2 for ; Sat, 07 Sep 2019 03:10:37 -0700 (PDT) X-Received: by 2002:a05:6830:124e:: with SMTP id s14mr2032771otp.177.1567851036667; Sat, 07 Sep 2019 03:10:36 -0700 (PDT) In-Reply-To: X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=feezK6Iy; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::330 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:23400 Archived-At: --00000000000019f39b0591f3c435 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Oops, forgot the link: https://metacpan.org/pod/Unicode::Collate Den l=C3=B6r 7 sep. 2019 12:05BPJ skrev: > I just realized two things which make matters much worse: > > 1. Not all publications accept the same transliteration schemes. Just by > surveying one author's references to his own works in one bibliography I > find that his surname, =D0=AF=D0=B1=D0=BB=D0=BE=D0=BD=D1=81=D0=BA=D0=B8= =D0=B9, can be transliterated in five different > ways (although two predominate)! So I'll need both a `transliterated titl= e` > field and a field `transliterated authors` field with (in each item) a > mapping of alternative transliterations. Even Icelandic needs to be > transliterated sometimes, e.g. =C3=9E=C3=B3r=C3=B0ur becoming Th=C3=B3rdu= r (with data loss!) > > 2. Sorting. Latin letters like _=C4=8D, =C5=A1, =C5=BE_ need to sort as = _c, s, z_ and > probably _=C3=9E_ must sometimes sort like _Th_ and sometimes after _z_! = This > needs sometimes tailored locale dependent sorting! Accented letters can > ideally be handled by entering things in NFC and hoping that sort > algorithms ignore combining marks, but then e.g. in Scandinavian language= s > _=C3=B6_ sorts not as _o_ but at the end of the alphabet (ideally _=C3=BE= , =C3=A6, =C3=B8, =C3=A5, > =C3=A4, =C3=B6_ go at the end of the alphabet in that order, but often _= =C3=A6/=C3=A4, =C3=B8/=C3=B6_ are > conflated either before or after _=C3=A5_!). Anyway it seems CSL has no > customizable sort key field. I know how to handle these things myself wit= h > [Unicode::Collate][] but that at least means some postprocessing of as ye= t > unknown complexity. > > Den ons 4 sep. 2019 09:33BPJ skrev: > >> Does anyone know how to handle transliterated titles and names in >> citations, when you want to include both the transliteration and the >> original? Does CSL have any fields for that? >> >> TIA, >> >> /bpj >> > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CADAJKhDm8bibSjJfrM6W69qM_j1N9tPHEgRwaic4bZmrsB1CVw%40mail.g= mail.com. --00000000000019f39b0591f3c435 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Den l=C3=B6r 7 sep. 2019 12:05= BPJ <melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skre= v:
I just realized two things which make matters much worse:

1.=C2=A0 Not all publications a= ccept the same transliteration schemes. Just by surveying one author's = references to his own works in one bibliography I find that his surname, = =D0=AF=D0=B1=D0=BB=D0=BE=D0=BD=D1=81=D0=BA=D0=B8=D0=B9, can be transliterat= ed in five different ways (although two predominate)! So I'll need both= a `transliterated title` field and a field `transliterated authors` field = with (in each item) a mapping of alternative transliterations. Even Iceland= ic needs to be transliterated sometimes, e.g. =C3=9E=C3=B3r=C3=B0ur becomin= g Th=C3=B3rdur (with data loss!)

2. Sorting. Latin letters like _=C4=8D, =C5=A1,=C2=A0 =C5=BE_ nee= d to sort as _c, s, z_ and probably _=C3=9E_ must sometimes sort like _Th_ = and sometimes after _z_! This needs sometimes tailored locale dependent sor= ting! Accented letters can ideally be handled by entering things in NFC and= hoping that sort algorithms ignore combining marks, but then e.g. in Scand= inavian languages _=C3=B6_ sorts not as _o_ but at the end of the alphabet = (ideally _=C3=BE, =C3=A6, =C3=B8, =C3=A5, =C3=A4, =C3=B6_ go at the end of = the alphabet in that order, but often _=C3=A6/=C3=A4, =C3=B8/=C3=B6_ are co= nflated either before or after _=C3=A5_!). Anyway it seems CSL has no custo= mizable sort key field. I know how to handle these things myself with [Unic= ode::Collate][] but that at least means some postprocessing of as yet unkno= wn complexity.

Den ons 4 sep. 2019 09:33BPJ <melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> s= krev:
Does anyone know how to handl= e transliterated titles and names in
citations, when you want to include both the transliteration and the
original? Does CSL have any fields for that?

TIA,

/bpj

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/CADAJKhDm8bibSjJfrM6W69qM_j1N9tPHEgRwaic4= bZmrsB1CVw%40mail.gmail.com.
--00000000000019f39b0591f3c435--