From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/31588 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: BPJ Newsgroups: gmane.text.pandoc Subject: Re: Glossary Filter for MD2Tex Date: Tue, 18 Oct 2022 20:42:39 +0200 Message-ID: References: <88a14108-f2e4-40d0-a98e-5c6f84b8ff41n@googlegroups.com> <3307993F-F813-405F-BFEC-F17FAF27BEA5@gmail.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000087ef0405eb5375fc" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="33095"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCWMVYEK54FRBK7HXONAMGQEMIDJIGI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Oct 18 20:42:55 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-vk1-f184.google.com ([209.85.221.184]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1okrYF-0008Nu-8u for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 18 Oct 2022 20:42:55 +0200 Original-Received: by mail-vk1-f184.google.com with SMTP id i130-20020a1f9f88000000b003aba9ac6977sf2582906vke.10 for ; Tue, 18 Oct 2022 11:42:55 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1666118574; cv=pass; d=google.com; s=arc-20160816; b=TN5P4xOXHHr3j8Faltp9NwcmC+6RqK/d2HeNe++e7SZvmra4so9UkrmXSaQkrtIQfb 0ahDgMIAY0GrDHB+RhVYIDQpCWyBA9bSeAWjYqDrFIvApnGfrCKwPPA9mvkSR7Icq6K6 HEO3b5tvrpWOF4n6x/zCyCskb5m4gsP8MJP5vPPFOC7jg7whKMIQfDAcj/3Rg59QpsD3 utVG431lB5mK92vpcEr8nKiR5n4usz4Z9JQGP6rBW7OysfG+LlYBspxBJ4ykOWLY/iyQ X+gETOq7oZ5HKxVxemEogE7rOXgyM66P9lvkrPEoiaCd/eX1UX/Fn7NMhi4xtJp6Ahfm AMBQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=HyYUhZWlEfdIkkeGRizfW6vJvaxgReoLX4z8OvkGazc=; b=fMxXeFEZhs4u0tHytHczzJ/VZqitCNONSLsc+Zer1rqHOGhQzOvAw5Tjzj0N+8P8SO Hn1GNUwhnLW/2QtrR3iPYKgPREsAS0iTv0q/yaL6GS0C70vbzbLm9cIwnkdLvO1Z6A11 nt17/o6c9I4fCE/DfeCE0IA5DmzFuQG+YOvyIBa828T6pheZ1WURIs3O+MUVKKJSlkTa NcfyDAPLF6F3HkRqlGGOlu0lL4UuamYbbC6vglTfSHwtpjW4+AyjYGI3Wc/gm4ibkeyn 4lCfWQjaoHQlmYb6G1gm76bH4Sqt21dL6Vbw4OQmJVM8/FolwZuB1YvojyMVuP7bC3Wl OpxQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CFT14pgP; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::b2f as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:subject :message-id:date:from:in-reply-to:references:mime-version:sender :from:to:cc:subject:date:message-id:reply-to; bh=HyYUhZWlEfdIkkeGRizfW6vJvaxgReoLX4z8OvkGazc=; b=RgQiSNDkOqjyhqj9ix+GeC0bF8nL5ykcuEpbspysib7girpdoJjFLT0Geq8PNtfbwz fVNh1aKnmOkicQOcDjDqpSAU/Ptoz9nU2uZLrhchI4GDCsks8XM2lfbiWqKs8QSf2ZVW xXDtuhlBkYm3xFSU35pf7sJjUQX2+/aKS70iQqf+9sx4D97DjhN9/J9PIshXJaPyE33k Lge0p2kBMYzyzfPb/OEf9uLVaOKvVzFVmh5o2L8ZrNkxteXIPDqOataQbsiQWNd+HERn qE48PrhHpdtgCQD1ZI3nnQzJouyQPE+k77GJMYGlHC8DwFrq+F9g5tuYY0CigHnUUW+O CMdg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:subject :message-id:date:from:in-reply-to:references:mime-version:from:to:cc :subject:date:message-id:reply-to; bh=HyYUhZWlEfdIkkeGRizfW6vJvaxgReoLX4z8OvkGazc=; b=KD/pqM6fqBTWqW25Ef3WCad9lC1OASRDzc74z0RkuuzuIh/xz62N//oZA419xNVRTZ iCPtEGIYF78mTOUWCnqbaN/1JGBQqzNi1cpb0LIQhGqeHAvMjemWAubiYisriL+dQGEx /vx5lRLxTml3cJmfgj2U5VDebno1k3MMymNVxLIZZurUyhH/MDJRuZLLAmN6VtcOC+rH qqW5ZNGS3UN8MpUMK77J6N9ezvoelXxaU0Cwgcc2avmFfz1svv0L1KhXJZ5W/nYR64+C m4Ibs5Ww44Pw2KIj09So/bmA0J9igBbrycFOOsSZNZZyqXvNQvG+d0RfOTktFJmlEp8j v9mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:subject :message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=HyYUhZWlEfdIkkeGRizfW6vJvaxgReoLX4z8OvkGazc=; b=CXl2oat1p7jnO9zt5NJ+/KO9EKtiqOMlrzzBkJs7gWKSg5eslCBy0GtGdyzTk36om5 LUlwwRxxFQtsfvsHdIPNz4Tw1HAOCD+PKeRlu4W9elgjZIUc/hwFjuqf1qbQ2O83wSwq RyJ7wCbDPj9TGJfEt20W1Tvzf5VUPXTeJcrw3cpS2RSHjqrzdi4iYpUNJtlFjMZA9F5s yq/4m74KLYcsD08EEboSOEJNc/bEZyphWtlfyKfm+HRwJX/oCZ8tEUJk+SXC9eqw0gsr 1poTxoNTNs4ANBH0hVVRQ6Dz0ZA8Hssj2BxW1osZz7TKAd Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: ACrzQf1lvwPb5pXJNCS0YiKGO6/nS5aCnlz17XXBAuxzwRAnEpbtyPKd fSr4iXtdTj527zZsQCFa/gg= X-Google-Smtp-Source: AMsMyM54gS1WEX67tJqI4hTTLmtZRBxglvzQJBqWfKFAF6rKsOm52o6O6lr7+3KyPgD3sD1GwkvJxg== X-Received: by 2002:a1f:9bd0:0:b0:3af:163a:72f9 with SMTP id d199-20020a1f9bd0000000b003af163a72f9mr2346349vke.0.1666118574211; Tue, 18 Oct 2022 11:42:54 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a67:c48b:0:b0:395:3560:5755 with SMTP id d11-20020a67c48b000000b0039535605755ls3229360vsk.9.-pod-prod-gmail; Tue, 18 Oct 2022 11:42:51 -0700 (PDT) X-Received: by 2002:a05:6102:1255:b0:3a7:7791:1257 with SMTP id p21-20020a056102125500b003a777911257mr2373486vsg.2.1666118571028; Tue, 18 Oct 2022 11:42:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666118571; cv=none; d=google.com; s=arc-20160816; b=RPvGvbYqDIqVf0ps4yBGw4EtX27Ano0iqR4JeXZEX0FBj7/HboIaaOjJnuM/ft6c5b 9p4uyA0EU2FZglHukH7M7qM0h2KwRYnNHrYbaXACI49v8xm5QwPq7f7ZHhXjGRm2IWQj 96Nj56/iVRpRJCjHA0UZL6u+bCEXskcnB1y0LC9si5ntfmIni6bPy+/KRdRPaxirOzRr zOTXiLZh5mmRxiU51qE3wscf+edyGaF33CSm8n+zbw6iJuwl+26alEAZMiykh7yZSA4N KbPaLdkKgA8cHs46kr+SNl6RLCovvKVJSfCkTa3uWGF5N16+pP0kdvYO6yonUSCYNWER A3fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=acowS1mpmle/gk2xGJgFJWtVvLiBUBP6WdqgYBunskc=; b=pqG8sYH/AKZDBnuRmg6nKcZzSNAmZk2iRUYITiyMCK9ZIR3UmmYzx4Mb8IcQcUWI6D sI8ouBwYL+gE5TIarRmM6copOWFbgR0md8igzKnGbZxwI4GOs/M8ZT4bfGmNA1JHnmhQ uC8qfdM5OV+bxxFZxI0PqVQ/unCE0HIykMLZ9l9tE28ZuvDqiIfU1yXXzQ2CUmQGjadX 4Lazt4L5rggnaG7m2jrNbeZX7xIxDwdkMz3JLr5aJ9VFGPlhsA3Yo4vGTgueNsTikuUQ OxGOB419rIs7ZGK3Ia2jO67LNUCW4yZri8V306bay8FrlfKeDQprH4akgDVAAnSI810l Nn2A== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CFT14pgP; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::b2f as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com. [2607:f8b0:4864:20::b2f]) by gmr-mx.google.com with ESMTPS id y4-20020a05610207c400b003a95a847876si434391vsg.1.2022.10.18.11.42.51 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Oct 2022 11:42:51 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::b2f as permitted sender) client-ip=2607:f8b0:4864:20::b2f; Original-Received: by mail-yb1-xb2f.google.com with SMTP id f205so6909958yba.2 for ; Tue, 18 Oct 2022 11:42:51 -0700 (PDT) X-Received: by 2002:a25:25c3:0:b0:6bc:18e3:e7d2 with SMTP id l186-20020a2525c3000000b006bc18e3e7d2mr3605728ybl.66.1666118570207; Tue, 18 Oct 2022 11:42:50 -0700 (PDT) In-Reply-To: <3307993F-F813-405F-BFEC-F17FAF27BEA5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CFT14pgP; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::b2f as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:31588 Archived-At: --00000000000087ef0405eb5375fc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Den tis 18 okt. 2022 17:36Bernardo C.D.A. Vasconcelos < bernardovasconcelos-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev: > > As for translating the filter note that Lua can't really handle UTF-8. > > There is some rudimentary support for converting codepoint number =E2= =86=94 > > UTF-8 > > byte sequences and for iterating through a string of bytes > > representing > > UTF-8 encoded characters but no concept of chars as opposed to bytes. > > This > > may become a show stopper if you need to manipulate strings containing > > UTF-8 text. > > > Thanks, @BPJ, for the explanation. Apparently, Lua 5.3 onwards includes > UTF-8 support. Have you seen it? E.g. > > https://q-syshelp.qsc.com/Content/Control_Scripting/Lua_5.3_Reference_Man= ual/Standard_Libraries/4_-_Basic_UTF-8_Support.htm Yes, that is what I meant. It's very very basic. Notably pattern matching is still entirely byte oriented, except for the pattern `utf8.charpattern` which will match the bytes of any UTF-8 character. Pandoc adds some UTF-8 oriented functions, notably case changing functions, in the `pandoc.text` library, but that is all. > > > > > For Ancient Greek you want grc as the language tag. > > > > Indeed it is (and that is generally what I use), but =E1=BC=80=CE=B3=CE= =B1=CE=B8=CF=8C=CF=82 is > just Polytonic Greek, which is not the same as Ancient Greek. > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/3307993F-F813-405F-BFEC-= F17FAF27BEA5%40gmail.com > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CADAJKhBVNnb9LTK5jvnDZbhqbP--BFzgc3fQgw2Lw4VBZ-fH7A%40mail.g= mail.com. --00000000000087ef0405eb5375fc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


Den tis 18 okt. 2022 17:36Bernardo C.D.A. Vasconcelos = <bernardovasconcelos@gm= ail.com> skrev:
> As for = translating the filter note that Lua can't really handle UTF-8.
> There is some rudimentary support for converting codepoint number =E2= =86=94
> UTF-8
> byte sequences and for iterating through a string of bytes
> representing
> UTF-8 encoded characters but no concept of chars as opposed to bytes. =
> This
> may become a show stopper if you need to manipulate strings containing=
> UTF-8 text.


Thanks, @BPJ, for the explanation. Apparently, Lua 5.3 onwards includes UTF-8 support. Have you seen it? E.g.
https://q-syshelp.qsc.com/Content/Control= _Scripting/Lua_5.3_Reference_Manual/Standard_Libraries/4_-_Basic_UTF-8_Supp= ort.htm

Yes, that is what I meant. It's very very basic. Notably pattern= matching is still entirely byte oriented, except for the pattern `utf8.cha= rpattern` which will match the bytes of any UTF-8 character. Pandoc adds so= me UTF-8 oriented functions, notably case changing functions, in the `pando= c.text` library, but that is all.






> For Ancient Greek you want grc as the language tag.
>

Indeed it is (and that is generally what I use), but =E1=BC=80=CE=B3=CE=B1= =CE=B8=CF=8C=CF=82 is
just Polytonic Greek, which is not the same as Ancient Greek.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.= com.
To view this discussion on the web visit https://groups.google.com/= d/msgid/pandoc-discuss/3307993F-F813-405F-BFEC-F17FAF27BEA5%40gmail.com= .

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/CADAJKhBVNnb9LTK5jvnDZbhqbP--BFzgc3fQgw2L= w4VBZ-fH7A%40mail.gmail.com.
--00000000000087ef0405eb5375fc--