Den tis 18 okt. 2022 17:36Bernardo C.D.A. Vasconcelos <bernardovasconcelos@gmail.com> skrev:
> As for translating the filter note that Lua can't really handle UTF-8.
> There is some rudimentary support for converting codepoint number ↔
> UTF-8
> byte sequences and for iterating through a string of bytes
> representing
> UTF-8 encoded characters but no concept of chars as opposed to bytes.
> This
> may become a show stopper if you need to manipulate strings containing
> UTF-8 text.


Thanks, @BPJ, for the explanation. Apparently, Lua 5.3 onwards includes
UTF-8 support. Have you seen it? E.g.
https://q-syshelp.qsc.com/Content/Control_Scripting/Lua_5.3_Reference_Manual/Standard_Libraries/4_-_Basic_UTF-8_Support.htm

Yes, that is what I meant. It's very very basic. Notably pattern matching is still entirely byte oriented, except for the pattern `utf8.charpattern` which will match the bytes of any UTF-8 character. Pandoc adds some UTF-8 oriented functions, notably case changing functions, in the `pandoc.text` library, but that is all.






> For Ancient Greek you want grc as the language tag.
>

Indeed it is (and that is generally what I use), but ἀγαθός is
just Polytonic Greek, which is not the same as Ancient Greek.

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3307993F-F813-405F-BFEC-F17FAF27BEA5%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhBVNnb9LTK5jvnDZbhqbP--BFzgc3fQgw2Lw4VBZ-fH7A%40mail.gmail.com.