From: Max Chernoff via ntg-context <ntg-context@ntg.nl>
To: ntg-context@ntg.nl
Cc: Max Chernoff <mseven@telus.net>, oinos@gmx.es
Subject: Re: issue with scite module
Date: Wed, 1 Jun 2022 15:58:13 -0600 [thread overview]
Message-ID: <cac73198-f067-80c6-633c-b2172f65ddac@telus.net> (raw)
In-Reply-To: <0642f009-3737-f783-6b25-a346f7981667@fiee.net>
> Now, I still don’t understand LPEG and don’t know if there’s a general
> “character” class that doesn’t need a list...
Well looking through the XML spec
https://www.w3.org/TR/REC-xml/#NT-NameChar
you'd think that we'd want a pattern like this:
local name = (R("az","AZ","09", "\u{C0}\u{D6}", "\u{D8}\u{F6}", "\u{F8}\u{2FF}", "\u{370}\u{37D}", "\u{37F}\u{1FFF}", "\u{200C}\u{200D}", "\u{2070}\u{218F}", "\u{2C00}\u{2FEF}", "\u{3001}\u{D7FF}", "\u{F900}\u{FDCF}", "\u{FDF0}\u{FFFD}", "\u{10000}\u{EFFFF}", "\u{0300}\u{036F}", "\u{203F}\u{2040}") + S("_-.\u{B7}"))^1
But that doesn't work, since
> The same is true for lpeg.R, although the latter will display an error message if used
> with multibyte characters. Therefore lpeg.R('aä') results in the message bad argument #1
> to 'R' (range must have two characters), since to lpeg, ä is two ’characters’ (bytes), so
> aä totals three. (https://texdoc.org/serve/luatex/0##680)
The easiest way that I found was to just cheat and use everything with
a TeX catcode 11 ("letters"):
local name = (R("az","AZ","09") + S("_-.") + lpeg.utfchartabletopattern(characters.csletters))^1
This isn't strictly speaking correct, but I think that it's close
enough. It seems to work correctly for Pablo's initial example,
but it may break something else.
-- Max
diff --git a/texmf-context/context/data/scite/context/lexers/scite-context-lexer-xml.original b/texmf-context/context/data/scite/context/lexers/scite-context-lexer-xml.lua
index e635d40..97de3fd 100644
--- a/texmf-context/context/data/scite/context/lexers/scite-context-lexer-xml.original
+++ b/texmf-context/context/data/scite/context/lexers/scite-context-lexer-xml.lua
@@ -41,7 +41,7 @@ local semicolon = P(";")
local equal = P("=")
local ampersand = P("&")
-local name = (R("az","AZ","09") + S("_-."))^1
+local name = (R("az","AZ","09") + S("_-.") + lpeg.utfchartabletopattern(characters.csletters))^1
local openbegin = P("<")
local openend = P("</")
local closebegin = P("/>") + P(">")
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
next prev parent reply other threads:[~2022-06-01 21:58 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-01 16:47 Pablo Rodriguez via ntg-context
2022-06-01 16:58 ` Henning Hraban Ramm via ntg-context
2022-06-01 17:45 ` Pablo Rodriguez via ntg-context
2022-06-01 19:00 ` Henning Hraban Ramm via ntg-context
2022-06-01 21:58 ` Max Chernoff via ntg-context [this message]
2022-06-02 15:36 ` Pablo Rodriguez via ntg-context
2022-06-02 17:03 ` Pablo Rodriguez via ntg-context
2022-06-02 22:52 ` Max Chernoff via ntg-context
2022-06-04 8:42 ` Pablo Rodriguez via ntg-context
2022-06-04 9:59 Pablo Rodriguez via ntg-context
2022-06-04 21:18 ` Max Chernoff via ntg-context
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cac73198-f067-80c6-633c-b2172f65ddac@telus.net \
--to=ntg-context@ntg.nl \
--cc=mseven@telus.net \
--cc=oinos@gmx.es \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).