Thanks John - copying & pasting the unicode from the HTML output into the Lua filter did the trick. Should've thought of that! On Thursday, 29 August 2019 13:51:49 UTC-4, John MacFarlane wrote: > > > Or better yet just use the unicode character (make sure your > lua filter is UTF-8 encoded): > > s.text == '–' > > Ken Dow > writes: > > > Thanks for the help (Sorry for the long delay - I didn't get notified of > > your post). > > > > I tried your suggestion and it works perfectly when searching for normal > > text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc > throws > > the following error: > > > > decimal escape too large near '"\5881' > > > > Single quotes (.e.g, s.text == '\8211') gives the same error. I tried > > "\\8211" in case the backslash needs to be escaped; no error but no > > replacement occurs. > > > > Finally, I tried the utf8.codes approach, referring to Material Icon > > codepoints doc for the value that should match, like so: > > > > function Str (s) > > if utf8.codes(s.text) == 'e5c3' then > > return pandoc.RawInline( > > 'html', > > 'apps' > > ) > > end > > end > > > > No error but no replacement. > > > > On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote: > >> > >> Ken Dow writes: > >> > >> > My DOCX source document, which is being converted to HTML, uses some > >> Google > >> > Material fonts. What shows up in the AST are values like > >> > > >> > Str "\8211" > >> > > >> > I'd like to find and replace those to produce something like the > >> following > >> > HTML: > >> > > >> > face > >> > > >> > Is that possible and if so, how? > >> > >> The way to go here is via `RawInline` elements, e.g.: > >> > >> function Str (s) > >> if s.text == '–' then > >> return pandoc.RawInline( > >> 'html', > >> 'face' > >> ) > >> end > >> end > >> > >> Note matching on an exact string would fail if the character was > >> somewhere within a word (a typical would be em-dashes). One would have > >> to use the [utf8.codes] module to manually find and replace those > >> characters in that case. > >> > >> [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) > > >> > >> -- > >> Albert Krewinkel > >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > >> > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org . > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com. > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f48093d8-f00b-4287-9b31-abd24912d17d%40googlegroups.com.