Thanks John - copying & pasting the unicode from the HTML output into the Lua filter did the trick. Should've thought of that!

On Thursday, 29 August 2019 13:51:49 UTC-4, John MacFarlane wrote:

Or better yet just use the unicode character (make sure your
lua filter is UTF-8 encoded):

s.text == '–'

Ken Dow <theke...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Thanks for the help (Sorry for the long delay - I didn't get notified of
> your post).
>
> I tried your suggestion and it works perfectly when searching for normal
> text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc throws
> the following error:
>
> decimal escape too large near '"\5881'
>
> Single quotes (.e.g, s.text == '\8211') gives the same error. I tried
> "\\8211" in case the backslash needs to be escaped; no error but no
> replacement occurs.
>
> Finally, I tried the utf8.codes approach, referring to Material Icon
> codepoints doc for the value that should match, like so:
>
> function Str (s)
>   if utf8.codes(s.text) == 'e5c3' then
>     return pandoc.RawInline(
>       'html',
>       '<i class="material-icons">apps</i>'
>     )
>   end
> end
>
> No error but no replacement.
>
> On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote:
>>
>> Ken Dow writes:
>>
>> > My DOCX source document, which is being converted to HTML, uses some
>> Google
>> > Material fonts. What shows up in the AST are values like
>> >
>> > Str "\8211"
>> >
>> > I'd like to find and replace those to produce something like the
>> following
>> > HTML:
>> >
>> > <i class="material-icons">face</i>
>> >
>> > Is that possible and if so, how?
>>
>> The way to go here is via `RawInline` elements, e.g.:
>>
>>     function Str (s)
>>       if s.text == '–' then
>>         return pandoc.RawInline(
>>           'html',
>>           '<i class="material-icons">face</i>'
>>         )
>>       end
>>     end
>>
>> Note matching on an exact string would fail if the character was
>> somewhere within a word (a typical would be em-dashes). One would have
>> to use the [utf8.codes] module to manually find and replace those
>> characters in that case.
>>
>> [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes)
>>
>> --
>> Albert Krewinkel
>> GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
>>
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f48093d8-f00b-4287-9b31-abd24912d17d%40googlegroups.com.