* Replace Str with HTML in Lua Filter @ 2019-08-09 19:28 Ken Dow [not found] ` <abe5ae45-2ad8-419b-a282-5b5e1b4fcda1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Ken Dow @ 2019-08-09 19:28 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 711 bytes --] My DOCX source document, which is being converted to HTML, uses some Google Material fonts. What shows up in the AST are values like Str "\8211" I'd like to find and replace those to produce something like the following HTML: <i class="material-icons">face</i> Is that possible and if so, how? -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/abe5ae45-2ad8-419b-a282-5b5e1b4fcda1%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2412 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <abe5ae45-2ad8-419b-a282-5b5e1b4fcda1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Replace Str with HTML in Lua Filter [not found] ` <abe5ae45-2ad8-419b-a282-5b5e1b4fcda1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2019-08-10 16:02 ` Albert Krewinkel [not found] ` <8736i9qa95.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Albert Krewinkel @ 2019-08-10 16:02 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Ken Dow writes: > My DOCX source document, which is being converted to HTML, uses some Google > Material fonts. What shows up in the AST are values like > > Str "\8211" > > I'd like to find and replace those to produce something like the following > HTML: > > <i class="material-icons">face</i> > > Is that possible and if so, how? The way to go here is via `RawInline` elements, e.g.: function Str (s) if s.text == '–' then return pandoc.RawInline( 'html', '<i class="material-icons">face</i>' ) end end Note matching on an exact string would fail if the character was somewhere within a word (a typical would be em-dashes). One would have to use the [utf8.codes] module to manually find and replace those characters in that case. [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8736i9qa95.fsf%40zeitkraut.de. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <8736i9qa95.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Replace Str with HTML in Lua Filter [not found] ` <8736i9qa95.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2019-08-29 15:06 ` Ken Dow [not found] ` <87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Ken Dow @ 2019-08-29 15:06 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2375 bytes --] Thanks for the help (Sorry for the long delay - I didn't get notified of your post). I tried your suggestion and it works perfectly when searching for normal text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc throws the following error: decimal escape too large near '"\5881' Single quotes (.e.g, s.text == '\8211') gives the same error. I tried "\\8211" in case the backslash needs to be escaped; no error but no replacement occurs. Finally, I tried the utf8.codes approach, referring to Material Icon codepoints doc for the value that should match, like so: function Str (s) if utf8.codes(s.text) == 'e5c3' then return pandoc.RawInline( 'html', '<i class="material-icons">apps</i>' ) end end No error but no replacement. On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote: > > Ken Dow writes: > > > My DOCX source document, which is being converted to HTML, uses some > Google > > Material fonts. What shows up in the AST are values like > > > > Str "\8211" > > > > I'd like to find and replace those to produce something like the > following > > HTML: > > > > <i class="material-icons">face</i> > > > > Is that possible and if so, how? > > The way to go here is via `RawInline` elements, e.g.: > > function Str (s) > if s.text == '–' then > return pandoc.RawInline( > 'html', > '<i class="material-icons">face</i>' > ) > end > end > > Note matching on an exact string would fail if the character was > somewhere within a word (a typical would be em-dashes). One would have > to use the [utf8.codes] module to manually find and replace those > characters in that case. > > [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 6962 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Replace Str with HTML in Lua Filter [not found] ` <87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2019-08-29 17:50 ` John MacFarlane 2019-08-29 17:51 ` John MacFarlane 1 sibling, 0 replies; 6+ messages in thread From: John MacFarlane @ 2019-08-29 17:50 UTC (permalink / raw) To: Ken Dow, pandoc-discuss In Haskell you can use "\5881" but in lua this won't work. Try "\u{16F9}". Ken Dow <thekenshow-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Thanks for the help (Sorry for the long delay - I didn't get notified of > your post). > > I tried your suggestion and it works perfectly when searching for normal > text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc throws > the following error: > > decimal escape too large near '"\5881' > > Single quotes (.e.g, s.text == '\8211') gives the same error. I tried > "\\8211" in case the backslash needs to be escaped; no error but no > replacement occurs. > > Finally, I tried the utf8.codes approach, referring to Material Icon > codepoints doc for the value that should match, like so: > > function Str (s) > if utf8.codes(s.text) == 'e5c3' then > return pandoc.RawInline( > 'html', > '<i class="material-icons">apps</i>' > ) > end > end > > No error but no replacement. > > On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote: >> >> Ken Dow writes: >> >> > My DOCX source document, which is being converted to HTML, uses some >> Google >> > Material fonts. What shows up in the AST are values like >> > >> > Str "\8211" >> > >> > I'd like to find and replace those to produce something like the >> following >> > HTML: >> > >> > <i class="material-icons">face</i> >> > >> > Is that possible and if so, how? >> >> The way to go here is via `RawInline` elements, e.g.: >> >> function Str (s) >> if s.text == '–' then >> return pandoc.RawInline( >> 'html', >> '<i class="material-icons">face</i>' >> ) >> end >> end >> >> Note matching on an exact string would fail if the character was >> somewhere within a word (a typical would be em-dashes). One would have >> to use the [utf8.codes] module to manually find and replace those >> characters in that case. >> >> [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) >> >> -- >> Albert Krewinkel >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 >> > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2pnknamjr.fsf%40johnmacfarlane.net. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Replace Str with HTML in Lua Filter [not found] ` <87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2019-08-29 17:50 ` John MacFarlane @ 2019-08-29 17:51 ` John MacFarlane [not found] ` <m2muframhm.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 1 sibling, 1 reply; 6+ messages in thread From: John MacFarlane @ 2019-08-29 17:51 UTC (permalink / raw) To: Ken Dow, pandoc-discuss Or better yet just use the unicode character (make sure your lua filter is UTF-8 encoded): s.text == '–' Ken Dow <thekenshow-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Thanks for the help (Sorry for the long delay - I didn't get notified of > your post). > > I tried your suggestion and it works perfectly when searching for normal > text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc throws > the following error: > > decimal escape too large near '"\5881' > > Single quotes (.e.g, s.text == '\8211') gives the same error. I tried > "\\8211" in case the backslash needs to be escaped; no error but no > replacement occurs. > > Finally, I tried the utf8.codes approach, referring to Material Icon > codepoints doc for the value that should match, like so: > > function Str (s) > if utf8.codes(s.text) == 'e5c3' then > return pandoc.RawInline( > 'html', > '<i class="material-icons">apps</i>' > ) > end > end > > No error but no replacement. > > On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote: >> >> Ken Dow writes: >> >> > My DOCX source document, which is being converted to HTML, uses some >> Google >> > Material fonts. What shows up in the AST are values like >> > >> > Str "\8211" >> > >> > I'd like to find and replace those to produce something like the >> following >> > HTML: >> > >> > <i class="material-icons">face</i> >> > >> > Is that possible and if so, how? >> >> The way to go here is via `RawInline` elements, e.g.: >> >> function Str (s) >> if s.text == '–' then >> return pandoc.RawInline( >> 'html', >> '<i class="material-icons">face</i>' >> ) >> end >> end >> >> Note matching on an exact string would fail if the character was >> somewhere within a word (a typical would be em-dashes). One would have >> to use the [utf8.codes] module to manually find and replace those >> characters in that case. >> >> [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) >> >> -- >> Albert Krewinkel >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 >> > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2muframhm.fsf%40johnmacfarlane.net. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <m2muframhm.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: Replace Str with HTML in Lua Filter [not found] ` <m2muframhm.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2019-08-29 20:24 ` Ken Dow 0 siblings, 0 replies; 6+ messages in thread From: Ken Dow @ 2019-08-29 20:24 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3496 bytes --] Thanks John - copying & pasting the unicode from the HTML output into the Lua filter did the trick. Should've thought of that! On Thursday, 29 August 2019 13:51:49 UTC-4, John MacFarlane wrote: > > > Or better yet just use the unicode character (make sure your > lua filter is UTF-8 encoded): > > s.text == '–' > > Ken Dow <theke...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: > > > Thanks for the help (Sorry for the long delay - I didn't get notified of > > your post). > > > > I tried your suggestion and it works perfectly when searching for normal > > text (e.g., s.text == "Widget") but with s.text == "\8211", Pandoc > throws > > the following error: > > > > decimal escape too large near '"\5881' > > > > Single quotes (.e.g, s.text == '\8211') gives the same error. I tried > > "\\8211" in case the backslash needs to be escaped; no error but no > > replacement occurs. > > > > Finally, I tried the utf8.codes approach, referring to Material Icon > > codepoints doc for the value that should match, like so: > > > > function Str (s) > > if utf8.codes(s.text) == 'e5c3' then > > return pandoc.RawInline( > > 'html', > > '<i class="material-icons">apps</i>' > > ) > > end > > end > > > > No error but no replacement. > > > > On Saturday, 10 August 2019 12:02:40 UTC-4, Albert Krewinkel wrote: > >> > >> Ken Dow writes: > >> > >> > My DOCX source document, which is being converted to HTML, uses some > >> Google > >> > Material fonts. What shows up in the AST are values like > >> > > >> > Str "\8211" > >> > > >> > I'd like to find and replace those to produce something like the > >> following > >> > HTML: > >> > > >> > <i class="material-icons">face</i> > >> > > >> > Is that possible and if so, how? > >> > >> The way to go here is via `RawInline` elements, e.g.: > >> > >> function Str (s) > >> if s.text == '–' then > >> return pandoc.RawInline( > >> 'html', > >> '<i class="material-icons">face</i>' > >> ) > >> end > >> end > >> > >> Note matching on an exact string would fail if the character was > >> somewhere within a word (a typical would be em-dashes). One would have > >> to use the [utf8.codes] module to manually find and replace those > >> characters in that case. > >> > >> [utf8.codes](https://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) > > >> > >> -- > >> Albert Krewinkel > >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > >> > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d%40googlegroups.com. > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f48093d8-f00b-4287-9b31-abd24912d17d%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 6048 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-08-29 20:24 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-09 19:28 Replace Str with HTML in Lua Filter Ken Dow [not found] ` <abe5ae45-2ad8-419b-a282-5b5e1b4fcda1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2019-08-10 16:02 ` Albert Krewinkel [not found] ` <8736i9qa95.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2019-08-29 15:06 ` Ken Dow [not found] ` <87a12669-ed81-4ce4-aa8e-eb5d3d64bf3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2019-08-29 17:50 ` John MacFarlane 2019-08-29 17:51 ` John MacFarlane [not found] ` <m2muframhm.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2019-08-29 20:24 ` Ken Dow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).