* Pandoc Citeproc doesn't work on HTML format @ 2022-11-07 13:26 Mladen Babic [not found] ` <8e24d40c-5977-4912-9e1b-6cfa0f66d5e5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Mladen Babic @ 2022-11-07 13:26 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1867 bytes --] Hi all, I'm trying to reference cites from the .bib file in the HTML but without success. The function perfectly works for Markdown, so my question is does the citeproc work on other formats except for MD? Here are some examples which I use: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="en"> <body> Test [@test1] </body> </html> Command: pandoc --bibliography=test.bib --citeproc test.html -o test.html -s --metadata-file=test.yaml The .bib file contains the following: @article{test1, author = {Rathod, N and Kulawik, P and Ozogul, Y and Ozogul, F and Bekhit, A}, title = {Recent developments in non-thermal processing for seafood and seafood products: cold plasma, pulsed electric field and high hydrostatic pressure}, journal = {International Journal of Food Science & Technology}, date = {2022}, year = {2022}, pages = {774--790}, volume = {57}, number = {2}, doi = {10.1111/ijfs.15392}, raw = {Rathod, N. B., Kulawik, P., Ozogul, Y., Ozogul, F., & Bekhit, A. E. D. A. (2022). Recent developments in non-thermal processing for seafood and seafood products: cold plasma, pulsed electric field and high hydrostatic pressure. International Journal of Food Science & Technology, 57(2), 774-790. https://doi.org/10.1111/ijfs.15392} } I have created the Lua filter which covers only partial cases. I'm a newbie in Lua and can not currently make the complex filter like we have it for MD. Thank you. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8e24d40c-5977-4912-9e1b-6cfa0f66d5e5n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2551 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <8e24d40c-5977-4912-9e1b-6cfa0f66d5e5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Pandoc Citeproc doesn't work on HTML format [not found] ` <8e24d40c-5977-4912-9e1b-6cfa0f66d5e5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-11-07 14:21 ` Albert Krewinkel [not found] ` <87v8nqon26.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Albert Krewinkel @ 2022-11-07 14:21 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Hi Mladen, Mladen Babic <mladen.babic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > I'm trying to reference cites from the .bib file in the HTML but > without success. The function perfectly works for Markdown, so my > question is does the citeproc work on other formats except for MD? > > Here are some examples which I use: > > <!DOCTYPE html> > <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="en"> > <body> > Test [@test1] > </body> > </html> The file is parsed as HTML, but body above uses Markdown syntax; it's not possible to use that syntax in HTML. However, Markdown can contain HTML, so you could try with `pandoc --from=markdown ...`. Note, however, that pandoc conversions are lossy in general. Going from HTML to HTML might not do what you expect. -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <87v8nqon26.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Pandoc Citeproc doesn't work on HTML format [not found] ` <87v8nqon26.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2022-11-07 15:12 ` Mladen Babic [not found] ` <b67f836a-8d65-4124-bb6c-900d9933d2d2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Mladen Babic @ 2022-11-07 15:12 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2190 bytes --] Hi Albert, Thanks a lot for the quick reply. Ok, I probably missed in the Pandoc citeproc doc that doesn't mention that supports only MD, so I thought it would work for all formats with pattern @test. What I actually want to do is when the user uploads the DOCX file, Pandoc converts the file to HTML and shows it to the HTML editor for additional editing by the user and converts it back to DOCX. After converting to Html, the system (my app) will replace current cites in HTML cite i.e. [1] with the key from the .bib file (like in my case [@test1] so the citeproc will know how to process it. I guess I need to convert DOCX to MD from MD to HTML but I'm afraid the file will lose some of the styles during the conversion process. Any tips/hints will be appreciated. Thank you. On Monday, November 7, 2022 at 3:29:29 PM UTC+1 Albert Krewinkel wrote: > Hi Mladen, > > Mladen Babic <mladen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > I'm trying to reference cites from the .bib file in the HTML but > > without success. The function perfectly works for Markdown, so my > > question is does the citeproc work on other formats except for MD? > > > > Here are some examples which I use: > > > > <!DOCTYPE html> > > <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="en"> > > <body> > > Test [@test1] > > </body> > > </html> > > The file is parsed as HTML, but body above uses Markdown syntax; it's > not possible to use that syntax in HTML. However, Markdown can contain > HTML, so you could try with `pandoc --from=markdown ...`. > > Note, however, that pandoc conversions are lossy in general. Going from > HTML to HTML might not do what you expect. > > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b67f836a-8d65-4124-bb6c-900d9933d2d2n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3304 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <b67f836a-8d65-4124-bb6c-900d9933d2d2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Pandoc Citeproc doesn't work on HTML format [not found] ` <b67f836a-8d65-4124-bb6c-900d9933d2d2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-11-08 8:07 ` Albert Krewinkel [not found] ` <87r0ydoo0n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Albert Krewinkel @ 2022-11-08 8:07 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Frederik Eichler Mladen Babic <mladen.babic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > What I actually want to do is when the user uploads the DOCX file, > Pandoc converts the file to HTML and shows it to the HTML editor for > additional editing by the user and converts it back to DOCX. > After converting to Html, the system (my app) will replace current > cites in HTML cite i.e. [1] with the key from the .bib file (like in > my case [@test1] so the citeproc will know how to process it. That's an interesting use case. I don't have any immediate ideas; going via Markdown might be the best option. But please make sure to also checkout [OS-APS], an open-source project that uses pandoc for some of the document conversions. Going from your description it sounds like it could be exactly what you need. I've added Frederik from that org to CC, he may be able give more info. [OS-APS]: https://os-aps.de -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <87r0ydoo0n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Pandoc Citeproc doesn't work on HTML format [not found] ` <87r0ydoo0n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2022-11-08 9:21 ` 'William Lupton' via pandoc-discuss [not found] ` <CAEe_xxizCtYTk_m5ROjitBB9WPxivF3rKdmk2vOFqEdZBtLX0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: 'William Lupton' via pandoc-discuss @ 2022-11-08 9:21 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Frederik Eichler [-- Attachment #1: Type: text/plain, Size: 3502 bytes --] Re this: > Ok, I probably missed in the Pandoc citeproc doc that doesn't mention that supports only MD, so I thought it would work for all formats with pattern @test. The @test citation syntax is defined under the citations extension <https://pandoc.org/MANUAL.html#extension-citations> (with target 'extension-citations'). This is within the 'Pandoc's Markdown' section and so perhaps applies only to markdown. However, there's another citations extension <https://pandoc.org/MANUAL.html#org-citations> (with target 'org-citations') in the 'Extensions -> Other extensions' section, and this describes its usage within org and docx documents. This little shell script illustrates that the 'citations' extension is supported for docx, ipynb, jats, markdown (+variants), opml and org, and is enabled by default for markdown, opml and org. % for i in $(pandoc --list-input-formats); do echo -n $i:; pandoc --list-extensions=$i | grep citations || echo; done | grep ':.citations' docx:-citations ipynb:-citations markdown:+citations markdown_github:-citations markdown_mmd:-citations markdown_phpextra:-citations markdown_strict:-citations opml:+citations org:+citations So I think that (not surprisingly?) the 'citations' syntax supported by a given input format (if supported) is a function of that input format. The supported format is clear for markdown (+variants?), org and docx but perhaps not for ipynb and opml. I think that it might be useful to clarify some of this in the man page? Please let me know if I should create an issue. On Tue, 8 Nov 2022 at 08:21, Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote: > > Mladen Babic <mladen.babic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > What I actually want to do is when the user uploads the DOCX file, > > Pandoc converts the file to HTML and shows it to the HTML editor for > > additional editing by the user and converts it back to DOCX. > > After converting to Html, the system (my app) will replace current > > cites in HTML cite i.e. [1] with the key from the .bib file (like in > > my case [@test1] so the citeproc will know how to process it. > > That's an interesting use case. I don't have any immediate ideas; going > via Markdown might be the best option. > > But please make sure to also checkout [OS-APS], an open-source > project that uses pandoc for some of the document conversions. Going > from your description it sounds like it could be exactly what you need. > I've added Frederik from that org to CC, he may be able give more info. > > [OS-APS]: https://os-aps.de > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/87r0ydoo0n.fsf%40zeitkraut.de > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxizCtYTk_m5ROjitBB9WPxivF3rKdmk2vOFqEdZBtLX0Q%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 4926 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAEe_xxizCtYTk_m5ROjitBB9WPxivF3rKdmk2vOFqEdZBtLX0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Pandoc Citeproc doesn't work on HTML format [not found] ` <CAEe_xxizCtYTk_m5ROjitBB9WPxivF3rKdmk2vOFqEdZBtLX0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-11-10 14:09 ` Mladen Babic 0 siblings, 0 replies; 6+ messages in thread From: Mladen Babic @ 2022-11-10 14:09 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 4371 bytes --] Thanks all for the feedback. It would be nice to have citeproc for HTML too. I guess it won't take too much effort for it. In the meantime, I would like to create some Lua filters that will cover several cases but I'm a newbie to Lua. I created a case for the first case [@test1], but I'm not able to implement for i.e [@test1; @test2]. How can I return a list of cites? This is my Lua filter: function Str(el) local citekey = el.text:match("[[]@(%w+)[]]") if citekey then local citation = pandoc.Citation(citekey, 'NormalCitation') return pandoc.Cite({pandoc.Str(citekey)}, {citation}) end end Any help will be appreciated. Thanks On Tuesday, November 8, 2022 at 10:22:05 AM UTC+1 wlu...-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org wrote: > Re this: > > > Ok, I probably missed in the Pandoc citeproc doc that doesn't mention > that supports only MD, so I thought it would work for all formats with > pattern @test. > > The @test citation syntax is defined under the citations extension > <https://pandoc.org/MANUAL.html#extension-citations> (with target > 'extension-citations'). This is within the 'Pandoc's Markdown' section and > so perhaps applies only to markdown. > > However, there's another citations extension > <https://pandoc.org/MANUAL.html#org-citations> (with target > 'org-citations') in the 'Extensions -> Other extensions' section, and this > describes its usage within org and docx documents. > > This little shell script illustrates that the 'citations' extension is > supported for docx, ipynb, jats, markdown (+variants), opml and org, and is > enabled by default for markdown, opml and org. > > % for i in $(pandoc --list-input-formats); do echo -n $i:; pandoc > --list-extensions=$i | grep citations || echo; done | grep ':.citations' > docx:-citations > ipynb:-citations > markdown:+citations > markdown_github:-citations > markdown_mmd:-citations > markdown_phpextra:-citations > markdown_strict:-citations > opml:+citations > org:+citations > > So I think that (not surprisingly?) the 'citations' syntax supported by a > given input format (if supported) is a function of that input format. The > supported format is clear for markdown (+variants?), org and docx but > perhaps not for ipynb and opml. > > I think that it might be useful to clarify some of this in the man page? > Please let me know if I should create an issue. > > On Tue, 8 Nov 2022 at 08:21, Albert Krewinkel <albert...-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> > wrote: > >> >> Mladen Babic <mladen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: >> >> > What I actually want to do is when the user uploads the DOCX file, >> > Pandoc converts the file to HTML and shows it to the HTML editor for >> > additional editing by the user and converts it back to DOCX. >> > After converting to Html, the system (my app) will replace current >> > cites in HTML cite i.e. [1] with the key from the .bib file (like in >> > my case [@test1] so the citeproc will know how to process it. >> >> That's an interesting use case. I don't have any immediate ideas; going >> via Markdown might be the best option. >> >> But please make sure to also checkout [OS-APS], an open-source >> project that uses pandoc for some of the document conversions. Going >> from your description it sounds like it could be exactly what you need. >> I've added Frederik from that org to CC, he may be able give more info. >> >> [OS-APS]: https://os-aps.de >> >> -- >> Albert Krewinkel >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/87r0ydoo0n.fsf%40zeitkraut.de >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b4d87a5f-0eaf-4cfb-82cd-5699aad36402n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 7280 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-11-10 14:09 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-07 13:26 Pandoc Citeproc doesn't work on HTML format Mladen Babic [not found] ` <8e24d40c-5977-4912-9e1b-6cfa0f66d5e5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-11-07 14:21 ` Albert Krewinkel [not found] ` <87v8nqon26.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2022-11-07 15:12 ` Mladen Babic [not found] ` <b67f836a-8d65-4124-bb6c-900d9933d2d2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-11-08 8:07 ` Albert Krewinkel [not found] ` <87r0ydoo0n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2022-11-08 9:21 ` 'William Lupton' via pandoc-discuss [not found] ` <CAEe_xxizCtYTk_m5ROjitBB9WPxivF3rKdmk2vOFqEdZBtLX0Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2022-11-10 14:09 ` Mladen Babic
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).