* Sometimes markdown output tables are HTML @ 2022-03-21 16:49 Paul Close [not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Paul Close @ 2022-03-21 16:49 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1151 bytes --] Hi all, I am using pandoc to translate MS Word to gfm (markdown subset) and having a problem where some tables are output as HTML instead of the expected markdown. I found if I use grid_tables, there is no problem, but due to needing gfm format I need to stick with pipe_tables. It appears to be related to table width, though I tried some options including --wrap=none and --columns=1000 but neither changed the output. Is there some way I can force output of pipe_tables, even if they are ugly or overly long? If not, might it be possible to write a lua script to force table input into a pipe_table output? Ideally I'd like some way to control (or even know) when pandoc would decide to output HTML. Thanks for any thoughts! -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fc81988f-18f9-45eb-81e9-526a5507140fn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 1530 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Sometimes markdown output tables are HTML [not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-03-22 14:55 ` Paul Close [not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Paul Close @ 2022-03-22 14:55 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1543 bytes --] An update... I was working on small examples to reproduce and noticed the HTML tables appear to be caused by multiple lines in a cell, which makes sense why grid tables work, but pipe tables did not. I am working around using a filter to join multiple lines with RawInline('html', '<br/>'). On Monday, March 21, 2022 at 11:49:48 AM UTC-5 Paul Close wrote: > Hi all, > > I am using pandoc to translate MS Word to gfm (markdown subset) and having > a problem where some tables are output as HTML instead of the expected > markdown. I found if I use grid_tables, there is no problem, but due to > needing gfm format I need to stick with pipe_tables. > > It appears to be related to table width, though I tried some options > including --wrap=none and --columns=1000 but neither changed the output. > > Is there some way I can force output of pipe_tables, even if they are ugly > or overly long? If not, might it be possible to write a lua script to force > table input into a pipe_table output? Ideally I'd like some way to control > (or even know) when pandoc would decide to output HTML. > > Thanks for any thoughts! > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/102127eb-7c0f-479e-9bc0-92f222412584n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2136 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Sometimes markdown output tables are HTML [not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-03-23 16:42 ` Paul Close 0 siblings, 0 replies; 3+ messages in thread From: Paul Close @ 2022-03-23 16:42 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3573 bytes --] Perhaps the final piece of the puzzle, I found my MS Word document had some strange formatting that was not visible, but resulted in BlockQuotes in pandoc, which in turn are not valid in pipe_tables. In case it helps someone else, here is the lua filter I wrote to get markdown pipe_tables output. -- Merge multiple paragraphs by appending subsequent paragraphs to the first -- with an HTML <br/> separator, so markdown sees them as a single (long!) line. function merge_cells(row) for cell = 1, #row.cells do local cell_contents = row.cells[cell].contents if #cell_contents > 1 then -- Combine the content of all blocks into the content -- of the first block, then clear the remaining blocks cell_contents[1].content = pandoc.utils.blocks_to_inlines( cell_contents, { pandoc.Space(), pandoc.RawInline('html', "<br/>"), pandoc.Space() }) for block = 2, #cell_contents do cell_contents[block] = nil end end end end function Table(elem) -- Fix cases where TableHead is empty, if so move first table row up if #elem.head.rows == 0 then local row = table.remove(elem.bodies[1].body, 1) table.insert(elem.head.rows, 1, row) end -- Fix cases where multiple lines appear in a table cell since gfm/pipe -- tables can only handle single lines. Instead use <br/> to separate. for row = 1, #elem.head.rows do merge_cells(elem.head.rows[row]) end for row = 1, #elem.bodies[1].body do merge_cells(elem.bodies[1].body[row]) end if #elem.bodies > 1 then print("Warning: table with " .. #elem.bodies .. " bodies.") end -- Block quotes don't work in tables, replace with italics return elem:walk { BlockQuote = function(el) return pandoc.Emph(pandoc.utils.stringify(el.content)) end } end On Tuesday, March 22, 2022 at 9:55:24 AM UTC-5 Paul Close wrote: > An update... I was working on small examples to reproduce and noticed the > HTML tables appear to be caused by multiple lines in a cell, which makes > sense why grid tables work, but pipe tables did not. > > I am working around using a filter to join multiple lines with > RawInline('html', '<br/>'). > > On Monday, March 21, 2022 at 11:49:48 AM UTC-5 Paul Close wrote: > >> Hi all, >> >> I am using pandoc to translate MS Word to gfm (markdown subset) and >> having a problem where some tables are output as HTML instead of the >> expected markdown. I found if I use grid_tables, there is no problem, but >> due to needing gfm format I need to stick with pipe_tables. >> >> It appears to be related to table width, though I tried some options >> including --wrap=none and --columns=1000 but neither changed the output. >> >> Is there some way I can force output of pipe_tables, even if they are >> ugly or overly long? If not, might it be possible to write a lua script to >> force table input into a pipe_table output? Ideally I'd like some way to >> control (or even know) when pandoc would decide to output HTML. >> >> Thanks for any thoughts! >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c83caefe-623c-4b2f-b69d-d05b88919ed2n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 5484 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-03-23 16:42 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-03-21 16:49 Sometimes markdown output tables are HTML Paul Close [not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-03-22 14:55 ` Paul Close [not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-03-23 16:42 ` Paul Close
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).