* Sometimes markdown output tables are HTML
@ 2022-03-21 16:49 Paul Close
[not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Paul Close @ 2022-03-21 16:49 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1151 bytes --]
Hi all,
I am using pandoc to translate MS Word to gfm (markdown subset) and having
a problem where some tables are output as HTML instead of the expected
markdown. I found if I use grid_tables, there is no problem, but due to
needing gfm format I need to stick with pipe_tables.
It appears to be related to table width, though I tried some options
including --wrap=none and --columns=1000 but neither changed the output.
Is there some way I can force output of pipe_tables, even if they are ugly
or overly long? If not, might it be possible to write a lua script to force
table input into a pipe_table output? Ideally I'd like some way to control
(or even know) when pandoc would decide to output HTML.
Thanks for any thoughts!
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fc81988f-18f9-45eb-81e9-526a5507140fn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 1530 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Sometimes markdown output tables are HTML
[not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-03-22 14:55 ` Paul Close
[not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Paul Close @ 2022-03-22 14:55 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1543 bytes --]
An update... I was working on small examples to reproduce and noticed the
HTML tables appear to be caused by multiple lines in a cell, which makes
sense why grid tables work, but pipe tables did not.
I am working around using a filter to join multiple lines with
RawInline('html', '<br/>').
On Monday, March 21, 2022 at 11:49:48 AM UTC-5 Paul Close wrote:
> Hi all,
>
> I am using pandoc to translate MS Word to gfm (markdown subset) and having
> a problem where some tables are output as HTML instead of the expected
> markdown. I found if I use grid_tables, there is no problem, but due to
> needing gfm format I need to stick with pipe_tables.
>
> It appears to be related to table width, though I tried some options
> including --wrap=none and --columns=1000 but neither changed the output.
>
> Is there some way I can force output of pipe_tables, even if they are ugly
> or overly long? If not, might it be possible to write a lua script to force
> table input into a pipe_table output? Ideally I'd like some way to control
> (or even know) when pandoc would decide to output HTML.
>
> Thanks for any thoughts!
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/102127eb-7c0f-479e-9bc0-92f222412584n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2136 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Sometimes markdown output tables are HTML
[not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-03-23 16:42 ` Paul Close
0 siblings, 0 replies; 3+ messages in thread
From: Paul Close @ 2022-03-23 16:42 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 3573 bytes --]
Perhaps the final piece of the puzzle, I found my MS Word document had some
strange formatting that was not visible, but resulted in BlockQuotes in
pandoc, which in turn are not valid in pipe_tables.
In case it helps someone else, here is the lua filter I wrote to get
markdown pipe_tables output.
-- Merge multiple paragraphs by appending subsequent paragraphs to the first
-- with an HTML <br/> separator, so markdown sees them as a single (long!)
line.
function merge_cells(row)
for cell = 1, #row.cells do
local cell_contents = row.cells[cell].contents
if #cell_contents > 1 then
-- Combine the content of all blocks into the content
-- of the first block, then clear the remaining blocks
cell_contents[1].content = pandoc.utils.blocks_to_inlines(
cell_contents, { pandoc.Space(), pandoc.RawInline('html',
"<br/>"), pandoc.Space() })
for block = 2, #cell_contents do
cell_contents[block] = nil
end
end
end
end
function Table(elem)
-- Fix cases where TableHead is empty, if so move first table row up
if #elem.head.rows == 0 then
local row = table.remove(elem.bodies[1].body, 1)
table.insert(elem.head.rows, 1, row)
end
-- Fix cases where multiple lines appear in a table cell since gfm/pipe
-- tables can only handle single lines. Instead use <br/> to separate.
for row = 1, #elem.head.rows do
merge_cells(elem.head.rows[row])
end
for row = 1, #elem.bodies[1].body do
merge_cells(elem.bodies[1].body[row])
end
if #elem.bodies > 1 then
print("Warning: table with " .. #elem.bodies .. " bodies.")
end
-- Block quotes don't work in tables, replace with italics
return elem:walk {
BlockQuote = function(el)
return pandoc.Emph(pandoc.utils.stringify(el.content))
end
}
end
On Tuesday, March 22, 2022 at 9:55:24 AM UTC-5 Paul Close wrote:
> An update... I was working on small examples to reproduce and noticed the
> HTML tables appear to be caused by multiple lines in a cell, which makes
> sense why grid tables work, but pipe tables did not.
>
> I am working around using a filter to join multiple lines with
> RawInline('html', '<br/>').
>
> On Monday, March 21, 2022 at 11:49:48 AM UTC-5 Paul Close wrote:
>
>> Hi all,
>>
>> I am using pandoc to translate MS Word to gfm (markdown subset) and
>> having a problem where some tables are output as HTML instead of the
>> expected markdown. I found if I use grid_tables, there is no problem, but
>> due to needing gfm format I need to stick with pipe_tables.
>>
>> It appears to be related to table width, though I tried some options
>> including --wrap=none and --columns=1000 but neither changed the output.
>>
>> Is there some way I can force output of pipe_tables, even if they are
>> ugly or overly long? If not, might it be possible to write a lua script to
>> force table input into a pipe_table output? Ideally I'd like some way to
>> control (or even know) when pandoc would decide to output HTML.
>>
>> Thanks for any thoughts!
>>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c83caefe-623c-4b2f-b69d-d05b88919ed2n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 5484 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-03-23 16:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 16:49 Sometimes markdown output tables are HTML Paul Close
[not found] ` <fc81988f-18f9-45eb-81e9-526a5507140fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-03-22 14:55 ` Paul Close
[not found] ` <102127eb-7c0f-479e-9bc0-92f222412584n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-03-23 16:42 ` Paul Close
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).