public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* pandoc convert from HTML to markdown with pipe-tables
@ 2019-05-24 20:45 SB Chapman
       [not found] ` <4f31ceca-cfec-4af8-a7f1-a360d498fb24-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: SB Chapman @ 2019-05-24 20:45 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1820 bytes --]

I'm having trouble getting my HTML to convert to markdown with pipe-tables. 
I either output raw HTML for the table or a grid-table. I specifically 
don't want the "+" cell alignment character. I also tried converting to 
gfm, but, then, I lose some of the markdown info that I need in other 
places in my document. Any help would be greatly appreciated.


Input

<Sect> <H1 id="LinkTarget_4081">PLANNING AND BIDDING </H1>

<P>blahc blah blah.</P>

<P>blahc blah blah. </P>

<P>blahc blah blah. </P>

<P>blahc blah blah. </P>


<H5 id="LinkTarget_3932">Table 2-1 Index  Report: Appendices B and C </H5>

<Table>
<TR>
<TD>
<P>Section </P>
</TD>

<TD>
<P>Appendix B Table of Contents </P>
</TD>
</TR>

<TR>
<TD>
<P>I </P>
</TD>

<TD>
<P>Introduction </P>
</TD>
</TR>

</table>

Output(table only):

+-----------------------+-----------------------+-----------------------+
| Section               | Appendix B Table of   |                       |
|                       | Contents              |                       |
+-----------------------+-----------------------+-----------------------+

command used: pandoc -f HTML -t markdown -o output.md -s 

I've used -t markdown+pipe_tables-grid_tables and any variation of tables I 
can think of.   


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4f31ceca-cfec-4af8-a7f1-a360d498fb24%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3183 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pandoc convert from HTML to markdown with pipe-tables
       [not found] ` <4f31ceca-cfec-4af8-a7f1-a360d498fb24-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-05-25 14:06   ` mb21
       [not found]     ` <8873e8ed-2ef1-4c43-bde5-3daff55516e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2019-05-25 21:00   ` Kolen Cheung
  1 sibling, 1 reply; 4+ messages in thread
From: mb21 @ 2019-05-25 14:06 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2481 bytes --]

The problem is that your table cells contain paragraphs (`<p>` tags) in the 
HTML input. Pandoc detects those as block elements, which generally cannot 
be represented in pipe-tables.

So you can either change your input HTML or use a pandoc-filter to strip 
the Paras. Actually, in this case the filter would only have to set the 
column widths to zero, which the HTML reader should have done any way. I've 
submitted a pull request to fix this 
case: https://github.com/jgm/pandoc/pull/5524/


On Friday, May 24, 2019 at 10:45:14 PM UTC+2, SB Chapman wrote:
>
> I'm having trouble getting my HTML to convert to markdown with 
> pipe-tables. I either output raw HTML for the table or a grid-table. I 
> specifically don't want the "+" cell alignment character. I also tried 
> converting to gfm, but, then, I lose some of the markdown info that I need 
> in other places in my document. Any help would be greatly appreciated.
>
>
> Input
>
> <Sect> <H1 id="LinkTarget_4081">PLANNING AND BIDDING </H1>
>
> <P>blahc blah blah.</P>
>
> <P>blahc blah blah. </P>
>
> <P>blahc blah blah. </P>
>
> <P>blahc blah blah. </P>
>
>
> <H5 id="LinkTarget_3932">Table 2-1 Index  Report: Appendices B and C </H5>
>
> <Table>
> <TR>
> <TD>
> <P>Section </P>
> </TD>
>
> <TD>
> <P>Appendix B Table of Contents </P>
> </TD>
> </TR>
>
> <TR>
> <TD>
> <P>I </P>
> </TD>
>
> <TD>
> <P>Introduction </P>
> </TD>
> </TR>
>
> </table>
>
> Output(table only):
>
> +-----------------------+-----------------------+-----------------------+
> | Section               | Appendix B Table of   |                       |
> |                       | Contents              |                       |
> +-----------------------+-----------------------+-----------------------+
>
> command used: pandoc -f HTML -t markdown -o output.md -s 
>
> I've used -t markdown+pipe_tables-grid_tables and any variation of tables 
> I can think of.   
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8873e8ed-2ef1-4c43-bde5-3daff55516e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3962 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pandoc convert from HTML to markdown with pipe-tables
       [not found]     ` <8873e8ed-2ef1-4c43-bde5-3daff55516e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-05-25 20:45       ` Kolen Cheung
  0 siblings, 0 replies; 4+ messages in thread
From: Kolen Cheung @ 2019-05-25 20:45 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 1002 bytes --]

This is a problem I’ve encountered from time to time. Sometimes it is really because the table isn’t compatible with pipe table, sometimes not.

In the later case, you could try to use GitHub.com/ickc/pantable . There’s a new option pipe_tables and raw_markdown when used together would force pipe table output. (Note that this is written a few days ago and I encountered a bug yesterday that will be fixed soon.)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8f1a0f00-2e77-4c92-ada3-e4dc195c5582%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* pandoc convert from HTML to markdown with pipe-tables
       [not found] ` <4f31ceca-cfec-4af8-a7f1-a360d498fb24-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2019-05-25 14:06   ` mb21
@ 2019-05-25 21:00   ` Kolen Cheung
  1 sibling, 0 replies; 4+ messages in thread
From: Kolen Cheung @ 2019-05-25 21:00 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 904 bytes --]

I might not have been very clear: if you want to use paintable to do this, you first need to run with pantable2csv which gives you a markdown with a codeblock of that table in CSV format. Then you need to manually edit the YAML option to add the 2 I mentioned. (Currently it is not possible to do this without manual edit.)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9ad1dbee-e596-4797-8e69-96815d8ae157%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-25 21:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-24 20:45 pandoc convert from HTML to markdown with pipe-tables SB Chapman
     [not found] ` <4f31ceca-cfec-4af8-a7f1-a360d498fb24-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-05-25 14:06   ` mb21
     [not found]     ` <8873e8ed-2ef1-4c43-bde5-3daff55516e8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-05-25 20:45       ` Kolen Cheung
2019-05-25 21:00   ` Kolen Cheung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).