Markdown, tables and CSV

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

* Markdown, tables and CSV
       [not found] ` <047d7b86ebe83c062b05332eab9b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
@ 2016-05-20  9:38   ` Martin Fenner
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Martin Fenner @ 2016-05-20  9:38 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3071 bytes --]

Dear group,

The topic of CSV support in Pandoc has come up several times on this list, includes this thread from 2014:
https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI <https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI>

Since last year I work for an organisation that frequently deals with tabular data (and helped organize CSVconf earlier this month), and I have done some thinking on how CSV could fit into Pandoc. I see two important use cases:

* CSV reader that converts to tables in HTML, docx, latex, etc.
* CSV has a format to describe tables in markdown

For the first use case I wrote a hack for the Jekyll blogging platform this week that turns CSV files into markdown grid tables format that is then processed by Pandoc (https://github.com/datacite/jekyll-csvy <https://github.com/datacite/jekyll-csvy>). I would rather use Pandoc with a CSV reader, but my Haskell isn't good enough to write one. But for now I can generate blog posts directly from CSV files. Other people have done similar things with Pandoc and CSV.

For the second use case I see a clear advantage of CSV over the various attempts to format tables in markdown (simple_tables, multiline_tables, grid_tables, pipe_tables). Everyone (and many tools) understands the CSV format, and you can do most of the things with CSV that the other table formats allow (multi-column formats and column alignment are a bit trickier). This has been done before using Pandoc filters, but I think a Pandoc "csv_tables" Pandoc extension would make this easier for the casual user. Using the grid_tables example from the Pandoc documentation, this could look like this:

: Sample csv table.

,,,
Fruit,Price,Advantages
Bananas,$1.34,- built-in wrapper\n- bright color
Oranges,$2.10, - cures scurvy\n- tasty
,,,

I like three commas on a new line to indicate the start and end of a table, but that is of course open for discussion. The format is much easier to read and edit for humans compared to grid tables, the only tricky bit is maybe the \n for multiline columns. I would think we could add metadata to the fenced table blog similar to code blocks, e.g.

,,,{ #mytable .numberRows }

One challenge with CSV is that it is an ill-defined format somewhat similar to markdown before CommonMark. It may make things easier to only support a specific CSV variant (e.g. comma as separator, header required, comment lines not allowed).

Thoughts?

Best,

Martin

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4487 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
@ 2016-05-20 17:55       ` John Gabriele
       [not found]         ` <1463766905.1918988.613990665.6CD67781-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
  2016-05-20 18:36       ` John MACFARLANE
                         ` (6 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: John Gabriele @ 2016-05-20 17:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 4684 bytes --]

Hi Martin,
 
There's also [issue 553](https://github.com/jgm/pandoc/issues/553).
 
Personally, I think I like (from that issue thread) anton-k's
original idea:
 
![an image](foo.jpg)
 
![an include](foo.txt)
 
![a csv to be rendered as a table](foo.csv)
 
(that is, based on filename extension)
 
Those seem sensible, symmetrical, easy to remember, and I think fit well
with what pandoc already does (`![]()` is already like an include).
 
As for a syntax to allow writing your csv data right into your md file,
... Pandoc already supports a generous number of table formats that are
pretty easy to type. And for larger tables that you might be tempted to
copy/paste in, might be better easier to bang-include them (as in,
`![]()`), rather than muck up your pretty markdown file with a giant
bunch of csv data. :)
 
-- John
 
 
 
 
On Fri, May 20, 2016, at 05:38 AM, Martin Fenner wrote:
> Dear group,
>
> The topic of CSV support in Pandoc has come up several times on this
> list, includes this thread from 2014:
> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>
> Since last year I work for an organisation that frequently deals with
> tabular data (and helped organize CSVconf earlier this month), and I
> have done some thinking on how CSV could fit into Pandoc. I see two
> important use cases:
>
> * CSV reader that converts to tables in HTML, docx, latex, etc.
> * CSV has a format to describe tables in markdown
>
> For the first use case I wrote a hack for the Jekyll blogging
> platform this week that turns CSV files into markdown grid tables
> format that is then processed by Pandoc
> (https://github.com/datacite/jekyll-csvy). I would rather use Pandoc
> with a CSV reader, but my Haskell isn't good enough to write one. But
> for now I can generate blog posts directly from CSV files. Other
> people have done similar things with Pandoc and CSV.
>
> For the second use case I see a clear advantage of CSV over the
> various attempts to format tables in markdown (simple_tables,
> multiline_tables, grid_tables, pipe_tables). Everyone (and many tools)
> understands the CSV format, and you can do most of the things with CSV
> that the other table formats allow (multi-column formats and column
> alignment are a bit trickier). This has been done before using Pandoc
> filters, but I think a Pandoc "csv_tables" Pandoc extension would make
> this easier for the casual user. Using the grid_tables example from
> the Pandoc documentation, this could look like this:
>
> : Sample csv table.
>
> ,,,
> Fruit,Price,Advantages
> Bananas,$1.34,- built-in wrapper\n- bright color
> Oranges,$2.10, - cures scurvy\n- tasty
> ,,,
>
> I like three commas on a new line to indicate the start and end of
> a table, but that is of course open for discussion. The format is
> much easier to read and edit for humans compared to grid tables,
> the only tricky bit is maybe the \n for multiline columns. I would
> think we could add metadata to the fenced table blog similar to
> code blocks, e.g.
>
> ,,,{ #mytable .numberRows }
>
> One challenge with CSV is that it is an ill-defined format somewhat
> similar to markdown before CommonMark. It may make things easier to
> only support a specific CSV variant (e.g. comma as separator, header
> required, comment lines not allowed).
>
> Thoughts?
>
> Best,
>
> Martin
>
>
>
>
> --
>  You received this message because you are subscribed to the Google
>  Groups "pandoc-discuss" group.
>  To unsubscribe from this group and stop receiving emails from it,
>  send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>  To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>  To view this discussion on the web visit
>  https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org[1].
>  For more options, visit https://groups.google.com/d/optout.
 

Links:

  1. https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org?utm_medium=email&utm_source=footer

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1463766905.1918988.613990665.6CD67781%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 7420 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
  2016-05-20 17:55       ` John Gabriele
@ 2016-05-20 18:36       ` John MACFARLANE
       [not found]         ` <20160520183616.GB95956-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
  2016-05-21 17:03       ` kurt.pfeifle via pandoc-discuss
                         ` (5 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: John MACFARLANE @ 2016-05-20 18:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Martin Fenner [May 20 16 12:38 ]:
>   I would rather use Pandoc with a CSV reader, but my Haskell isn't good
>   enough to write one.

This would be a pretty easy project for someone trying to
learn Haskell; maybe someone on the list wants to try it?
The cassava library works well for csv parsing.

>   For the second use case I see a clear advantage of CSV over the various
>   attempts to format tables in markdown (simple_tables, multiline_tables,
>   grid_tables, pipe_tables). Everyone (and many tools) understands the
>   CSV format, and you can do most of the things with CSV that the other
>   table formats allow (multi-column formats and column alignment are a
>   bit trickier). This has been done before using Pandoc filters, but I
>   think a Pandoc "csv_tables" Pandoc extension would make this easier for
>   the casual user. Using the grid_tables example from the Pandoc
>   documentation, this could look like this:
>
>   : Sample csv table.
>   ,,,
>   Fruit,Price,Advantages
>   Bananas,$1.34,- built-in wrapper\n- bright color
>   Oranges,$2.10, - cures scurvy\n- tasty
>   ,,,

I think that using a filter that processes specially marked
code blocks is a better way to go than introducing yet
another delimited block type.

For one thing, this will degrade much more gracefully when
you render it with a standard markdown renderer.
(The CSV will show up as code rather than garbage.)

One could think about integrating the filter into pandoc
itself, as an option, but the code and syntax would not
have to be different, I think.



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <20160520183616.GB95956-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
@ 2016-05-20 19:05           ` John Muccigrosso
  2016-05-20 19:30           ` John Gabriele
  2016-05-20 19:32           ` BP Jonsson
  2 siblings, 0 replies; 42+ messages in thread
From: John Muccigrosso @ 2016-05-20 19:05 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 883 bytes --]

This could really be quite nice, especially for those of us who try (and 
have our students try) to keep stuff in plain text wherever possible.

(While someone else is writing this, a nice add-on to the filter would be 
the ability to select rows in the csv by numbers (mth through nth) or by a 
column value.)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0aeb3cec-e59d-487c-882c-20d014623fa7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1316 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <1463766905.1918988.613990665.6CD67781-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
@ 2016-05-20 19:15           ` BP Jonsson
  2016-05-25 14:18           ` Frank Colcord
  1 sibling, 0 replies; 42+ messages in thread
From: BP Jonsson @ 2016-05-20 19:15 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 7595 bytes --]

I like the idea of overloading image syntax for includes, but I'm a little
concerned how format recognition by extension would work out in practice.
Most extensions that would likely come up map only to one fiile format but
some file file formats map to multiple extensions, notably markdown, and
for backwards compatibility's sake one would have to assume that an
'unknown' extension is an image. However now that image elements take
attributes it would be possible to use an "include" pseudo-attribute with
the same formats as used for the --from option as values, including
extensions if any. I would guess that a filter implementing that would not
be too hard to implement in Haskell, which can call pandoc as a library.
Shelling out to pandoc from python or perl would be possible. Also the
include would probably have to be an image element as only content since
the included document would consist of block elements.

/bpj

fredag 20 maj 2016 skrev John Gabriele <jgabriele-97jfqw80gc6171pxa8y+qA@public.gmane.org>:

> Hi Martin,
>
> There's also [issue 553](https://github.com/jgm/pandoc/issues/553).
>
> Personally, I think I like (from that issue thread) anton-k's original
> idea:
>
>     ![an image](foo.jpg)
>
>     ![an include](foo.txt)
>
>     ![a csv to be rendered as a table](foo.csv)
>
> (that is, based on filename extension)
>
> Those seem sensible, symmetrical, easy to remember, and I think fit well
> with what pandoc already does (`![]()` is already like an include).
>
> As for a syntax to allow writing your csv data right into your md file,
> ... Pandoc already supports a generous number of table formats that are
> pretty easy to type. And for larger tables that you might be tempted to
> copy/paste in, might be better easier to bang-include them (as in,
> `![]()`), rather than muck up your pretty markdown file with a giant bunch
> of csv data. :)
>
> -- John
>
>
>
>
> On Fri, May 20, 2016, at 05:38 AM, Martin Fenner wrote:
>
> Dear group,
>
> The topic of CSV support in Pandoc has come up several times on this list,
> includes this thread from 2014:
> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>
> Since last year I work for an organisation that frequently deals with
> tabular data (and helped organize CSVconf earlier this month), and I have
> done some thinking on how CSV could fit into Pandoc. I see two important
> use cases:
>
> * CSV reader that converts to tables in HTML, docx, latex, etc.
> * CSV has a format to describe tables in markdown
>
> For the first use case I wrote a hack for the Jekyll blogging platform
> this week that turns CSV files into markdown grid tables format that is
> then processed by Pandoc (https://github.com/datacite/jekyll-csvy). I
> would rather use Pandoc with a CSV reader, but my Haskell isn't good enough
> to write one. But for now I can generate blog posts directly from CSV
> files. Other people have done similar things with Pandoc and CSV.
>
> For the second use case I see a clear advantage of CSV over the various
> attempts to format tables in markdown (simple_tables, multiline_tables,
> grid_tables, pipe_tables). Everyone (and many tools) understands the CSV
> format, and you can do most of the things with CSV that the other table
> formats allow (multi-column formats and column alignment are a bit
> trickier). This has been done before using Pandoc filters, but I think a
> Pandoc "csv_tables" Pandoc extension would make this easier for the casual
> user. Using the grid_tables example from the Pandoc documentation, this
> could look like this:
>
> : Sample csv table.
>
> ,,,
> Fruit,Price,Advantages
> Bananas,$1.34,- built-in wrapper\n- bright color
> Oranges,$2.10, - cures scurvy\n- tasty
> ,,,
>
> I like three commas on a new line to indicate the start and end of a
> table, but that is of course open for discussion. The format is much easier
> to read and edit for humans compared to grid tables, the only tricky bit is
> maybe the \n for multiline columns. I would think we could add metadata to
> the fenced table blog similar to code blocks, e.g.
>
> ,,,{ #mytable .numberRows }
>
> One challenge with CSV is that it is an ill-defined format somewhat
> similar to markdown before CommonMark. It may make things easier to only
> support a specific CSV variant (e.g. comma as separator, header required,
> comment lines not allowed).
>
> Thoughts?
>
> Best,
>
> Martin
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org
> <https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/1463766905.1918988.613990665.6CD67781%40webmail.messagingengine.com
> <https://groups.google.com/d/msgid/pandoc-discuss/1463766905.1918988.613990665.6CD67781%40webmail.messagingengine.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.



-- 

------------------------------
SavedURI :Show URLShow URLSavedURI :
SavedURI :Hide URLHide URLSavedURI :
https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvwhttps://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
------------------------------

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuQe9oFtnkjLYfAYgAqgzYrM6XBZrBPOHui%3DHG14U4X2Zg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 15504 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <20160520183616.GB95956-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
  2016-05-20 19:05           ` John Muccigrosso
@ 2016-05-20 19:30           ` John Gabriele
       [not found]             ` <1463772643.1938448.614055033.793EA897-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
  2016-05-20 19:32           ` BP Jonsson
  2 siblings, 1 reply; 42+ messages in thread
From: John Gabriele @ 2016-05-20 19:30 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

John, this may be something that could stand to be in the faq:

----------------------
Q. Can I write csv content in my input file and have Pandoc render it as
a table?

A. Pandoc doesn't support this directly, but you *can* put your csv data
into a delimited code block, mark the block $thusly, then use then use a
filter to process these specially-marked blocks. For example, {snip}.

A nice benefit of this is that it will degrade gracefully if the
document is then processed without using the filter.

There may be other options available on the [Pandoc Extras page],
including ... {snip preprocessors, scripts, ...?}
----------------------

Not sure about what "$thusly" would be though. Also have not used Pandoc
filters before.

-- John



On Fri, May 20, 2016, at 02:36 PM, John MACFARLANE wrote:
> +++ Martin Fenner [May 20 16 12:38 ]:
> >   I would rather use Pandoc with a CSV reader, but my Haskell isn't good
> >   enough to write one.
> 
> This would be a pretty easy project for someone trying to
> learn Haskell; maybe someone on the list wants to try it?
> The cassava library works well for csv parsing.
> 
> >   For the second use case I see a clear advantage of CSV over the various
> >   attempts to format tables in markdown (simple_tables, multiline_tables,
> >   grid_tables, pipe_tables). Everyone (and many tools) understands the
> >   CSV format, and you can do most of the things with CSV that the other
> >   table formats allow (multi-column formats and column alignment are a
> >   bit trickier). This has been done before using Pandoc filters, but I
> >   think a Pandoc "csv_tables" Pandoc extension would make this easier for
> >   the casual user. Using the grid_tables example from the Pandoc
> >   documentation, this could look like this:
> >
> >   : Sample csv table.
> >   ,,,
> >   Fruit,Price,Advantages
> >   Bananas,$1.34,- built-in wrapper\n- bright color
> >   Oranges,$2.10, - cures scurvy\n- tasty
> >   ,,,
> 
> I think that using a filter that processes specially marked
> code blocks is a better way to go than introducing yet
> another delimited block type.
> 
> For one thing, this will degrade much more gracefully when
> you render it with a standard markdown renderer.
> (The CSV will show up as code rather than garbage.)
> 
> One could think about integrating the filter into pandoc
> itself, as an option, but the code and syntax would not
> have to be different, I think.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/20160520183616.GB95956%40protagoras.berkeley.edu.
> For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <20160520183616.GB95956-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
  2016-05-20 19:05           ` John Muccigrosso
  2016-05-20 19:30           ` John Gabriele
@ 2016-05-20 19:32           ` BP Jonsson
  2 siblings, 0 replies; 42+ messages in thread
From: BP Jonsson @ 2016-05-20 19:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 4577 bytes --]

I agree that inline CSV probably better be done by overloading code blocks.
Another reason for that is that CSV doesn't look like a table at all, and
thus is a very un-markdownish syntax. this is a purely aesthetic concern
though.

If someone would write the code to implement this it would be good if one
could customize the delimiter, quote and escape characters through
attributes. I wrote a simple filter once which simply passed such options
on to the Text::CSV Perl module. It so happens that I deal with tab
separated values more often than actual CSV. While TSV is much easier to
convert into a pandoc table it would be good if a CSV reader could handle
it.

/bpj

fredag 20 maj 2016 skrev John MACFARLANE <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>:

> +++ Martin Fenner [May 20 16 12:38 ]:
>
>>   I would rather use Pandoc with a CSV reader, but my Haskell isn't good
>>   enough to write one.
>>
>
> This would be a pretty easy project for someone trying to
> learn Haskell; maybe someone on the list wants to try it?
> The cassava library works well for csv parsing.
>
>   For the second use case I see a clear advantage of CSV over the various
>>   attempts to format tables in markdown (simple_tables, multiline_tables,
>>   grid_tables, pipe_tables). Everyone (and many tools) understands the
>>   CSV format, and you can do most of the things with CSV that the other
>>   table formats allow (multi-column formats and column alignment are a
>>   bit trickier). This has been done before using Pandoc filters, but I
>>   think a Pandoc "csv_tables" Pandoc extension would make this easier for
>>   the casual user. Using the grid_tables example from the Pandoc
>>   documentation, this could look like this:
>>
>>   : Sample csv table.
>>   ,,,
>>   Fruit,Price,Advantages
>>   Bananas,$1.34,- built-in wrapper\n- bright color
>>   Oranges,$2.10, - cures scurvy\n- tasty
>>   ,,,
>>
>
> I think that using a filter that processes specially marked
> code blocks is a better way to go than introducing yet
> another delimited block type.
>
> For one thing, this will degrade much more gracefully when
> you render it with a standard markdown renderer.
> (The CSV will show up as code rather than garbage.)
>
> One could think about integrating the filter into pandoc
> itself, as an option, but the code and syntax would not
> have to be different, I think.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/20160520183616.GB95956%40protagoras.berkeley.edu
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 

------------------------------
SavedURI :Show URLShow URLSavedURI :
SavedURI :Hide URLHide URLSavedURI :
https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvwhttps://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
------------------------------

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuTbCcn8rWR%2Byug%2B2CYzyFi3k%3DK1q3%2B1-VJnx7ZrKTBxcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 5994 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <1463772643.1938448.614055033.793EA897-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
@ 2016-05-20 19:37               ` BP Jonsson
  0 siblings, 0 replies; 42+ messages in thread
From: BP Jonsson @ 2016-05-20 19:37 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 5674 bytes --]

$thusly would probably be a class .csv and recognition of a class .code for
when you want the code block to be left as an actual code block.

/bpj

fredag 20 maj 2016 skrev John Gabriele <jgabriele-97jfqw80gc6171pxa8y+qA@public.gmane.org>:

> John, this may be something that could stand to be in the faq:
>
> ----------------------
> Q. Can I write csv content in my input file and have Pandoc render it as
> a table?
>
> A. Pandoc doesn't support this directly, but you *can* put your csv data
> into a delimited code block, mark the block $thusly, then use then use a
> filter to process these specially-marked blocks. For example, {snip}.
>
> A nice benefit of this is that it will degrade gracefully if the
> document is then processed without using the filter.
>
> There may be other options available on the [Pandoc Extras page],
> including ... {snip preprocessors, scripts, ...?}
> ----------------------
>
> Not sure about what "$thusly" would be though. Also have not used Pandoc
> filters before.
>
> -- John
>
>
>
> On Fri, May 20, 2016, at 02:36 PM, John MACFARLANE wrote:
> > +++ Martin Fenner [May 20 16 12:38 ]:
> > >   I would rather use Pandoc with a CSV reader, but my Haskell isn't
> good
> > >   enough to write one.
> >
> > This would be a pretty easy project for someone trying to
> > learn Haskell; maybe someone on the list wants to try it?
> > The cassava library works well for csv parsing.
> >
> > >   For the second use case I see a clear advantage of CSV over the
> various
> > >   attempts to format tables in markdown (simple_tables,
> multiline_tables,
> > >   grid_tables, pipe_tables). Everyone (and many tools) understands the
> > >   CSV format, and you can do most of the things with CSV that the other
> > >   table formats allow (multi-column formats and column alignment are a
> > >   bit trickier). This has been done before using Pandoc filters, but I
> > >   think a Pandoc "csv_tables" Pandoc extension would make this easier
> for
> > >   the casual user. Using the grid_tables example from the Pandoc
> > >   documentation, this could look like this:
> > >
> > >   : Sample csv table.
> > >   ,,,
> > >   Fruit,Price,Advantages
> > >   Bananas,$1.34,- built-in wrapper\n- bright color
> > >   Oranges,$2.10, - cures scurvy\n- tasty
> > >   ,,,
> >
> > I think that using a filter that processes specially marked
> > code blocks is a better way to go than introducing yet
> > another delimited block type.
> >
> > For one thing, this will degrade much more gracefully when
> > you render it with a standard markdown renderer.
> > (The CSV will show up as code rather than garbage.)
> >
> > One could think about integrating the filter into pandoc
> > itself, as an option, but the code and syntax would not
> > have to be different, I think.
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:;>.
> > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:;>.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/pandoc-discuss/20160520183616.GB95956%40protagoras.berkeley.edu
> .
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:;>.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:;>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/1463772643.1938448.614055033.793EA897%40webmail.messagingengine.com
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 

------------------------------
SavedURI :Show URLShow URLSavedURI :
SavedURI :Hide URLHide URLSavedURI :
https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvwhttps://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
------------------------------

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuSYnaun%2Bdg7-6asBvgXbmbh6BAw2HQEbg-nY3Qe-EwBzA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 7868 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
  2016-05-20 17:55       ` John Gabriele
  2016-05-20 18:36       ` John MACFARLANE
@ 2016-05-21 17:03       ` kurt.pfeifle via pandoc-discuss
       [not found]         ` <fbcb1ece-48c7-4451-be2f-1b6cd70b2969-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-05-23  4:42       ` Martin Fenner
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: kurt.pfeifle via pandoc-discuss @ 2016-05-21 17:03 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1891 bytes --]



Am Freitag, 20. Mai 2016 11:39:02 UTC+2 schrieb Martin Fenner:

Dear group,
>
> The topic of CSV support in Pandoc has come up several times on this list, 
> includes this thread from 2014:
> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>
> Since last year I work for an organisation that frequently deals with 
> tabular data (and helped organize CSVconf earlier this month), and I have 
> done some thinking on how CSV could fit into Pandoc.
>
Are you aware of these two Pandoc filters?

   - *pandoc-csv2table <https://github.com/baig/pandoc-csv2table> * 
   (https://github.com/baig/pandoc-csv2table) 
   - pandoc-placetable <https://github.com/mb21/pandoc-placetable>
    (https://github.com/mb21/pandoc-placetable) 

Both “abuse” the fenced code block syntax, assign the class .table to the 
block and allow inline CSV data as well as referencing an external CSV file.

Personally, I still prefer to use *csv2table* (over the newer *placetable*) 
because I can also use it to convert CSV to Markdown tables (grid, simple 
and multiline) — which *placetable” currently doesn’t do because it works 
differently (AFAIU):

   - placetable converts CSV to Pandoc’s native format directly 
   - csv2table converts CSV to Markdown first. 



-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fbcb1ece-48c7-4451-be2f-1b6cd70b2969%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6312 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
                         ` (2 preceding siblings ...)
  2016-05-21 17:03       ` kurt.pfeifle via pandoc-discuss
@ 2016-05-23  4:42       ` Martin Fenner
       [not found]         ` <a1503704-4f58-47f7-a9e8-1c60dad8e935-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-05-27 19:50       ` D L
                         ` (3 subsequent siblings)
  7 siblings, 1 reply; 42+ messages in thread
From: Martin Fenner @ 2016-05-23  4:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1463 bytes --]

Dear all,

thank you for your responses. If nobody else beats me to it, I will try to 
write a Pandoc CSV writer in July, when I have a little bit more time.

My argument that CSV is actually a perfect fit for Markdown and a natural 
way to describe tables apparently didn't convince this list. In my view the 
current table implementations supported by Pandoc do their job, but don't 
look like a good fit for the Markdown philosophy of keeping things simple 
yet powerful. Maybe one reason why there is no standard Markdown syntax for 
tables. I know that there are various ways to accomplish CSV tables with 
Pandoc filters, etc., so it is not a practical limitation.

As far as using the ![] syntax to load other content besides images, that 
is a very interesting proposition. Given that the Github issue is from 
2012, I am not sure how much consensus is on that topic.

Best,

Martin

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a1503704-4f58-47f7-a9e8-1c60dad8e935%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1994 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <a1503704-4f58-47f7-a9e8-1c60dad8e935-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-23 20:22           ` John MACFARLANE
  0 siblings, 0 replies; 42+ messages in thread
From: John MACFARLANE @ 2016-05-23 20:22 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Martin Fenner [May 22 16 21:42 ]:
>   In my view the current table implementations supported
>   by Pandoc do their job, but don't look like a good fit
>   for the Markdown philosophy of keeping things simple yet
>   powerful.

If "simple yet powerful" is what you want, HTML syntax for
specifying tables is great.

But the Markdown philosophy is not about being "simple but
powerful."  It's about having source files that are
human-readable text files.

Quoting from John Gruber's original Markdown syntax page:

> The overriding design goal for Markdown’s formatting syntax
> is to make it as readable as possible. The idea is that a
> Markdown-formatted document should be publishable as-is, as
> plain text, without looking like it’s been marked up with
> tags or formatting instructions.

CSV is simple (though not as flexible as HTML formatting).
But it doesn't pass this test.

This design goal is really important to keep in mind when
comparing Markdown to alternatives, which are often easier
to write (because, for example, sublists are indicated by
`**` instead of indentation), but don't look as natural
to read.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20160523202204.GG21327%40protagoras.berkeley.edu.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <1463766905.1918988.613990665.6CD67781-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
  2016-05-20 19:15           ` BP Jonsson
@ 2016-05-25 14:18           ` Frank Colcord
       [not found]             ` <471daa3c-e2ec-4445-b4fd-44e5c8a3fd6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 42+ messages in thread
From: Frank Colcord @ 2016-05-25 14:18 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 5245 bytes --]

I'd like to second this proposal. As a simple Pandoc user, these would be 
very helpful:

 
    ![an image](foo.jpg)
 
    ![an include](foo.txt)
 
    ![a csv to be rendered as a table](foo.csv)
 
thanks for all the development.

Frank

On Friday, May 20, 2016 at 6:55:10 PM UTC+1, jgabriele wrote:
>
> Hi Martin,
>  
> There's also [issue 553](https://github.com/jgm/pandoc/issues/553 
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjgm%2Fpandoc%2Fissues%2F553&sa=D&sntz=1&usg=AFQjCNHFocHh8P3j2zM42EMAMvPZ-kfuPA>
> ).
>  
> Personally, I think I like (from that issue thread) anton-k's original 
> idea:
>  
>     ![an image](foo.jpg)
>  
>     ![an include](foo.txt)
>  
>     ![a csv to be rendered as a table](foo.csv)
>  
> (that is, based on filename extension)
>  
> Those seem sensible, symmetrical, easy to remember, and I think fit well 
> with what pandoc already does (`![]()` is already like an include).
>  
> As for a syntax to allow writing your csv data right into your md file, 
> ... Pandoc already supports a generous number of table formats that are 
> pretty easy to type. And for larger tables that you might be tempted to 
> copy/paste in, might be better easier to bang-include them (as in, 
> `![]()`), rather than muck up your pretty markdown file with a giant bunch 
> of csv data. :)
>  
> -- John
>  
>  
>  
>  
> On Fri, May 20, 2016, at 05:38 AM, Martin Fenner wrote:
>
> Dear group,
>  
> The topic of CSV support in Pandoc has come up several times on this list, 
> includes this thread from 2014:
> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>  
> Since last year I work for an organisation that frequently deals with 
> tabular data (and helped organize CSVconf earlier this month), and I have 
> done some thinking on how CSV could fit into Pandoc. I see two important 
> use cases:
>  
> * CSV reader that converts to tables in HTML, docx, latex, etc.
> * CSV has a format to describe tables in markdown
>  
> For the first use case I wrote a hack for the Jekyll blogging platform 
> this week that turns CSV files into markdown grid tables format that is 
> then processed by Pandoc (https://github.com/datacite/jekyll-csvy). I 
> would rather use Pandoc with a CSV reader, but my Haskell isn't good enough 
> to write one. But for now I can generate blog posts directly from CSV 
> files. Other people have done similar things with Pandoc and CSV.
>  
> For the second use case I see a clear advantage of CSV over the various 
> attempts to format tables in markdown (simple_tables, multiline_tables, 
> grid_tables, pipe_tables). Everyone (and many tools) understands the CSV 
> format, and you can do most of the things with CSV that the other table 
> formats allow (multi-column formats and column alignment are a bit 
> trickier). This has been done before using Pandoc filters, but I think a 
> Pandoc "csv_tables" Pandoc extension would make this easier for the casual 
> user. Using the grid_tables example from the Pandoc documentation, this 
> could look like this:
>  
> : Sample csv table.
>  
> ,,,
> Fruit,Price,Advantages
> Bananas,$1.34,- built-in wrapper\n- bright color
> Oranges,$2.10, - cures scurvy\n- tasty
> ,,,
>  
> I like three commas on a new line to indicate the start and end of a 
> table, but that is of course open for discussion. The format is much easier 
> to read and edit for humans compared to grid tables, the only tricky bit is 
> maybe the \n for multiline columns. I would think we could add metadata to 
> the fenced table blog similar to code blocks, e.g.
>  
> ,,,{ #mytable .numberRows }
>  
> One challenge with CSV is that it is an ill-defined format somewhat 
> similar to markdown before CommonMark. It may make things easier to only 
> support a specific CSV variant (e.g. comma as separator, header required, 
> comment lines not allowed).
>  
> Thoughts?
>  
> Best,
>  
> Martin
>  
>  
>  
>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org 
> <https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>  
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/471daa3c-e2ec-4445-b4fd-44e5c8a3fd6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 10171 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <471daa3c-e2ec-4445-b4fd-44e5c8a3fd6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-26  5:42               ` Sergio Correia
       [not found]                 ` <b9147aed-bf8e-4136-8fd2-949dea1034ea-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Sergio Correia @ 2016-05-26  5:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 6462 bytes --]

A bit late to the party, but:

1) An alternative to using extensions to indicate the required action would 
be to use protocols.
For instance, this example pandoc filter:

http://scorreia.com/software/panflute/guide.html#calling-external-programs

Thus, you can do

[Caption of the Table](csv://some/path.csv)

Or maybe

[some/path.md](include://)



2) For tables, I would also suggest to take a look at this filter:

http://scorreia.com/software/panflute/guide.html#yaml-code-blocks

It allows markdown like this:

Some text

~~~ csv
title: Some Title
has-header: True
---
Col1, Col2, Col3
1, 2, 3
10, 20, 30
~~~

More text


This combines pure CSV with options set up in YAML, so you can add captions 
and customize the table.

Since the CSV is handled by python's CSV library, it is quite powerful. 
Also, complex things like selecting a subset of rows/cols, or adding 
format, could be done in 1-2 lines of code. If you are interested, shoot me 
an email and I can add a more complex working example.

Cheers,
S


On Wednesday, May 25, 2016 at 7:18:10 AM UTC-7, Frank Colcord wrote:
>
> I'd like to second this proposal. As a simple Pandoc user, these would be 
> very helpful:
>
>  
>     ![an image](foo.jpg)
>  
>     ![an include](foo.txt)
>  
>     ![a csv to be rendered as a table](foo.csv)
>  
> thanks for all the development.
>
> Frank
>
> On Friday, May 20, 2016 at 6:55:10 PM UTC+1, jgabriele wrote:
>>
>> Hi Martin,
>>  
>> There's also [issue 553](https://github.com/jgm/pandoc/issues/553 
>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjgm%2Fpandoc%2Fissues%2F553&sa=D&sntz=1&usg=AFQjCNHFocHh8P3j2zM42EMAMvPZ-kfuPA>
>> ).
>>  
>> Personally, I think I like (from that issue thread) anton-k's original 
>> idea:
>>  
>>     ![an image](foo.jpg)
>>  
>>     ![an include](foo.txt)
>>  
>>     ![a csv to be rendered as a table](foo.csv)
>>  
>> (that is, based on filename extension)
>>  
>> Those seem sensible, symmetrical, easy to remember, and I think fit well 
>> with what pandoc already does (`![]()` is already like an include).
>>  
>> As for a syntax to allow writing your csv data right into your md file, 
>> ... Pandoc already supports a generous number of table formats that are 
>> pretty easy to type. And for larger tables that you might be tempted to 
>> copy/paste in, might be better easier to bang-include them (as in, 
>> `![]()`), rather than muck up your pretty markdown file with a giant bunch 
>> of csv data. :)
>>  
>> -- John
>>  
>>  
>>  
>>  
>> On Fri, May 20, 2016, at 05:38 AM, Martin Fenner wrote:
>>
>> Dear group,
>>  
>> The topic of CSV support in Pandoc has come up several times on this 
>> list, includes this thread from 2014:
>> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>>  
>> Since last year I work for an organisation that frequently deals with 
>> tabular data (and helped organize CSVconf earlier this month), and I have 
>> done some thinking on how CSV could fit into Pandoc. I see two important 
>> use cases:
>>  
>> * CSV reader that converts to tables in HTML, docx, latex, etc.
>> * CSV has a format to describe tables in markdown
>>  
>> For the first use case I wrote a hack for the Jekyll blogging platform 
>> this week that turns CSV files into markdown grid tables format that is 
>> then processed by Pandoc (https://github.com/datacite/jekyll-csvy). I 
>> would rather use Pandoc with a CSV reader, but my Haskell isn't good enough 
>> to write one. But for now I can generate blog posts directly from CSV 
>> files. Other people have done similar things with Pandoc and CSV.
>>  
>> For the second use case I see a clear advantage of CSV over the various 
>> attempts to format tables in markdown (simple_tables, multiline_tables, 
>> grid_tables, pipe_tables). Everyone (and many tools) understands the CSV 
>> format, and you can do most of the things with CSV that the other table 
>> formats allow (multi-column formats and column alignment are a bit 
>> trickier). This has been done before using Pandoc filters, but I think a 
>> Pandoc "csv_tables" Pandoc extension would make this easier for the casual 
>> user. Using the grid_tables example from the Pandoc documentation, this 
>> could look like this:
>>  
>> : Sample csv table.
>>  
>> ,,,
>> Fruit,Price,Advantages
>> Bananas,$1.34,- built-in wrapper\n- bright color
>> Oranges,$2.10, - cures scurvy\n- tasty
>> ,,,
>>  
>> I like three commas on a new line to indicate the start and end of a 
>> table, but that is of course open for discussion. The format is much easier 
>> to read and edit for humans compared to grid tables, the only tricky bit is 
>> maybe the \n for multiline columns. I would think we could add metadata to 
>> the fenced table blog similar to code blocks, e.g.
>>  
>> ,,,{ #mytable .numberRows }
>>  
>> One challenge with CSV is that it is an ill-defined format somewhat 
>> similar to markdown before CommonMark. It may make things easier to only 
>> support a specific CSV variant (e.g. comma as separator, header required, 
>> comment lines not allowed).
>>  
>> Thoughts?
>>  
>> Best,
>>  
>> Martin
>>  
>>  
>>  
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org 
>> <https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b9147aed-bf8e-4136-8fd2-949dea1034ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 11720 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                 ` <b9147aed-bf8e-4136-8fd2-949dea1034ea-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-26  8:04                   ` Martin Fenner
       [not found]                     ` <B4779237-F368-454A-8E43-93EBCDFDF8AB-i39mICoz+qVg9hUCZPvPmw@public.gmane.org>
  2016-05-26 12:50                   ` BPJ
  1 sibling, 1 reply; 42+ messages in thread
From: Martin Fenner @ 2016-05-26  8:04 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3090 bytes --]


> Am 23.05.2016 um 23:22 schrieb John MACFARLANE <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>:
> 
> But the Markdown philosophy is not about being "simple but
> powerful."  It's about having source files that are
> human-readable text files.
> 
> Quoting from John Gruber's original Markdown syntax page:
> 
>> The overriding design goal for Markdown’s formatting syntax
>> is to make it as readable as possible. The idea is that a
>> Markdown-formatted document should be publishable as-is, as
>> plain text, without looking like it’s been marked up with
>> tags or formatting instructions.
> 
> CSV is simple (though not as flexible as HTML formatting).
> But it doesn't pass this test.
> 
> This design goal is really important to keep in mind when
> comparing Markdown to alternatives, which are often easier
> to write (because, for example, sublists are indicated by
> `**` instead of indentation), but don't look as natural
> to read.

John, you are right, and I will rephrase. I think that seemless integration of markdown and CSV would make a lot of sense for many use cases. A Pandoc CSV reader is a good first step. The second step is then combining multiple Pandoc documents into one. This can of course be done already, but maybe extending the ![]() syntax would make it easier to import other documents (e.g. a number of tables in CSV format) into markdown documents instead of appending multiple documents into one. The only requirement would be that the external file referenced in ![]() would be in a format that Pandoc understands, otherwise it is loaded as binary blob, as for images.


> Am 26.05.2016 um 08:42 schrieb Sergio Correia <sergio.correia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
> 
> 2) For tables, I would also suggest to take a look at this filter:
> 
> http://scorreia.com/software/panflute/guide.html#yaml-code-blocks <http://scorreia.com/software/panflute/guide.html#yaml-code-blocks>
> 
> It allows markdown like this:
> Some text
> 
> ~~~ csv
> title: Some Title
> has-header: True
> ---
> Col1, Col2, Col3
> 1, 2, 3
> 10, 20, 30
> ~~~
> 
> More text
> 
> This combines pure CSV with options set up in YAML, so you can add captions and customize the table.

I like the use of YAML in code blocks, as you frequently need additional metadata to properly understand tabular data (see for example http://data.okfn.org/doc/tabular-data-package <http://data.okfn.org/doc/tabular-data-package>). Very cool.

Best,

Martin

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/B4779237-F368-454A-8E43-93EBCDFDF8AB%40datacite.org.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 5393 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                     ` <B4779237-F368-454A-8E43-93EBCDFDF8AB-i39mICoz+qVg9hUCZPvPmw@public.gmane.org>
@ 2016-05-26 11:48                       ` Frank Colcord
       [not found]                         ` <CADZiF+X6AuYJEnnNCs1M=spfbp9Fn4X2GBVkxXKp9g9SSNH16A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Frank Colcord @ 2016-05-26 11:48 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 5424 bytes --]

@sergio thanks for the pandoc filter. It looks very helpful. I don't know
python, but I can see how helpful that would be.
I've handled the csv in previous projects by changing the delimiters when
saving a spreadsheet so the delimiters are pandoc friendly.

@martin you say "The second step is then combining multiple Pandoc
documents into one. This can of course be done already". I've done this by
constructing long command line strings. I've also seen the preprocessors.
Is there another way to include multiple markdown files? I've been writing
in a wiki format where one txt file has links to others. I'd love to have
an automatic recursive way to include all the links as separate chapters.
If I have to, I don't mind setting up the includes manually. Please let me
know if you have seen documentation how to do that. I've seen some
documentation of recent pandoc versions which make it seem as if there are
some ways to do this.

I've seen this discussion:
https://groups.google.com/forum/#!searchin/pandoc-discuss/include/pandoc-discuss/eRuhx2Md0BI/mfzp_ZhuDAAJ

This discussion mentions preprocessors : FilePP and gpp.

have I missed some easier way to include files?

thanks, Frank



On 26 May 2016 at 09:04, Martin Fenner <martin.fenner-i39mICoz+qVg9hUCZPvPmw@public.gmane.org> wrote:

>
> Am 23.05.2016 um 23:22 schrieb John MACFARLANE <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>:
>
> But the Markdown philosophy is not about being "simple but
> powerful."  It's about having source files that are
> human-readable text files.
>
> Quoting from John Gruber's original Markdown syntax page:
>
> The overriding design goal for Markdown’s formatting syntax
> is to make it as readable as possible. The idea is that a
> Markdown-formatted document should be publishable as-is, as
> plain text, without looking like it’s been marked up with
> tags or formatting instructions.
>
>
> CSV is simple (though not as flexible as HTML formatting).
> But it doesn't pass this test.
>
> This design goal is really important to keep in mind when
> comparing Markdown to alternatives, which are often easier
> to write (because, for example, sublists are indicated by
> `**` instead of indentation), but don't look as natural
> to read.
>
>
> John, you are right, and I will rephrase. I think that seemless
> integration of markdown and CSV would make a lot of sense for many use
> cases. A Pandoc CSV reader is a good first step. The second step is then
> combining multiple Pandoc documents into one. This can of course be done
> already, but maybe extending the ![]() syntax would make it easier to
> import other documents (e.g. a number of tables in CSV format) into
> markdown documents instead of appending multiple documents into one. The
> only requirement would be that the external file referenced in ![]() would
> be in a format that Pandoc understands, otherwise it is loaded as binary
> blob, as for images.
>
>
> Am 26.05.2016 um 08:42 schrieb Sergio Correia <sergio.correia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>
> 2) For tables, I would also suggest to take a look at this filter:
>
> http://scorreia.com/software/panflute/guide.html#yaml-code-blocks
>
> It allows markdown like this:
>
> Some text
>
> ~~~ csv
> title: Some Title
> has-header: True
> ---
> Col1, Col2, Col3
> 1, 2, 3
> 10, 20, 30
> ~~~
>
> More text
>
>
> This combines pure CSV with options set up in YAML, so you can add
> captions and customize the table.
>
>
> I like the use of YAML in code blocks, as you frequently need additional
> metadata to properly understand tabular data (see for example
> http://data.okfn.org/doc/tabular-data-package). Very cool.
>
> Best,
>
> Martin
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "pandoc-discuss" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/pandoc-discuss/znGQ62WpWrg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/B4779237-F368-454A-8E43-93EBCDFDF8AB%40datacite.org
> <https://groups.google.com/d/msgid/pandoc-discuss/B4779237-F368-454A-8E43-93EBCDFDF8AB%40datacite.org?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
frank-XCqNk0BTMejR7s880joybQ@public.gmane.org, cell: +447763116793, skype: colcord,
if you ever get no response, please send to: fcolcord-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADZiF%2BX6AuYJEnnNCs1M%3Dspfbp9Fn4X2GBVkxXKp9g9SSNH16A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 8803 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                 ` <b9147aed-bf8e-4136-8fd2-949dea1034ea-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-05-26  8:04                   ` Martin Fenner
@ 2016-05-26 12:50                   ` BPJ
  1 sibling, 0 replies; 42+ messages in thread
From: BPJ @ 2016-05-26 12:50 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 8199 bytes --]

While I appreciate the intent behind the suggestion to use protocols like
this i still think that a pseudo-attribute 'format' which can/will be
discarded is preferable to pseudo-protocols because (1) if you are going to
use a URL rather than a path most include files will properly have a
`file://` URL and (2) in principle it should be possible to include a
markdown/csv/whatever file from a remote location, and in that case there
will be a different protocol like http or ftp. thus something like this is
safer, more robust, and clearer:

    ![file description or caption](path/or/url){format=markdown}

torsdag 26 maj 2016 skrev Sergio Correia <sergio.correia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> A bit late to the party, but:
>
> 1) An alternative to using extensions to indicate the required action
> would be to use protocols.
> For instance, this example pandoc filter:
>
> http://scorreia.com/software/panflute/guide.html#calling-external-programs
>
> Thus, you can do
>
> [Caption of the Table](csv://some/path.csv)
>
> Or maybe
>
> [some/path.md](include://)
>
>
>
> 2) For tables, I would also suggest to take a look at this filter:
>
> http://scorreia.com/software/panflute/guide.html#yaml-code-blocks
>
> It allows markdown like this:
>
> Some text
>
> ~~~ csv
> title: Some Title
> has-header: True
> ---
> Col1, Col2, Col3
> 1, 2, 3
> 10, 20, 30
> ~~~
>
> More text
>
>
> This combines pure CSV with options set up in YAML, so you can add
> captions and customize the table.
>
> Since the CSV is handled by python's CSV library, it is quite powerful.
> Also, complex things like selecting a subset of rows/cols, or adding
> format, could be done in 1-2 lines of code. If you are interested, shoot me
> an email and I can add a more complex working example.
>
> Cheers,
> S
>
>
> On Wednesday, May 25, 2016 at 7:18:10 AM UTC-7, Frank Colcord wrote:
>>
>> I'd like to second this proposal. As a simple Pandoc user, these would be
>> very helpful:
>>
>>
>>     ![an image](foo.jpg)
>>
>>     ![an include](foo.txt)
>>
>>     ![a csv to be rendered as a table](foo.csv)
>>
>> thanks for all the development.
>>
>> Frank
>>
>> On Friday, May 20, 2016 at 6:55:10 PM UTC+1, jgabriele wrote:
>>>
>>> Hi Martin,
>>>
>>> There's also [issue 553](https://github.com/jgm/pandoc/issues/553
>>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjgm%2Fpandoc%2Fissues%2F553&sa=D&sntz=1&usg=AFQjCNHFocHh8P3j2zM42EMAMvPZ-kfuPA>
>>> ).
>>>
>>> Personally, I think I like (from that issue thread) anton-k's original
>>> idea:
>>>
>>>     ![an image](foo.jpg)
>>>
>>>     ![an include](foo.txt)
>>>
>>>     ![a csv to be rendered as a table](foo.csv)
>>>
>>> (that is, based on filename extension)
>>>
>>> Those seem sensible, symmetrical, easy to remember, and I think fit well
>>> with what pandoc already does (`![]()` is already like an include).
>>>
>>> As for a syntax to allow writing your csv data right into your md file,
>>> ... Pandoc already supports a generous number of table formats that are
>>> pretty easy to type. And for larger tables that you might be tempted to
>>> copy/paste in, might be better easier to bang-include them (as in,
>>> `![]()`), rather than muck up your pretty markdown file with a giant bunch
>>> of csv data. :)
>>>
>>> -- John
>>>
>>>
>>>
>>>
>>> On Fri, May 20, 2016, at 05:38 AM, Martin Fenner wrote:
>>>
>>> Dear group,
>>>
>>> The topic of CSV support in Pandoc has come up several times on this
>>> list, includes this thread from 2014:
>>> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>>>
>>> Since last year I work for an organisation that frequently deals with
>>> tabular data (and helped organize CSVconf earlier this month), and I have
>>> done some thinking on how CSV could fit into Pandoc. I see two important
>>> use cases:
>>>
>>> * CSV reader that converts to tables in HTML, docx, latex, etc.
>>> * CSV has a format to describe tables in markdown
>>>
>>> For the first use case I wrote a hack for the Jekyll blogging platform
>>> this week that turns CSV files into markdown grid tables format that is
>>> then processed by Pandoc (https://github.com/datacite/jekyll-csvy). I
>>> would rather use Pandoc with a CSV reader, but my Haskell isn't good enough
>>> to write one. But for now I can generate blog posts directly from CSV
>>> files. Other people have done similar things with Pandoc and CSV.
>>>
>>> For the second use case I see a clear advantage of CSV over the various
>>> attempts to format tables in markdown (simple_tables, multiline_tables,
>>> grid_tables, pipe_tables). Everyone (and many tools) understands the CSV
>>> format, and you can do most of the things with CSV that the other table
>>> formats allow (multi-column formats and column alignment are a bit
>>> trickier). This has been done before using Pandoc filters, but I think a
>>> Pandoc "csv_tables" Pandoc extension would make this easier for the casual
>>> user. Using the grid_tables example from the Pandoc documentation, this
>>> could look like this:
>>>
>>> : Sample csv table.
>>>
>>> ,,,
>>> Fruit,Price,Advantages
>>> Bananas,$1.34,- built-in wrapper\n- bright color
>>> Oranges,$2.10, - cures scurvy\n- tasty
>>> ,,,
>>>
>>> I like three commas on a new line to indicate the start and end of a
>>> table, but that is of course open for discussion. The format is much easier
>>> to read and edit for humans compared to grid tables, the only tricky bit is
>>> maybe the \n for multiline columns. I would think we could add metadata to
>>> the fenced table blog similar to code blocks, e.g.
>>>
>>> ,,,{ #mytable .numberRows }
>>>
>>> One challenge with CSV is that it is an ill-defined format somewhat
>>> similar to markdown before CommonMark. It may make things easier to only
>>> support a specific CSV variant (e.g. comma as separator, header required,
>>> comment lines not allowed).
>>>
>>> Thoughts?
>>>
>>> Best,
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org
>>> <https://groups.google.com/d/msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:_e(%7B%7D,'cvml','pandoc-discuss%2Bunsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org');>
> .
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:_e(%7B%7D,'cvml','pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org');>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/b9147aed-bf8e-4136-8fd2-949dea1034ea%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/b9147aed-bf8e-4136-8fd2-949dea1034ea%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhATyS6Nu49vbs-zp5UU2_05_dsPtzV2VHEsJ-q7i2VZPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 12441 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                         ` <CADZiF+X6AuYJEnnNCs1M=spfbp9Fn4X2GBVkxXKp9g9SSNH16A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-27  6:33                           ` John Gabriele
       [not found]                             ` <1464330807.2727387.620260561.2CC32090-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: John Gabriele @ 2016-05-27  6:33 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1674 bytes --]

On Thu, May 26, 2016, at 07:48 AM, Frank Colcord wrote:
> {snip}
> I've handled the csv in previous projects by changing the delimiters
> when saving a spreadsheet so the delimiters are pandoc friendly.

I too have done the same thing in the past. LibreOffice can easily
change the column delimiter in a csv file (from commas to pipes, and
back again). But, in the end, even if you only want to copypaste the
table into your .md file and not edit it, it still looks crummy when
reading it in plain text. If I'm going to go through the (albeit
small) hassle of converting csv to use pipes so I can paste it into my
.md file, it's just as easy to use a script to convert the csv into a
pandoc-markdown -formatted table, and paste that in instead. That way,
my doc still is readable as plain text --- you just need to remember
that it's generated content, and so if you want to make changes, do
them in the csv.

Because of how easy it is to do that, I'd be inclined to say that Pandoc
could do without introducing extra syntax for automatic rendering of raw
pasted-in csv data.

-- John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1464330807.2727387.620260561.2CC32090%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 2620 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                             ` <1464330807.2727387.620260561.2CC32090-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
@ 2016-05-27 10:24                               ` Frank Colcord
       [not found]                                 ` <e1ffce2d-9cc0-4367-a652-a46fa5c141a6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-05-27 15:48                               ` 'Jason White' via pandoc-discuss
  1 sibling, 1 reply; 42+ messages in thread
From: Frank Colcord @ 2016-05-27 10:24 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2225 bytes --]

What's the script you use to turn a csv into readable table in markdown?

the reason I liked this format:
![Table name](./tables/table.csv)

was that I could edit the csv files in my spreadsheet processor and keep 
them as tables in the pdf I generated.
This lets me sort items, add columns, move columns around very easily, 
using a tool built for that job.

If someone else has a better format, that's great.

Using a script is less great.

On Friday, May 27, 2016 at 7:33:32 AM UTC+1, jgabriele wrote:
>
> On Thu, May 26, 2016, at 07:48 AM, Frank Colcord wrote:
>
> {snip}
> I've handled the csv in previous projects by changing the delimiters when 
> saving a spreadsheet so the delimiters are pandoc friendly.
>
>  
> I too have done the same thing in the past. LibreOffice can easily change 
> the column delimiter in a csv file (from commas to pipes, and back again). 
> But, in the end, even if you only want to copypaste the table into your .md 
> file and not edit it, it still looks crummy when reading it in plain text. 
> If I'm going to go through the (albeit small) hassle of converting csv to 
> use pipes so I can paste it into my .md file, it's just as easy to use a 
> script to convert the csv into a pandoc-markdown -formatted table, and 
> paste that in instead. That way, my doc still is readable as plain text --- 
> you just need to remember that it's generated content, and so if you want 
> to make changes, do them in the csv.
>  
> Because of how easy it is to do that, I'd be inclined to say that Pandoc 
> could do without introducing extra syntax for automatic rendering of raw 
> pasted-in csv data.
>  
> -- John
>  
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e1ffce2d-9cc0-4367-a652-a46fa5c141a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3317 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                                 ` <e1ffce2d-9cc0-4367-a652-a46fa5c141a6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-27 14:02                                   ` John Gabriele
  0 siblings, 0 replies; 42+ messages in thread
From: John Gabriele @ 2016-05-27 14:02 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2509 bytes --]

On Fri, May 27, 2016, at 06:24 AM, Frank Colcord wrote:
> What's the script you use to turn a csv into readable table in
> markdown?
 
Here's one I wrote: <https://github.com/uvtc/csv2md>.
 
> the reason I liked this format:
> ![Table name](./tables/table.csv)
>
> was that I could edit the csv files in my spreadsheet processor and
> keep them as tables in the pdf I generated.
> This lets me sort items, add columns, move columns around very easily,
> using a tool built for that job.
 
Oh, absolutely. I like that too. My most recent comment was specifically
about copypasting blocks of csv content directly into your .md file, and
having Pandoc treat that specially and render it as a table.
 
-- John
 
 
>
> On Friday, May 27, 2016 at 7:33:32 AM UTC+1, jgabriele wrote:
>> On Thu, May 26, 2016, at 07:48 AM, Frank Colcord wrote:
>>> {snip}
>>> I've handled the csv in previous projects by changing the delimiters
>>> when saving a spreadsheet so the delimiters are pandoc friendly.
>>
>> I too have done the same thing in the past. LibreOffice can easily
>> change the column delimiter in a csv file (from commas to pipes, and
>> back again). But, in the end, even if you only want to copypaste the
>> table into your .md file and not edit it, it still looks crummy when
>> reading it in plain text. If I'm going to go through the (albeit
>> small) hassle of converting csv to use pipes so I can paste it into
>> my .md file, it's just as easy to use a script to convert the csv
>> into a pandoc-markdown -formatted table, and paste that in instead.
>> That way, my doc still is readable as plain text --- you just need to
>> remember that it's generated content, and so if you want to make
>> changes, do them in the csv.
>>
>> Because of how easy it is to do that, I'd be inclined to say that
>> Pandoc could do without introducing extra syntax for automatic
>> rendering of raw pasted-in csv data.
>>
>> -- John
>>
 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1464357753.647857.620578913.08BC7804%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4573 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                             ` <1464330807.2727387.620260561.2CC32090-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
  2016-05-27 10:24                               ` Frank Colcord
@ 2016-05-27 15:48                               ` 'Jason White' via pandoc-discuss
  1 sibling, 0 replies; 42+ messages in thread
From: 'Jason White' via pandoc-discuss @ 2016-05-27 15:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

John Gabriele <jgabriele-97jfqw80gc6171pxa8y+qA@public.gmane.org> wrote:
  
> Because of how easy it is to do that, I'd be inclined to say that Pandoc
> could do without introducing extra syntax for automatic rendering of raw
> pasted-in csv data.


If you can reliably identify the blocks of CSV text, the script could become a
Pandoc filter - optional, but available to those who want it.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
                         ` (3 preceding siblings ...)
  2016-05-23  4:42       ` Martin Fenner
@ 2016-05-27 19:50       ` D L
  2016-11-14  7:40       ` Kolen Cheung
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 42+ messages in thread
From: D L @ 2016-05-27 19:50 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2156 bytes --]


>
> I'm a novice in applying pandoc. But I'm interested in batch processing of 
> personaiised documents (web content and slide presentations).
>
This requires both pre-processing and post-processing stages in the 
pipeline of document production. 
My inclination is to leverage php (although I see that other scripting 
languages such as python are used). 

>  
>
    After playing around with different options I read in this thread I can 
now meet the requirements
 by the following approach which fits with my PHP development workflow.

(a) PHP5 is requirement in Ubuntu.
(b) add a php extension to my input markdown file (e.g. test.md becomes 
test.md.php)
(c) now add php pre-processing functions to my test.md.php input file
(d) run in command terminal .. php test.md.php > test.md .. to pre-process 
the markdown mixed with php
(e) pre-process functions included in test.md.php might be (as an example) 
..
     <?php embedObject("path/to/test.csv", $csv_range, $object_type, 
$object_style) ?>
(f) generated HTML code returned from above function is embedded as inline 
code between <section></section> tags
(g) another php function includes files (sections) recursively from nested 
folders.
(h) also I'm researching how harp might fit in to 
workflow... https://harpjs.com/
(i) finally the personalisation variables for each run might be driven from 
mongodb json content.
(j) I use Atom markdown editor with markdown preview and PHP packages 
installed
     
This hybrid md.php approach might run against the grain by I throw it in 
here as another suggested workflow.
 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/89d4b3da-8a90-43a1-8edf-998a77e002dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3150 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <fbcb1ece-48c7-4451-be2f-1b6cd70b2969-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-29 12:45           ` mb21
       [not found]             ` <f0058def-bd69-40c1-82b4-e7bdd151c46c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: mb21 @ 2016-05-29 12:45 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2256 bytes --]

@Kurt, pandoc-placetable works perfectly to convert csv to markdown table 
syntax (and any other output format where pandoc supports generating 
tables): 
pandoc --filter pandoc-placetable -t markdown


On Saturday, May 21, 2016 at 7:03:57 PM UTC+2, kurt.p...-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org 
wrote:
>
> Am Freitag, 20. Mai 2016 11:39:02 UTC+2 schrieb Martin Fenner:
>
> Dear group,
>>
>> The topic of CSV support in Pandoc has come up several times on this 
>> list, includes this thread from 2014:
>> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>>
>> Since last year I work for an organisation that frequently deals with 
>> tabular data (and helped organize CSVconf earlier this month), and I have 
>> done some thinking on how CSV could fit into Pandoc.
>>
> Are you aware of these two Pandoc filters?
>
>    - *pandoc-csv2table <https://github.com/baig/pandoc-csv2table> * (
>    https://github.com/baig/pandoc-csv2table) 
>    - pandoc-placetable <https://github.com/mb21/pandoc-placetable> (
>    https://github.com/mb21/pandoc-placetable) 
>
> Both “abuse” the fenced code block syntax, assign the class .table to the 
> block and allow inline CSV data as well as referencing an external CSV file.
>
> Personally, I still prefer to use *csv2table* (over the newer *placetable*) 
> because I can also use it to convert CSV to Markdown tables (grid, simple 
> and multiline) — which *placetable” currently doesn’t do because it works 
> differently (AFAIU):
>
>    - placetable converts CSV to Pandoc’s native format directly 
>    - csv2table converts CSV to Markdown first. 
>
> 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f0058def-bd69-40c1-82b4-e7bdd151c46c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 9250 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <f0058def-bd69-40c1-82b4-e7bdd151c46c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-29 15:59               ` kurt.pfeifle via pandoc-discuss
       [not found]                 ` <001833c9-e40d-4079-ba79-c88c852780a5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: kurt.pfeifle via pandoc-discuss @ 2016-05-29 15:59 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3000 bytes --]



Am Sonntag, 29. Mai 2016 14:45:35 UTC+2 schrieb mb21:

@Kurt, pandoc-placetable works perfectly to convert csv to markdown table 
> syntax (and any other output format where pandoc supports generating 
> tables): 
> pandoc --filter pandoc-placetable -t markdown
>
> Sorry, I did not intend to mis-represent what pandoc-placetable currently 
can do and what it cannot.

I know it can convert to Markdown tables. But (AFAIU) it can generate only 
one type of table: simple_table.
However, with pandoc-csv2table I can generate simple_table, multiline_table, 
pipe_table and grid_table types — simply by adding it into the code block 
metadata: {.table header="yes" type="grid" ....}.

I tried to get the same thing with pandoc --filter pandoc-placetable -t 
markdown+multiline_tables, but it didn’t work.

(Maybe I’m missing something — then please tell me.)

Cheers, Kurt

On Saturday, May 21, 2016 at 7:03:57 PM UTC+2, kurt.p...-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org 
> wrote:
>>
>> Am Freitag, 20. Mai 2016 11:39:02 UTC+2 schrieb Martin Fenner:
>>
>> Dear group,
>>>
>>> The topic of CSV support in Pandoc has come up several times on this 
>>> list, includes this thread from 2014:
>>> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>>>
>>> Since last year I work for an organisation that frequently deals with 
>>> tabular data (and helped organize CSVconf earlier this month), and I have 
>>> done some thinking on how CSV could fit into Pandoc.
>>>
>> Are you aware of these two Pandoc filters?
>>
>>    - *pandoc-csv2table <https://github.com/baig/pandoc-csv2table> * (
>>    https://github.com/baig/pandoc-csv2table) 
>>    - pandoc-placetable <https://github.com/mb21/pandoc-placetable> (
>>    https://github.com/mb21/pandoc-placetable) 
>>
>> Both “abuse” the fenced code block syntax, assign the class .table to 
>> the block and allow inline CSV data as well as referencing an external CSV 
>> file.
>>
>> Personally, I still prefer to use *csv2table* (over the newer 
>> *placetable*) because I can also use it to convert CSV to Markdown 
>> tables (grid, simple and multiline) — which *placetable” currently doesn’t 
>> do because it works differently (AFAIU):
>>
>>    - placetable converts CSV to Pandoc’s native format directly 
>>    - csv2table converts CSV to Markdown first. 
>>
>> 
>>
> 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/001833c9-e40d-4079-ba79-c88c852780a5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 23709 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                 ` <001833c9-e40d-4079-ba79-c88c852780a5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-30  7:47                   ` mb21
       [not found]                     ` <27f2fe62-8115-4513-b13a-c995f625f60d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: mb21 @ 2016-05-30  7:47 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3797 bytes --]

Well, in the end it is the pandoc markdown writes that is used to generate 
the tables, so you're on the right track with -t markdown+multiline_tables. 
However, unfortunately the logic is a bit more complicated, see 
https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/Markdown.hs#L444. 
Since all table-options are enabled by default, you have to turn OFF the 
kind of tables you don't want, e.g. to get pipe-tables: pandoc --filter 
pandoc-placetable -t markdown-simple_tables
I agree that this is not optimal, but this behaviour should be changed in 
the markdown writer, not the filter...

On Sunday, May 29, 2016 at 5:59:49 PM UTC+2, kurt.p...-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org wrote:
>
> Am Sonntag, 29. Mai 2016 14:45:35 UTC+2 schrieb mb21:
>
> @Kurt, pandoc-placetable works perfectly to convert csv to markdown table 
>> syntax (and any other output format where pandoc supports generating 
>> tables): 
>> pandoc --filter pandoc-placetable -t markdown
>>
>> Sorry, I did not intend to mis-represent what pandoc-placetable 
> currently can do and what it cannot.
>
> I know it can convert to Markdown tables. But (AFAIU) it can generate only 
> one type of table: simple_table.
> However, with pandoc-csv2table I can generate simple_table, 
> multiline_table, pipe_table and grid_table types — simply by adding it 
> into the code block metadata: {.table header="yes" type="grid" ....}.
>
> I tried to get the same thing with pandoc --filter pandoc-placetable -t 
> markdown+multiline_tables, but it didn’t work.
>
> (Maybe I’m missing something — then please tell me.)
>
> Cheers, Kurt
>
> On Saturday, May 21, 2016 at 7:03:57 PM UTC+2, kurt.p...-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org 
>> wrote:
>>>
>>> Am Freitag, 20. Mai 2016 11:39:02 UTC+2 schrieb Martin Fenner:
>>>
>>> Dear group,
>>>>
>>>> The topic of CSV support in Pandoc has come up several times on this 
>>>> list, includes this thread from 2014:
>>>> https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI
>>>>
>>>> Since last year I work for an organisation that frequently deals with 
>>>> tabular data (and helped organize CSVconf earlier this month), and I have 
>>>> done some thinking on how CSV could fit into Pandoc.
>>>>
>>> Are you aware of these two Pandoc filters?
>>>
>>>    - *pandoc-csv2table <https://github.com/baig/pandoc-csv2table> * (
>>>    https://github.com/baig/pandoc-csv2table) 
>>>    - pandoc-placetable <https://github.com/mb21/pandoc-placetable> (
>>>    https://github.com/mb21/pandoc-placetable) 
>>>
>>> Both “abuse” the fenced code block syntax, assign the class .table to 
>>> the block and allow inline CSV data as well as referencing an external CSV 
>>> file.
>>>
>>> Personally, I still prefer to use *csv2table* (over the newer 
>>> *placetable*) because I can also use it to convert CSV to Markdown 
>>> tables (grid, simple and multiline) — which *placetable” currently doesn’t 
>>> do because it works differently (AFAIU):
>>>
>>>    - placetable converts CSV to Pandoc’s native format directly 
>>>    - csv2table converts CSV to Markdown first. 
>>>
>>> 
>>>
>> 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/27f2fe62-8115-4513-b13a-c995f625f60d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 26094 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                     ` <27f2fe62-8115-4513-b13a-c995f625f60d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-05-31 14:28                       ` kurt.pfeifle via pandoc-discuss
  0 siblings, 0 replies; 42+ messages in thread
From: kurt.pfeifle via pandoc-discuss @ 2016-05-31 14:28 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2635 bytes --]



Am Montag, 30. Mai 2016 09:47:50 UTC+2 schrieb mb21:
>
> Well, in the end it is the pandoc markdown writes that is used to generate 
> the tables, so you're on the right track with -t markdown+multiline_tables. 
> However, unfortunately the logic is a bit more complicated, see 
> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/Markdown.hs#L444. 
> Since all table-options are enabled by default, you have to turn OFF the 
> kind of tables you don't want
>

Aaaaaahhhh... (and now I even vaguely remember that I had stumbled across 
this item before, but my leaky brain + memory had forgotten it again for 
good.

Thanks for the reminder. I'll have to re-visit this topic again in the near 
future.
 

> , e.g. to get pipe-tables: pandoc --filter pandoc-placetable -t 
> markdown-simple_tables
> I agree that this is not optimal, but this behaviour should be changed in 
> the markdown writer, not the filter...
>
> On Sunday, May 29, 2016 at 5:59:49 PM UTC+2, kurt.p...-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org 
> wrote:
>>
>> Am Sonntag, 29. Mai 2016 14:45:35 UTC+2 schrieb mb21:
>>
>> @Kurt, pandoc-placetable works perfectly to convert csv to markdown table 
>>> syntax (and any other output format where pandoc supports generating 
>>> tables): 
>>> pandoc --filter pandoc-placetable -t markdown
>>>
>>> Sorry, I did not intend to mis-represent what pandoc-placetable 
>> currently can do and what it cannot.
>>
>> I know it can convert to Markdown tables. But (AFAIU) it can generate 
>> only one type of table: simple_table.
>> However, with pandoc-csv2table I can generate simple_table, 
>> multiline_table, pipe_table and grid_table types — simply by adding it 
>> into the code block metadata: {.table header="yes" type="grid" ....}.
>>
>> I tried to get the same thing with pandoc --filter pandoc-placetable -t 
>> markdown+multiline_tables, but it didn’t work.
>>
>> (Maybe I’m missing something — then please tell me.)
>>
>  

>
>>>>    
>>>> 
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3a313532-aefb-4eb0-b9c5-88065cd8c85a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 10519 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
                         ` (4 preceding siblings ...)
  2016-05-27 19:50       ` D L
@ 2016-11-14  7:40       ` Kolen Cheung
       [not found]         ` <14b8fa54-dc04-4874-bf47-fb268fc9f298-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-11-18  9:22       ` mb21
  2016-11-29 22:13       ` Kolen Cheung
  7 siblings, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-11-14  7:40 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig


[-- Attachment #1.1: Type: text/plain, Size: 5694 bytes --]

 <#>Yet Another Pandoc Filters on CSV Tables 

For those who are interested in using CSV tables in pandoc markdown, I’m 
writing a filter that build upon one of panflute’s example. And I’m still 
thinking about which exact syntax to use. Feel free to give suggestions.
<#>Features 

What separate this to existing filters is the automatic calculation of the 
column-width (in contrast with pandoc-placetable) and write to pandoc AST 
directly (in contrast with pandoc-csv2table), as well as specifying the 
table width (a ratio to the line-width). And comparing with the 2 said 
filters, panflute’s example use YAML to store data, rather than the 
attributes of the code-block (which I think is more natural for data).

There’s a notebook in ickc/pandoc-table-csv-test/panflute-csv2table.ipynb 
<https://github.com/ickc/pandoc-table-csv-test/blob/master/ipynb/panflute-csv2table.ipynb> 
and the the filter is at ickc/pandoc-table-csv-test/csv-tables.py 
<https://github.com/ickc/pandoc-table-csv-test/blob/master/bin/csv-tables.py>
.

The current syntax is this (borrow much from panflute’s example, and the 
csv is borrowed from pandoc-csv2table):

~~~csv
title: "*Great* Title"
has-header: False
column-width:
  - 0.1
  - 0.2
  - 0.3
  - 0.4
table-width: 0.8
alignment: LRCmarkdown: True
---
1,2,3,4
~~~

<#>A Comparison of Metadata Keys Between Pandoc Filters on CSV Tables 

My biggest questions is which metadata keys to use. 
Backward-compatibility-wise, since pandoc-csv2table and pandoc-placetable 
use attributes to store metadata, while panflute’s example and mine use 
YAML to store, the only one I need to consider backward-compatibility is 
panflute’s. But I actually think pandoc-csv2table/placetable’s keys make 
more sense. e.g. header vs has-header, caption vs title.

And for alignment, pandoc-csv2table/placetable use aligns. For width, 
placetable use widths. I’m not sure if I should follow them.

A comparison of the keys: (The output is generated by my filter)

+--------+----------------------------+-------------------+--------------------+--------------------------+
|        | pandoc-csv2table           | pandoc-placetable | panflute example   | my proposal              |
+========+============================+===================+====================+==========================+
| type   | type=simple|multiline|grid |                   |                    |                          |
|        | |pipe                      |                   |                    |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
| header | header=yes|no              | header=yes|no     | has-header:        | header: True|False       |
|        |                            |                   | True|False         |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
| captio | caption                    | caption           | title              | caption                  |
| n      |                            |                   |                    |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
| source | source                     | file              | source             | source                   |
+--------+----------------------------+-------------------+--------------------+--------------------------+
| aligns | aligns=LRCD                | aligns=LRCD       |                    | alignment: LRCD          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
| width  |                            | widths="0.5 0.2   |                    | column-width: \[0.5,     |
|        |                            | 0.3"              |                    | 0.2, 0.3\]               |
+--------+----------------------------+-------------------+--------------------+--------------------------+
|        |                            | inlinemarkdown    |                    | markdown: True|False     |
+--------+----------------------------+-------------------+--------------------+--------------------------+
|        |                            | delimiter         |                    |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
|        |                            | quotechar         |                    |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+
|        |                            | id (wrapped by    |                    |                          |
|        |                            | div)              |                    |                          |
+--------+----------------------------+-------------------+--------------------+--------------------------+



-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/14b8fa54-dc04-4874-bf47-fb268fc9f298%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 24972 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <14b8fa54-dc04-4874-bf47-fb268fc9f298-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-14 14:38           ` Melroch
       [not found]             ` <CADAJKhBcAxdQxytFdiug2iqxL+VxwECtWD-nMH4qPcfUUZUzUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Melroch @ 2016-11-14 14:38 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 7429 bytes --]

One possible problem with including metadata as YAML is that it may become
harder for filters earlier in the chain to query the metadata or inject a
CSV block using the normal attribute interface, if any, of the filter
engine, not to mention parsing the CSV, query or alter it and write it
back. For that reason I think it be better if the content of the code block
is the pure CSV data. FWIW I tried both strategies with my unpublished
filters, so I'm not just speculating.

Also since there are many filters doing the same thing the identifying
class should better not be just `csv` but also identify the filter expected
to handle the data.

Den 14 nov 2016 08:41 skrev "Kolen Cheung" <christian.kolen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> <#m_-2424333709963600538_>Yet Another Pandoc Filters on CSV Tables
>
> For those who are interested in using CSV tables in pandoc markdown, I’m
> writing a filter that build upon one of panflute’s example. And I’m still
> thinking about which exact syntax to use. Feel free to give suggestions.
> <#m_-2424333709963600538_>Features
>
> What separate this to existing filters is the automatic calculation of the
> column-width (in contrast with pandoc-placetable) and write to pandoc AST
> directly (in contrast with pandoc-csv2table), as well as specifying the
> table width (a ratio to the line-width). And comparing with the 2 said
> filters, panflute’s example use YAML to store data, rather than the
> attributes of the code-block (which I think is more natural for data).
>
> There’s a notebook in ickc/pandoc-table-csv-test/panflute-csv2table.ipynb
> <https://github.com/ickc/pandoc-table-csv-test/blob/master/ipynb/panflute-csv2table.ipynb>
> and the the filter is at ickc/pandoc-table-csv-test/csv-tables.py
> <https://github.com/ickc/pandoc-table-csv-test/blob/master/bin/csv-tables.py>
> .
>
> The current syntax is this (borrow much from panflute’s example, and the
> csv is borrowed from pandoc-csv2table):
>
> ~~~csv
> title: "*Great* Title"
> has-header: False
> column-width:
>   - 0.1
>   - 0.2
>   - 0.3
>   - 0.4
> table-width: 0.8
> alignment: LRCmarkdown: True
> ---
> 1,2,3,4
> ~~~
>
> <#m_-2424333709963600538_>A Comparison of Metadata Keys Between Pandoc
> Filters on CSV Tables
>
> My biggest questions is which metadata keys to use.
> Backward-compatibility-wise, since pandoc-csv2table and pandoc-placetable
> use attributes to store metadata, while panflute’s example and mine use
> YAML to store, the only one I need to consider backward-compatibility is
> panflute’s. But I actually think pandoc-csv2table/placetable’s keys make
> more sense. e.g. header vs has-header, caption vs title.
>
> And for alignment, pandoc-csv2table/placetable use aligns. For width,
> placetable use widths. I’m not sure if I should follow them.
>
> A comparison of the keys: (The output is generated by my filter)
>
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> |        | pandoc-csv2table           | pandoc-placetable | panflute example   | my proposal              |
> +========+============================+===================+====================+==========================+
> | type   | type=simple|multiline|grid |                   |                    |                          |
> |        | |pipe                      |                   |                    |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> | header | header=yes|no              | header=yes|no     | has-header:        | header: True|False       |
> |        |                            |                   | True|False         |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> | captio | caption                    | caption           | title              | caption                  |
> | n      |                            |                   |                    |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> | source | source                     | file              | source             | source                   |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> | aligns | aligns=LRCD                | aligns=LRCD       |                    | alignment: LRCD          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> | width  |                            | widths="0.5 0.2   |                    | column-width: \[0.5,     |
> |        |                            | 0.3"              |                    | 0.2, 0.3\]               |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> |        |                            | inlinemarkdown    |                    | markdown: True|False     |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> |        |                            | delimiter         |                    |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> |        |                            | quotechar         |                    |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
> |        |                            | id (wrapped by    |                    |                          |
> |        |                            | div)              |                    |                          |
> +--------+----------------------------+-------------------+--------------------+--------------------------+
>
> 
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/14b8fa54-dc04-4874-bf47-fb268fc9f298%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/14b8fa54-dc04-4874-bf47-fb268fc9f298%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhBcAxdQxytFdiug2iqxL%2BVxwECtWD-nMH4qPcfUUZUzUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 27033 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <CADAJKhBcAxdQxytFdiug2iqxL+VxwECtWD-nMH4qPcfUUZUzUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-11-14 23:32               ` Kolen Cheung
  2016-11-15  1:33               ` Sergio Correia
  1 sibling, 0 replies; 42+ messages in thread
From: Kolen Cheung @ 2016-11-14 23:32 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 3847 bytes --]

Great insight. This is the disadvantage of the easiness or writing filters, together with the fact that they are decentralized. The result is the design of filters usually only go through a brief period and not a lot of input from the others. While syntaxes and features in pandoc took a long time to discuss and settle, it focuses on the big picture (both in terms of syntax and generality).

I find it difficult to find a solution to the problem, here's my thought:

First, consider only CSV tables, I think it is actually more natural to write a CSV reader rather than a filter, hence it will read a csv file and convert it to AST on cli. I suggested it to pandoc-placetable and I think I should do the same to mine. Whether or not it will make into pandoc is irrelevant (but I very much hope so!), the important point is it becomes a reader that turns CSV into AST. Then the second part would be most useful: define a standard syntax that *everyone* will be happy with, that includes arbitrary raw formats in the markdown source. (There were already efforts on, say, raw TeX because sometimes pandoc will accidentally parsed the TeX. )The point here is to provide a syntax such that it is clear that the following codeblock will be included (HTML, LaTeX, rst, even docx with a file name, etc.) and optionally parsed by the reader pandoc/filter knows (for binary format, parsing will be mandatory).

Second, that's why I suggested here and there (#2 misc idea on panflute) that there should be a centralized filter gallery. The point is not all filters should go to there (you can't force them anyway), but that for people care enough to share filters for reuse together will go through the design *together*. People can have their own repo for testing or even distributing over standard channel like cabal/pip, but when they submitted to the gallery, it will be santinized (for security/quality). It doesn't mean it will be very rigorous either (who has the time to do that?), but as long as it is centralized, the awareness of the other filters in the repo will also remind them to think about the big picture (currently it is very difficult to have a big picture of the current filters, which is why I started to catalog current filters and hopes to analyze them). And then the issue tracker there may also serves as a discussion on creating fitlers (or may be just here).

Lastly, this is most difficult but exciting. Bringing a analogy from LaTeX: something like the memoir class will be awesome. i.e. a certain guy knows the existing filters and use cases enough, and has good taste and the skills, and have enough time and will to do it: a big filter that is general enough to do most of the things and has a syntax clear and general enough to build compatible filters to use with it. (by the way, currently the compatibility between filters is not known, at least not well documented.)

To summaries, the ability to use filters (on internal AST) is what sets apart pandoc. But it has a lot of potential than currently realized. To bring the analogy from TeX and LaTeX again, filter+pandoc can be like La+TeX.

P.S. I realized I digressed, only the 1st point belongs to this thread, the others should probably belongs to the "ecosystem" thread.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/620ebba4-4753-4909-a0a2-cc3f2218cc26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <CADAJKhBcAxdQxytFdiug2iqxL+VxwECtWD-nMH4qPcfUUZUzUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-11-14 23:32               ` Kolen Cheung
@ 2016-11-15  1:33               ` Sergio Correia
       [not found]                 ` <12c01cfd-f9de-4dd9-bb80-fcac75c808be-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 42+ messages in thread
From: Sergio Correia @ 2016-11-15  1:33 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1.1: Type: text/plain, Size: 2205 bytes --]

On Monday, November 14, 2016 at 9:38:48 AM UTC-5, BP wrote:
>
> One possible problem with including metadata as YAML is that it may become 
> harder for filters earlier in the chain to query the metadata or inject a 
> CSV block using the normal attribute interface, if any, of the filter 
> engine, not to mention parsing the CSV, query or alter it and write it 
> back. For that reason I think it be better if the content of the code block 
> is the pure CSV data. FWIW I tried both strategies with my unpublished 
> filters, so I'm not just speculating. 
>
> Also since there are many filters doing the same thing the identifying 
> class should better not be just `csv` but also identify the filter expected 
> to handle the data.
>

Sorry, I don't understand what is the problem. CSV blocks are not a 
standard feature of Pandoc, and each filter has its own conventions, so I 
don't think is reasonable to expect a new filter to allow its data to be 
queried/exposed to other, unknown filters.

About having additional information besides the raw CSV, I think it's 
actually the most important thing, because it allows you to have a title, 
to load CSV from external sources, add footnotes, and specify output 
options, all of which wouldn't be possible if we restricted the content to 
be some CSV-delimited info.

Finally, I do agree that having a filter named "csv" or "pandoc-csv" might 
collide with existing filters, but I don't think there is a problem with 
having a csv class. I think the chance that a user ends up requiring two 
different filters that use CSV code blocks is low enough for this to be a 
non-issue.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/12c01cfd-f9de-4dd9-bb80-fcac75c808be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2857 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                 ` <12c01cfd-f9de-4dd9-bb80-fcac75c808be-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-15  6:03                   ` Kolen Cheung
       [not found]                     ` <38bfec67-90f0-4d71-b054-1eedfd853d96-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-11-15  6:03 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

I think he means the content of the codeblock should be CSV only and let the attributes to encode all the necessary metadata.

I also agree that YAML metadata is more natural for data (so the value wouldn't always be a string and requires additional parsing, and can have markdown, which is important for caption). I don't entirely understand the workflow he describe, probably he means another filter that process the attribute and inject a CSV there. (That's the primary reason on the feature request in panflute that allow truly empty YAML. And thanks for considering that.)

I'm thinking a more natural way to specify the code block is a CSV is not by class but by a special key-value pair, say, `filter=csv2table`. This way seems to fit better in the picture that "panflute auto-install the filter for the user". The "filter" YAML key in the main document specify which filter pan flute will execute, and the attribute in any element `filter=...` determines where would the filter be acting upon. (at least for those elements that support attributes).

For my filter, I am thinking may be I should turn it into a cli that's like pandoc but for tables. Unimaginatively, let's call that panxls for the meanwhile. Suppose it can do `panxls -f csv -t json`, and also HTML, LaTeX, xlsx, etc. Then provide a thin wrapper that's a pandoc filter and define a general syntax to apply the cli in a pandoc markdown document.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/38bfec67-90f0-4d71-b054-1eedfd853d96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                     ` <38bfec67-90f0-4d71-b054-1eedfd853d96-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-15  9:07                       ` BP Jonsson
       [not found]                         ` <CAFC_yuQU3BRFaJW7QQof_bvU7muAUZGKg7DRc4gEp=4ZibAjHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: BP Jonsson @ 2016-11-15  9:07 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3393 bytes --]

Yes my main point was that the 'code' should be CSV only,   since I
actually have had to modify the CSV programmatically before converting it
to AST. I don't really see the argument about including a caption
containing Markdown in the YAML. A caption in the attributes may also
contain Markdown. You will have to shell out to pandoc in both cases. Even
the CSV might contain markup. I actually edited the data for a whole
dictionary as spreadsheet/CSV once! It was before I discovered pandoc so
the embedded markup was LaTeX and didn't get parsed as part of the CSV
processing, but I would probably do it more or less the same way now.

/bpj

Den 15 nov 2016 07:03 skrev "Kolen Cheung" <christian.kolen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> I think he means the content of the codeblock should be CSV only and let
> the attributes to encode all the necessary metadata.
>
> I also agree that YAML metadata is more natural for data (so the value
> wouldn't always be a string and requires additional parsing, and can have
> markdown, which is important for caption). I don't entirely understand the
> workflow he describe, probably he means another filter that process the
> attribute and inject a CSV there. (That's the primary reason on the feature
> request in panflute that allow truly empty YAML. And thanks for considering
> that.)
>
> I'm thinking a more natural way to specify the code block is a CSV is not
> by class but by a special key-value pair, say, `filter=csv2table`. This way
> seems to fit better in the picture that "panflute auto-install the filter
> for the user". The "filter" YAML key in the main document specify which
> filter pan flute will execute, and the attribute in any element
> `filter=...` determines where would the filter be acting upon. (at least
> for those elements that support attributes).
>
> For my filter, I am thinking may be I should turn it into a cli that's
> like pandoc but for tables. Unimaginatively, let's call that panxls for the
> meanwhile. Suppose it can do `panxls -f csv -t json`, and also HTML, LaTeX,
> xlsx, etc. Then provide a thin wrapper that's a pandoc filter and define a
> general syntax to apply the cli in a pandoc markdown document.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/38bfec67-90f0-4d71-b054-1eedfd853d96%
> 40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuQU3BRFaJW7QQof_bvU7muAUZGKg7DRc4gEp%3D4ZibAjHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4524 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                         ` <CAFC_yuQU3BRFaJW7QQof_bvU7muAUZGKg7DRc4gEp=4ZibAjHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-11-15  9:29                           ` Kolen Cheung
       [not found]                             ` <d4c5aaa1-4bb7-4b6c-82bc-e0763555651d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-11-15  9:29 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

I understand more about your point now.

Let's say if the general syntax to "hijack" the code block is:

1. an attribute `filter="..."`
2. a code block with an optional yaml block
3. the yaml block contains options for different filters
4. different filters can interact with the same code block (filter can have other filter as pre-processor)

now since all filters aware that there can be an optional yaml filter, any filters following the same standard can strip away the yaml if they don't need it.

So there seems no fundamental flaw in having a yaml block in the codeblock to store data, rather than using attributes.

So I think the more important thing is to have a standard, general syntax that everyone agrees (like the above), then we can minimize compatibility issue between filters.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                             ` <d4c5aaa1-4bb7-4b6c-82bc-e0763555651d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-16  8:46                               ` Kolen Cheung
  0 siblings, 0 replies; 42+ messages in thread
From: Kolen Cheung @ 2016-11-16  8:46 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 449 bytes --]

Another idea: use CSV comment to embed any neccessary metadata. According to http://stackoverflow.com/questions/1961006/can-a-csv-file-have-a-comment , CSV comment is not standardized, however.

Why it might be useful: I am thinking about writing some sort of CSV reader and writer such that it can go back and forth between CSV and pandoc AST "losslessly".

And since CSV comment is not standize, feel free to comment on what syntax might be best.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
                         ` (5 preceding siblings ...)
  2016-11-14  7:40       ` Kolen Cheung
@ 2016-11-18  9:22       ` mb21
       [not found]         ` <78b88082-90cb-4ec8-ab45-9e2be24d6dc4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-11-29 22:13       ` Kolen Cheung
  7 siblings, 1 reply; 42+ messages in thread
From: mb21 @ 2016-11-18  9:22 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig


[-- Attachment #1.1: Type: text/plain, Size: 949 bytes --]

I just wanted to mention that I'm happy to accept pull requests for 
pandoc-placetable. Especially, automatic calculation of column widths (if 
done well) and a table width attribue would be welcome.

Btw, pandoc-placetable supports a filter-less mode now (not pushed to 
hackage yet):

    
pandoc-placetable --file=foo.csv --widths="0.2 0.8" | pandoc -f json -o 
output.html

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/78b88082-90cb-4ec8-ab45-9e2be24d6dc4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2982 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <78b88082-90cb-4ec8-ab45-9e2be24d6dc4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-18 10:39           ` Kolen Cheung
       [not found]             ` <d847d3af-73fd-41d1-96e8-2c3a0dc9d70a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-12-04 12:59           ` Kolen Cheung
  1 sibling, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-11-18 10:39 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 5355 bytes --]

The panflute filter I wrote on csv-tables is almost finishing, I wrote extensive testing today, and probably I would submit it to PyPI tomorrow. Some of the unique features is auto width and talb-width, design not to fail on crappy source (following pandoc's practice). I decided to call it pantable as a wordplay on pandoc, and means a subset of pandoc, and probably will support xlsx or html_tables with slight modification.

I understand that "yet another pandoc csv table filter" seems not a good idea. And believe me, I strongly believe in streamlining pandoc experience, and hope to integrate filters to work together in harmony. So may be let me explain why I wrote mine:

I've been considering writing (and re-writing) filters in haskwell or python with panflute. Haskell almost won me over with the performance and tigher interaction with pandoc. But, (besides learning curve on picking up Haskell) the main difficulties I have around haskell filters is distribution, basically only those who have programming and command-line background can handle it. This might not be a big deal for some situation. But for the project I'm working on, it is extremely important. My department hired me this semester to rewrite one of the introductory physics course workbooks. What they ask for is the "traditional" way of writing workbooks. The old sources were partly Word partly TeX. Last update is about a decade ago. And they hire one person for a semester to finish the job and that's it. So the quality of the workbook is, as you can imagine, sub-par. But my idea is to:

1. turn it into GitHub repo, openning it up for all GSIs teaching those courses for download and edit. (open-source is under consideration, but I can guess it is not their priority even if they don't object to the idea)
2. use of pandoc, rather than pure LaTeX, so that multiple output can be simultaneously targetted. It also lower the barrier for any GSIs to contribute (many new graduate students do not know LaTeX)

All these is supposed to accerlerate development, partly by platform and partly by lowering the barrier (technological, mental, etc.). The last thing I want is to make the build process daunting or fragile (cabal dependency hell?)

I have suggested on the first meeting to use pandoc, and they amost immediately rejected it. But since I need to use pandoc to convert the doc (not docx! i.e. indirectly), I demostrated the capability of pandoc and what it is. I prepared it to be an intermediate process only. And in my design of the makefile, I prepare to nuke any existance of pandoc and leave a way to export the project in pure LaTeX only. However, "they" are convinced  to use pandoc markdown as the source after seeing what it is capable of.

Even after we settled for using pandoc, there's still a lot of uncertainties. The "they" I refered to who are convinced to adopt pandoc, are fired deal to budget cut. Who knows who is to take over and what he will think about pandoc (probably will be an old senior staff, used to old tools). And also because of the budget cut, they probably are not hiring me to continue to develop the workbook (I got teaching offer instead). (a hint on the poor (in both sense) university that got so much budget cut since 2009, it is the same as pandoc's author's university.)

That's the primary reason I've been thinking about the pandoc ecosystem and how to lower the barrier to use filters. The last thing I want to see is whoever the future hired person to work on the project nuke the pandoc source (and yes, I will provide the "nuke button"). (I even afraid they will nuke my makefile and the whole build process that I keep perfecting for weeks to handle single source to develop multiple series of workbooks. You know, I can imagine someone picking it up and say, "What's that makefile? Let's double click the tex and build in TeXShop" (or even worse convert it to LyX).)

P.S. However, the situation of pandoc filters might get brighter soon. There's some sort of pandoc package manager in development (so far panflute only). And I have a vague idea on a pre-build big binaries to include some of the useful filters (including those in Haskell). Using TeX terminologies, they are something like tlmgr and texlive distribution. I personally believe that these are very important for the longevity of the tool we all loves about: they are all about lower barrier, easier to use, easier to write filters (I believe addressed by panflute), zero maintenance, easier configuration, even GUI (say, Atom packages). (Right now pandoc seems to be for "hackers" only.) And let's not to mention ARM and appstore compatibility (podoc seems to have limited capability to parse pandoc in Python and has an appstore friendly license).

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d847d3af-73fd-41d1-96e8-2c3a0dc9d70a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <d847d3af-73fd-41d1-96e8-2c3a0dc9d70a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-23 10:52               ` Kolen Cheung
  0 siblings, 0 replies; 42+ messages in thread
From: Kolen Cheung @ 2016-11-23 10:52 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1.1: Type: text/plain, Size: 1933 bytes --]

I setup my filter in ickc/pantable: CSV Tables in Markdown: Pandoc Filter 
for CSV Tables <https://github.com/ickc/pantable> and called it pantable. I 
kind of want to break to pattern of pandoc-... since it is too cumbersome 
and Python doesn’t like -. In a sense it also emphasize pantable is a 
subset of pandoc, all about tables. For the meanwhile it only support CSV 
input. (But I’m thinking about adding .xlsx (seems trivial) and .html (less 
trivial) input. The last one might seems redundant, but basically the idea 
is to use HTML to type tables when it is too complicated, but rather than 
being a raw HTML I want it be real pandoc Table element for other outputs.)

I wrote extensive tests to make sure it works in cornering case. *One thing 
that could be controversial is error handling*. Basically for the metadata 
part, whenever it is invalid (e.g. header: first row while I expect boolean 
or yes/no), it will be overridden with sensible defaults. I think it is 
kind of like pandoc’s spirit that “any random string is valid markdown”, so 
I try to suppress and handle away all the errors. But I’m not sure if it 
will makes the error too difficult to be spotted.

Please do give feedback if you have any. I will put it into real world use 
in my project heavily in the coming month.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/573694fd-9b0e-42a7-9ab2-d54376a1bde8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 8070 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
                         ` (6 preceding siblings ...)
  2016-11-18  9:22       ` mb21
@ 2016-11-29 22:13       ` Kolen Cheung
       [not found]         ` <a668593c-b4f2-4f57-909b-3f16dfb40990-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  7 siblings, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-11-29 22:13 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig

[-- Attachment #1.1: Type: text/plain, Size: 1520 bytes --]

Hi,

I wrote another “interesting” filter base on the same idea, but in reverse: pantable/pantable2csv.py 
at master · ickc/pantable 
<https://github.com/ickc/pantable/blob/master/pantable/pantable2csv.py>. It 
converts all tables in pandoc to a YAML-CodeBlock-styled CSV table defined 
in pantable.

Effectively, it adds a “CSV Writer”, where pantable is kind of “CSV Reader”.

I can kind of achieve idempotence here, but only at [image: P^3 = P^2], not [image: 
P^2 = P] (it’s from pandoc though). Basically it captures all info from 
pandoc’s AST, so the conversion pantable2csv did should be “lossless”.

These are at least important to me because I can safely jump between the 2 
formats (native pandoc table and csv table in code-block) without worrying 
too much. I could jump to csv for edit (more low-level, e.g. width control) 
and jump back to pandoc tables for better readability.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a668593c-b4f2-4f57-909b-3f16dfb40990%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4024 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <a668593c-b4f2-4f57-909b-3f16dfb40990-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-29 22:30           ` Sergio Correia
       [not found]             ` <7e398825-a285-4e73-ad3d-908f1f141589-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Sergio Correia @ 2016-11-29 22:30 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig


[-- Attachment #1.1: Type: text/plain, Size: 1876 bytes --]

Cool trick!

I can see now why were you testing the convert_text() code.

One question though, if you first do

pandoc example.md --output=example2.md

Then applying the filters to example2.md would be an idempotent operation?


On Tuesday, November 29, 2016 at 5:13:18 PM UTC-5, Kolen Cheung wrote:
>
> Hi,
>
> I wrote another “interesting” filter base on the same idea, but in 
> reverse: pantable/pantable2csv.py at master · ickc/pantable 
> <https://github.com/ickc/pantable/blob/master/pantable/pantable2csv.py>. 
> It converts all tables in pandoc to a YAML-CodeBlock-styled CSV table 
> defined in pantable.
>
> Effectively, it adds a “CSV Writer”, where pantable is kind of “CSV 
> Reader”.
>
> I can kind of achieve idempotence here, but only at [image: P^3 = P^2], 
> not [image: P^2 = P] (it’s from pandoc though). Basically it captures all 
> info from pandoc’s AST, so the conversion pantable2csv did should be 
> “lossless”.
>
> These are at least important to me because I can safely jump between the 2 
> formats (native pandoc table and csv table in code-block) without worrying 
> too much. I could jump to csv for edit (more low-level, e.g. width control) 
> and jump back to pandoc tables for better readability.
> 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7e398825-a285-4e73-ad3d-908f1f141589%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5187 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <7e398825-a285-4e73-ad3d-908f1f141589-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-11-30  2:06               ` Kolen Cheung
  0 siblings, 0 replies; 42+ messages in thread
From: Kolen Cheung @ 2016-11-30  2:06 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig

[-- Attachment #1.1: Type: text/plain, Size: 3056 bytes --]

Idempotent in this case means pantable and pantable2csv are “inverse” to 
each other. e.g. pandoc -t markdown -F pantable -F pantable2csv test.md 
should be identical to test.md. This can be done because pantable2csv 
losslessly represents all info in the pandoc’s AST into YAML+CSV in 
code-block. The only cases it ain’t lossless are

   1. 

   when pandoc parses the markdown in each cell to AST, and pantable2csv 
   passes those AST back into markdown by using pandoc. i.e. whether this is 
   idempotent or not depends on pandoc’s markdown -> AST -> markdown 
   conversion.
   2. 

   Potentially the width might have some truncation error, especially when 
   the to-format and from-format are not the same.

When I say I achieve [image: P^3=P^2], it means that pandoc -t native -F 
pantable -F pantable2csv -F pantable -F pantable2csv -F pantable -F 
pantable2csv csv_table.md = pandoc -t native -F pantable -F pantable2csv -F 
pantable -F pantable2csv csv_table.md (which is part of the unit test). The 
diff between [image: P^2] and [image: P] is exactly from (1).

A corollary to (1) is that it is kind of slow, since each table cells call 
pandoc for the conversion once. Probably nothing can be improved except to 
reinvent the parsing of tables (probably there’s no way to tell pandoc to 
ignore markdown in cell, while not escaping character sequences). (And 
tables has [image: m \times n] cells so inherently it will be slow.) But I 
don’t quite worry about the performance aspect if it is going to solve a 
workflow problem.

Eventually I think I’m going to make a thin wrapper of both to provide a 
cli version. And then automator scripts can be created. Then basically I 
can select the table in text editors and call system services to convert it 
in place. i.e. highlight table, convert to csv, edit, highlight and convert 
to table. This system-service-part won’t be cross-platform though. (By the 
way, a sad news is Apple just fired the one responsible for Automator and 
Applescripts, and kill the whole team! This used be the forte of OS X!)

In a sense, pantable2csv gives one the power to edit the table “in AST 
directly” easily, while pantable provide a way to pretty-print it back in 
native markdown. Note that after commit 298e6f3, all pandoc table’s info 
has a markdown representation (grid_tables only misses alignment which is 
added in that commit).

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e6213b58-11e9-4948-80c2-650347e26c2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 14462 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]         ` <78b88082-90cb-4ec8-ab45-9e2be24d6dc4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-11-18 10:39           ` Kolen Cheung
@ 2016-12-04 12:59           ` Kolen Cheung
       [not found]             ` <40e755f4-b03d-453e-90d6-13d1ba596f60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2016-12-04 12:59 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig


[-- Attachment #1.1: Type: text/plain, Size: 1024 bytes --]



Just in case anyone is interested in auto-calculate column width, probably 
deal to jgm/pandoc@0dfceda 
<https://github.com/jgm/pandoc/commit/0dfcedad7ef98dfcfdb2378b7c974bf96b93fbcc>, 
the column width should be calculated from the “(actual length of line) + 
3”. Also see default column used in tables? - Google Groups 
<https://groups.google.com/forum/#!topic/pandoc-discuss/OiLX2LF3dC0> in how 
it is reverse-engineered.


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/40e755f4-b03d-453e-90d6-13d1ba596f60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2478 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]             ` <40e755f4-b03d-453e-90d6-13d1ba596f60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-08-01  9:10               ` Kolen Cheung
       [not found]                 ` <6ad9a315-1887-4e88-af53-99eaa87d39fa-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 42+ messages in thread
From: Kolen Cheung @ 2017-08-01  9:10 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: mf-+Z+QprJ1jbpwFuiNLMe2Ig

[-- Attachment #1.1: Type: text/plain, Size: 1749 bytes --]

Hi, I'd like to hear your opinion on pantable's dependencies.

For now, pantable only depends on panflute (and else all in Python's 
standard library). But there's a few things I want to do that would be 
impossible without further dependencies. Notable ones are matplotlib, 
numpy, scipy, matplotlib2tikz, pandas. And you can see these are related to 
plots and/or CSV readers/writers. Other potential dependencies would be 
xlsx reader/writer, etc.

Some of the dependencies are quite big and I'm not certain if they'll build 
successfully on alternative architecture. So I think I either list all 
these into the dependencies and make all pantable users also install them, 
or I make them optional dependencies, where some functions will only 
function if those are installed (with error hinting which to install). 
While the later approach seems best to have minimal impact on the others, 
the former approach is more "all-inclusive", and will be easier to maintain 
for example for which CSV reader/writer to use (I'd want to use pandas' but 
if pandas is optional then I need to deal with 2 different implementations 
leading to potentially different behaviors.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6ad9a315-1887-4e88-af53-99eaa87d39fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2261 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: Markdown, tables and CSV
       [not found]                 ` <6ad9a315-1887-4e88-af53-99eaa87d39fa-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-08-01 13:43                   ` Sergio Correia
  0 siblings, 0 replies; 42+ messages in thread
From: Sergio Correia @ 2017-08-01 13:43 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3147 bytes --]

Another option is to have another package (e.g. pantable_extended) that
does it.

In that way, people who just want to user the basic elements can use
pantable, and you can use the other one to push capabilities to the limit.

Also, I agree that requiring numpy instead as having it optional is not
always the best, as on Windows you might want to install it from the
unofficial precompiled binaries, that AFAIK are faster.

Let me know how it goes!
Cheers,
Sergio

On Aug 1, 2017 5:10 AM, "Kolen Cheung" <christian.kolen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

Hi, I'd like to hear your opinion on pantable's dependencies.

For now, pantable only depends on panflute (and else all in Python's
standard library). But there's a few things I want to do that would be
impossible without further dependencies. Notable ones are matplotlib,
numpy, scipy, matplotlib2tikz, pandas. And you can see these are related to
plots and/or CSV readers/writers. Other potential dependencies would be
xlsx reader/writer, etc.

Some of the dependencies are quite big and I'm not certain if they'll build
successfully on alternative architecture. So I think I either list all
these into the dependencies and make all pantable users also install them,
or I make them optional dependencies, where some functions will only
function if those are installed (with error hinting which to install).
While the later approach seems best to have minimal impact on the others,
the former approach is more "all-inclusive", and will be easier to maintain
for example for which CSV reader/writer to use (I'd want to use pandas' but
if pandas is optional then I need to deal with 2 different implementations
leading to potentially different behaviors.

Thanks.

-- 
You received this message because you are subscribed to a topic in the
Google Groups "pandoc-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/pandoc-discuss/znGQ62WpWrg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/
msgid/pandoc-discuss/6ad9a315-1887-4e88-af53-99eaa87d39fa%40googlegroups.com
<https://groups.google.com/d/msgid/pandoc-discuss/6ad9a315-1887-4e88-af53-99eaa87d39fa%40googlegroups.com?utm_medium=email&utm_source=footer>
.

For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABbbhq-rBQ2i0HhBHPVD34kPTEbRnG8%3Du0y0rW-t0vKE113dSw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4673 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2017-08-01 13:43 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <047d7b86ebe83c062b05332eab9b@google.com>
     [not found] ` <047d7b86ebe83c062b05332eab9b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2016-05-20  9:38   ` Markdown, tables and CSV Martin Fenner
     [not found]     ` <20BF19CB-A2B0-4B19-A749-D750CDD89736-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org>
2016-05-20 17:55       ` John Gabriele
     [not found]         ` <1463766905.1918988.613990665.6CD67781-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
2016-05-20 19:15           ` BP Jonsson
2016-05-25 14:18           ` Frank Colcord
     [not found]             ` <471daa3c-e2ec-4445-b4fd-44e5c8a3fd6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-26  5:42               ` Sergio Correia
     [not found]                 ` <b9147aed-bf8e-4136-8fd2-949dea1034ea-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-26  8:04                   ` Martin Fenner
     [not found]                     ` <B4779237-F368-454A-8E43-93EBCDFDF8AB-i39mICoz+qVg9hUCZPvPmw@public.gmane.org>
2016-05-26 11:48                       ` Frank Colcord
     [not found]                         ` <CADZiF+X6AuYJEnnNCs1M=spfbp9Fn4X2GBVkxXKp9g9SSNH16A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-27  6:33                           ` John Gabriele
     [not found]                             ` <1464330807.2727387.620260561.2CC32090-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
2016-05-27 10:24                               ` Frank Colcord
     [not found]                                 ` <e1ffce2d-9cc0-4367-a652-a46fa5c141a6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-27 14:02                                   ` John Gabriele
2016-05-27 15:48                               ` 'Jason White' via pandoc-discuss
2016-05-26 12:50                   ` BPJ
2016-05-20 18:36       ` John MACFARLANE
     [not found]         ` <20160520183616.GB95956-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
2016-05-20 19:05           ` John Muccigrosso
2016-05-20 19:30           ` John Gabriele
     [not found]             ` <1463772643.1938448.614055033.793EA897-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
2016-05-20 19:37               ` BP Jonsson
2016-05-20 19:32           ` BP Jonsson
2016-05-21 17:03       ` kurt.pfeifle via pandoc-discuss
     [not found]         ` <fbcb1ece-48c7-4451-be2f-1b6cd70b2969-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-29 12:45           ` mb21
     [not found]             ` <f0058def-bd69-40c1-82b4-e7bdd151c46c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-29 15:59               ` kurt.pfeifle via pandoc-discuss
     [not found]                 ` <001833c9-e40d-4079-ba79-c88c852780a5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-30  7:47                   ` mb21
     [not found]                     ` <27f2fe62-8115-4513-b13a-c995f625f60d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-31 14:28                       ` kurt.pfeifle via pandoc-discuss
2016-05-23  4:42       ` Martin Fenner
     [not found]         ` <a1503704-4f58-47f7-a9e8-1c60dad8e935-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-05-23 20:22           ` John MACFARLANE
2016-05-27 19:50       ` D L
2016-11-14  7:40       ` Kolen Cheung
     [not found]         ` <14b8fa54-dc04-4874-bf47-fb268fc9f298-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-14 14:38           ` Melroch
     [not found]             ` <CADAJKhBcAxdQxytFdiug2iqxL+VxwECtWD-nMH4qPcfUUZUzUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-14 23:32               ` Kolen Cheung
2016-11-15  1:33               ` Sergio Correia
     [not found]                 ` <12c01cfd-f9de-4dd9-bb80-fcac75c808be-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-15  6:03                   ` Kolen Cheung
     [not found]                     ` <38bfec67-90f0-4d71-b054-1eedfd853d96-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-15  9:07                       ` BP Jonsson
     [not found]                         ` <CAFC_yuQU3BRFaJW7QQof_bvU7muAUZGKg7DRc4gEp=4ZibAjHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-15  9:29                           ` Kolen Cheung
     [not found]                             ` <d4c5aaa1-4bb7-4b6c-82bc-e0763555651d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-16  8:46                               ` Kolen Cheung
2016-11-18  9:22       ` mb21
     [not found]         ` <78b88082-90cb-4ec8-ab45-9e2be24d6dc4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-18 10:39           ` Kolen Cheung
     [not found]             ` <d847d3af-73fd-41d1-96e8-2c3a0dc9d70a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-23 10:52               ` Kolen Cheung
2016-12-04 12:59           ` Kolen Cheung
     [not found]             ` <40e755f4-b03d-453e-90d6-13d1ba596f60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-08-01  9:10               ` Kolen Cheung
     [not found]                 ` <6ad9a315-1887-4e88-af53-99eaa87d39fa-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-08-01 13:43                   ` Sergio Correia
2016-11-29 22:13       ` Kolen Cheung
     [not found]         ` <a668593c-b4f2-4f57-909b-3f16dfb40990-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-29 22:30           ` Sergio Correia
     [not found]             ` <7e398825-a285-4e73-ad3d-908f1f141589-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-11-30  2:06               ` Kolen Cheung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).