From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/14972 Path: news.gmane.org!not-for-mail From: Martin Fenner Newsgroups: gmane.text.pandoc Subject: Markdown, tables and CSV Date: Fri, 20 May 2016 12:38:55 +0300 Message-ID: <20BF19CB-A2B0-4B19-A749-D750CDD89736@martinfenner.org> References: <047d7b86ebe83c062b05332eab9b@google.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Content-Type: multipart/alternative; boundary="Apple-Mail=_2F71F1E8-B355-4785-A551-C237A54A05C9" X-Trace: ger.gmane.org 1463737154 6703 80.91.229.3 (20 May 2016 09:39:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 20 May 2016 09:39:14 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBC6KTIOEQEEBBNFW7O4QKGQEDMRRLLA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri May 20 11:39:07 2016 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-wm0-f59.google.com ([74.125.82.59]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1b3gtZ-0006Me-Si for gtp-pandoc-discuss@m.gmane.org; Fri, 20 May 2016 11:39:01 +0200 Original-Received: by mail-wm0-f59.google.com with SMTP id g16sf3234108wmd.0 for ; Fri, 20 May 2016 02:39:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=sender:from:message-id:mime-version:subject:date:references:to :in-reply-to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:x-spam-checked-in-group :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=jxPucwtbRqBKeaoIKqUMXrgDtCjTdsJSeiWTYDzAghI=; b=QfI4YXV68+voWrBVy+EmMsndDaX3GMN3QD2990kUnAA5Q8MXWZYlfKJIUUquvEk7yi gTG+vxFRKdDAnIa1X2pOoFP7BXRPLrNTc4JQMmWJjuDFDT0rJ9hxePQ37NqYmZwLg26e kuZLyay8sLOW7Y8tieNZsLzi1XcunnjnI1TDMdK8c2cKuwCc/jPjWoCGb5F+cFuJrlia gH1C4th/cKBqVv7F/vonUQOrE1niea352RdsPqb/CLjWzNfaCSd63MjltXNiZkCdDlCi IUEKNhYjFE2MX/P0toV540ndd0XIidwEizfodQ7Q45HJRFUS2bHsDcDprT10T22fXd4s mlsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=sender:x-gm-message-state:from:message-id:mime-version:subject:date :references:to:in-reply-to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=jxPucwtbRqBKeaoIKqUMXrgDtCjTdsJSeiWTYDzAghI=; b=YaUV7F77nuVP+6f1/6G+PRR5xdETh6RuXpoF+nkzcIEerJ8E6+S7lofs6HGHwUC1bA 286U0IML41XBCHsOr10O3cqmmNnutshU3PzJupNr7M6YcmF2fx1JnrKV1J7JP8FTozEL E21VevdtxBSuciV83vDT34132Hs8Rw6qrX5f5qOuRLPiJNzO5sSGOyQ52UniChsq9MSz d3lt6AIooYc0Y1LPukj+9OAK5u2nJJ2rwMTN6/v/VWUKDO/7mi+ll4MEKC4+KZr0dwQu ZGwrhuo6Uqma/SGNQTtWsK3Gy1fiSMEXQNOPVwXboJuv9XeIZ2imGlLujY1N9BJU1Kur 2+2g== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOPr4FWBet3kgy/vu/cs2lmP93YayVogYn3duT8E9yGrtcvMsIafNrPRHN8U8cdEXwm2+Q== X-Received: by 10.25.78.12 with SMTP id c12mr12326lfb.19.1463737141387; Fri, 20 May 2016 02:39:01 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.25.151.1 with SMTP id z1ls394509lfd.108.gmail; Fri, 20 May 2016 02:39:00 -0700 (PDT) X-Received: by 10.112.156.100 with SMTP id wd4mr245360lbb.4.1463737140002; Fri, 20 May 2016 02:39:00 -0700 (PDT) Original-Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com. [2a00:1450:400c:c09::230]) by gmr-mx.google.com with ESMTPS id l68si141435wmd.3.2016.05.20.02.38.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 20 May 2016 02:38:59 -0700 (PDT) Received-SPF: pass (google.com: domain of mf-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org designates 2a00:1450:400c:c09::230 as permitted sender) client-ip=2a00:1450:400c:c09::230; Original-Received: by mail-wm0-x230.google.com with SMTP id a17so161173746wme.0 for ; Fri, 20 May 2016 02:38:59 -0700 (PDT) X-Received: by 10.28.174.70 with SMTP id x67mr2505007wme.43.1463737139611; Fri, 20 May 2016 02:38:59 -0700 (PDT) Original-Received: from [192.168.1.4] ([80.106.206.217]) by smtp.gmail.com with ESMTPSA id b22sm3709255wmb.9.2016.05.20.02.38.57 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 20 May 2016 02:38:57 -0700 (PDT) In-Reply-To: <047d7b86ebe83c062b05332eab9b-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> X-Mailer: Apple Mail (2.3112) X-Original-Sender: mf-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@martinfenner-org.20150623.gappssmtp.com; spf=pass (google.com: domain of mf-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org designates 2a00:1450:400c:c09::230 as permitted sender) smtp.mailfrom=mf-+Z+QprJ1jbpwFuiNLMe2Ig@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Spam-Checked-In-Group: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:14972 Archived-At: --Apple-Mail=_2F71F1E8-B355-4785-A551-C237A54A05C9 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Dear group, The topic of CSV support in Pandoc has come up several times on this list, = includes this thread from 2014: https://groups.google.com/forum/#!topic/pandoc-discuss/kBdJU_JktzI Since last year I work for an organisation that frequently deals with tabul= ar data (and helped organize CSVconf earlier this month), and I have done s= ome thinking on how CSV could fit into Pandoc. I see two important use case= s: * CSV reader that converts to tables in HTML, docx, latex, etc. * CSV has a format to describe tables in markdown For the first use case I wrote a hack for the Jekyll blogging platform this= week that turns CSV files into markdown grid tables format that is then pr= ocessed by Pandoc (https://github.com/datacite/jekyll-csvy ). I would rather use Pandoc with a CSV reader, bu= t my Haskell isn't good enough to write one. But for now I can generate blo= g posts directly from CSV files. Other people have done similar things with= Pandoc and CSV. For the second use case I see a clear advantage of CSV over the various att= empts to format tables in markdown (simple_tables, multiline_tables, grid_t= ables, pipe_tables). Everyone (and many tools) understands the CSV format, = and you can do most of the things with CSV that the other table formats all= ow (multi-column formats and column alignment are a bit trickier). This has= been done before using Pandoc filters, but I think a Pandoc "csv_tables" P= andoc extension would make this easier for the casual user. Using the grid_= tables example from the Pandoc documentation, this could look like this: : Sample csv table. ,,, Fruit,Price,Advantages Bananas,$1.34,- built-in wrapper\n- bright color Oranges,$2.10, - cures scurvy\n- tasty ,,, I like three commas on a new line to indicate the start and end of a table,= but that is of course open for discussion. The format is much easier to re= ad and edit for humans compared to grid tables, the only tricky bit is mayb= e the \n for multiline columns. I would think we could add metadata to the = fenced table blog similar to code blocks, e.g. ,,,{ #mytable .numberRows } One challenge with CSV is that it is an ill-defined format somewhat similar= to markdown before CommonMark. It may make things easier to only support a= specific CSV variant (e.g. comma as separator, header required, comment li= nes not allowed). Thoughts? Best, Martin --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.org. For more options, visit https://groups.google.com/d/optout. --Apple-Mail=_2F71F1E8-B355-4785-A551-C237A54A05C9 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8 Dear group,
The topic of CSV support in Pandoc has c= ome up several times on this list, includes this thread from 2014:
https://groups.google.com/forum/#!topic/pandoc-= discuss/kBdJU_JktzI

Since last year I work for an organisation that frequently deals wit= h tabular data (and helped organize CSVconf earlier this month), and I have= done some thinking on how CSV could fit into Pandoc. I see two important u= se cases:

* CSV r= eader that converts to tables in HTML, docx, latex, etc.
* CSV has a format to describe tables in markdown
<= br class=3D"">
For the first use case I wrote a hack f= or the Jekyll blogging platform this week that turns CSV files into markdow= n grid tables format that is then processed by Pandoc (https://github.com/datacite/jeky= ll-csvy). I would rather use Pandoc with a CSV reader, but my Haskell i= sn't good enough to write one. But for now I can generate blog posts direct= ly from CSV files. Other people have done similar things with Pandoc and CS= V.

For the second= use case I see a clear advantage of CSV over the various attempts to forma= t tables in markdown (simple_tables, multiline_tables, grid_tables, pipe_ta= bles). Everyone (and many tools) understands the CSV format, and you can do= most of the things with CSV that the other table formats allow (multi-colu= mn formats and column alignment are a bit trickier). This has been done bef= ore using Pandoc filters, but I think a Pandoc "csv_tables" Pandoc extensio= n would make this easier for the casual user. Using the grid_tables example= from the Pandoc documentation, this could look like this:

: Sample csv tab= le.

,,,
Fruit,Price,Advantages
Bananas,$1.34,- bui= lt-in wrapper\n- bright color
Oranges,$2.10, - cures s= curvy\n- tasty
,,,

I like three commas on a new line to indicate = the start and end of a table, but that is of course open for discussion. Th= e format is much easier to read and edit for humans compared to grid tables= , the only tricky bit is maybe the \n for multiline columns. I would think = we could add metadata to the fenced table blog similar to code blocks, e.g.=

,,,{ #mytable .n= umberRows }

One c= hallenge with CSV is that it is an ill-defined format somewhat similar to m= arkdown before CommonMark. It may make things easier to only support a spec= ific CSV variant (e.g. comma as separator, header required, comment lines n= ot allowed).

Thou= ghts?

Best,
=

Martin



--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/20BF19CB-A2B0-4B19-A749-D750CDD89736%40martinfenner.or= g.
For more options, visit http= s://groups.google.com/d/optout.
--Apple-Mail=_2F71F1E8-B355-4785-A551-C237A54A05C9--