From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/18002 Path: news.gmane.org!.POSTED!not-for-mail From: Sam Liddicott Newsgroups: gmane.text.pandoc Subject: Pandoc support for implicit fenced code blocks in source files Date: Fri, 4 Aug 2017 07:43:21 -0700 (PDT) Message-ID: <36c4d1d4-2d9d-4919-97ab-0eaf588b22a3@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_876_1522020506.1501857801259" X-Trace: blaine.gmane.org 1501857805 30614 195.159.176.226 (4 Aug 2017 14:43:25 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 4 Aug 2017 14:43:25 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDOYFR5FTILRBCMQSLGAKGQECDZKFFY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Aug 04 16:43:20 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pf0-f191.google.com ([209.85.192.191]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dddoq-0007X6-Nk for gtp-pandoc-discuss@m.gmane.org; Fri, 04 Aug 2017 16:43:16 +0200 Original-Received: by mail-pf0-f191.google.com with SMTP id y19sf1128347pfi.1 for ; Fri, 04 Aug 2017 07:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=4IBtfBVBTNEKONoGV28BCD5J/9CCuoQiVKfE8cDI1I4=; b=kSvHWIQKuy7k+ttpupDc+5DoPGrooJbQXH1iZgJK1DtDk1ft+raSdOuGSRtbTfYRAg rWHIyEgaCcCkTrI+D7g4ndEuDi14ETsCZIGrqyXusNqXCHbTGPVQxVjx6jMwDqT+bDOF AP3LvkIx02IDMfdREzGa9HZ0OsaaY1MF8Fb/2Ju4LU6GklxktKZZjsrQi/zCvn9amlhG ikJG5q0+sqzpRBxKCioUCtc+kVhJZLNIappgK7u9HxSCsDcZER9IU3Wrce5dVVWCuOiK 0MgMo8hFKHvI7K0zNaKAUoC8bekGeQqSR4bptFWCnTF1vKRsGjESKgiOpGOvjmsUoS7V G9uw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=liddicott-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=4IBtfBVBTNEKONoGV28BCD5J/9CCuoQiVKfE8cDI1I4=; b=QtKReihYSd7N/lE5iS+zsp71OEPDBKf/rmOHBTBn2SopR7lruWVevCzqjLunNioI4Z jhxWdUpk/AbnJQpmyDG4uavHv132A0v0lzqyDYWHb/4cJLwrkgKaRUtMLLjOGm8b5/jR YQIHSNKR7mswM+CIFlZ0t3RyxQQAmDDFlMUjOWXErIE3M2IT36V2+xA0scleKjmtUrYR ZbxurqKHMnQinSzxoLqJY/e6aJhqUKDw2oNAnayOFwaUztnfzgLHLlyVRb9ivLVdDA0+ 9Iznpw6EZZsKM/cxAVBfTs5Hp6ndNxXwR7gYUfZ7NdskEOlrdYt7WqrkLJz7No5W4lua lJpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=4IBtfBVBTNEKONoGV28BCD5J/9CCuoQiVKfE8cDI1I4=; b=YE+08WIh392laglDzoKb4h1/gLY1JRxAx662dKR229OeaCuJ677afhrTq73c0xTLEb 23K4M7W4SI8zxc0/0wkvhy9qpC9Tsy6CjWNoU0auHoRnXfE2e275tbmTrem6Gg30db+m OkYLEWwxdVd1z+63bHKK06woNHNVq6nBOeFveaP64NwI0UJTEfw/TCDeqI+oWh8SaQ8Z GeCHcEHRPnWvPTk0o8DCT0scM5Ife6G9SYTuNMqDMtZF/TcCYpEEqL9r+2lrg7oWWz2X Mb9htb1/6CRy/ZmkXBkbfAw1Dh1n1luG57SHXmIza5Y9xoNCw2VIb/Sbp/MVj23+X1Pi FNkg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AIVw111EVkRbs+u/sQLdraI1GLs1z1SgJPY+j8UGmh+v6DE16OKo+ScE V+lGV6hC34fugA== X-Received: by 10.36.26.69 with SMTP id 66mr79529iti.12.1501857802627; Fri, 04 Aug 2017 07:43:22 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.36.137.196 with SMTP id s187ls1794164itd.2.canary-gmail; Fri, 04 Aug 2017 07:43:21 -0700 (PDT) X-Received: by 10.36.1.208 with SMTP id 199mr80827itk.6.1501857801737; Fri, 04 Aug 2017 07:43:21 -0700 (PDT) X-Original-Sender: sam-zQGKLn5Wc3Lby3iVrkZq2A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:18002 Archived-At: ------=_Part_876_1522020506.1501857801259 Content-Type: multipart/alternative; boundary="----=_Part_877_1604605817.1501857801259" ------=_Part_877_1604605817.1501857801259 Content-Type: text/plain; charset="UTF-8" I use pandoc in my C source files. C comments contain pandoc markup, or perhaps special C comments like this /*:pandoc and the pandoc ends when the C comment ends. Combined with goat (to convert ascii diagrams to SVG) this is a great way to document code. The difficulty in doing this now is convincing pandoc that code outside of comments is a fenced code block, and that the start end end comment markers shouldn't be rendered. For instance: ---8<------8<------8<------8<------8<------8<------8<--- /** It's a shame that pandoc renders the leading /* and now here is some code but it the comment markers sadly render: ~~~C */ int x() { blag(); }; /* ~~~ etc. */ ---8<------8<------8<------8<------8<------8<------8<--- This feature can't be implemented using the filters, because if pandoc treats the C source as pandoc markup, the start-comment might be halfway through an AST node that never should be there. I think that it needs parser support; but it isn't a new input format either, as other variants of markdown might be used internally. Ideally, this requires a new parsing mode to assume a fenced-code-block interspersed with other pandoc markup. i think the method is: based on the file extension or a runtime argument, set the default fenced code block type, and the comment start and end sequences. 1. If, (after skipping initial white space), the first text is not a comment-start-sequence, then the fence-code block is assumed before the white space, and all the input is inserted into that fenced code block in the AST until end-of-file or a comment-start-sequence. 2. At a comment-start-sequence, the sequence is thrown away and the parse acts as if a fenced-code-block-end was read. 3. Parsing continues as normal until a comment-end-sequence is read. This sequence is thrown away and the parser repeats from 1. Now maybe the start-comment-sequence is always followed by a magic header like :pandoc and maybe by further attributes which ought to be applies to the previous fenced code block, and maybe the end-comment-sequence could have attributes to apply to the upcoming fenced-code-block. Maybe as a way to say: skip this code block until you next see a pandoc comment, don't even bother to emit it. Not all code wants to be part of the documentation, after all. Maybe this would be better suited to an awk script to run manually and not be part of pandoc at all. I'm using this sed, which does the job somewhat. sed -e '1!s/^\/\* */~~~\n\n/;/\/\*/!s/\*\/$/\n\n~~~C/' | pandoc --toc -s -S -o doc.html What are others thoughts on this? Sam -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919-97ab-0eaf588b22a3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_877_1604605817.1501857801259 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I use pandoc in my C source files.

C co= mments contain pandoc markup, or perhaps special C comments like this /*:pa= ndoc
and the pandoc ends when the C comment ends.

<= /div>
Combined with goat (to convert ascii diagrams to SVG) this is a g= reat way to document code.

The difficulty in doing= this now is convincing pandoc that code outside of comments is a fenced co= de block, and that the start end end comment markers shouldn't be rende= red.

For instance:

--= -8<------8<------8<------8<------8<------8<------8<---=
/**
It's a shame that pandoc renders the = leading /*
and now here is some code but it the comment markers s= adly render:=C2=A0

~~~C=C2=A0
*/
int x() {
=C2=A0 blag();

}; /*
<= div>~~~

etc.
*/
---8<= ------8<------8<------8<------8<------8<------8<---
=

This feature can't be implemented using the filters= , because if pandoc treats the C source as pandoc markup, the start-comment= might be halfway through an AST node that never should be there.

I think that it needs parser support; but it isn't a ne= w input format either, as other variants of markdown might be used internal= ly.

Ideally, this requires a new parsing mode= to assume a fenced-code-block interspersed with other pandoc markup.
=

i think the method is:

b= ased on the file extension or a runtime argument, set the default fenced co= de block type, and the comment start and end sequences.


1. If, (after skipping initial white space), the first= text is not a comment-start-sequence, then the fence-code block is assumed= before the white space, and all the input is inserted into that fenced cod= e block in the AST until end-of-file or a comment-start-sequence.

2. At a comment-start-sequence, the sequence is thrown away= and the parse acts as if a fenced-code-block-end was read.

<= /div>
3. Parsing continues as normal until a comment-end-sequence is re= ad. This sequence is thrown away and the parser repeats from 1.
<= br>
Now maybe the start-comment-sequence is always followed by a = magic header like :pandoc and maybe by further attributes which ought to be= applies to the previous fenced code block, and maybe the end-comment-seque= nce could have attributes to apply to the upcoming fenced-code-block. Maybe= as a way to say: skip this code block until you next see a pandoc comment,= don't even bother to emit it. Not all code wants to be part of the doc= umentation, after all.

Maybe this would be better = suited to an awk script to run manually and not be part of pandoc at all. I= 'm using this sed, which does the job somewhat.

=C2=A0 sed -e '1!s/^\/\* */~~~\n\n/;/\/\*/!s/\*\/$/\n\n~~~C/' | p= andoc --toc -s -S -o doc.html

What are others = thoughts on this?

Sam





--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/36c4d1d4-2d9d-4919-97ab-0eaf588b22a3%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_877_1604605817.1501857801259-- ------=_Part_876_1522020506.1501857801259--