public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Pandoc support for implicit fenced code blocks in source files
@ 2017-08-04 14:43 Sam Liddicott
       [not found] ` <36c4d1d4-2d9d-4919-97ab-0eaf588b22a3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Sam Liddicott @ 2017-08-04 14:43 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3189 bytes --]

I use pandoc in my C source files.

C comments contain pandoc markup, or perhaps special C comments like this 
/*:pandoc
and the pandoc ends when the C comment ends.

Combined with goat (to convert ascii diagrams to SVG) this is a great way 
to document code.

The difficulty in doing this now is convincing pandoc that code outside of 
comments is a fenced code block, and that the start end end comment markers 
shouldn't be rendered.

For instance:

---8<------8<------8<------8<------8<------8<------8<---
/**
It's a shame that pandoc renders the leading /*
and now here is some code but it the comment markers sadly render: 

~~~C 
*/
int x() {
  blag();

}; /*
~~~

etc.
*/
---8<------8<------8<------8<------8<------8<------8<---

This feature can't be implemented using the filters, because if pandoc 
treats the C source as pandoc markup, the start-comment might be halfway 
through an AST node that never should be there.

I think that it needs parser support; but it isn't a new input format 
either, as other variants of markdown might be used internally.

Ideally, this requires a new parsing mode to assume a fenced-code-block 
interspersed with other pandoc markup.

i think the method is:

based on the file extension or a runtime argument, set the default fenced 
code block type, and the comment start and end sequences.


1. If, (after skipping initial white space), the first text is not a 
comment-start-sequence, then the fence-code block is assumed before the 
white space, and all the input is inserted into that fenced code block in 
the AST until end-of-file or a comment-start-sequence.

2. At a comment-start-sequence, the sequence is thrown away and the parse 
acts as if a fenced-code-block-end was read.

3. Parsing continues as normal until a comment-end-sequence is read. This 
sequence is thrown away and the parser repeats from 1.

Now maybe the start-comment-sequence is always followed by a magic header 
like :pandoc and maybe by further attributes which ought to be applies to 
the previous fenced code block, and maybe the end-comment-sequence could 
have attributes to apply to the upcoming fenced-code-block. Maybe as a way 
to say: skip this code block until you next see a pandoc comment, don't 
even bother to emit it. Not all code wants to be part of the documentation, 
after all.

Maybe this would be better suited to an awk script to run manually and not 
be part of pandoc at all. I'm using this sed, which does the job somewhat.

  sed -e '1!s/^\/\* */~~~\n\n/;/\/\*/!s/\*\/$/\n\n~~~C/' | pandoc --toc -s 
-S -o doc.html

What are others thoughts on this?

Sam






-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919-97ab-0eaf588b22a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4394 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Pandoc support for implicit fenced code blocks in source files
       [not found] ` <36c4d1d4-2d9d-4919-97ab-0eaf588b22a3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-08-04 15:03   ` John MacFarlane
       [not found]     ` <20170804150352.GC74082-9Rnp8PDaXcadBw3G0RLmbRFnWt+6NQIA@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2017-08-04 15:03 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I'd say the obvious thing to do is to create a program
preprocesses the C source file, converting it into a
Markdown document that you can pipe through pandoc.

Or write a standalone tool in Haskell, using pandoc
as a library.  (And language-c to parse the source
code.)

+++ Sam Liddicott [Aug 04 17 07:43 ]:
>   I use pandoc in my C source files.
>   C comments contain pandoc markup, or perhaps special C comments like
>   this /*:pandoc
>   and the pandoc ends when the C comment ends.
>   Combined with goat (to convert ascii diagrams to SVG) this is a great
>   way to document code.
>   The difficulty in doing this now is convincing pandoc that code outside
>   of comments is a fenced code block, and that the start end end comment
>   markers shouldn't be rendered.
>   For instance:
>   ---8<------8<------8<------8<------8<------8<------8<---
>   /**
>   It's a shame that pandoc renders the leading /*
>   and now here is some code but it the comment markers sadly render:
>   ~~~C
>   */
>   int x() {
>     blag();
>   }; /*
>   ~~~
>   etc.
>   */
>   ---8<------8<------8<------8<------8<------8<------8<---
>   This feature can't be implemented using the filters, because if pandoc
>   treats the C source as pandoc markup, the start-comment might be
>   halfway through an AST node that never should be there.
>   I think that it needs parser support; but it isn't a new input format
>   either, as other variants of markdown might be used internally.
>   Ideally, this requires a new parsing mode to assume a fenced-code-block
>   interspersed with other pandoc markup.
>   i think the method is:
>   based on the file extension or a runtime argument, set the default
>   fenced code block type, and the comment start and end sequences.
>   1. If, (after skipping initial white space), the first text is not a
>   comment-start-sequence, then the fence-code block is assumed before the
>   white space, and all the input is inserted into that fenced code block
>   in the AST until end-of-file or a comment-start-sequence.
>   2. At a comment-start-sequence, the sequence is thrown away and the
>   parse acts as if a fenced-code-block-end was read.
>   3. Parsing continues as normal until a comment-end-sequence is read.
>   This sequence is thrown away and the parser repeats from 1.
>   Now maybe the start-comment-sequence is always followed by a magic
>   header like :pandoc and maybe by further attributes which ought to be
>   applies to the previous fenced code block, and maybe the
>   end-comment-sequence could have attributes to apply to the upcoming
>   fenced-code-block. Maybe as a way to say: skip this code block until
>   you next see a pandoc comment, don't even bother to emit it. Not all
>   code wants to be part of the documentation, after all.
>   Maybe this would be better suited to an awk script to run manually and
>   not be part of pandoc at all. I'm using this sed, which does the job
>   somewhat.
>     sed -e '1!s/^\/\* */~~~\n\n/;/\/\*/!s/\*\/$/\n\n~~~C/' | pandoc --toc
>   -s -S -o doc.html
>   What are others thoughts on this?
>   Sam
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919-
>   97ab-0eaf588b22a3%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919-97ab-0eaf588b22a3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Pandoc support for implicit fenced code blocks in source files
       [not found]     ` <20170804150352.GC74082-9Rnp8PDaXcadBw3G0RLmbRFnWt+6NQIA@public.gmane.org>
@ 2017-08-07  8:42       ` Sam Liddicott
  0 siblings, 0 replies; 3+ messages in thread
From: Sam Liddicott @ 2017-08-07  8:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 5069 bytes --]

Thanks, John, for both your replies.

Sam

On Friday, 4 August 2017 16:04:08 UTC+1, John MacFarlane wrote:
>
> I'd say the obvious thing to do is to create a program 
> preprocesses the C source file, converting it into a 
> Markdown document that you can pipe through pandoc. 
>
> Or write a standalone tool in Haskell, using pandoc 
> as a library.  (And language-c to parse the source 
> code.) 
>
> +++ Sam Liddicott [Aug 04 17 07:43 ]: 
> >   I use pandoc in my C source files. 
> >   C comments contain pandoc markup, or perhaps special C comments like 
> >   this /*:pandoc 
> >   and the pandoc ends when the C comment ends. 
> >   Combined with goat (to convert ascii diagrams to SVG) this is a great 
> >   way to document code. 
> >   The difficulty in doing this now is convincing pandoc that code 
> outside 
> >   of comments is a fenced code block, and that the start end end comment 
> >   markers shouldn't be rendered. 
> >   For instance: 
> >   ---8<------8<------8<------8<------8<------8<------8<--- 
> >   /** 
> >   It's a shame that pandoc renders the leading /* 
> >   and now here is some code but it the comment markers sadly render: 
> >   ~~~C 
> >   */ 
> >   int x() { 
> >     blag(); 
> >   }; /* 
> >   ~~~ 
> >   etc. 
> >   */ 
> >   ---8<------8<------8<------8<------8<------8<------8<--- 
> >   This feature can't be implemented using the filters, because if pandoc 
> >   treats the C source as pandoc markup, the start-comment might be 
> >   halfway through an AST node that never should be there. 
> >   I think that it needs parser support; but it isn't a new input format 
> >   either, as other variants of markdown might be used internally. 
> >   Ideally, this requires a new parsing mode to assume a 
> fenced-code-block 
> >   interspersed with other pandoc markup. 
> >   i think the method is: 
> >   based on the file extension or a runtime argument, set the default 
> >   fenced code block type, and the comment start and end sequences. 
> >   1. If, (after skipping initial white space), the first text is not a 
> >   comment-start-sequence, then the fence-code block is assumed before 
> the 
> >   white space, and all the input is inserted into that fenced code block 
> >   in the AST until end-of-file or a comment-start-sequence. 
> >   2. At a comment-start-sequence, the sequence is thrown away and the 
> >   parse acts as if a fenced-code-block-end was read. 
> >   3. Parsing continues as normal until a comment-end-sequence is read. 
> >   This sequence is thrown away and the parser repeats from 1. 
> >   Now maybe the start-comment-sequence is always followed by a magic 
> >   header like :pandoc and maybe by further attributes which ought to be 
> >   applies to the previous fenced code block, and maybe the 
> >   end-comment-sequence could have attributes to apply to the upcoming 
> >   fenced-code-block. Maybe as a way to say: skip this code block until 
> >   you next see a pandoc comment, don't even bother to emit it. Not all 
> >   code wants to be part of the documentation, after all. 
> >   Maybe this would be better suited to an awk script to run manually and 
> >   not be part of pandoc at all. I'm using this sed, which does the job 
> >   somewhat. 
> >     sed -e '1!s/^\/\* */~~~\n\n/;/\/\*/!s/\*\/$/\n\n~~~C/' | pandoc 
> --toc 
> >   -s -S -o doc.html 
> >   What are others thoughts on this? 
> >   Sam 
> > 
> >   -- 
> >   You received this message because you are subscribed to the Google 
> >   Groups "pandoc-discuss" group. 
> >   To unsubscribe from this group and stop receiving emails from it, send 
> >   an email to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To post to this group, send email to 
> >   [2]pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To view this discussion on the web visit 
> >   [3]
> https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919- 
> >   97ab-0eaf588b22a3%40googlegroups.com. 
> >   For more options, visit [4]https://groups.google.com/d/optout. 
> > 
> >References 
> > 
> >   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   2. mailto:pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   3. 
> https://groups.google.com/d/msgid/pandoc-discuss/36c4d1d4-2d9d-4919-97ab-0eaf588b22a3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer 
> >   4. https://groups.google.com/d/optout 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/668512b2-3a37-4c33-a521-7e77f48e8e7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 8808 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-07  8:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-04 14:43 Pandoc support for implicit fenced code blocks in source files Sam Liddicott
     [not found] ` <36c4d1d4-2d9d-4919-97ab-0eaf588b22a3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-08-04 15:03   ` John MacFarlane
     [not found]     ` <20170804150352.GC74082-9Rnp8PDaXcadBw3G0RLmbRFnWt+6NQIA@public.gmane.org>
2017-08-07  8:42       ` Sam Liddicott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).