public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* filter to remove inline HTML comments
@ 2014-12-12 21:53 Pablo Rodríguez
       [not found] ` <548B63DF.5050902-S0/GAf8tV78@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Rodríguez @ 2014-12-12 21:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dear Matthew,

although I read the filters you sent me, I have just realized that the
Pandoc filter (https://gist.github.com/mpickering/b9fceb4cf9f11e116b49)
only removes block comments.

Would it be possible that it also removes inline comments?

(I have tried myself do learn from the filter that removes footnotes,
but I’m afraid this is all Greek to me.)

Many thanks for your help,


Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/548B63DF.5050902%40web.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found] ` <548B63DF.5050902-S0/GAf8tV78@public.gmane.org>
@ 2014-12-12 23:39   ` BPJ
       [not found]     ` <CADAJKhA2zy5KJ-FC44BKTeuTkw_CCwZa3c+NJAgScBFEOjLnog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: BPJ @ 2014-12-12 23:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2112 bytes --]

I thought HTML had only one type of comments, namely delimited, or am I
missing something? Does pandoc parse HTML comments somtimes as RawInline
and sometimes as RawBlock?

If the issue is to have comments in your markdown which don't show up in
generated HTML you can use YAML blocks which only contain YAML comments:

---
# This is a YAML comment
...

/bpj

fredag 12 december 2014 skrev Pablo Rodríguez <oinos-S0/GAf8tV78@public.gmane.org>:

> Dear Matthew,
>
> although I read the filters you sent me, I have just realized that the
> Pandoc filter (https://gist.github.com/mpickering/b9fceb4cf9f11e116b49)
> only removes block comments.
>
> Would it be possible that it also removes inline comments?
>
> (I have tried myself do learn from the filter that removes footnotes,
> but I’m afraid this is all Greek to me.)
>
> Many thanks for your help,
>
>
> Pablo
> --
> http://www.ousia.tk
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:;>.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:;>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/548B63DF.5050902%40web.de
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhA2zy5KJ-FC44BKTeuTkw_CCwZa3c%2BNJAgScBFEOjLnog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3371 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found]     ` <CADAJKhA2zy5KJ-FC44BKTeuTkw_CCwZa3c+NJAgScBFEOjLnog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-12-13 10:14       ` Pablo Rodríguez
       [not found]         ` <548C1181.6010806-S0/GAf8tV78@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Rodríguez @ 2014-12-13 10:14 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 12/13/2014 12:39 AM, BPJ wrote:
> I thought HTML had only one type of comments, namely delimited, or am I
> missing something? Does pandoc parse HTML comments somtimes as RawInline
> and sometimes as RawBlock?

HTML comments themselves may be blocks or inline elements.

    <!--

    This is a block comment.

    Isn’t it?

    -->

    This is a<!--n inline--> comment.


pandoc removes them when parsing to other format than HTML. When parsing
to HTML, it simply passes them to the output text.

> If the issue is to have comments in your markdown which don't show up in
> generated HTML you can use YAML blocks which only contain YAML comments:

John told me about this, but YAML comments are harder to write for both
block and inline comments.


Pablo
-- 
http://www.ousia.tk

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/548C1181.6010806%40web.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found]         ` <548C1181.6010806-S0/GAf8tV78@public.gmane.org>
@ 2014-12-13 14:09           ` Mark Szepieniec
       [not found]             ` <CAE4-1rUH+guOdJvpmMNLS8_xKKbj=oBXcdqrF8eT7uEAZWPkrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-12-14 21:29           ` BP Jonsson
  1 sibling, 1 reply; 7+ messages in thread
From: Mark Szepieniec @ 2014-12-13 14:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2522 bytes --]

module Main where

import Text.Pandoc.JSON
import Data.List

main = toJSONFilter removeComments

removeComments :: Inline -> [Inline]
removeComments e@(RawInline (Format "html") s) = if isComment s then [] else [e]
removeComments e = [e]

isComment :: String -> Bool
isComment = isPrefixOf "<!--"



This should work, although I've not tested it myself.

Mark


On Sat, Dec 13, 2014 at 11:14 AM, Pablo Rodríguez <oinos-S0/GAf8tV78@public.gmane.org> wrote:
>
> On 12/13/2014 12:39 AM, BPJ wrote:

> I thought HTML had only one type of comments, namely delimited, or am I
> > missing something? Does pandoc parse HTML comments somtimes as RawInline
> > and sometimes as RawBlock?
>
> HTML comments themselves may be blocks or inline elements.
>
>     <!--
>
>     This is a block comment.
>
>     Isn’t it?
>
>     -->
>
>     This is a<!--n inline--> comment.
>
>
> pandoc removes them when parsing to other format than HTML. When parsing
> to HTML, it simply passes them to the output text.
>
> > If the issue is to have comments in your markdown which don't show up in
> > generated HTML you can use YAML blocks which only contain YAML comments:
>
> John told me about this, but YAML comments are harder to write for both
> block and inline comments.
>
>
> Pablo
> --
> http://www.ousia.tk
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/548C1181.6010806%40web.de
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAE4-1rUH%2BguOdJvpmMNLS8_xKKbj%3DoBXcdqrF8eT7uEAZWPkrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4426 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found]             ` <CAE4-1rUH+guOdJvpmMNLS8_xKKbj=oBXcdqrF8eT7uEAZWPkrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-12-13 19:09               ` Pablo Rodríguez
       [not found]                 ` <548C8EF8.8050007-S0/GAf8tV78@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Rodríguez @ 2014-12-13 19:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Many thanks for your reply and the code, Mark.

How can I merge both filters (for block and inline comments) into a
single one.

I tried something like:

#!/usr/bin/runhaskell

module Main where

import Text.Pandoc.JSON
import Data.List

main :: IO ()
main = do
    toJSONFilter removeBlockComments removeInlineComments
    toJSONFilter removeBlockComments removeInlineComments

removeBlockComments :: Block -> [Block]
removeBlockComments e@(RawBlock (Format "html") _) = if isComment s then
[] else [e]
removeBlockComments e = [e]

removeInlineComments :: Inline -> [Inline]
removeInlineComments e@(RawInline (Format "html") s) = if isComment s
then [] else [e]
removeInlineComments e = [e]

isComment :: String -> Bool
isComment = isPrefixOf "<!--"

But I must confess that this is the first time I try to write (or even
read) something in Haskell.

How could be both filters merged?

Many thanks for your help,


Pablo



On 12/13/2014 03:09 PM, Mark Szepieniec wrote:
> module Main where
>  
> import Text.Pandoc.JSON
> import Data.List
>  
> main = toJSONFilter removeComments
>  
> removeComments :: Inline -> [Inline]
> removeComments e@(RawInline (Format "html") s) = if isComment s then [] else [e]
> removeComments e = [e]
>  
> isComment :: String -> Bool
> isComment = isPrefixOf "<!--"
> 
> This should work, although I've not tested it myself.
> 
> Mark


-- 
http://www.ousia.tk


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found]                 ` <548C8EF8.8050007-S0/GAf8tV78@public.gmane.org>
@ 2014-12-13 19:29                   ` Mark Szepieniec
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Szepieniec @ 2014-12-13 19:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3446 bytes --]

Merging the two filters is not straightforward, as one operates on Blocks
and the other on Inlines. It would be doable by merging them into a
function that operates on Pandocs (the datatype pandoc uses to represent
complete documents). However, I don't see why you would want to go to the
trouble of doing that; I think it's more elegant to keep them separate and
if you need to use both just do `pandoc ... --filter removeInlineComments
--filter removeBlockComments` which would apply the filters in that order.
That way, things are more modular and therefore easier to repurpose for
future use cases.

On Sat, Dec 13, 2014 at 8:09 PM, Pablo Rodríguez <oinos-S0/GAf8tV78@public.gmane.org> wrote:
>
> Many thanks for your reply and the code, Mark.
>
> How can I merge both filters (for block and inline comments) into a
> single one.
>
> I tried something like:
>
> #!/usr/bin/runhaskell
>
> module Main where
>
> import Text.Pandoc.JSON
> import Data.List
>
> main :: IO ()
> main = do
>     toJSONFilter removeBlockComments removeInlineComments
>     toJSONFilter removeBlockComments removeInlineComments
>
> removeBlockComments :: Block -> [Block]
> removeBlockComments e@(RawBlock (Format "html") _) = if isComment s then
> [] else [e]
> removeBlockComments e = [e]
>
> removeInlineComments :: Inline -> [Inline]
> removeInlineComments e@(RawInline (Format "html") s) = if isComment s
> then [] else [e]
> removeInlineComments e = [e]
>
> isComment :: String -> Bool
> isComment = isPrefixOf "<!--"
>
> But I must confess that this is the first time I try to write (or even
> read) something in Haskell.
>
> How could be both filters merged?
>
> Many thanks for your help,
>
>
> Pablo
>
>
>
> On 12/13/2014 03:09 PM, Mark Szepieniec wrote:
> > module Main where
> >
> > import Text.Pandoc.JSON
> > import Data.List
> >
> > main = toJSONFilter removeComments
> >
> > removeComments :: Inline -> [Inline]
> > removeComments e@(RawInline (Format "html") s) = if isComment s then []
> else [e]
> > removeComments e = [e]
> >
> > isComment :: String -> Bool
> > isComment = isPrefixOf "<!--"
> >
> > This should work, although I've not tested it myself.
> >
> > Mark
>
>
> --
> http://www.ousia.tk
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/548C8EF8.8050007%40web.de
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAE4-1rVuUuGss_x43gr80TgfHaxhRF8-MEyhM335Au3RZ-HGgg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4906 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filter to remove inline HTML comments
       [not found]         ` <548C1181.6010806-S0/GAf8tV78@public.gmane.org>
  2014-12-13 14:09           ` Mark Szepieniec
@ 2014-12-14 21:29           ` BP Jonsson
  1 sibling, 0 replies; 7+ messages in thread
From: BP Jonsson @ 2014-12-14 21:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2014-12-13 11:14, Pablo Rodríguez skrev:
> On 12/13/2014 12:39 AM, BPJ wrote:
>> I thought HTML had only one type of comments, namely delimited, or am I
>> missing something? Does pandoc parse HTML comments somtimes as RawInline
>> and sometimes as RawBlock?
>
> HTML comments themselves may be blocks or inline elements.
>
>      <!--
>
>      This is a block comment.
>
>      Isn’t it?
>
>      -->
>
>      This is a<!--n inline--> comment.
>
>
> pandoc removes them when parsing to other format than HTML. When parsing
> to HTML, it simply passes them to the output text.

My question was meant to mean whether pandoc parses the one into
a RawBlock and the other into a Raw Inline, rather than into a
RawInline inside a Para and a RawInline. Turns out it does the
former, but that is not obvious without checking since the HTML
syntax is the same in both cases.

     [
     {
         "unMeta" : {}
     },
     [
         {
             "c" : [
                 "html",
                 "<!--\n\nThis is a block comment.\n\nIsn’t 
it?\n\n-->"
             ],
             "t" : "RawBlock"
         },
         {
                 ...
                     {
                 "t" : "RawInline",
                 "c" : [
                     "html",
                     "<!--n inline-->"
                 ]
                 },
                 ...
         }
     ]
     ]

>
>> If the issue is to have comments in your markdown which don't show up in
>> generated HTML you can use YAML blocks which only contain YAML comments:
>
> John told me about this, but YAML comments are harder to write for both
> block and inline comments.
>
>
> Pablo
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/548E0139.6030504%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-12-14 21:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-12 21:53 filter to remove inline HTML comments Pablo Rodríguez
     [not found] ` <548B63DF.5050902-S0/GAf8tV78@public.gmane.org>
2014-12-12 23:39   ` BPJ
     [not found]     ` <CADAJKhA2zy5KJ-FC44BKTeuTkw_CCwZa3c+NJAgScBFEOjLnog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-13 10:14       ` Pablo Rodríguez
     [not found]         ` <548C1181.6010806-S0/GAf8tV78@public.gmane.org>
2014-12-13 14:09           ` Mark Szepieniec
     [not found]             ` <CAE4-1rUH+guOdJvpmMNLS8_xKKbj=oBXcdqrF8eT7uEAZWPkrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-13 19:09               ` Pablo Rodríguez
     [not found]                 ` <548C8EF8.8050007-S0/GAf8tV78@public.gmane.org>
2014-12-13 19:29                   ` Mark Szepieniec
2014-12-14 21:29           ` BP Jonsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).