public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Typesetting Markdiown - Part 8
@ 2020-04-28  2:30 Dave Jarvis
       [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Jarvis @ 2020-04-28  2:30 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 717 bytes --]

Hi there!

Bored? Feeling socially isolated? Need some fun times with Pandoc?

How about typesetting a 100-year-old poem? Or converting epubs to Markdown, 
then Markdown into PDF documents?

https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/

I hope you find the post useful.

Stay safe.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8e93804b-8b3e-48ea-b0a4-620dc0ab77d1%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1175 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Typesetting Markdiown - Part 8
       [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-04-28  8:05   ` Albert Krewinkel
  2020-04-29  0:34   ` T. Kurt Bond
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Albert Krewinkel @ 2020-04-28  8:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dave Jarvis writes:

> Bored? Feeling socially isolated? Need some fun times with Pandoc?
>
> How about typesetting a 100-year-old poem? Or converting epubs to Markdown, 
> then Markdown into PDF documents?
>
> https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/

Wonderful article, thanks for sharing!


-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Typesetting Markdiown - Part 8
       [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-04-28  8:05   ` Albert Krewinkel
@ 2020-04-29  0:34   ` T. Kurt Bond
  2020-04-30  2:56   ` Lua filters with ms output [Was: Re: Typesetting Markdiown - Part 8] T. Kurt Bond
  2020-05-03  1:56   ` Lua filters with ms output T. Kurt Bond
  3 siblings, 0 replies; 7+ messages in thread
From: T. Kurt Bond @ 2020-04-29  0:34 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw,
	dave.jarvis-Re5JQEeQqe8AvxtiuMwx3w

Dave Jarvis <dave.jarvis-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi there!
> 
> Bored? Feeling socially isolated? Need some fun times with Pandoc?
> 
> How about typesetting a 100-year-old poem? Or converting epubs to Markdown, 
> then Markdown into PDF documents?
> 
> https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/
> 
> I hope you find the post useful.

I tried out the classify.lua filter because it looked interesting, and
I noticed that the generated context has a blank line after \startpoem
and a blank line before \stoppoem that are not there in the original
poem.md.  Is there any way to leave those blank lines out?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Lua filters with ms output [Was: Re: Typesetting Markdiown - Part 8]
       [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-04-28  8:05   ` Albert Krewinkel
  2020-04-29  0:34   ` T. Kurt Bond
@ 2020-04-30  2:56   ` T. Kurt Bond
       [not found]     ` <20200429.225635.1056265120665984150.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2020-05-03  1:56   ` Lua filters with ms output T. Kurt Bond
  3 siblings, 1 reply; 7+ messages in thread
From: T. Kurt Bond @ 2020-04-30  2:56 UTC (permalink / raw)
  To: Dave Jarvis, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dave Jarvis <dave.jarvis-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Bored? Feeling socially isolated? Need some fun times with Pandoc?
>
> How about typesetting a 100-year-old poem? Or converting epubs to 
> Markdown,
> then Markdown into PDF documents?
>
> https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/
>
> I hope you find the post useful.

I wanted to try something similar to the Lua filter classify.lua in
that blog, but for ms output instead of context.  Here's what I came
up with:

===== classify-ms.lua ======================================
-- from: 
https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/
function Div( element )
   local annotation = element.classes:find_if( matches )

   if annotation then
     annotation = annotation:gsub( "[^%w]*", "" )

     return {
       ms( ".start", annotation ),
       element,
       ms( ".stop", annotation )
     }
   end
end

function Span(element)
    local annotation = element.classes:find_if(matches)

    if annotation then
       annotation = annotation:gsub("[^%w]*", "")

       return {
          ms_inline("\\*[start", annotation, "]"),
          element,
          ms_inline("\\*[stop", annotation, "]")
       }
    end
end

function matches( s )
   return s:match( "^%a+" )
end

function ms( macro, annotation )
   return pandoc.RawBlock( "ms", macro .. annotation )
end

function ms_inline (macro, annotation, stop)
    return pandoc.RawInline ("ms", macro .. annotation .. stop)
end
============================================================

I changed the Div function to use groff syntax.  That worked fine.
(And I was glad to see that in the groff output these didn't have
extra blank lines like they did in the context output. Why were there
blank lines in the context output, BTW?)

And I added a Span function that did something similar to Div, but
instead of using RawBlock it used RawInline.  This way it could handle
ReStructuredText interpreted text roles (like :program:`pandoc`) or
pandoc markdown spans with classes (like [pandoc]{.program}.  That
worked fine.

Here's the markdown input file:

===== poem-plus.md =========================================
<!-- From: https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/ -->
``` {=ms}
.ds startprogram \\f[CW]\\m[red]
.ds stopprogram \\m[]\\fP
.de startpoem
.DS
..
.de stoppoem
.DE
..
```

This is a sentence.  This sentence talks about [pandoc]{.program}.  This is 
another sentence.

::: poem
Some say the world will end in fire,
Some say in ice.
From what I've tasted of desire
I hold with those who favor fire.
But if it had to perish twice,
I think I know enough of hate
To say that for destruction ice
Is also great,
And would suffice.
:::

This is a final sentence.
<!-- 
     Local Variables: 
     compile-command: "pandoc -f markdown -t ms --lua-filter classify-ms.lua --wrap=preserve poem-plus.md"
     End:
-->
============================================================

And here is the ms output from poem-plus.md using
--lua-filter=classify-ms.lua:

===== poem-plus.ms =========================================
.ds startprogram \\f[CW]\\m[red]
.ds stopprogram \\m[]\\fP
.de startpoem
.DS
..
.de stoppoem
.DE
..
.LP
This is a sentence.
This sentence talks about \*[startprogram]pandoc\*[stopprogram].
This is
another sentence.
.startpoem
.LP
Some say the world will end in fire,
Some say in ice.
From what I\[cq]ve tasted of desire
I hold with those who favor fire.
But if it had to perish twice,
I think I know enough of hate
To say that for destruction ice
Is also great,
And would suffice.
.stoppoem
.LP
This is a final sentence.
============================================================

And that looks like I expected it to, and the inline stuff works just
like I expected it to, so the word emacs showed up surrouned in
\*[startprogram] and \*[stopprogram], and that makes it constant width
and red in the PDF output.

And the .startpoem and .stoppoem commands showed up as I expected.
Note that they are defined to start a -ms display and end a -ms
display.  Displays don't fill lines, so the intent is that the lines
of the poem will each be a separate line in the output.

Unfortunately, there is a problem.  See that .LP right after the
.startpoem in the output?  It turns out .LP is not allowed in a
display, so the the .LP cancels the display and the lines show up
filled in the output.

I tried defining the .startpoem and .stoppoem macros so that they just
use the raw groff commands ".nf" and ".fi", but there is a problem
with that, too.  It turns out that .LP explicitly resets the fill mode
to on, so the lines are filled in the output.  (It also resets a bunch
of other things as well, including the font and the font family.
There goes my hope of being able to set poems in EBGaramond instead of
the default family.)

I'm not sure that there is any good fix to this.  Do Str elements in
the internal representation have to be in a paragraph?  If not, is
there a way that I could write the function Div in the Lua filter to
walk across the contents of a Div with the annotation "poem" to get
rid of the Para elements and replace them with just a list of Str
elements?

Any ideas would be appreciated.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lua filters with ms output [Was: Re: Typesetting Markdiown - Part 8]
       [not found]     ` <20200429.225635.1056265120665984150.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2020-04-30 19:02       ` John MacFarlane
  0 siblings, 0 replies; 7+ messages in thread
From: John MacFarlane @ 2020-04-30 19:02 UTC (permalink / raw)
  To: T. Kurt Bond, Dave Jarvis, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

T. Kurt Bond <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Unfortunately, there is a problem.  See that .LP right after the
> .startpoem in the output?  It turns out .LP is not allowed in a
> display, so the the .LP cancels the display and the lines show up
> filled in the output.

Just have your lua filter change the Para element inside the
container into a Plain element.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lua filters with ms output
       [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
                     ` (2 preceding siblings ...)
  2020-04-30  2:56   ` Lua filters with ms output [Was: Re: Typesetting Markdiown - Part 8] T. Kurt Bond
@ 2020-05-03  1:56   ` T. Kurt Bond
       [not found]     ` <20200502.215606.548079673999789845.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  3 siblings, 1 reply; 7+ messages in thread
From: T. Kurt Bond @ 2020-05-03  1:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dave Jarvis <dave.jarvis-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> started it with:
> How about typesetting a 100-year-old poem? Or converting epubs to 
> Markdown,
> then Markdown into PDF documents?
>
> https://dave.autonoma.ca/blog/2020/04/28/typesetting-markdown-part-8/

T. Kurt Bond <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> made some changes to the filter
from that page and then noted:

> Unfortunately, there is a problem.  See that .LP right after the
> .startpoem in the output?  It turns out .LP is not allowed in a
> display, so the the .LP cancels the display and the lines show up
> filled in the output.

John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:
> Just have your lua filter change the Para element inside the
> container into a Plain element.

That worked very well.

I work with ReStructuredText documents a lot, and wanted to try
something like this with one of them.

This filter wraps spans with a class, such as from interpreted text
roles defined in the source ReST (like :program:`pandoc`) in calls
to user defined groff strings \*[start<class>] and \*[stop<class>]
(the definitions are included in the source ReST as a raw block for
ms output) that include groff escapes to change the font and the
glyph color and then change back to the previous font and glyph
color.

It also wraps divs with classes with calls to user defined groff
macros .start<class> and .stop<class> (also included in the source
ReST as a raw block for ms output).

For divs with the poem class, it converts any contained LineBlock
elements into a list of Plain elements containing its contents,
avoiding the ms output for the LineBlock starting with .LP, which
would cancel the .DS (start display) macro we want to use in the
.startpoem macro definition.  The .LP would also reset the font family
in use to the default, another reason to avoid it.

It also converts the empty element that occurs in the line block
as a result of a blank line in the line block input into a RawBlock
that creates a blank line in the ms output, to show the division into
stanzas of the poem.

Interestingly, the first Str elements in the each line in the content
of the line block preserved the leading spaces from the input as
Unicode NO-BREAK SPACE characters, preserving indentation of lines in
the line block.  Unfortunately, the width of those spaces alone is not
enough create a visually distinct indentation, so this filter changes
those Str elements into a RawInline that outputs a groff horizfontal
movement whose width is based on the number of leading NO-BREAK SPACE
characters, and follow this with a new Str element that has the
leading NO-BREAK SPACE characters removed.

Here is the lua filter:

===== classify-rst-ms.lua ==================================
onig = require ("rex_onig") -- Need a regex package that understands UTF8.
-- text in LineBreak preserves leading spaces as Unicode NO-BREAK SPACE
leading_nobreakspace_rx = onig.new ("^(\u{a0}+)(.*)$", nil, "UTF8")

function Div( element )
   local annotation = element.classes:find_if( matches )
   local numPara = 0
  
   if annotation then
      annotation = annotation:gsub( "[^%w]*", "" )
      if annotation == "poem" then
         element = pandoc.walk_block (
            element, {
               -- Replace LineBlock element with a list of Plain elements
               -- containing the LineBlock's subelements.
               LineBlock = function (el)
                  local l = {}
                  for _, subel in ipairs (el.content) do
                     if #subel == 0 then
                        -- If subel is an empty table, output a raw empty line
                        table.insert (l, pandoc.RawBlock ("ms", "\n\n"))
                     else
                        -- Check for leading NO-BREAK SPACE charaters
                        local m1, m2 = onig.match (subel[1].text,
                                                   leading_nobreakspace_rx)
                        if m1 then
                           -- Replace the NO-BREAK SPACE characters with a raw
                           -- groff horizontal movement, because the
                           -- NO-BREAK SPACE characters are too narrow.
                           table.insert (subel, 1, pandoc.RawInline ("ms", string.format ("\\h'%dn'", utf8.len (m1))))
                           -- Modify what was used to be the first item to just
                           -- include the trailing characters of the match.
                           subel[2] = pandoc.Str (m2)
                           table.insert (l, pandoc.Plain (subel))
                        else
                           -- Just put the subel in Plain element.
                           table.insert (l, (pandoc.Plain (subel)))
                        end
                     end
                  end
                  return l
         end })
      end
      return {
         ms( ".start", annotation ),
         element,
         ms( ".stop", annotation )
      }
   end
end

function Span(element)
   local annotation = element.classes:find_if(matches)

   if annotation then
      annotation = annotation:gsub("[^%w]*", "")

      return {
         ms_inline("\\*[start", annotation, "]"),
         element,
         ms_inline("\\*[stop", annotation, "]")
      }
   end
end

function matches( s )
  return s:match( "^%a+" )
end

function ms( macro, annotation )
  return pandoc.RawBlock( "ms", macro .. annotation )
end

function ms_inline (macro, annotation, stop)
   return pandoc.RawInline ("ms", macro .. annotation .. stop)
end
============================================================

Here is the ReST source of the document:

===== poem-plus.rst ========================================
Lua Filters For Massaging ``ms`` Output
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

.. raw:: ms

    .ds startprogram \\f[CW]\\m[red]
    .ds stopprogram \\m[]\\fP
    .de startpoem
    .ds OLDFAM \\*[FAM]
    .ds FAM BM
    .DS I 3
    ..
    .de stoppoem
    .DE
    .ds FAM \\*[OLDFAM]
    ..

.. role:: program

This is a sentence.  This sentence talks about :program:`pandoc`.  
This is
another sentence.

.. class:: poem

    | Some say the world will end in fire,
    |    Some say in ice.
    | From what I've tasted of desire
    |    I hold with those who favor fire.
    | But if it had to perish twice,
    |    I think I know enough of hate
    |    To say that for destruction ice
    |    Is also great,
    | And would suffice.
    |
    | And another line,
    |    And an indented line.

This is a final sentence.
============================================================

And here is the ms output:

===== poem-plus-rst.ms =====================================
.SH 1
Lua Filters For Massaging \f[CB]ms\f[B] Output
.pdfhref O 1 "Lua Filters For Massaging ms Output"
.pdfhref M "lua-filters-for-massaging-ms-output"
.ds startprogram \\f[CW]\\m[red]
.ds stopprogram \\m[]\\fP
.de startpoem
.ds OLDFAM \\*[FAM]
.ds FAM BM
.DS I 3
..
.de stoppoem
.DE
.ds FAM \\*[OLDFAM]
..
.LP
This is a sentence.
This sentence talks about \*[startprogram]pandoc\*[stopprogram].
This is
another sentence.
.startpoem
Some say the world will end in fire,
\h'3n'Some say in ice.
 From what I\[aq]ve tasted of desire
\h'3n'I hold with those who favor fire.
But if it had to perish twice,
\h'3n'I think I know enough of hate
\h'3n'To say that for destruction ice
\h'3n'Is also great,
And would suffice.

And another line,
\h'3n'And an indented line.
.stoppoem
.LP
This is a final sentence.
============================================================

Being able to rewrite the tree and insert RawBlocks and RawInlines is
really powerful when it comes to customizing output for particular
output formats.

I hope this example is useful for others like me just learning to use
Lua filters.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lua filters with ms output
       [not found]     ` <20200502.215606.548079673999789845.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2020-05-03  5:15       ` T. Kurt Bond
  0 siblings, 0 replies; 7+ messages in thread
From: T. Kurt Bond @ 2020-05-03  5:15 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

For completeness I should have mentioned that the command to produce the ms
output was:

    pandoc -f rst -t ms --lua-filter classify-rst-ms.lua --wrap=preserve poem-plus.rst

-- 
T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-03  5:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-28  2:30 Typesetting Markdiown - Part 8 Dave Jarvis
     [not found] ` <8e93804b-8b3e-48ea-b0a4-620dc0ab77d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-28  8:05   ` Albert Krewinkel
2020-04-29  0:34   ` T. Kurt Bond
2020-04-30  2:56   ` Lua filters with ms output [Was: Re: Typesetting Markdiown - Part 8] T. Kurt Bond
     [not found]     ` <20200429.225635.1056265120665984150.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2020-04-30 19:02       ` John MacFarlane
2020-05-03  1:56   ` Lua filters with ms output T. Kurt Bond
     [not found]     ` <20200502.215606.548079673999789845.tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2020-05-03  5:15       ` T. Kurt Bond

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).