public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* NOTE and WARNING in DocBook -> Word conversions
@ 2023-01-06 19:57 Joseph Ottinger
       [not found] ` <f05af404-d529-4e22-8334-3a637bc98efcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 2+ messages in thread
From: Joseph Ottinger @ 2023-01-06 19:57 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1328 bytes --]

Hi. I have a book written with asciidoctor, from which I generate DocBook, 
and that's fed into pandoc to generate Word output.

However, WARNING and NOTE types aren't rendered very well, I *think* - or 
it's entirely possible that they are rendered according to paragraph type 
and I just can't tell.

In the AST, WARNING and NOTE generate XML:

<note><simpara /></note> <!-- for NOTE types, obviously -->

When AsciiDoctor generates HTML, it annotates the content with NOTE: and a 
paragraph with the contents of the note. In Word, generated from DocBook, 
this annotation (the "NOTE:") is lost. That's sort of understandable, 
but... how would one replicate the behavior from the HTML backend? The 
DocBook *has* the types; the AST in Pandoc contains a DIV with a class that 
maps to NOTE or WARNING, so the context is there, but I have no idea how 
I'd update the Word output to reflect the NOTE or WARNING type.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f05af404-d529-4e22-8334-3a637bc98efcn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1688 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NOTE and WARNING in DocBook -> Word conversions
       [not found] ` <f05af404-d529-4e22-8334-3a637bc98efcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2023-01-06 22:32   ` BPJ
  0 siblings, 0 replies; 2+ messages in thread
From: BPJ @ 2023-01-06 22:32 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 5214 bytes --]

Since that docbook note element becomes a div with class "note" in the
Pandoc AST you can use a Lua filter to modify the AST for divs with such
a class, by inserting a paragraph with the "Note" label wrapped in div
with a 'custom-style' attribute so that you can apply a custom paragraph
style to the paragraph in docx.

    <https://pandoc.org/MANUAL.html#custom-styles>

    <https://pandoc.org/MANUAL.html#option--reference-doc>

    <
https://github.com/jgm/pandoc/wiki/Defining-custom-DOCX-styles-in-LibreOffice-(and-Word)
>

Here is the Lua filter I use for this admonitions when converting to
docx, slightly modified:

``` lua
-- Edit this with labels and attributes for admonition div classes
local cls_data = {
  note = {
    label = 'Note',
    label_attrs = {
      ['custom-style'] = 'NoteLabel'
    },
    text_attrs = {
      -- ['custom-style'] = 'NoteText'
    }
  },
  warning = {
    label = 'Warning',
    label_attrs = {
      ['custom-style'] = 'WarningLabel'
    },
    text_attrs = {
      -- ['custom-style'] = 'WarningText'
    }
  }
}

-- Get the pandoc library under a shorter name
local p = assert(pandoc, "Cannot find the pandoc library")
if not ('table' == type(p)) then
  error("Expected variable pandoc to be table")
end

-- Create the label divs
for cls, data in pairs(cls_data) do
  data.label = p.Div({
    p.Para({ p.Str(data.label) })
  }, data.label_attrs)
end

-- The filter function
function Div (div)
  for _, cls in ipairs(div.classes) do
    local data = cls_data[cls] -- get data if any
    if data then -- if this is an admonition class
      -- Set the attributes on the div
      for name, val in pairs(data.text_attrs) do
        div.attributes[name] = val
      end
      -- Return the data and the div
      return { data.label:clone(), div }
    end
  end
  -- If no class matches
  return nil
end
```

Each key in the table `cls_data` is a class which occurs on a div which
should be styled as an admonition. You probably want to add and/or
modify entries in this table to match the classes of divs you want to
style and the paragraph style names you actually have in your
reference-doc.docx.

Each value in that table is a table with three fields:

1.  `label` is the string you want as text in the label paragraph above
    the text of the div, e.g. "Note".

2.  `label_attrs` is a table with the attributes you want to apply to
    the div containing the label paragraph, notably the custom-style to
    apply to the paragraph.

3.  `text_attrs` is a table containing attributes you want to apply to
    the admonition text, i.e. the original div. I have commented out
    these attributes since a custom-style here will override any
    paragraph styles which Pandoc might apply to paragraphs inside the
    original div. _If_ you want a custom-style here you might want ro
    use it to apply some indentation for example.

An alternative strategy might be to turn each admonition into a
single-item definition list where the label is the "term" and the div
text is the "definition". Please let me know if you want that.

See here if you want to write your own Lua filter:

<https://pandoc.org/lua-filters.html>


Den fre 6 jan. 2023 20:58Joseph Ottinger <dreamreal-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:

> Hi. I have a book written with asciidoctor, from which I generate DocBook,
> and that's fed into pandoc to generate Word output.
>
> However, WARNING and NOTE types aren't rendered very well, I *think* - or
> it's entirely possible that they are rendered according to paragraph type
> and I just can't tell.
>
> In the AST, WARNING and NOTE generate XML:
>
> <note><simpara /></note> <!-- for NOTE types, obviously -->
>
> When AsciiDoctor generates HTML, it annotates the content with NOTE: and a
> paragraph with the contents of the note. In Word, generated from DocBook,
> this annotation (the "NOTE:") is lost. That's sort of understandable,
> but... how would one replicate the behavior from the HTML backend? The
> DocBook *has* the types; the AST in Pandoc contains a DIV with a class that
> maps to NOTE or WARNING, so the context is there, but I have no idea how
> I'd update the Word output to reflect the NOTE or WARNING type.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/f05af404-d529-4e22-8334-3a637bc98efcn%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/f05af404-d529-4e22-8334-3a637bc98efcn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhA-pQMdwO%3DgnDcTAxmNK3QLt5VKaREmxkwuTMWZx6%3DPcQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 8755 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-01-06 22:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-06 19:57 NOTE and WARNING in DocBook -> Word conversions Joseph Ottinger
     [not found] ` <f05af404-d529-4e22-8334-3a637bc98efcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-01-06 22:32   ` BPJ

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).