public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Behaviour change in docbook reader
@ 2022-11-07  7:43 Erik Rask
       [not found] ` <CAMXDC9+w_75mDebPvJ-RyuUyJkd7x0QTbqUe7ZT1mwuWj_wZqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 2+ messages in thread
From: Erik Rask @ 2022-11-07  7:43 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1688 bytes --]

Hello,
I went ahead and upgraded parts of the toolchain since the pandoc version
there was too old (all the way from 2.7.3 to 2.19.2). After that, the
conversion results from docbook to docx were different (granted, I am aware
it's a huge version change). Tracing this, I found one tangible example
that seems to occur in the docbook reader: A <note> element was previously
put into a blockquote pandoc object, now it is a div with an identifying
class.

I am trawling the changelogs for the releases between those two, but would
like to ask the community if you recognize the change offhand and know of
an extension or parameter that lets me revert to the old behaviour while I
identify all those changes and put together some Lua to address it?

docbook:
<note><para>Here is a note without a title</para></note>

Pandoc (-t native) 2.7.3:
BlockQuote
 [Para [Strong [Str "Note"]]
 ,Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "note",Space,Str
"without",Space,Str "a",Space,Str "title"]]

2.19.2:
Div
    ( "" , [ "note" ] , [] )
    [ Para [ Str "Here", Space, Str "is", Space, Str "a", Space, Str
"note", Space, Str "without", Space
        , Str "a", Space, Str "title"
        ]
    ]

Regards,
Erik Rask
-- 
Hurrying will get you nowhere faster.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMXDC9%2Bw_75mDebPvJ-RyuUyJkd7x0QTbqUe7ZT1mwuWj_wZqA%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 2647 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Behaviour change in docbook reader
       [not found] ` <CAMXDC9+w_75mDebPvJ-RyuUyJkd7x0QTbqUe7ZT1mwuWj_wZqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2022-11-08 23:17   ` John MacFarlane
  0 siblings, 0 replies; 2+ messages in thread
From: John MacFarlane @ 2022-11-08 23:17 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

With the changelog getting to be book-sized, it’s a big undertaking to look at all the docbook changes since then.
A cleaner approach would be to do a git diff on src/Text/Pandoc/Readers/DocBook.hs from 2.7.3 tag to 2.19.2.

Anyway, a Lua filter to change the 2.19.2 result to the 2.7.3 result should be quite trivial to write (3-4 lines).


> On Nov 6, 2022, at 11:43 PM, Erik Rask <lifeunleaded-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> Hello,
> I went ahead and upgraded parts of the toolchain since the pandoc version there was too old (all the way from 2.7.3 to 2.19.2). After that, the conversion results from docbook to docx were different (granted, I am aware it's a huge version change). Tracing this, I found one tangible example that seems to occur in the docbook reader: A <note> element was previously put into a blockquote pandoc object, now it is a div with an identifying class.
> 
> I am trawling the changelogs for the releases between those two, but would like to ask the community if you recognize the change offhand and know of an extension or parameter that lets me revert to the old behaviour while I identify all those changes and put together some Lua to address it?
> 
> docbook:
> <note><para>Here is a note without a title</para></note>
> 
> Pandoc (-t native) 2.7.3:
> BlockQuote
>  [Para [Strong [Str "Note"]]
>  ,Para [Str "Here",Space,Str "is",Space,Str "a",Space,Str "note",Space,Str "without",Space,Str "a",Space,Str "title"]]
> 
> 2.19.2:
> Div
>     ( "" , [ "note" ] , [] )
>     [ Para [ Str "Here", Space, Str "is", Space, Str "a", Space, Str "note", Space, Str "without", Space
>         , Str "a", Space, Str "title"
>         ]
>     ]
> 
> Regards,
> Erik Rask
> -- 
> Hurrying will get you nowhere faster.
> 
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMXDC9%2Bw_75mDebPvJ-RyuUyJkd7x0QTbqUe7ZT1mwuWj_wZqA%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/726624CC-60E4-49C4-A1BC-42158C90145C%40gmail.com.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-08 23:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07  7:43 Behaviour change in docbook reader Erik Rask
     [not found] ` <CAMXDC9+w_75mDebPvJ-RyuUyJkd7x0QTbqUe7ZT1mwuWj_wZqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-11-08 23:17   ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).