public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Sigismond <pascal.conil.lacoste-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: docx+styles to dokuwiki somehow ?
Date: Wed, 28 Jun 2023 08:00:05 -0700 (PDT)	[thread overview]
Message-ID: <62b0db64-b7ab-48e8-9025-9c969304e1b6n@googlegroups.com> (raw)
In-Reply-To: <a62eaa45-0126-4325-878e-4dae06aba21an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 9903 bytes --]

@Bastien
I managed to extend your workaround to the other badly handled types, thank 
you.
Then, I tried to extend this already very useful lua filter in order to 
convert some specific custom-styled divs to level 1 titles in dokuwiki 
output.

For example, with `Warburg (Otto Heinrich)` having the custom-style `mots`, 
what I get is
```
<WRAP mots>
Warburg (Otto Heinrich)

</WRAP>
```
what I would like is
``` 
====== Warburg (Otto Heinrich) ======
``` 

So I modified the filter to add a new condition : 
```
function Div (div)
local custom_style = div.attributes['custom-style']
if custom_style then
if custom_style == 'mots' then
local pre = pandoc.RawBlock('dokuwiki', '======')
local post = pandoc.RawBlock('dokuwiki', '======')
else 
local pre = pandoc.RawBlock('dokuwiki', '<WRAP ' .. custom_style .. '>')
local post = pandoc.RawBlock('dokuwiki', '</WRAP>')
end
local content = div.content
table.insert(content, 1, pre)
table.insert(content, post)
return content
end
end
```

Well, it doesn't work in many ways :
- first, it tells me that 
Block, list of Blocks, or compatible element expected, got Blocks
I guess my condition is badly formed.

- then if I just try to test it without the added (and buggy) condition,
```
function Div (div)
local custom_style = div.attributes['custom-style']
if custom_style then
local pre = pandoc.RawBlock('dokuwiki', '======')
local post = pandoc.RawBlock('dokuwiki', '======')
local content = div.content
table.insert(content, 1, pre)
table.insert(content, post)
return content
end
end
```
I get… a block
``` 
======
Warburg (Otto Heinrich)

======
``` 
which doesn't convert to a proper level 1 title in dokuwiki.

I understand that my desired modification is inline whereas the original 
filter is designed to manage blocks but my lack of knowledge in lua lets me 
struggling to go past that.

Could you please show me the way to do this ?
Thanks.

Le mardi 27 juin 2023 à 12:21:20 UTC+2, Sigismond a écrit :

> Thank you Bastien.
> I did not find a bug report that specifically treats this issue. Many 
> other issues with dokuwiki and lists though.
> So that's a bug report #8920 <https://github.com/jgm/pandoc/issues/8920>
>
> Le mardi 27 juin 2023 à 11:53:48 UTC+2, Bastien DUMONT a écrit :
>
>> I think that it is worth a bug report if it has not been done yet. As a 
>> workaround, you can expand the filter to remove all divs with custom-style 
>> from the bullet lists. 
>>
>> ``` 
>> function Div (div) 
>> local custom_style = div.attributes['custom-style'] 
>> if custom_style then 
>> local pre = pandoc.RawBlock('dokuwiki', '<WARP "' .. custom_style .. 
>> '">') 
>> local post = pandoc.RawBlock('dokuwiki', '</WARP>') 
>> table.insert(div.content, post) 
>> table.insert(div.content, 1, pre) 
>> return div.content 
>> end 
>> end 
>>
>> local remove_custom_styles = { 
>> Div = function(div) 
>> if div.attributes['custom-style'] then 
>> return div.content 
>> end 
>> end 
>> } 
>>
>> function BulletList(list) 
>> -- Do the same for all types that are badly handled with docx+styles 
>> -- (e.g. OrderedList) 
>> return list:walk(remove_custom_styles) 
>> end 
>>
>> return { 
>> -- We must process the bullet lists first to remove the divs 
>> -- before they are converted to raw code. 
>> { BulletList = BulletList }, 
>> { Div = Div } 
>> } 
>>
>> ``` 
>>
>> Le Tuesday 27 June 2023 à 02:35:06AM, Sigismond a écrit : 
>> > Well… it does work but, somehow, docx+styles messes with the lists : 
>> > For a simple docx with just one list, unordered here is what I get with 
>> -f 
>> > docx+styles -t dokuwiki : 
>> > <HTML><ul></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>Liste 1<HTML></p></HTML> 
>> > <HTML></li></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 2<HTML></p></HTML> 
>> > <HTML></li></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 3<HTML></p></HTML> 
>> > 
>> > <HTML><ul></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 3a<HTML></p></HTML> 
>> > <HTML></li></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 3b<HTML></p></HTML> 
>> > <HTML></li></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 3c<HTML></p></HTML> 
>> > <HTML></li></HTML><HTML></ul></HTML> 
>> > <HTML></li></HTML> 
>> > <HTML><li></HTML><HTML><p></HTML>liste 4<HTML></p></HTML> 
>> > <HTML></li></HTML><HTML></ul></HTML> 
>> > 
>> > Which is not parsed by dokuwiki. 
>> > 
>> > 
>> > Without +styles : 
>> >   * Liste 1 
>> >   * liste 2 
>> >   * liste 3 
>> >     * liste 3a 
>> >     * liste 3b 
>> >     * liste 3c 
>> >   * liste 4 
>> > 
>> > Which is syntactically correct dokuwiki format. 
>> > 
>> > If I understand it well, Pandoc seems to consider an ordered list badly 
>> > formatted only when +styles is applied and it spits out some raw html 
>> with <p> 
>> > tags inside <li>s 
>> > 
>> > So what is it ? Bad implementation in Dokuwiki writer ?  
>> > How can I benefit from both +styles, with my lua filter, and lists ?  
>> > 
>> > -- 
>> >   Pascal 
>> > Le lundi 26 juin 2023 à 16:04:17 UTC+2, Sigismond a écrit : 
>> > 
>> > Thanks a lot Bastien, it works perfectly well. 
>> > 
>> > Le lundi 26 juin 2023 à 15:47:00 UTC+2, Bastien DUMONT a écrit : 
>> > 
>> > With `-f docx+styles`, you can replace the divs with custom styles with 
>> > this kind of filter: 
>> > 
>> > ``` 
>> > function Div (div) 
>> > local custom_style = div.attributes['custom-style'] 
>> > if custom_style then 
>> > local pre = pandoc.RawBlock('dokuwiki', '<WARP "' .. custom_style .. 
>> > '">') 
>> > local post = pandoc.RawBlock('dokuwiki', '</WARP>') 
>> > local content = div.content 
>> > table.insert(content, 1, pre) 
>> > table.insert(content, post) 
>> > return content 
>> > end 
>> > end 
>> > ``` 
>> > 
>> > Le Monday 26 June 2023 à 06:16:48AM, Sigismond a écrit : 
>> > > OK, let's try it another way : 
>> > > 
>> > > I plan to use Pandoc to convert several docx files to dokuwiki 
>> > format. 
>> > > I need to retain custom block styles and convert them to custom tags, 
>> > something 
>> > > like  
>> > > 
>> > > <WARP my-custom-block-style> 
>> > > my dokuwiki formatted block text 
>> > > </WARP> 
>> > > 
>> > > Do I need to develop a custom dokuwiki writer from scratch to do that 
>> > or is 
>> > > there a way to use lua filters for this purpose. 
>> > > Sorry if the answer is obvious but I struggle to find relevant 
>> > information. 
>> > > 
>> > > Thanks for any help, 
>> > > -- 
>> > >   Pascal 
>> > > 
>> > > 
>> > > Le mercredi 26 avril 2023 à 16:14:20 UTC+2, pascal Conil-lacoste a 
>> > écrit : 
>> > > 
>> > > Hi everybody, 
>> > > 
>> > > I've been using pandoc for some years to accomplish very 
>> > straightforward 
>> > > conversions. 
>> > > Now that what I plan to do is a little more complex, I struggle to 
>> > find 
>> > > relevant information. 
>> > > 
>> > > I need to convert docx to dokuwiki and retain Word custom styles. I 
>> > thought 
>> > > I could use docx+styles to get custom-styles in dokuwiki files but 
>> > they 
>> > > don't make it to the output and get stripped. 
>> > > 
>> > > I would be happy with ::: {custom-style="myStyle"} my text here::: 
>> > > 
>> > > If I could get something along these lines, I would be able to apply 
>> > some 
>> > > other simple transformation to get to the final dokuwiki files and 
>> > treat 
>> > > them with a plugin. 
>> > > 
>> > > What is the best way to achieve this ? Filters ? Templates ? 
>> > > 
>> > > Any help welcome! 
>> > > 
>> > > -- 
>> > > You received this message because you are subscribed to the Google 
>> > Groups 
>> > > "pandoc-discuss" group. 
>> > > To unsubscribe from this group and stop receiving emails from it, 
>> > send an email 
>> > > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>> > > To view this discussion on the web visit [2][1]https:// 
>> > groups.google.com/d/msgid/ 
>> > > pandoc-discuss/bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%[2] 
>> > 40googlegroups.com. 
>> > > 
>> > > References: 
>> > > 
>> > > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> > > [2] [3]https://groups.google.com/d/msgid/pandoc-discuss/ 
>> > bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%40googlegroups.com?utm_medium= 
>> > email&utm_source=footer 
>> > 
>> > 
>> > -- 
>> > You received this message because you are subscribed to the Google 
>> Groups 
>> > "pandoc-discuss" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email 
>> > to [4]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>> > To view this discussion on the web visit [5]
>> https://groups.google.com/d/msgid/ 
>> > pandoc-discuss/f0b95670-24a3-4870-842f-fb6e7791a694n%40googlegroups.com. 
>>
>> > 
>> > References: 
>> > 
>> > [1] https://groups.google.com/d/msgid/ 
>> > [2] http://40googlegroups.com/ 
>> > [3] 
>> https://groups.google.com/d/msgid/pandoc-discuss/bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%40googlegroups.com?utm_medium=email&utm_source=footer 
>> > [4] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> > [5] 
>> https://groups.google.com/d/msgid/pandoc-discuss/f0b95670-24a3-4870-842f-fb6e7791a694n%40googlegroups.com?utm_medium=email&utm_source=footer 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/62b0db64-b7ab-48e8-9025-9c969304e1b6n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 17544 bytes --]

  parent reply	other threads:[~2023-06-28 15:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-26 14:14 pascal Conil-lacoste
     [not found] ` <16df0de5-a608-4e6e-9545-3fa338229d8fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-26 13:16   ` Sigismond
     [not found]     ` <bdc377c4-3918-4f0f-a87e-a66f9d128cc2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-26 13:46       ` Bastien DUMONT
2023-06-26 14:04         ` Sigismond
     [not found]           ` <d22b9383-2891-44f7-8f4a-1867eef83fe2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-27  9:35             ` Sigismond
     [not found]               ` <f0b95670-24a3-4870-842f-fb6e7791a694n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-27  9:53                 ` Bastien DUMONT
2023-06-27 10:21                   ` Sigismond
     [not found]                     ` <a62eaa45-0126-4325-878e-4dae06aba21an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-28 15:00                       ` Sigismond [this message]
     [not found]                         ` <62b0db64-b7ab-48e8-9025-9c969304e1b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-28 16:52                           ` Bastien DUMONT
2023-06-29 13:11                             ` Sigismond

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62b0db64-b7ab-48e8-9025-9c969304e1b6n@googlegroups.com \
    --to=pascal.conil.lacoste-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).