From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32884 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Sigismond Newsgroups: gmane.text.pandoc Subject: Re: docx+styles to dokuwiki somehow ? Date: Tue, 27 Jun 2023 03:21:20 -0700 (PDT) Message-ID: References: <16df0de5-a608-4e6e-9545-3fa338229d8fn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2956_1864153716.1687861280517" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32645"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCJ4VS5M3INRBIPQ5KSAMGQEBVOXTNA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Jun 27 12:21:26 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oa1-f57.google.com ([209.85.160.57]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1qE5p7-0008Fs-Ne for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 27 Jun 2023 12:21:25 +0200 Original-Received: by mail-oa1-f57.google.com with SMTP id 586e51a60fabf-1b052e4fc28sf927668fac.3 for ; Tue, 27 Jun 2023 03:21:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1687861284; x=1690453284; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=1AqhUeXfEQFlGgT9J9NYkouj9yuYnfqxyFdV4qdZY6E=; b=bkMEYdRR+16jfvoRzfaTnl+3mtQEtKOTNy6dtpkVoS/QFm5I29KC/sz5HhmPQyWofi Ghglu+YEzMPI0Bi25cDJmWvIcRcjE27fqWgE8JouXqTHcA2OAuIjANyI68rlkcjX8c7d bhaPL+BTEZtg1MQWy3YgRB9KsoKAwd+/6IMCtwvuIveRhNG2Q3dribKpIGPyivrGpY0N RVwM4U53obaJrGCHIbbEsQCm+QEPgz0gyZ30lTdTmYOPMqphwP74xBJh5Sos8HnWSuru RIDI20IKno9qwo86wiqhCvNeHtuo/Ui/ubHKZOZ7NDpjGrXlw2Q1KZEEYNK2UMNvpLyB zloA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687861284; x=1690453284; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=1AqhUeXfEQFlGgT9J9NYkouj9yuYnfqxyFdV4qdZY6E=; b=A6vVys7cJDsU4fqAtr8nh/3sLZs8ZSEnZiE258uZDqlrG/f9sJQRL7uHH6/BjihOze I7s8YYFe4vQQKOsUo71NM1BdSvtAo0D/3kt3DMqRmPlvnS1zMjgvyNwy8Lpt3Joe/LhP SIqJ/Wx8/A5+WQa2uhet/27YABV/YprEw7crv3ckj0R1uBn3gP9qAW3WRASf5WGjNqXY CGRwIGVs17ViBEEJvqiXF+RHcJOdHSReHVy1JP5kodMN6BKkRCi9n3SagnHpHMmGbDxe BGLMWbCV2yAe/bxOLFE0RDajQu0oTJ03mWwGZUYTe/1p4tu6fpQxwM0Ky82C/nu7cdk0 XCtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687861284; x=1690453284; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=1AqhUeXfEQFlGgT9J9NYkouj9yuYnfqxyFdV4qdZY6E=; b=l/gtSEUhtGBmxNuN6k95DWmvcEe1LOcuokxTY0HVlAZFGbQFg+aaMN8yMSD0njDCXD hIg4D7acTbtuDnLhkpL/LrrTPL2WrGtJuOiRBI3DwlTYF2KL117UezhgfK0w6XU30Ww+ licjtUgLMhY/xDorba1mi0404bisZGJQeOhftPx8TqGJ3XMoGWYFQI9UyW+EIHv4NFCl QLmltrttAyu8l9M8vciEkiJ3Sy+4bKPzoqYy0HoWjEtfQ1f8fH2m5Y5eXTnLh1fD+3YL FqNhFStH3vm1eVYBkUBbyBN9D9cC9T8MdEr3eAxnIbm+4mu6G5 Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AC+VfDwi+6BBaESu16CEpIjrRCrB8za08PyONKzDLKNIdjJ02gkxwFqN /wqrP+nmSEiCtJN8PglI4tI= X-Google-Smtp-Source: ACHHUZ5J7K63UC0fli7DN75Y+6n0vAl2LL7MDvB6Bhr+7PcPdOfjuBCTLwTj75NZ0Syi1vO8wW1fbA== X-Received: by 2002:a05:6870:e281:b0:1b0:3821:f09e with SMTP id v1-20020a056870e28100b001b03821f09emr4827545oad.18.1687861284536; Tue, 27 Jun 2023 03:21:24 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6870:9d99:b0:1b0:67ac:3bbb with SMTP id pv25-20020a0568709d9900b001b067ac3bbbls166643oab.0.-pod-prod-05-us; Tue, 27 Jun 2023 03:21:21 -0700 (PDT) X-Received: by 2002:a9d:6211:0:b0:6b5:7811:730 with SMTP id g17-20020a9d6211000000b006b578110730mr5871820otj.5.1687861281077; Tue, 27 Jun 2023 03:21:21 -0700 (PDT) In-Reply-To: X-Original-Sender: pascal.conil.lacoste-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32884 Archived-At: ------=_Part_2956_1864153716.1687861280517 Content-Type: multipart/alternative; boundary="----=_Part_2957_557429029.1687861280517" ------=_Part_2957_557429029.1687861280517 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you Bastien. I did not find a bug report that specifically treats this issue. Many other= =20 issues with dokuwiki and lists though. So that's a bug report #8920 Le mardi 27 juin 2023 =C3=A0 11:53:48 UTC+2, Bastien DUMONT a =C3=A9crit : > I think that it is worth a bug report if it has not been done yet. As a= =20 > workaround, you can expand the filter to remove all divs with custom-styl= e=20 > from the bullet lists. > > ``` > function Div (div) > local custom_style =3D div.attributes['custom-style'] > if custom_style then > local pre =3D pandoc.RawBlock('dokuwiki', '') > local post =3D pandoc.RawBlock('dokuwiki', '') > table.insert(div.content, post) > table.insert(div.content, 1, pre) > return div.content > end > end > > local remove_custom_styles =3D { > Div =3D function(div) > if div.attributes['custom-style'] then > return div.content > end > end > } > > function BulletList(list) > -- Do the same for all types that are badly handled with docx+styles > -- (e.g. OrderedList) > return list:walk(remove_custom_styles) > end > > return { > -- We must process the bullet lists first to remove the divs > -- before they are converted to raw code. > { BulletList =3D BulletList }, > { Div =3D Div } > } > > ``` > > Le Tuesday 27 June 2023 =C3=A0 02:35:06AM, Sigismond a =C3=A9crit : > > Well=E2=80=A6 it does work but, somehow, docx+styles messes with the li= sts : > > For a simple docx with just one list, unordered here is what I get with= =20 > -f > > docx+styles -t dokuwiki : > >
    > >
  • Liste 1

    > >
  • > >
  • liste 2

    > >
  • > >
  • liste 3

    > >=20 > >
      > >
    • liste 3a

      > >
    • > >
    • liste 3b

      > >
    • > >
    • liste 3c

      > >
    > >
  • > >
  • liste 4

    > >
> >=20 > > Which is not parsed by dokuwiki. > >=20 > >=20 > > Without +styles : > > * Liste 1 > > * liste 2 > > * liste 3 > > * liste 3a > > * liste 3b > > * liste 3c > > * liste 4 > >=20 > > Which is syntactically correct dokuwiki format. > >=20 > > If I understand it well, Pandoc seems to consider an ordered list badly > > formatted only when +styles is applied and it spits out some raw html= =20 > with

> > tags inside

  • s > >=20 > > So what is it ? Bad implementation in Dokuwiki writer ?=20 > > How can I benefit from both +styles, with my lua filter, and lists ?=20 > >=20 > > -- > > Pascal > > Le lundi 26 juin 2023 =C3=A0 16:04:17 UTC+2, Sigismond a =C3=A9crit : > >=20 > > Thanks a lot Bastien, it works perfectly well. > >=20 > > Le lundi 26 juin 2023 =C3=A0 15:47:00 UTC+2, Bastien DUMONT a =C3=A9cri= t : > >=20 > > With `-f docx+styles`, you can replace the divs with custom styles with > > this kind of filter: > >=20 > > ``` > > function Div (div) > > local custom_style =3D div.attributes['custom-style'] > > if custom_style then > > local pre =3D pandoc.RawBlock('dokuwiki', ' > '">') > > local post =3D pandoc.RawBlock('dokuwiki', '') > > local content =3D div.content > > table.insert(content, 1, pre) > > table.insert(content, post) > > return content > > end > > end > > ``` > >=20 > > Le Monday 26 June 2023 =C3=A0 06:16:48AM, Sigismond a =C3=A9crit : > > > OK, let's try it another way : > > > > > > I plan to use Pandoc to convert several docx files to dokuwiki > > format. > > > I need to retain custom block styles and convert them to custom tags, > > something > > > like=20 > > > > > > > > > my dokuwiki formatted block text > > > > > > > > > Do I need to develop a custom dokuwiki writer from scratch to do that > > or is > > > there a way to use lua filters for this purpose. > > > Sorry if the answer is obvious but I struggle to find relevant > > information. > > > > > > Thanks for any help, > > > -- > > > Pascal > > > > > > > > > Le mercredi 26 avril 2023 =C3=A0 16:14:20 UTC+2, pascal Conil-lacoste= a > > =C3=A9crit : > > > > > > Hi everybody, > > > > > > I've been using pandoc for some years to accomplish very > > straightforward > > > conversions. > > > Now that what I plan to do is a little more complex, I struggle to > > find > > > relevant information. > > > > > > I need to convert docx to dokuwiki and retain Word custom styles. I > > thought > > > I could use docx+styles to get custom-styles in dokuwiki files but > > they > > > don't make it to the output and get stripped. > > > > > > I would be happy with ::: {custom-style=3D"myStyle"} my text here::: > > > > > > If I could get something along these lines, I would be able to apply > > some > > > other simple transformation to get to the final dokuwiki files and > > treat > > > them with a plugin. > > > > > > What is the best way to achieve this ? Filters ? Templates ? > > > > > > Any help welcome! > > > > > > -- > > > You received this message because you are subscribed to the Google > > Groups > > > "pandoc-discuss" group. > > > To unsubscribe from this group and stop receiving emails from it, > > send an email > > > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > > To view this discussion on the web visit [2][1]https:// > > groups.google.com/d/msgid/ > > > pandoc-discuss/bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%[2] > > 40googlegroups.com. > > > > > > References: > > > > > > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > > [2] [3]https://groups.google.com/d/msgid/pandoc-discuss/ > > bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%40googlegroups.com?utm_medium=3D > > email&utm_source=3Dfooter > >=20 > >=20 > > -- > > You received this message because you are subscribed to the Google Grou= ps > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send= =20 > an email > > to [4]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [5] > https://groups.google.com/d/msgid/ > > pandoc-discuss/f0b95670-24a3-4870-842f-fb6e7791a694n%40googlegroups.com= . > >=20 > > References: > >=20 > > [1] https://groups.google.com/d/msgid/ > > [2] http://40googlegroups.com/ > > [3]=20 > https://groups.google.com/d/msgid/pandoc-discuss/bdc377c4-3918-4f0f-a87e-= a66f9d128cc2n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > > [4] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [5]=20 > https://groups.google.com/d/msgid/pandoc-discuss/f0b95670-24a3-4870-842f-= fb6e7791a694n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/a62eaa45-0126-4325-878e-4dae06aba21an%40googlegroups.com. ------=_Part_2957_557429029.1687861280517 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you Bastien.
    I did not find a bug report that specifically treats= this issue. Many other issues with dokuwiki and lists though.
    Le mardi 27 juin 2023 =C3=A0 11:53:48 UTC+2, Bastien DUMO= NT a =C3=A9crit=C2=A0:
    I think that it is worth a bug report if it has not been done yet= . As a workaround, you can expand the filter to remove all divs with custom= -style from the bullet lists.

    ```
    function Div (div)
    local custom_style =3D div.attributes['custom-style']
    if custom_style then
    local pre =3D pandoc.RawBlock('dokuwiki', '<WARP &qu= ot;' .. custom_style .. '">')
    local post =3D pandoc.RawBlock('dokuwiki', '</WARP&g= t;')
    table.insert(div.content, post)
    table.insert(div.content, 1, pre)
    return div.content
    end
    end

    local remove_custom_styles =3D {
    Div =3D function(div)
    if div.attributes['custom-style'] then
    return div.content
    end
    end
    }

    function BulletList(list)
    -- Do the same for all types that are badly handled with docx+styles
    -- (e.g. OrderedList)
    return list:walk(remove_custom_styles)
    end

    return {
    -- We must process the bullet lists first to remove the divs
    -- before they are converted to raw code.
    { BulletList =3D BulletList },
    { Div =3D Div }
    }

    ```

    Le Tuesday 27 June 2023 =C3=A0 02:35:06AM, Sigismond a =C3=A9crit :
    > Well=E2=80=A6 it does work but, somehow, docx+styles messes with t= he lists :
    > For a simple docx with just one list, unordered here is what I get= with -f
    > docx+styles -t dokuwiki :
    > <HTML><ul></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;Liste 1<HTML></p></HTML>
    > <HTML></li></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 2<HTML></p></HTML>
    > <HTML></li></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 3<HTML></p></HTML>
    >=20
    > <HTML><ul></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 3a<HTML></p></HTML>
    > <HTML></li></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 3b<HTML></p></HTML>
    > <HTML></li></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 3c<HTML></p></HTML>
    > <HTML></li></HTML><HTML></ul></HT= ML>
    > <HTML></li></HTML>
    > <HTML><li></HTML><HTML><p></HTML&= gt;liste 4<HTML></p></HTML>
    > <HTML></li></HTML><HTML></ul></HT= ML>
    >=20
    > Which is not parsed by dokuwiki.
    >=20
    >=20
    > Without +styles :
    > =C2=A0 * Liste 1
    > =C2=A0 * liste 2
    > =C2=A0 * liste 3
    > =C2=A0 =C2=A0 * liste 3a
    > =C2=A0 =C2=A0 * liste 3b
    > =C2=A0 =C2=A0 * liste 3c
    > =C2=A0 * liste 4
    >=20
    > Which is syntactically correct dokuwiki format.
    >=20
    > If I understand it well, Pandoc seems to consider an ordered list = badly
    > formatted only when +styles is applied and it spits out some raw h= tml with <p>
    > tags inside <li>s
    >=20
    > So what is it ? Bad implementation in Dokuwiki writer ?=C2=A0
    > How can I benefit from both +styles, with my lua filter, and lists= ?=C2=A0
    >=20
    > --
    > =C2=A0 Pascal
    > Le lundi 26 juin 2023 =C3=A0 16:04:17 UTC+2, Sigismond a =C3=A9cri= t=C2=A0:
    >=20
    > Thanks a lot Bastien, it works perfectly well.
    >=20
    > Le lundi 26 juin 2023 =C3=A0 15:47:00 UTC+2, Bastien DUMONT a = =C3=A9crit=C2=A0:
    >=20
    > With `-f docx+styles`, you can replace the divs with custo= m styles with
    > this kind of filter:
    >=20
    > ```
    > function Div (div)
    > local custom_style =3D div.attributes['custom-style= 9;]
    > if custom_style then
    > local pre =3D pandoc.RawBlock('dokuwiki', '<= ;WARP "' .. custom_style ..
    > '">')
    > local post =3D pandoc.RawBlock('dokuwiki', '&l= t;/WARP>')
    > local content =3D div.content
    > table.insert(content, 1, pre)
    > table.insert(content, post)
    > return content
    > end
    > end
    > ```
    >=20
    > Le Monday 26 June 2023 =C3=A0 06:16:48AM, Sigismond a =C3= =A9crit :
    > > OK, let's try it another way :
    > >
    > > I plan to use Pandoc to convert several docx files to= dokuwiki
    > format.
    > > I need to retain custom block styles and convert them= to custom tags,
    > something
    > > like=C2=A0
    > >
    > > <WARP my-custom-block-style>
    > > my dokuwiki formatted block text
    > > </WARP>
    > >
    > > Do I need to develop a custom dokuwiki writer from sc= ratch to do that
    > or is
    > > there a way to use lua filters for this purpose.
    > > Sorry if the answer is obvious but I struggle to find= relevant
    > information.
    > >
    > > Thanks for any help,
    > > --
    > > =C2=A0 Pascal
    > >
    > >
    > > Le mercredi 26 avril 2023 =C3=A0 16:14:20 UTC+2, pasc= al Conil-lacoste a
    > =C3=A9crit=C2=A0:
    > >
    > > Hi everybody,
    > >
    > > I've been using pandoc for some years to accompli= sh very
    > straightforward
    > > conversions.
    > > Now that what I plan to do is a little more complex, = I struggle to
    > find
    > > relevant information.
    > >
    > > I need to convert docx to dokuwiki and retain Word cu= stom styles. I
    > thought
    > > I could use docx+styles to get custom-styles in dokuw= iki files but
    > they
    > > don't make it to the output and get stripped.
    > >
    > > I would be happy with ::: {custom-style=3D"mySty= le"} my text here:::
    > >
    > > If I could get something along these lines, I would b= e able to apply
    > some
    > > other simple transformation to get to the final dokuw= iki files and
    > treat
    > > them with a plugin.
    > >
    > > What is the best way to achieve this ? Filters ? Temp= lates ?
    > >
    > > Any help welcome!
    > >
    > > --
    > > You received this message because you are subscribed = to the Google
    > Groups
    > > "pandoc-discuss" group.
    > > To unsubscribe from this group and stop receiving ema= ils from it,
    > send an email
    > > to [1]pand= oc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    > > To view this discussion on the web visit [2][1]https:= //
    > groups.google.c= om/d/msgid/
    > > pandoc-discuss/bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%= [2]
    > 40googlegroups.com.
    > >
    > > References:
    > >
    > > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
    > > [2] [3]https://groups.google.com/d/msgid/pandoc-discu= ss/
    > bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%40googlegroups.com?utm_medium=3D
    > email&utm_source=3Dfooter
    >=20
    >=20
    > --
    > You received this message because you are subscribed to the Google= Groups
    > "pandoc-discuss" group.
    > To unsubscribe from this group and stop receiving emails from it, = send an email
    > to [4]pandoc-discus...@= googlegroups.com.
    > To view this discussion on the web visit [5]https://groups.google.com/d/msgid/
    > pandoc-discuss/f0b95670-24a3-4870-842f-fb6e7791a694n%40googlegroups.com.
    >=20
    > References:
    >=20
    > [1] https://groups.= google.com/d/msgid/
    > [2] http://40googlegroups.com/
    > [3] https://groups= .google.com/d/msgid/pandoc-discuss/bdc377c4-3918-4f0f-a87e-a66f9d128cc2n%40= googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter
    > [4] mailto:pandoc-discu= s...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
    > [5] https://groups= .google.com/d/msgid/pandoc-discuss/f0b95670-24a3-4870-842f-fb6e7791a694n%40= googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter

    --
    You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
    To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/a62eaa45-0126-4325-878e-4dae06aba21an%40googlegroups.= com.
    ------=_Part_2957_557429029.1687861280517-- ------=_Part_2956_1864153716.1687861280517--