public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Noah Malmed <nmalmed-O2gogPphfo5dNrB6XyqITwC/G2K4zDHf@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Feature Idea: docx -> HTML table styling
Date: Thu, 16 Jun 2022 08:16:25 -0700 (PDT)	[thread overview]
Message-ID: <ec31e976-089b-4916-949a-fad874b2a8adn@googlegroups.com> (raw)
In-Reply-To: <m2tu8l7dwk.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4195 bytes --]

Hi John,

Thanks for responding! I have a few clarifying questions mainly around 
Attr, because I don't quite understand what values are stored.

When you say adding `vertical-align` to attributes should be okay, how do 
you mean? Would it be more appropriate to store it in the XML format that 
docx uses to denote vertical alignment?

As well, I think I was a little thrown off by some of the black box testing 
we did on the HTML reader. When we ran `pandoc -f html -t native` on the 
following HTML:

<table>
   <tbody>
      <tr>
         <td style="background-color: green>green background</td>
      </tr>
   </tbody>
</table>

We received the following output:

[ Table
    ( "" , [] , [] )
    (Caption Nothing [])
    [ ( AlignDefault , ColWidthDefault ) ]
    (TableHead ( "" , [] , [] ) [])
    [ TableBody
        ( "" , [] , [] )
        (RowHeadColumns 0)
        []
        [ Row
            ( "" , [] , [] )
            [ Cell
                ( "" , [] , [ ( "style" , "background-color: green" ) ] )
                AlignDefault
                (RowSpan 1)
                (ColSpan 1)
                [ Plain [ Str "green" , Space , Str "background" ] ]
            ]
        ]
    ]
    (TableFoot ( "" , [] , [] ) [])
]

Seeing that the style was preserved led us to believe that it would be 
appropriate to store some styling in the AST. Is the problem with our 
proposed solution that we would be storing information that would be 
specific to HTML? Is there maybe a more generic language that would be more 
appropriate to store that information in?  

Thanks,

Noah
 


On Wednesday, June 15, 2022 at 5:36:18 PM UTC-5 John MacFarlane wrote:

> Noah Malmed <nma...-O2gogPphfo5dNrB6XyqITwC/G2K4zDHf@public.gmane.org> writes:
>
> > Hello!
> >
> > We use Pandoc often to convert from docx to HTML, and many of the 
> > documents we convert include tables. As far as we can tell, almost all 
> of 
> > the table styling is lost in the docx reader. Specifically, we care 
> about 5 
> > things:
> >
> > 1. Text justification (left, center, or right)
> >
> > 2. Vertical alignment (top, middle, or bottom)
> >
> > 3. Text indentation
> >
> > 4. Cell shading and text color
> >
> > 5. Table borders 
> >
> > We hope to enhance the docx reader so that these stylings get preserved 
> in 
> > the AST.
> >
> > Proposed solutions:
> >
> > 1. It seems like text justification already exists in the AST through 
> the 
> > Alignment value. It just needs to get implemented in the docx reader, as 
> > described in this issue: https://github.com/jgm/pandoc/issues/6316
>
> Correct.
>
> > 2. Add the vertical alignment style to attributes as suggested here 
> > <https://github.com/jgm/pandoc/issues/7444#issuecomment-881649125>
>
> Should be okay. However, adding `vertical-align` there won't do
> any good for converting to HTML unless the HTML writer is
> modified to be sensitive to this attribute.
>
> > 3. Add text indentation to attributes in the form of the style 
> padding-left
>
> You're talking about directly adding 'style' to attributes, with
> CSS contents? That would make the docx reader very good for
> converting to HTML and not so good for any other format.
>
> Note that in general pandoc does not strive to preserve every
> small detail of formatting, only structure. See the beginning
> of the manual.
>
> > 4. Add cell shading and text color to attributes in the form of the 
> styles 
> > background-color and color
>
> See above, also search the issue tracker for 'color'.
>
> > 5. Add table borders to attributes in the form of the style border
>
> I think this falls into the category of things that are beyond
> pandoc's scope. We don't strive to reproduce all the formatting
> details in conversions. Again, see the beginning of the manual.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ec31e976-089b-4916-949a-fad874b2a8adn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 7137 bytes --]

  parent reply	other threads:[~2022-06-16 15:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15 19:13 Noah Malmed
     [not found] ` <cf7005a8-0447-4667-acb2-c1eccbaacaden-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-15 22:36   ` John MacFarlane
     [not found]     ` <m2tu8l7dwk.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2022-06-16 15:16       ` Noah Malmed [this message]
     [not found]         ` <ec31e976-089b-4916-949a-fad874b2a8adn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-16 17:21           ` John MacFarlane
     [not found]             ` <m2v8t05xsf.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2022-06-17 13:56               ` Daniel Staal
     [not found]                 ` <98db6638-9fe8-90bc-8fc0-051d0307983c-Jdbf3xiKgS8@public.gmane.org>
2022-06-17 17:36                   ` John MacFarlane
     [not found]                     ` <m235g35h13.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2022-06-18  0:19                       ` ivo....-xwz7R8GQi1g@public.gmane.org
     [not found]                         ` <3dff9bb1-eed9-4252-9b72-1aa090c5865fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-21 17:42                           ` William Lupton
2022-06-16  6:49   ` Albert Krewinkel
     [not found]     ` <87y1xxvzt3.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-06-16 15:19       ` Noah Malmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec31e976-089b-4916-949a-fad874b2a8adn@googlegroups.com \
    --to=nmalmed-o2gogpphfo5dnrb6xyqitwc/g2k4zdhf@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).