public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Please give the Docx reader a test drive
@ 2014-08-11 21:55 Jesse Rosenthal
       [not found] ` <871tsmwv2h.fsf-4GNroTWusrE@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Jesse Rosenthal @ 2014-08-11 21:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Dear All,

The MS Word docx reader in the new pandoc is working pretty well these
days. Before the next release, though, I'd love it if we could run as
many real-world Word docs through it as possible, to catch any odd
behavior. As many different academic/professional fields as possible
would be ideal, since I know everyone uses word a bit
differently. Everyone testing it so far has brought some oversight to my
attention, so I'd love to get more eyes on it.

If you do try it out, and you find something that doesn't behave
correctly, please open an issue on my pandoc fork
(<https://github.com/jkr/pandoc.git>), and send me the document over
email if it's possible to share it. If you can't share it, it would be
great if you could try to reproduce the issue in a different document.

Some notes:

  - All text, and all text formatting (unless it comes from an unusual
    style) should be preserved. If it isn't, it's a bug.

  - There's not much we can do, with a few exceptions, with ad-hoc
    visual stylization: making columns by pressing space a lot, pressing
    return to make the end-of-the-line a bit prettier. The rule of thumb
    is: can the property in question stand a change in margins and font?
    If so, we should probably be able to interpret it. If not, we
    probably can't.

  - Headers, titles and the like will be interpreted correctly if they
    have the correct style. The reader can't guess at a header just
    because some text is in bold, or uses another font. (Though at some
    point in the future, I might introduce a filter with some heuristics
    for guessing.)

  - Block quotes should be picked up by either styling with Quote or
    BlockQuote, or by block indentation. If someone uses another style
    to produce a blockquote, please let me know, so I can add it to the
    list.

  - Track-changes can be used with the
    "--track-changes=accept|reject|all". accept will take the
    insertions, reject will stick with the deletions, and all will put
    in everything, marked up with spans.

  - Equations should appear as LaTeX.

Anyway, please do give it a try and let me know, through the channels
above, what weirdnesses you encounter.

To get the development pandoc, it's probably best to use a cabal sandbox
(available, I believe in cabal >= 1.18).

    git clone https://github.com/jgm/pandoc.git
    cd pandoc
    cabal update
    cabal sandbox --init
    cabal install --only-dependencies
    cabal install

The binary will then be located in pandoc/.cabal-sandbox/bin.

Thanks,
Jesse


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-02-18 16:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-11 21:55 Please give the Docx reader a test drive Jesse Rosenthal
     [not found] ` <871tsmwv2h.fsf-4GNroTWusrE@public.gmane.org>
2014-08-13  0:28   ` Andrew Dunning
     [not found]     ` <72E1556B-D515-4519-9E9A-20F7EBDBD240-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-13  4:27       ` Jesse Rosenthal
     [not found]         ` <m1k36dt3nx.fsf-4GNroTWusrE@public.gmane.org>
2014-08-15  5:36           ` Peter Sefton
     [not found]             ` <CAGQnt7X6nt6cBA6YnT3Bjg8+vfNw10-gDdUC7AphpXsgUtq9uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-15  5:37               ` Peter Sefton
     [not found]                 ` <CAGQnt7Wxyjn2VY-dkqarA1yuZvemqxH_jGYauEMDfNcKfRSL7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-15  5:45                   ` Peter Sefton
2015-02-13 12:36   ` Oliver
     [not found]     ` <e11d96c5-d197-4ccf-bd08-a324d1faf2e9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-02-13 12:41       ` Oliver
2015-02-17 18:16   ` russurquhart1
     [not found]     ` <97f48a84-e4f7-4617-95be-69493a3b47fa-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-02-17 18:28       ` Matthew Pickering
2015-02-17 19:19       ` John MacFarlane
2015-02-17 19:30       ` Jesse Rosenthal
     [not found]         ` <87h9uk1gko.fsf-4GNroTWusrE@public.gmane.org>
2015-02-17 20:06           ` russurquhart1
     [not found]             ` <3f8b9778-4923-4e41-95e0-c38b0153f981-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-02-17 20:45               ` Jesse Rosenthal
     [not found]                 ` <87vbj0z2pm.fsf-4GNroTWusrE@public.gmane.org>
2015-02-18 16:06                   ` russurquhart1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).