From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/12033 Path: news.gmane.org!not-for-mail From: russurquhart1 Newsgroups: gmane.text.pandoc Subject: Re: Please give the Docx reader a test drive Date: Tue, 17 Feb 2015 10:16:15 -0800 (PST) Message-ID: <97f48a84-e4f7-4617-95be-69493a3b47fa@googlegroups.com> References: <871tsmwv2h.fsf@jhu.edu> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_521_995212265.1424196975389" X-Trace: ger.gmane.org 1424196989 19386 80.91.229.3 (17 Feb 2015 18:16:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 17 Feb 2015 18:16:29 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCRKH45OXUEBB4EKR2TQKGQEHYKYXVY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Feb 17 19:16:21 2015 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-vc0-f191.google.com ([209.85.220.191]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YNmgz-00075v-Ca for gtp-pandoc-discuss@m.gmane.org; Tue, 17 Feb 2015 19:16:17 +0100 Original-Received: by mail-vc0-f191.google.com with SMTP id kv7sf9503090vcb.8 for ; Tue, 17 Feb 2015 10:16:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :content-type:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe; bh=k92DV7ac4DQQqHerTXW99f/pTVuUaOD8jRoEcqFskAU=; b=IMo/+wu85a6GVjZ0Jo6XXJSNr0094YjQu0os0vItPZzGtC6OZNqY7NeluSa/uzYy0p e3vXDk2i06z3UNPSEQ6WUhZZ8qZPMm1WMAiHZoSeE16CdgDF/uMsQK+dZvTNzaAVuKL0 YkzOP2TgHMJL6BkHTgKmTtkU4j0xSbW56gfoX55Wi9zaJBe7cJSRySp+nPG/zhduxwTo rqbbfs1QH0tdRPobj3x9M1KkPpDxrHbLVbEPeiSb2g0XEeW1QxtUc9mLg54bScco4vQ/ bUiu2JMLCDnkbokWu6tk9EUl25EdtooGIjKWEFEC5Lu/yNKh+1K3TpmKHowlD44s8jRT hcYg== X-Received: by 10.182.20.238 with SMTP id q14mr187131obe.27.1424196976483; Tue, 17 Feb 2015 10:16:16 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.182.166.72 with SMTP id ze8ls1046966obb.96.gmail; Tue, 17 Feb 2015 10:16:15 -0800 (PST) X-Received: by 10.182.199.5 with SMTP id jg5mr183670obc.11.1424196975889; Tue, 17 Feb 2015 10:16:15 -0800 (PST) In-Reply-To: <871tsmwv2h.fsf-4GNroTWusrE@public.gmane.org> X-Original-Sender: russurquhart1-H+0wwilmMs3R7s880joybQ@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:12033 Archived-At: ------=_Part_521_995212265.1424196975389 Content-Type: multipart/alternative; boundary="----=_Part_522_1929895019.1424196975389" ------=_Part_522_1929895019.1424196975389 Content-Type: text/plain; charset=UTF-8 Hi, I have some issue with the latest Docx reader. Is there a way to get your sandbox version for windows? Thanks,m Russ On Monday, August 11, 2014 at 4:52:32 PM UTC-5, Jesse Rosenthal wrote: > > Dear All, > > The MS Word docx reader in the new pandoc is working pretty well these > days. Before the next release, though, I'd love it if we could run as > many real-world Word docs through it as possible, to catch any odd > behavior. As many different academic/professional fields as possible > would be ideal, since I know everyone uses word a bit > differently. Everyone testing it so far has brought some oversight to my > attention, so I'd love to get more eyes on it. > > If you do try it out, and you find something that doesn't behave > correctly, please open an issue on my pandoc fork > (), and send me the document over > email if it's possible to share it. If you can't share it, it would be > great if you could try to reproduce the issue in a different document. > > Some notes: > > - All text, and all text formatting (unless it comes from an unusual > style) should be preserved. If it isn't, it's a bug. > > - There's not much we can do, with a few exceptions, with ad-hoc > visual stylization: making columns by pressing space a lot, pressing > return to make the end-of-the-line a bit prettier. The rule of thumb > is: can the property in question stand a change in margins and font? > If so, we should probably be able to interpret it. If not, we > probably can't. > > - Headers, titles and the like will be interpreted correctly if they > have the correct style. The reader can't guess at a header just > because some text is in bold, or uses another font. (Though at some > point in the future, I might introduce a filter with some heuristics > for guessing.) > > - Block quotes should be picked up by either styling with Quote or > BlockQuote, or by block indentation. If someone uses another style > to produce a blockquote, please let me know, so I can add it to the > list. > > - Track-changes can be used with the > "--track-changes=accept|reject|all". accept will take the > insertions, reject will stick with the deletions, and all will put > in everything, marked up with spans. > > - Equations should appear as LaTeX. > > Anyway, please do give it a try and let me know, through the channels > above, what weirdnesses you encounter. > > To get the development pandoc, it's probably best to use a cabal sandbox > (available, I believe in cabal >= 1.18). > > git clone https://github.com/jgm/pandoc.git > cd pandoc > cabal update > cabal sandbox --init > cabal install --only-dependencies > cabal install > > The binary will then be located in pandoc/.cabal-sandbox/bin. > > Thanks, > Jesse > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/97f48a84-e4f7-4617-95be-69493a3b47fa%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_522_1929895019.1424196975389 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I have some issue with the latest Docx reader. = Is there a way to get your sandbox version for windows?

Thanks,m
=
Russ

On Monday, August 11, 2014 at 4:52:32 PM UTC-5, Jesse Rosen= thal wrote:
Dear All,

The MS Word docx reader in the new pandoc is working pretty well these
days. Before the next release, though, I'd love it if we could run as
many real-world Word docs through it as possible, to catch any odd
behavior. As many different academic/professional fields as possible
would be ideal, since I know everyone uses word a bit
differently. Everyone testing it so far has brought some oversight to m= y
attention, so I'd love to get more eyes on it.

If you do try it out, and you find something that doesn't behave
correctly, please open an issue on my pandoc fork
(<= https://github.com/jkr/pandoc.git>), and send me the document o= ver
email if it's possible to share it. If you can't share it, it would be
great if you could try to reproduce the issue in a different document.

Some notes:

  - All text, and all text formatting (unless it comes from an unu= sual
    style) should be preserved. If it isn't, it's a bug.

  - There's not much we can do, with a few exceptions, with ad-hoc
    visual stylization: making columns by pressing space a lo= t, pressing
    return to make the end-of-the-line a bit prettier. The ru= le of thumb
    is: can the property in question stand a change in margin= s and font?
    If so, we should probably be able to interpret it. If not= , we
    probably can't.

  - Headers, titles and the like will be interpreted correctly if = they
    have the correct style. The reader can't guess at a heade= r just
    because some text is in bold, or uses another font. (Thou= gh at some
    point in the future, I might introduce a filter with some= heuristics
    for guessing.)

  - Block quotes should be picked up by either styling with Quote = or
    BlockQuote, or by block indentation. If someone uses anot= her style
    to produce a blockquote, please let me know, so I can add= it to the
    list.

  - Track-changes can be used with the
    "--track-changes=3Daccept|reject|all". accept will t= ake the
    insertions, reject will stick with the deletions, and all= will put
    in everything, marked up with spans.

  - Equations should appear as LaTeX.

Anyway, please do give it a try and let me know, through the channels
above, what weirdnesses you encounter.

To get the development pandoc, it's probably best to use a cabal sandbo= x
(available, I believe in cabal >=3D 1.18).

    git clone https://github.com/jgm/pandoc.git
    cd pandoc
    cabal update
    cabal sandbox --init
    cabal install --only-dependencies
    cabal install

The binary will then be located in pandoc/.cabal-sandbox/bin.

Thanks,
Jesse

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/97f48a84-e4f7-4617-95be-69493a3b47fa%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_522_1929895019.1424196975389-- ------=_Part_521_995212265.1424196975389--