public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>
To: William Lupton
	<wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org>,
	pandoc-discuss
	<pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: MS Word document always differs
Date: Wed, 26 Jan 2022 10:55:21 -0800	[thread overview]
Message-ID: <yh480kzgnie38m.fsf@johnmacfarlane.net> (raw)
In-Reply-To: <CAEe_xxjoHA0sta+4=eRu4xuYS44e2tpu4_74mmKMjc49e5=fow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>


See the manual:
https://pandoc.org/MANUAL.html#reproducible-builds

William Lupton <wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org> writes:

> Just commenting on this:
>
>> The two file always differ when I try: diff expected.docx sample.docx. I
> do not see why they should differ when they were created with the same
> parameters and source file with only the name changing.
>
> A docx file is a ZIP that contains many resources including a file
> called docProps/core.xml that includes "created" and "modified" properties
> that indicate when the document was created and modified.
>
> Therefore I don't think that you can assume that two docx files with
> identical content are in fact identical.
>
> Perhaps the word/document.xml files (also in the ZIP) will be identical, or
> perhaps it would be better to convert to a different format for comparison
> purposes?
>
> William
>
> On Wed, 26 Jan 2022 at 09:40, Nandakumar Chandrasekhar <
> navanitachora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Dear Folks,
>>
>> I am in the process of writing a lua-filter that modifies the font of
>> font-awesome icons added to a Word/DOCX document.
>>
>> I am finding that if I were to generate a file called expected.docx from
>> the code:
>>
>> pandoc -f markdown -t docx --reference-doc sample-reference.docx -o
>> expected.docx sample.md
>>
>> and then generate another docx file with a different name using the same
>> source file and reference docx as below:
>>
>> pandoc -f markdown -t docx --reference-doc sample-reference.docx -o
>> sample.docx sample.md
>>
>> The two file always differ when I try:
>>
>> diff expected.docx sample.docx
>>
>> I do not see why they should differ when they were created with the same
>> parameters and source file with only the name changing.
>>
>> What alternatives do I have to stop making the files differ.
>>
>> I need to write tests for my lua filter to be accepted into
>> pandoc/lua-filters.
>>
>> Therefore, I need to make sure that the expected output and the generated
>> output are exactly the same to pass the test.
>>
>> I hope someone can lend some insight.
>>
>> Many thanks.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/a3d1fad4-91f6-4495-aa4c-874f6ca5bb6en%40googlegroups.com
>> <https://groups.google.com/d/msgid/pandoc-discuss/a3d1fad4-91f6-4495-aa4c-874f6ca5bb6en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxjoHA0sta%2B4%3DeRu4xuYS44e2tpu4_74mmKMjc49e5%3Dfow%40mail.gmail.com.


  parent reply	other threads:[~2022-01-26 18:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-26  9:40 Nandakumar Chandrasekhar
     [not found] ` <a3d1fad4-91f6-4495-aa4c-874f6ca5bb6en-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-01-26  9:49   ` William Lupton
     [not found]     ` <CAEe_xxjoHA0sta+4=eRu4xuYS44e2tpu4_74mmKMjc49e5=fow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-01-26 10:49       ` Nandakumar Chandrasekhar
     [not found]         ` <b8ce8324-2f09-4448-a38a-702e2a0ea3e7n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-01-26 12:14           ` Bastien DUMONT
2022-01-26 17:02             ` Nandakumar Chandrasekhar
2022-01-26 18:55       ` John MacFarlane [this message]
     [not found]         ` <yh480kzgnie38m.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2022-01-27  3:28           ` Nandakumar Chandrasekhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yh480kzgnie38m.fsf@johnmacfarlane.net \
    --to=jgm-tvlzxgkolnx2fbvcvol8/a@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    --cc=wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).