ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* comparing pdfs
@ 2009-10-21 14:13 luigi scarso
  2009-10-23  8:50 ` luigi scarso
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: luigi scarso @ 2009-10-21 14:13 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Maybe stupid
Consider this (you must have cow.pdf in the same dir):

%%test-1.tex
\setupcolors[state=start]
\setupinteraction[state=start]
\starttext
\startfrontmatter
\completecontent
\stopfrontmatter
\startbodymatter
\chapter{Chapter}
\section{Section}
\input tufte
\externalfigure[cow]
\chapter{Chapter 2}
\section{Section}
\input knuth
\stopbodymatter
\stoptext

with markii
$>texexec --pdf test-1.tex
$>cp test-1.pdf test-1a.pdf
$>texexec --pdf test-1.tex
$>pdftoppm test-1a.pdf  test-1a-
$>pdftoppm test-1.pdf  test-1-
$> cmp test-1a.pdf test-1.pdf
test-1a.pdf test-1.pdf differ: byte 35086, line 220
of course at least /ID is different

$>for j in `seq 1 4`; do cmp test-1a--$j.ppm test-1--$j.ppm ; done
and, again of course, *no differences* because the files are equals page-by-page
-- so they are equals

Is there a way more quick and clean than  for cycle and ppm files?

--
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: comparing pdfs
  2009-10-21 14:13 comparing pdfs luigi scarso
@ 2009-10-23  8:50 ` luigi scarso
  2009-10-25  9:16   ` Taco Hoekwater
  2009-10-25 13:21 ` Sanjoy Mahajan
  2009-10-26 20:33 ` Henning Hraban Ramm
  2 siblings, 1 reply; 6+ messages in thread
From: luigi scarso @ 2009-10-23  8:50 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Oct 21, 2009 at 4:13 PM, luigi scarso <luigi.scarso@gmail.com> wrote:
> Maybe stupid
> Consider this (you must have cow.pdf in the same dir):
>
> %%test-1.tex
> \setupcolors[state=start]
> \setupinteraction[state=start]
> \starttext
> \startfrontmatter
> \completecontent
> \stopfrontmatter
> \startbodymatter
> \chapter{Chapter}
> \section{Section}
> \input tufte
> \externalfigure[cow]
> \chapter{Chapter 2}
> \section{Section}
> \input knuth
> \stopbodymatter
> \stoptext
>
> with markii
> $>texexec --pdf test-1.tex
> $>cp test-1.pdf test-1a.pdf
> $>texexec --pdf test-1.tex
> $>pdftoppm test-1a.pdf  test-1a-
> $>pdftoppm test-1.pdf  test-1-
> $> cmp test-1a.pdf test-1.pdf
> test-1a.pdf test-1.pdf differ: byte 35086, line 220
> of course at least /ID is different
>
> $>for j in `seq 1 4`; do cmp test-1a--$j.ppm test-1--$j.ppm ; done
> and, again of course, *no differences* because the files are equals page-by-page
> -- so they are equals
>
> Is there a way more quick and clean than  for cycle and ppm files?
Any idea ?


-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: comparing pdfs
  2009-10-23  8:50 ` luigi scarso
@ 2009-10-25  9:16   ` Taco Hoekwater
  0 siblings, 0 replies; 6+ messages in thread
From: Taco Hoekwater @ 2009-10-25  9:16 UTC (permalink / raw)
  To: mailing list for ConTeXt users

luigi scarso wrote:
>> $> cmp test-1a.pdf test-1.pdf
>> test-1a.pdf test-1.pdf differ: byte 35086, line 220
>> of course at least /ID is different

You could run

   $ diff -a test-1a.pdf test-1.pdf

instead, because diffs in the binary sections are unlikely
in such cases, and it is (relatively) simple to disregard
changes in /ID and /CreationDate.

>> $>for j in `seq 1 4`; do cmp test-1a--$j.ppm test-1--$j.ppm ; done
>> and, again of course, *no differences* because the files are equals page-by-page
>> -- so they are equals
>>
>> Is there a way more quick and clean than  for cycle and ppm files?
> Any idea ?

If the above command gives noticeable difference, afaik bitmap compares
are the only option that is left.

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: comparing pdfs
  2009-10-21 14:13 comparing pdfs luigi scarso
  2009-10-23  8:50 ` luigi scarso
@ 2009-10-25 13:21 ` Sanjoy Mahajan
  2009-10-26 20:33 ` Henning Hraban Ramm
  2 siblings, 0 replies; 6+ messages in thread
From: Sanjoy Mahajan @ 2009-10-25 13:21 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> Is there a way more quick and clean than  for cycle and ppm files?

The following shell script is not quick or clean, but it is thorough and
I use it to find changes from one version of a pdf file to the next.
For example, for my textbook page proofs, after I fix a bad line break,
I compare the new and most recent previous versions to check that the
fix has not created subsequent bad page breaks.

The script runs on GNU/Linux and requires pdftoppm (from xpdf or
poppler) and imagemagick (for the 'compare' utility).

It generates the comparison bitmaps in a /tmp directory.  The output
looks like:

/tmp/tmp.IWHtn0z7hk/diff-1.ppm   4250.32 (0.0648558)
/tmp/tmp.IWHtn0z7hk/diff-2.ppm   3429.2 (0.0523262)
/tmp/tmp.IWHtn0z7hk/diff-3.ppm   2890.33 (0.0441036)
/tmp/tmp.IWHtn0z7hk/diff-4.ppm   1455.9 (0.0222157)

where column 1 is the filename, which tells you which page is being
compared, column 2 is a measure of the difference between the two files
on that page, and column 2 is a normalized measure of column 2.  

To view the pages in order of most to least changed:

compare-pdfs a.pdf b.pdf | sort -nr -k2 | awk '{print $1}' | xargs feh -FV

I put this command in the Makefile for a project.

-Sanjoy

#! /bin/bash

# Usage: $0 file1.pdf file2.pdf
#   compares file1.pdf and file2.pdf by converting each page to bitmaps using
#   pdftoppm and then using the 'compare' ImageMagick utility
#
# Copyright 2007-2009 Sanjoy Mahajan.  Licensed under the GNU GPL version 2
# or (at your option) any later version.
#
# HISTORY
#   2009-09-30: Fix capture of dB output; don't use a viewer; use pdftoppm
#   2007-01-15: First version
#

dpi=144

if [ -z "$1" -o -z "$2" ]; then
  echo "Usage: $0 file1.pdf file2.pdf"
  exit 3
fi

# generate the many page images in a temporary directory
d=`mktemp -d`
pdftoppm -r $dpi $1 $d/one &
pdftoppm -r $dpi $2 $d/two &
wait

# find the union of the page numbers (in case one pdf has more pages)
pages=`ls $d/{one,two}-*.ppm | sed "s/.*-\([0-9][0-9]*\).ppm/\1/" | sort -un`

# compare each page
for p in $pages ; do
  if ! [ -e "$d/one-$p.ppm" ] ; then
    echo "$p: missing from $1"
    continue
  fi
  if ! [ -e "$d/two-$p.ppm" ] ; then
    echo "$p: missing from $2"
    continue
  fi
  echo -n "$d/diff-$p.ppm   "
  compare -metric mae $d/{one,two}-$p.ppm $d/diff-$p.ppm 2>&1
done
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: comparing pdfs
  2009-10-21 14:13 comparing pdfs luigi scarso
  2009-10-23  8:50 ` luigi scarso
  2009-10-25 13:21 ` Sanjoy Mahajan
@ 2009-10-26 20:33 ` Henning Hraban Ramm
  2009-10-26 23:16   ` luigi scarso
  2 siblings, 1 reply; 6+ messages in thread
From: Henning Hraban Ramm @ 2009-10-26 20:33 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Am 2009-10-21 um 16:13 schrieb luigi scarso:

> Is there a way more quick and clean than  for cycle and ppm files?


If you happen to own an Acrobat Pro license - it has a nice visual  
compare feature.


Greetlings from Lake Constance!
Hraban
---
http://www.fiee.net/texnique/
http://wiki.contextgarden.net
https://www.cacert.org (I'm an assurer)

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: comparing pdfs
  2009-10-26 20:33 ` Henning Hraban Ramm
@ 2009-10-26 23:16   ` luigi scarso
  0 siblings, 0 replies; 6+ messages in thread
From: luigi scarso @ 2009-10-26 23:16 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Mon, Oct 26, 2009 at 10:33 PM, Henning Hraban Ramm <hraban@fiee.net> wrote:
> Am 2009-10-21 um 16:13 schrieb luigi scarso:
>
>> Is there a way more quick and clean than  for cycle and ppm files?
>
>
> If you happen to own an Acrobat Pro license - it has a nice visual compare
> feature.

I can relax comparison in just verify if they are different, not where .
so maybe pdftoppm and sha512sum are sufficient

-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-26 23:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-21 14:13 comparing pdfs luigi scarso
2009-10-23  8:50 ` luigi scarso
2009-10-25  9:16   ` Taco Hoekwater
2009-10-25 13:21 ` Sanjoy Mahajan
2009-10-26 20:33 ` Henning Hraban Ramm
2009-10-26 23:16   ` luigi scarso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).