ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Henri Menke <henrimenke@gmail.com>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: PDF/A generation
Date: Thu, 13 Oct 2016 15:40:04 +0200	[thread overview]
Message-ID: <3cbb4476-9ac5-0653-4d9c-7c84abbddb84@gmail.com> (raw)
In-Reply-To: <CAG5iGsAJpZqEx7=5w=YOE-Z4ub-SacMWUnAr8BOCMZVw-MCZDQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2967 bytes --]

Dear Luigi,

thanks for the link.  I tried the validation with a little simplified example, which I show below.  VeraPDF reports some errors, see attached `test-result.xml`.  Unfortunately, VeraPDF cannot handle my production document and crashes during validation :( (see attached `production-result.xml`).  Also, it seems that leaving out `intent` does not make any difference.

Cheers, Henri

---

\setupinteraction
  [
    title=TITLE,
    subtitle=SUBTITLE,
    author=AUTHOR,
    keyword={KEYWORD1, KEYWORD2, KEYWORD3},
  ]

\setupbackend
  [
    format={pdf/a-2a},
    profile={default_cmyk.icc,default_rgb.icc,default_gray.icc},
  ]
\setupstructure[state=start,method=auto]

\starttext

\startchapter[title=Testing]
  \input knuth
\stopchapter

\stoptext

On 10/13/2016 03:13 PM, luigi scarso wrote:
> On Thu, Oct 13, 2016 at 3:03 PM, Henri Menke <henrimenke@gmail.com> wrote:
>> Dear list, (especially Luigi)
>>
>> for online publication I need to create a PDF/A compliant output file.  Does anyone have any experience with it and can tell me whether my setup will work?  So far I'm using
>>
>> \setupbackend
>>   [
>>     format={pdf/a-2a},
>>     profile={default_cmyk.icc,default_rgb.icc,default_gray.icc},
>>   ]
>> \setupstructure[state=start,method=auto]
>>
>> I chose PDF/A-2a because there I can have PDF 1.7 which keeps the file size down but I can also switch to PDF/A-1a.  I have *no* external pixel graphics, just included PDFs which are also produced by ConTeXt with the same setup.
>>
>> Online I found Luigi's paper on PDF/A-1a [1].  However, even after reading I'm unsure whether `intent` is optional or required.
>>
>> Since I don't own Adobe Acrobat (nor am I using Windows) I cannot verify the resulting output.  Does anyone know any working free or open-source tools for GNU/Linux to do this task?
> Have a look at
> http://verapdf.org/software/
> and test the file below  with
> $> verapdf -v -x   -f 1a test.pdf
> It should be ok
> 
> The icc files default_cmyk.icc  default_gray.icc  default_rgb.icc are
> from ghostscript, put them in the same directory of the test.
> 
> 
> \nopdfcompression
> \setupinteraction
>   [title=TITLE,
>    subtitle=SUBTITLE,
>    author=AUTHOR,
>    keyword={{KEYWORD1, KEYWORD2}, KEYWORD3}]
> 
> %% For PDF/A
> \setupbackend[
> format={pdf/a-1a:2005}, % or pdf/a-1a:2005
> profile={default_cmyk.icc,default_rgb.icc,default_gray.icc},
> intent=ISO coated v2 300\letterpercent\space (ECI)]
> 
> %% Tagged PDF
> %% method=auto ==> default tags by Adobe
> \setupbackend[export=yes]
> \setupstructure[state=start,method=auto]
> 
> 
> \startchapter[title=Testing]
> \startcolor[red]
> \input knuth
> \stopcolor
> \input tufte
> 
> \input knuth
> 
> \placefigure[middle][fig:foo]
>   {This is an image}
>   {\externalfigure[cow.jpg]}
> 
> \input tufte
> 
> \stopchapter
> 
> \stoptext
> 
> I'm in a middle of something else now, I will look into it next days,
> but you can play a bit and report problems.
> 


[-- Attachment #2: production-result.xml --]
[-- Type: text/xml, Size: 555 bytes --]

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<report xmlns="http://www.verapdf.org/MachineReadableReport" creationDate="2016-10-13T15:35:56.555+02:00" processingTime="00:00:01.537" version="0.24.2" buildDate="2016-10-11T12:21:00+02:00">
    <itemDetails size="13139386">
        <name>/home/user/document.pdf</name>
    </itemDetails>
    <validationReport>
        <statement>Could not finish validation. org.verapdf.core.ValidationException: Caught unexpected runtime exception during validation</statement>
    </validationReport>
</report>

[-- Attachment #3: test-result.xml --]
[-- Type: text/xml, Size: 4027 bytes --]

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<report xmlns="http://www.verapdf.org/MachineReadableReport" creationDate="2016-10-13T15:35:18.537+02:00" processingTime="00:00:01.114" version="0.24.2" buildDate="2016-10-11T12:21:00+02:00">
    <itemDetails size="207910">
        <name>/home/user/test.pdf</name>
    </itemDetails>
    <validationReport profile="PDF/A-2A validation profile" compliant="false">
        <statement>PDF file is not compliant with Validation Profile requirements.</statement>
        <details passedRules="119" failedRules="4" passedChecks="1685" failedChecks="6">
            <rule specification="ISO 19005-2:2011" clause="6.7.2" testNumber="1" status="failed" passedChecks="0" failedChecks="1">
                <description>The document catalog dictionary shall include a MarkInfo dictionary containing an entry, Marked, whose value shall be true.</description>
                <object>CosDocument</object>
                <test>Marked == true</test>
                <check status="failed">
                    <context>root</context>
                </check>
            </rule>
            <rule specification="ISO 19005-2:2011" clause="6.6.2.3" testNumber="7" status="failed" passedChecks="19" failedChecks="2">
                <description>All properties specified in XMP form shall use either the predefined schemas defined in the XMP Specification,
			ISO 19005-1 or this part of ISO 19005, or any extension schemas that comply with 6.6.2.3.2.</description>
                <object>XMPProperty</object>
                <test>(isPredefinedInXMP2005 == true || isDefinedInMainPackage == true || isDefinedInCurrentPackage == true) &amp;&amp; isValueTypeCorrect == true</test>
                <check status="failed">
                    <context>root/document[0]/metadata[0](20 0 obj PDMetadata)/XMPPackage[0]/Properties[18](http://ns.adobe.com/xap/1.0/mm/ - xmpMM:InstanceID)</context>
                </check>
                <check status="failed">
                    <context>root/document[0]/metadata[0](20 0 obj PDMetadata)/XMPPackage[0]/Properties[17](http://ns.adobe.com/xap/1.0/mm/ - xmpMM:DocumentID)</context>
                </check>
            </rule>
            <rule specification="ISO 19005-2:2011" clause="6.2.11.4" testNumber="4" status="failed" passedChecks="0" failedChecks="2">
                <description>If the FontDescriptor dictionary of an embedded CID font contains a CIDSet stream, then it shall identify all CIDs which are present in the font program,
			regardless of whether a CID in the font is referenced or used by the PDF or not.</description>
                <object>PDCIDFont</object>
                <test>fontFile_size == 0 || CIDSet_size == 0 || cidSetListsAllGlyphs == true</test>
                <check status="failed">
                    <context>root/document[0]/pages[0](15 0 obj PDPage)/contentStream[0](16 0 obj PDContentStream)/operators[10]/font[0](XZVPND+LMRoman10-Regular)/DescendantFonts[0](XZVPND+LMRoman10-Regular)</context>
                </check>
                <check status="failed">
                    <context>root/document[0]/pages[0](15 0 obj PDPage)/contentStream[0](16 0 obj PDContentStream)/operators[35]/font[0](HJUCGD+LMRoman12-Regular)/DescendantFonts[0](HJUCGD+LMRoman12-Regular)</context>
                </check>
            </rule>
            <rule specification="ISO 19005-2:2011" clause="6.7.3" testNumber="1" status="failed" passedChecks="0" failedChecks="1">
                <description>The logical structure of the conforming file shall be described by a structure hierarchy rooted in the StructTreeRoot entry 
			of the document's Catalog dictionary, as described in ISO 32000-1:2008, 14.7.</description>
                <object>PDDocument</object>
                <test>StructTreeRoot_size == 1</test>
                <check status="failed">
                    <context>root/document[0]</context>
                </check>
            </rule>
        </details>
    </validationReport>
</report>

[-- Attachment #4: Type: text/plain, Size: 489 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

  reply	other threads:[~2016-10-13 13:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-13 13:03 Henri Menke
2016-10-13 13:13 ` luigi scarso
2016-10-13 13:40   ` Henri Menke [this message]
2016-10-13 14:23     ` luigi scarso
2016-10-13 14:58       ` Peter Rolf
2016-10-14  9:22         ` luigi scarso
2016-10-14 12:04           ` Peter Rolf
2016-10-14 12:16             ` luigi scarso
2016-10-14 15:10               ` Peter Rolf
2016-10-18 12:54                 ` Peter Rolf
2016-10-18 13:44                   ` Henri Menke
2016-10-18 14:24                     ` Peter Rolf
2016-10-19 14:23                     ` Alan Braslau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3cbb4476-9ac5-0653-4d9c-7c84abbddb84@gmail.com \
    --to=henrimenke@gmail.com \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).