Recording and slides from veraPDF webinar published

The PREFORMA project ran a webinar series throughout September. The final webinar focused on veraPDF.

Overview

The PREFORMA project’s prototyping phase finishes at the end of 2016. The veraPDF consortium will be producing a v1.0 release candidate of the software library and applications in December 2016.

This webinar demonstrates the current state of veraPDF development and presents the consortium’s plans for 2017. Webinar attendees are shown:

  • the consortium’s current development priorities, i.e. what we’re working on;
  • the veraPDF software development plan for the 2016 v1.0 release;
  • how they help to improve veraPDF software through testing;
  • a demonstration of the software used in some simple, but typical testing scenarios; and
  • an outline of the consortium’s plans for 2017.

The presentations are aimed at non-technical veraPDF users and will benefit anyone wishing to make veraPDF validation part of their digital preservation processes.

Recording and Slides

Recording
PREFORMA introduction slides
veraPDF slides

veraPDF 0.22 released

The latest version of veraPDF is now available to download. Version 0.22 has the following improvements:

Application enhancements:

  • changed default feature generation to document level features;
  • added a new GUI dialog for managing feature generation options;
  • added a user-friendly Java OutOfMemoryError with suggestions for reconfiguration;
  • CLI can now overwrite report files;
  • added help message when CLI processes STDIN stream; and
  • synchronized the Web demo validation report with the CLI and GUI report styles.

Conformance checker fixes:

  • removed the rules for validating file provenance information;
  • fixed an issue with structure type mapping in Level A validation; and
  • implemented resource caching for memory optimization.

Test corpus:

  • converted all ‘fail’ test cases on file provenance information to ‘pass’ tests.

Download veraPDF 0.22:

http://downloads.verapdf.org/rel/verapdf-installer.zip

Release notes:

https://github.com/veraPDF/veraPDF-library/releases/latest

Call for testing

veraPDF is building an open source, industry-supported PDF/A validator. Please support our efforts by downloading and testing the software. If you encounter problems, or wish to suggest improvements, please add them to the project’s GitHub issue tracker (https://github.com/veraPDF/veraPDF-library/issues). Your feedback is very important, it helps to improve the software.

Keep up to date with the latest developments of veraPDF by subscribing to the veraPDF consortium’s newsletter (http://verapdf.org/subscribe/).

Upcoming events

PREFORMA webinar series, September 2016

The PREFORMA project is running a webinar series throughout September. Each webinar will focus on one of the three open source conformance checkers for TIFF; Matroska, FFV1, & PCM, and PDF/A. The webinars will:

  • introduce the PREFORMA project
  • update participants on the current status of the conformance checkers
  • demonstrate the software
  • outline future plans
  • give examples of how the community can contribute, or provide feedback

An overview of each webinar is available on the PREFORMA website at: http://www.digitalmeetsculture.net/article/preforma-webinars-dpf-manager-mediaconch-verapdf/.

Schedule:

Thursday 8 September 15:00 CET: DPF Manager
Thursday 15 September 15:00 CET: MediaConch
Thursday 22 September 15:00 CET: veraPDF

Registration:

There are 50 places available for each webinar, allocated on a first come, first served basis. To register for the webinars visit: http://preforma-webinars.eventbrite.co.uk/.

The webinars will last approximately one hour and will be recorded for those who cannot attend live.

veraPDF 0.20 released

The latest version of veraPDF is now available. Version 0.20 has the following enhancements:

Application enhancements:

  • added signature types to features report;
  • depth of feature reporting now configurable; and
  • altered log level of some validation methods.

Conformance checker fixes:

  • fix for validation of character encoding requirements of invisible fonts; and
  • fix for ICC Profile mluc tag.

Test corpus:

  • 34 new test files for PDF/A-2b.

Download veraPDF 0.20:

http://downloads.verapdf.org/rel/verapdf-installer.zip

Release notes:

https://github.com/veraPDF/veraPDF-library/releases/latest

Call for testing

veraPDF is building an open source, industry-approved PDF/A validator. Please support our efforts by downloading and testing the software. If you encounter problems, or wish to suggest improvements, please add them to the project’s GitHub issue tracker. Your feedback is very important, it helps to improve the software.

Keep up to date with the latest developments of veraPDF by subscribing to the veraPDF consortium’s newsletter.

veraPDF will be demonstrating the software at the PREFORMA Experience workshop on 23 November in Berlin. Find out more at: http://experienceworkshop.preforma-project.eu/.

PREFORMA Experience Workshop – Improving long-term digital preservation

Following the successful Open Source Workshop, organised in Stockholm in April this year, the PREFORMA project invites all the members of the digital preservation community to attend the Experience Workshop – Improving long-term digital preservation, which will be held in Berlin on November 23, 2016.

The aim of the workshop is to demonstrate the use of the conformance checkers for file-formats developed in the project, involve memory institutions outside the PREFORMA consortium in testing, using and further developing the software, and share the experience gained by PREFORMA memory institutions working with developers under R&D service agreements.

Hosted at the Kulturforum in Berlin, the event will feature keynote presentations by representatives from the European Commission on the opportunities offered by the Pre-Commercial Procurement instrument, talks by international experts in digital preservation on the importance of checking the conformance of the digital files against the standard specifications, live demonstrations of the software developed by the three suppliers working in the project (veraPDF, Easy Innova, MediaArea) and an informal networking event where the attendees can share experiences, meet the PREFORMA developers and learn about the tools.

This event is aimed at anyone interested in digital preservation and cultural heritage: memory institutions or other cultural heritage organisations involved in (or planning) digital preservation initiatives; developers who want to contribute code to the PREFORMA tools; standardisation bodies maintaining the technical specifications of preservation file formats; any other person interested in cooperating with us in defining open digital preservation standards.

The workshop will be co-located with Europeana Space final conference, the third edition of the Networking Session for EC projects in the cultural heritage field and the meeting of the German Working Group on PCP and PPI (Pre-Commercial Procurement and Public Procurement of Innovative solutions).

Register before 16 November 2016

http://experienceworkshop.preforma-project.eu/registration/
Participation in the event is free of charge.

Event website

http://experienceworkshop.preforma-project.eu/

The event will be held in English.

If you have any questions or need additional information, please contact: Claudio Prandoni prandoni@promoter.it

veraPDF 0.18 released

The latest version of veraPDF is now available to download. It is accompanied by a beta version of our new documentation site. The site provides installation and user guides to help you get started with the veraPDF conformance checker. The beta site is online here: http://docs.verapdf.org

veraPDF 0.18 has the following fixes and enhancements:

Application enhancements:

  • suppress all PDFBox warnings in the CLI interface when parsing PDF documents
  • generate error report instead of the exception in case of broken PDF documents
  • added a new CLI option to save XML report to a separate file in recursive PDF processing

veraPDF characterisation plugins:

  • enhancements to example pure java plugins
  • plugins now configurable through a dedicated config file

Conformance checker fixes:

  • ignore DeviceGray color space in soft mask images
  • treat glyph with GID 0 as “.notdef” in case of Type0 fonts
  • fixed validation of role map for non-standard structure elements (Level A)
  • fixed validation of page size implementation limits in case of negative width or height
  • fixed validation of non-standard embedded CMaps referenced from other CMaps

Test corpus:

  • added 180 new test files for parts 2 and 3

Infrastructure:

  • test coverage now monitored by Codecov online service
  • integration tests for 2u and 3b validation profiles added
  • using codacy and coverity online code QA services

Download veraPDF 0.18:

http://downloads.verapdf.org/rel/verapdf-installer.zip

Release notes:

https://github.com/veraPDF/veraPDF-library/releases/latest

veraPDF is building an open source, industry-approved PDF/A validator. Please support our efforts by downloading and testing the software. If you encounter problems, or wish to suggest improvements, please add them to the project’s GitHub issue tracker. Your feedback is very important, it helps to improve the software.
Keep up to date with the latest developments of veraPDF by subscribing to the veraPDF consortium’s newsletter.

Webinar: Pre Commercial Procurement for the long-term Preservation of Digital Cultural Heritage, 14 June

Programme:

  1. Background and context, Börje Justrell (10 mins)
  2. The PCP/PPI instrument and how it is implemented in PREFORMA, Antonella Fresa (10 mins)
  3. The PREFORMA Challenge, Bert Lemmens (10 mins)
  4. How to contribute and next appointments, Claudio Prandoni (10 mins)
  5. Q&A

Outline:

Pre-Commercial Procurement (PCP) is a competition-like method designed to steer the development of innovative solutions towards concrete public sector needs. These solutions are developed by external suppliers that are awarded a contract through a phased open procurement process. In the last years, the PCP instrument is becoming more and more popular within the public sector and the European Union increased support for groups of public procurers working together on joint PCPs under Horizon 2020.

PREFORMA is a PCP project co-funded by the European Commission under its FP7-ICT Programme to work on one of the main challenges that memory institutions are facing nowadays: the long-term preservation of digital data. In particular, the project offers memory institutions an open source conformance checker that controls if a file complies with the standard specifications and with the acceptance criteria of the institutions, thus giving them full control of the process of conformity testing of files to be created, migrated and ingested into archives. This software development is carried out in a collaborative environment with memory institutions and experts. Aim of the webinar is to present the first results of the project and to invite the wider digital preservation community – open source community, developers, standardization bodies and memory institutions – to participate in this process.

For more information about the PREFORMA project visit: http://www.preforma-project.eu/

Time:

13:30 BST / 14:30 CET. The webinar will last approximately one hour.

Register:

http://opfwebinarpcpdigitalheritage.eventbrite.co.uk

veraPDF 0.16 released with full support for all PDF/A parts and conformance levels

The latest version of veraPDF features full support of all PDF/A-2 and PDF/A-3 requirements (all levels). Together with earlier support of PDF/A-1 validation, it represents the first full support for all PDF/A parts and conformance levels.

Features:

  • Conformance checker
    • validation of digital signature requirements
    • extraction of color space info from JPEG2000 images
    • validation of permissions dictionary
    • PDF/A-2B fix: correct implementation of CIDSystemInfo entry requirements
    • command line support for plugin execution to extend feature extraction
  • veraPDF characterisation plugins
    • first set of example pure java plugins available
    • optional sample plugin pack available through installer

Test corpus:

  • 112 new atomic test files for parts 2 and 3

Infrastructure:

Download veraPDF 0.16:

http://downloads.verapdf.org/rel/verapdf-installer.zip

Release notes:

https://github.com/veraPDF/veraPDF-library/releases/latest  

veraPDF is building an industry-supported, open source PDF/A validator. The project benefits from a high level of development resource and PDF/A expertise. Please support our efforts by downloading and testing the software. If you encounter problems, or wish to suggest improvements, please add them to the project’s GitHub issue tracker. You can expect a speedy response. Your feedback is very important, it helps to improve the software.

Keep up to date with the latest developments of veraPDF by subscribing to the veraPDF consortium’s newsletter.

Update on the Resolution of Ambiguities

In December of last year we reported the development of the PDF Validation TWG’s Resolution of Ambiguities document, with an additional 10 questions added to the 4 previously presented to the ISO committee and resolved in April, 2015 during the meetings in San Jose, California.

Since last November the veraPDF contractor has raised, and the TWG has addressed, several more ambiguities to the PDF Validation TWG for resolution, bringing the total number of ambiguities raised to 24 for all parts of PDF/A.

Since many of these questions pertain to PDF/A-next in addition to previous Parts of ISO 19005, the 10 new questions generated by the TWG between the two ISO meetings were submitted into the formal ISO process for reviewing comments against draft specifications. The ISO WG then duly considered the Resolution of Ambiguities document during its meetings in Ghent, Belgium in May, 2016.

These new questions proved somewhat more contentious than many of the questions formerly raised. To provide a flavor of the issues addressed, the most recent set of ambiguities is summarized here:

veraPDF-A015 discusses the interpretation of the corrigendum 2 to ISO 19005-1, which contains a special clause to exclude resources unreferenced from the corresponding content stream from further requirements.

At Adobe’s request, this item was parked by the WG for further study, to be resolved at the ISO meetings in Sydney, in November, 2016.

veraPDF-A016 remains a sore-spot. The keys in question are deprecated from ISO 32000-2, and thus do not affect PDF/A-next. However, the requirement remains for PDF/A-2 and PDF/A-3; it will be left to an industry Application Note to provide a universal reference for relaxing these unnecessary and problematic requirements for CharSet and CIDSet entries.

veraPDF-A017 sought to clarify that XMP metadata streams in PDF/A-1 must be uncompressed. The TWG’s interpretation was accepted, and the WG added an additional clarification: that XMP packages don’t need to conform to XMP or even XML.

veraPDF-A018 refers to an ambiguity over whether the requirement pertains to the file-format or to a means of comparing real values. The WG decided that Non-zero values less than the minimal one are not allowed in PDF/A-2 (and PDFA-3) on purpose.

veraPDF-A019 discusses the problem that clause 6.1.13 in ISO 19005-2 copies the list of limits from ISO 32000-1 and lists them explicitly. However, the word “approximately” was dropped, and so the definition of the limits thus differs between ISO 32000-1 and PDF/A-2, creating an untenable situation for processors encountering files that (may) exceed these limits. The WG elected to leave the matter as-is because although differing from the base specification for PDF the actual requirement for PDF/A-2 was itself not ambiguous.

veraPDF-A020 concerns the “shall” requirement in all three parts of PDF/A to comply with either predefined schemas from the XMP specifications or with an extension schema. The WG accepted the PDF Validation TWG’s recommendation for PD/A-next.

veraPDF-A021 questions the value and practicality of the requirement in PDF/A-2 and PDF/A-3 to record user actions in the xmpMM:History property. The WG accepted the PDF Validation TWG’s recommendation for PD/A-next but highlighted that the parameters field is still required in xmpMM:History for conformance with PDF/A-2 and PDF/A-3.

veraPDF-A021a (there was a numbering error, to be corrected in a subsequent Resolutions document) points out that in PDF/A-1 it’s not clear if any Widget annotation is required to have an annotation dictionary. The WG agreed with the TWG’s interpretation that for PDF/A-1, every button field widget shall have an appearance stream or dictionary.

veraPDF-A022 affects all parts of PDF/A. The requirement for multiple appearance streams misses the case when a form (such as a radio button) has multiple widgets associated to it and defined in /Kids array. The TWG proposed to PASS otherwise valid PDF/A documents if it contains a Widget annotation dictionary with Parent key referring to a parent form field of type Button, and if the value of the N key in this widget annotation dictionary refers to an appearance subdictionary. The WG agreed.

veraPDF-A023 pointed out that some wording pertaining to ICC color spaces was imprecise, and proposed specific replacement text. The WG accepted this interpretation, and the PDF/A-next Project Leader agreed to make this change in the text of PDF/A-next.

Following the ISO meetings in Ghent the PDF Validation TWG will continue its review and test-suite development for PDF/A-2 and PDF/A-3, with its final questions to be put before the ISO WG during the November, 2016 meetings in Sydney, Australia.

The PDF Association is currently considering publication of the final Resolution of Ambiguities document as a formal PDF Association Application Note for PDF/A.