Designed to meet the needs of digital preservationists, and supported by leading members of the PDF software developer community, veraPDF is a purpose-built, open source, permissively licensed file-format validator covering all PDF/A parts and conformance levels, as well as PDF/UA-1 syntax checks. Learn more about what veraPDF is doing, and meet the team.
The veraPDF consortium
Led by the Open Preservation Foundation (OPF) and the PDF Association, the Consortium’s mission is to develop an industry-supported, open-source validator for PDF/A, and to build a community to maintain the project in the long term.
Open Preservation Foundation
The Open Preservation Foundation (OPF) sustains technology and knowledge for the long-term management of digital cultural heritage, in all its forms. It provides its members with reliable solutions to the challenges of digital preservation through technology stewardship, knowledge exchange, and advocacy and alliances. OPF currently stewards the leading portfolio of open-source digital preservation software.
The PDF Association promotes the adoption and implementation of International Standards for PDF technology. It is geared towards developers of PDF solutions; companies that work with PDF in providing output, document management, enterprise content management (ECM) and related applications, interested individuals, and users who want to implement PDF technology in their organizations.
Digital Preservation Coalition
The Digital Preservation Coalition (DPC) is an advocate and catalyst for digital preservation, enabling its members to deliver resilient long-term access to content and services, and helping them derive enduring value from digital collections. DPC raises awareness of the importance of the preservation of digital material and the attendant strategic, cultural and technological issues.
Funded by the European Commission’s PREFORMA Project
veraPDF is funded by the PREFORMA project. PREFORMA – PREservation FORMAts for culture information/e-archives, is a Pre-Commercial Procurement (PCP) project co-funded by the European Commission under its FP7-ICT Programme. The project’s main aim is to address the challenge of implementing standardised file formats for preserving digital objects in the long term, giving memory institutions full control over the acceptance and management of preservation files into digital repositories.
The specifications for PDF/A and PDF/UA are sets of restrictions and requirements applied to the “base” PDF standards (PDF 1.4 for PDF/A-1, ISO 32000-1 for PDF/A-2, PDF/A-3 and PDF/UA-1 and ISO 32000-2 for PDF/A-4) plus a specific set of 3rd party standards. The veraPDF subsystems include:
veraPDF Implementation Checker
The Implementation Checker parses and analyzes PDF documents. It outputs two types of report: a report describing the PDF document and its metadata and a Validation Report describing conformance to PDF/A and PDF/UA flavours.
veraPDF Metadata Fixer
The Metadata Fixer makes a limited set of fixes to metadata within PDF documents, such as removal of the PDF/A flag in the case of a non-conforming document, or the repair of broken XMP metadata, if bad XMP is the only error preventing a legitimate PDF/A flag.. The Metadata Fixer produces a fixed version of the original document and a Metadata Fixing Report, which describes the fixes attempted, and their success or failure.
veraPDF Policy Checker
The Policy Checker parses and analyzes a PDF Features Report and generates a Policy Report stating whether the PDF document complies with institutional policy as expressed in a Policy Profile. Note that the Policy Checker can be used to check for almost any quality in a PDF; for example, the use of annotations, irrespective of PDF/A.
The Reporter transforms verPDF’s machine-readable reports as generated by the Implementation Checker, Policy Checker, and Metadata Fixer, into other forms for downstream use.
The Shell manages veraPDF’s other components and ensures interaction in a coordinated sequences of actions. Users interact with the Shell through the Command Line Interface (CLI), Desktop Graphical User Interface, or Web Graphical User Interface.
veraPDF is open source software dual licensed for sustainability and reuse in accordance with PREFORMA’s requirements. Other project outputs such as test corpora and documentation are issued under a Creative Commons license.
The Mozilla Public License v2+ allows covered source code to be mixed with other files under a different, even proprietary license. Code licensed under the MPL must remain under the MPL, and freely available in source form.
The GNU General Public License v3 guarantees users the freedom to run, study, share (copy), and modify the software. The copyleft quality of the GPLv3 requires those rights to be retained.