1PDF/A The Development of a Digital 23Supplemental information. Informative
Preservation Standard. Stephen Abrams annexes to ISO 19005-1 PDF/A-1 conformance
Harvard University Betsy Fanning summary Best practices Guidelines for
Association for Information and Image capturing or converting electronic
Management Diana Helander Adobe Systems, documents to PDF/A For documents created
Inc. Susan Sullivan, CRM National Archives according to specific institutional rules
and Records Administration. SAA 69th Replicates the exact quality and content
Annual Meeting, New Orleans, August 14-21, of source documents within the PDF/A file
2005. Required for compliance with NARA’s PDF
2Agenda. The preservation problem The Transfer Guidance PDF/A FAQ Under
ISO standards process Benefits of PDF/A development Will be available on AIIM and
Technical overview Questions. NPES web sites.
3The preservation problem. What is the 24Supplemental information. Application
best option for preserving electronic notes Will provide specific guidance on
documents over archival time spans? TIFF? the use of PDF/A Similar in intent to
Widely adopted No access to underlying those produced for PDF/X Under development
text without OCR No mechanism for Will be available on AIIM and NPES web
capturing logical structure Difficult to sites AIIM and NPES will archive copies
create “born-digital” documents XML? Good of, and maintain public access to, the PDF
for describing logical structure, but not Reference and XMP Specification As well as
appearance Many incompatible other freely available, non-ISO normative
domain-specific schemas Native Format references of ISO 19005-1.
(e.g., MS Word)? Several ubiquitous, but 25Agenda. The preservation problem The
closed proprietary formats PDF? ISO standards process Benefits of PDF/A
4The preservation problem. PDF is a Technical overview Questions.
ubiquitous open format for electronic 26PDF/A. PDF/A is intended to address
documents Proprietary, but with publicly three primary issues: Define a file format
available specification Many statutory, that preserves the static visual
regulatory, and institutional policies appearance of electronic documents over
mandate the retention of PDF-based time Provide a framework for recording
documents over multiple generations of metadata about electronic documents
technology The feature-rich nature of PDF Provide a framework for defining the
can complicate preservation efforts. logical structure and semantic properties
5Desirable properties of a preservation of electronic documents.
format. PDF/A objectives Device 27Nevertheless… PDF/A may not be the
independence Can be reliably and last preservation format you will need
consistently rendered without regard to However, proper application of PDF/A
the hardware/software platform should result in reliable, predictable,
Self-contained Contains all resources and unambiguous access to the full
necessary for rendering Self-documenting information content of electronic
Contains its own description Transparency documents.
Amenable to direct analysis with basic 28PDF/A conformance. Two conformance
tools. levels PDF/A-1a Compliance with all
6Desirable properties of a preservation requirements of 19005-1 Including those
format. PDF/A objectives (Lack of) regarding structural and semantic tagging
technical protection mechanisms No PDF/A-1b Compliance with all requirements
encryption, passwords, etc. Disclosure of 19005-1 minimally necessary to preserve
Authoritative specification publicly the visual appearance of a PDF/A file.
available Adoption Widespread use may be 29PDF/A requirements. Conformance to PDF
the best deterrent against preservation 1.4.
risk. 30PDF/A requirements. Conformance to PDF
7PDF/A usage. PDF/A standard may be 1.4 With features that are Required.
used by vendors to: Develop applications 31PDF/A requirements. Conformance to PDF
that read and write and otherwise process 1.4 With features that are Required
PDF/A files These applications will be Recommended.
used by organizations to: Create and 32PDF/A requirements. Conformance to PDF
process PDF/A conformant files As part of 1.4 With features that are Required
their business processes In conjunction Recommended Restricted.
with necessary adjunct archival and 33PDF/A requirements. Conformance to PDF
records management policies and 1.4 With features that are Required
procedures. Recommended Restricted Prohibited.
8Current support for PDF/A. There is no 34PDF/A requirements. Conformance to PDF
“formal” support for PDF/A today Acrobat 7 1.4 With features that are Required
support for “draft” version Nor has PDF/A Recommended Restricted Prohibited Reader
yet been adopted as a “required” format by functional requirements.
any governmental, academic, or commercial 35PDF/A requirements. Conformance to PDF
body However, once ISO 19005-1 is formally 1.4 With features that are Required
published, we can expect tools to be Recommended Restricted Prohibited Reader
developed quickly Acartus ApertureONE ERM functional requirements Features not
Many other vendors participated in the documented in 1.4 are ignored by PDF/A
standards process Appligent, Callas, readers.
Global Graphics, PDF Sages And we expect 36General. Required Conformance to 1.4
that the mandated (or recommended) use of requirements Recommended Linearization
PDF/A will follow. hints should be ignored Restricted
9PDF/A caveats. However… PDF/A alone Document information dictionary must be
does not guarantee preservation PDF/A consistent with XMP metadata Prohibited
alone does not guarantee exact replication Encryption LZW compression Embedded files
of source material The intent of PDF/A is Optional content Sound and movie media
not to claim that PDF-based solutions are types.
the best way to preserve electronic 37Graphics. Required Device independent
documents But once you have decided to use color Embedded color spaces Restricted
a PDF-based approach, PDF/A defines an Image dictionaries Separation and DeviceN
archival profile of PDF that is more color spaces Form XObjects Extended
amenable to long-term preservation. graphics state Rendering intents
10The PDF/A standard. “This Prohibited Reference XObjects PostScript
International Standard specifies how to XObjects Non-PDF 1.4 defined operators
use the Portable Document Format (PDF) 1.4 Transparency.
for long-term preservation of electronic 38Fonts. Required Fonts legally
documents” Applicable to documents embeddable for unlimited, universal
containing character, raster, and vector rendering Embedded font programs Embedded
data The standard does not address: CMaps Consistent font metrics Unicode
Processes for generating PDF/A files character map (For Level A conformance
Specific implementation details of only) Recommended Font subsets Restricted
rendering PDF/A files Methods for storing Character encodings.
PDF/A files Hardware and software 39Annotations. Required Reader mechanism
dependencies. to expose the annotation dictionary
11The PDF/A standard. PDF/A is a file Contents key Restricted Annotation
format standard PDF/A is just one dictionaries Prohibited Non-PDF 1.4
component of a comprehensive preservation defined types FileAttachment, Sound, and
strategy Successful implementation depends Movie types.
upon: Records management policies and 40Actions. Required Behavior for
procedures Additional requirements and NextPage, PrevPage, FirstPage, and
conditions Quality assurance processes. LastPage actions as defined in PDF 1.4
12Agenda. The preservation problem The Reader mechanism to expose GoToR
ISO standards process Benefits of PDF/A dictionary F and D keys, URI action
Technical overview Questions. dictionary URI key, and SubmitForm action
13The PDF/A standard. Multi-part ISO dictionary F key Prohibited Launch, Sound,
International Standard ISO 19005-1:2005, Movie, ResetForm, ImportData, and
Document management – Electronic document JavaScript actions Deprecated set-state
file format for long-term preservation – and no-op actions Named actions other than
Part 1: Use of PDF 1.4 (PDF/A-1) Part 2 the 4 page navigation actions Widget
(19005-2) intended to bring PDF/A into annotation or Field dictionary AA key.
conformance with PDF 1.6 And additional 41Metadata. Requires use of Extensible
future parts, as necessary. Metadata Platform (XMP) Proprietary, but
14Time Line for Part 1. October 2002 open format Used for metadata creation,
Initial meeting of AIIM/NPES PDF/A processing, and interchange Based on
committee April 2003 Initial Working Draft Resource Description Framework (RDF) Open
(WD) August 2003 New Work Item (NWI) World Wide Web Consortium (W3C) standard
approved and Joint Working Group (JWG) Cornerstone of Semantic Web Pre-defined
formed December 2003 First Committee Draft schemas Base, DC, DRM, DAM, Workflow,
(CD) approved September 2004 Second CD EXIF, PDF, PSD Defined extension mechanism
approved June 2005 Draft International Embedding rules TIFF, JPEG, JPEG 2000,
Standard (DIS) unanimously approved. HTML, AI, PSD, PDF, …
15Time Line for Part 1. Submitted to ISO 42Metadata. Required Document level XMP
Central Secretariat for publication as metadata Equivalent XMP metadata for all
International Standard Should be publicly appropriate Document Information
available September 2005 Throughout the Dictionary properties Embedded extension
process, PDF/A has been reviewed by schema Version and conformance
technical experts from 15 national self-identification Recommended File
standards bodies. identifier File provenance Font metadata
16ISO/TC 171/SC 2/WG 5. ISO Joint Prohibited XMP packet header bytes and
Working Group (JWG) for PDF/A ISO/TC encoding attributes.
171/SC 2, Document management applications 43Logical structure (Level A conformance
– Application issues ISO/TC 130, Graphic only). Required Tagged PDF Explicit word
technology ISO/TC 46/SC 11, Information breaks Recommended Tagging for pagination,
and documentation – Archives/records layout, and page artifacts “Strongly
management ISO/TC 42, Photography. structured” block-level structural tagging
17Role of AIIM and NPES. AIIM, Natural language tagging Alternative
Association for Information and Image description, non-textual annotation,
Management Secretariat to ISO/TC 171 and replacement text, and abbreviation/acronym
ISO/TC 171/SC2 Secretariat to US Technical expansion tagging.
Advisory Group (TAG) for ISO/TC 171 NPES, 44Interactive forms. Required Field
The Association for Suppliers of Printing, appearance dictionary Restricted
Publishing, and Converting Technologies NeedAppearance flag Explicit word breaks
Secretariat to ANSI Committee for Graphic Prohibited A and AA keys in Widget and
Arts Technologies Standards (CGATS) Field dictionaries Note There is no
Secretariat to US TAG for ISO/TC 130 Joint restriction on the use of digital
sponsors of the initial US PDF/A signatures, as defined by PDF 1.4.
committee. 45What’s under consideration for Part 2?
18PDF/A terminology. PDF/A-1 refers to Based on PDF 1.6 The following specific
the format defined by Part 1 (ISO 19005-1) features are under consideration for
of the standard Part 2 (ISO 19005-2) will inclusion in Part 2 JPEG 2000 image
define PDF/A-2 New Parts can be added to compression More sophisticated digital
the PDF/A family of standards without signature support OpenType fonts 3D
obsoleting previous Parts. graphics Audio/video content Consistency
19Agenda. The preservation problem The with PDF/X, PDF/E, PDF/UA.
ISO standards process Benefits of PDF/A 46What’s under consideration for Part 2?
Technical overview Questions. If PDF/A-1 does not meet your specific
20PDF/A. Non-proprietary standard Based needs, get involved in the process Contact
on a proprietary, but open format Betsy Fanning, Director, AIIM Standards
Developed by inclusive set of stakeholders Program.
Subject to rigorous technical review 47PDF/A summary. ISO 19005-1 (should be
Minimal restrictions necessary to available September 2005) File format
facilitate long-term preservation Not standard One component of a comprehensive
reliant on the existence of any particular archival strategy Based on PDF 1.4 Two
reader. conformance levels Level A for
21Relationship to other standards. PDF/X structural/semantic tagging Level B for
for pre-press data exchange ISO 15390 appearance only Emphasis on reliable and
parts 4 (PDF/X-1a), 5 (PDF/X-2), and 6 predictable rendering of static visual
(PDF/X-3) Currently based on PDF 1.4; work appearance Do’s: embed fonts,
underway to extend to PDF 1.6 It is device-independent color, XMP metadata,
possible for a file to be both PDF/A and tagging Don’ts: encryption, LZW, embedded
PDF/X compliant PDF/E for engineering, files, external content references,
architectural, and GIS documents transparency, multi-media, JavaScript.
Provisionally based on PDF 1.6 PDF/UA for 48PDF/A summary. Consistency with PDF/X
accessibility Intended to address Section Work planned for Part 2.
508 concerns. 49Agenda. The preservation problem The
22Intellectual property rights. PDF/A is ISO standards process Benefits of PDF/A
a file format standard Anyone can use the Technical overview Questions.
PDF Reference and XMP Specification in 50Questions? http://www.iso.org/
conjunction with ISO 19005-1 to create http://www.aiim.org/pdfa/app-notes
applications that read, write, or process http://www.npes.org/standards/toolspdfa.ht
PDF/A files Adobe has granted a general l stephen_abrams@harvard.edu
royalty free license to use certain of its bfanning@aiim.org helander@adobe.com
patents to create applications that read, susan.sullivan@nara.gov.
write, or process PDF/A files.
PDFA The Development of a Digital Preservation Standard

