PDF A11y 101: Don't Remediate your PDFs
- Read Me
- A Brief History of the Portable Document Format
- PDF vs Web Pages
- Demonstration: Three Identical PDFs
- PDF Tutorial (Video)
- A11y PDF Basics
- PDF Titles
- Optical Character Recognition (OCR)
- Fillable PDF Forms
- Using Tables for Layout
- Remediation
- Embedding Fonts
- Bookmarks
- PDF Tools
- Links for PDFs
- Similar pages on BoutonJones.com
Read Me
This is not meant to be a complete or exhaustive explanation of PDF Accessibility.
A Brief History of the Portable Document Format
Adobe created the Portable Document Format (PDF) in 1993 to address the incompatibility of different computer systems, which often distorted document formatting during transfer. PDFs offered a solution by enabling the sharing and printing of documents with consistent layout, fonts, and images across various platforms. Regardless of the computer platform, the PDF will always display exactly the same.
Since then, PDFs have evolved beyond simple printing. They are now widely used for eBooks, contracts, interactive forms, and digital archiving. Features like clickable links, multimedia, and digital signatures have made PDFs adaptable to modern digital needs.
Adobe released the PDF as an ISO open standard In 2008. If Adobe were ever to go out of business, the PDF format will remain available and supported.
PDF allows the user to view a file precisely—down to the pixel, essentially, of what the author had intended.
- Bob Wulff, Adobe's Senior Vice President of Cloud Technology
cite in Who Created the PDF? on Adobe.com
PDF vs Web Pages
PDFs present some limitations. Many lack proper tagging or alternative text, hindering accessibility for users of assistive technologies. Additionally, they can become bloated with unnecessary data, impacting loading and sharing speeds. While PDFs excel at preserving document appearance, they are less suitable for editing and adapting to smaller screens compared to other formats.
| Feature | Web Pages | |
|---|---|---|
| Accessibility | Limited if not properly tagged | Highly customizable for accessibility |
| Layout Consistency | Always consistent | Can vary based on screen size and browser (e.g., Responsive Design) |
| Editability | Difficult to edit without special tools | Easy to update |
| Interactivity | Supports forms and multimedia | Highly interactive with JavaScript and CSS |
| File Size | Can become large with images or media | Typically smaller. Loads dynamically |
When are PDFs the best option? PDFs are good for:
- printing
- fillable forms
- signing
Don't use a PDF where a web page will do as well --- or better.
Related Usability Articles from the Nielsen Norman Group:
Demonstration: Three Identical PDFs
They have the exact same content, title, headers, font, and colors. So how are they different?
Blank PDFs
To users of assistive technology, scanned PDFs are blank. Click the next button to hear how JAWS perceives a scanned PDF.
JAWS identifies the first PDF as an "document," but it doesn't provide an further information. Click the next button to hear how NVDA understands the document.
NVDA is more informative. It announces "Alert: Empty document. Edit list: Document appears to be empty. It may be a scanned image that needs OCR or it may be a malformed document read only."
"A picture may be worth a thousand words --- but not if it's a picture of a thousand words that you're trying to read with accessible technology."
- David Ondich
ADA Program Manager for the City of Austin
When you scan a printed document – e.g., a signed contract, police report, corporate policy, or anything requested for discovery – it is no more than an image of the hard copy. It will not be accessible to screen readers in its current form.
If you can't "copy and paste " text, it most likely is an image.
Untagged PDFs
Untagged PDFs contain true text but they can't be navigated. Screen readers can read the content but they can't find the headers or the lists. The tags have been stripped from the document.
Question: when is a PDF with text an "untagged" document?
Answer 1 of 2: When semantic markup is not used to tag the structural elements in a document, the document is called "untagged." If the source of a PDF is an untagged document, the PDF will also be untagged.
- Tagged PDFs
-
Tagged PDF (PDF 1.4) is a stylized use of PDF that builds on PDF's logical structure framework. It defines a set of standard structure types and attributes that allow page content (text, graphics, and images) to be extracted and reused for other purposes
Cite: PDF Technology Notes- A PDF file that -- in addition to text and graphics -- contains meta-data for text-extraction, content-reflow, document accessibility, geographic information in PDF containing maps, etc.,
With the correct tags, a screen reader can:- Understand where headings fall
- Follow the correct reading order
- Identify footnotes & graphics
- Understand the structure of tables
- Complete fillable forms
In most cases, tags are necessary in order to make a PDF file comply with Section 508.
Answer 2 of 2: When you print a document to PDF. The text is saved to the PDF but the tags for indicating structure are lost.
One surprising exception is tables. I printed (to PDF) a Word document containing a table. Both cells of the table included multiple lines of text. I expected the screen reader to ignore the table cells and read the entire content left to right. But both JAWS and NVDA read all the line of text in the first cell before reading all the lines of text in the next cell. I tried using Word's column function. NVDA was able to read a two column PDF printed from Word correctly.
Export to PDF
| Never Print to PDF! | Always Export to PDF instead |
|---|---|
|
|
Never print to PDF. Always export to --- or save as --- PDF. That way the semantic formatting (the tags) will be included.
Export to PDF is a better option than Save as PDF.
Accessible PDFs
This third PDF is properly tagged and contains real text.
Screen readers (such as Jaws and NVDA) will identify headers and lists in a tagged PDF.
PDF Tutorial (Video)
How to Create Accessible in PDF [SIC] by Using Adobe Acrobat Pro [2017] (1:50)
Transcript:
00:00 In this video, PDF Tutorial: How to Create Accessible in pdf by using adobe acrobat pro
00:01 Go to the tool menu and click the Action Wizard and Click Create Accessibly
00:11 select the Add Document Description and click ok and select destination
00:22 fill the information
01:00 now your accessibility is created
01:40 Please Subscribe My channel Thank you for watching
A11y PDF Basics
- Only use PDFs on websites when they are the best option (e.g., attestation by electronic signatures, fillable forms, and printing.) Don't use a PDF where a web page will do as well.
- It's better to remediate and export an accessible source document to PDF than to remediate a PDF exported from an inaccessible document. Check the accessibility of your source documents using automated and manual testing and then remediate any errors.
- Never print to PDF. Always export to --- or save as --- PDF. While the "print to PDF" document will retain the same appearance as the original MSO document, it will remove the hidden tags which allow screen reader users to navigate the document. Instead use "Export to PDF" or "Save as PDF."
- Before remediating a document (or making any edits), make a backup copy. Every time you make a major change to a remediated document make another backup. Remediation errors can be difficult or even impossible to recover from.
- Test the accessibility of your PDFs using automated and manual testing.
- If you must scan to PDF, check the accuracy of the OCR and correct where necessary. OCR can be deceptive. Even when it looks correct, the OCR might not have mapped the new characters correctly. One way to check is to copy the text in the PDF and paste it into a text editor like notepad. Check that the text in the text editor is accurate. Another check is to read the document with a screen reader.
- Avoid using tables for layout. But if you must, remove the tags for tables (used for layout) in the Adobe's Tag Tree. Visually the layout will remain the same, but it will not cause issues for AT.
PDF Titles
In a PDF, the "title style" refers to the text displayed as the title within the document itself
The "title metadata" is a separate piece of information embedded within the PDF file that describes the document's title and is used for searching and indexing purposes but isn't necessarily visible directly in the document itself.
Saving the Title Metadata
- Open the document in the Adobe Acrobat editor.
- Navigate to "File" > "Properties."
- Select the "Description" tab.
- Enter the desired title in the "Title" field
This will embed the title as metadata within the PDF file.
Display the Title (Metadata)
By default, the Title (Metadata) is empty and the PDF is set to display the File Name instead of the Title (Style).
You must manually enter the Title and set Show to display the Document Title.
- Open the PDF in Adobe Acrobat.

- Go to "File" > "Properties."

- Select the "Initial View" tab.

- In "Window Options," choose "Document Title" from the "Show" dropdown.

Optical Character Recognition (OCR)
OCR Demo (Video)
OCR: How To Convert Scanned Document (PDF) To Text Using Adobe Acrobat Professional (:37)
This video has no narration, transcript, or close captioning.
In this video a PDF is displayed inside Adobe Acrobat. The user tries to select the text. But the PDF is image only so the text cannot be selected. The user runs OCR on the entire PDF. The pages are de-skewed (i.e. straightened) The text inside the images are converted into real text. They appear to retain the original fancy font face.
Recognize Text (OCR) Output Options in Adobe Acrobat 2017
If you click the Setting Button on the Recognize Text menu, you can choose among three options in the Recognize Text dialog box.
- Searchable Image
- Searchable Image (exact)
- Editable Text and Image
Two Output Options:
- Editable Text and Image
- Converts the graphical text into vector text. The vector text will replace the "text in image."
- It creates and embeds a custom font that looks like the original font in the image.
- You can now edit the text.
- The file size is larger
- Searchable Image and Searchable Image (Exact)
- Places an invisible "layer" (not really a layer in a strict technical sense) on top of the original image.
- You can't easily edit the document but the invisible layer can be read by assistive technology (such as JAWS and NVDA.)
Searchable Image (Exact)
"Searchable Image" may slightly alter the image (by deskewing it) to improve text recognition, while "Searchable Image (Exact)" preserves the original.
Generally the file size of deskewed PDFs are smaller than the skewed versions because of the improved text recognition.
Additional Resources on the Output Options
- Understanding OCR options in Adobe Acrobat: "Searchable Image", "Searchable Image (Exact)", and "Editable Text and Images" (Superuser)
- Searchable Image vs. Searchable Image (Exact) - Quality of OCR (Adobe Community)
Validating OCR
OCR can be deceptive. It might appear to the human eye to match the original document exactly while being wildly inaccurate to assistive technology. It's always best to check the accuracy. Here are three methods.
- Compare the visible text to what a screen reader finds.
- Copy and paste the newly editable text into a text document.
- Change the font face --- of the editable text --- throughout the revised document. (Later you can change it back to the fonts closest to the original document's fonts.)
Fillable PDF Forms
- Make sure the labels and names of the input fields are correct.
- Make sure the tab order of the input fields is correct.
Form fields are listed in the order they were added. So, if you started with a single "Name" field and later renamed it to "Last Name" and added a "First Name" field, then the "First Name" field will be listed much later in the form. The screen reader will announce something like this: "Last name, street address, city, zip code, phone number, email address, first name." - The "tool tips" should match the "labels."
- "Date field tool tips" should contain a text string showing the correct date format (e.g. "mm/dd/yyyy.")
Using Tables for Layout
It's a best practice not to use tables for layout. But if you do, it's a good idea to remove the tags for tables (used for layout) in the Adobe's Tag Tree.
Remove Layout Tables in Adobe Acrobat 2017
- Select Tools
- Select Edit PDF
- Select Tag to see the tag tree
- Select the table with a right click
- Select the Delete Tag option from the context menu
The challenge here is to identify the correct tag. They are not well labeled and it's hard to identify which item in the document they correspond to.
Remediation
I am not an expert on PDF remediation. I am focused on creating accessible PDFs by starting with accessible documents and exporting them into PDF.
But sometimes remediation is unavoidable.
My Best Advice on Remediation: If you must remediate PDFs, save the documents periodically under different names (e.g., ebook_ver01, ebook_ver02, ebook_ver03). It can be hard to recover from mistakes made during remediation and it is often easier to start over with your last saved copy than undo the mistake.
Embedding Fonts
Adobe Acrobat's Base 14 Fonts
The Base 14 Fonts are defined in ISO 32000-1:2008(E) in Section 9.6.2.2.
They are also referred to as the "Standard 14 Fonts", the "Standard Type 1 Fonts", and the "Standard Fonts".
Monospaced Fonts: Courier, Courier Bold, Courier Oblique, Courier Bold-Oblique
Proportional Fonts (Sans Serif): Helvetica, Helvetica Bold, Helvetica Oblique, Helvetica Bold-Oblique
Proportional Fonts (Serif): Times Roman, Times Bold, Times Italic, Times Bold-Italic
Symbol
Zapf Dingbats
Since these fonts are standard and assumed to be available on most PDF readers, they don't need to be embedded within the document, resulting in a smaller file size. If you limit your document to these 14 fonts, it should not be necesssary to embed any additional fonts.
Checking If Fonts are Embedded in a PDF.
To check whether the fonts are all embedded in your PDF file:
- Open your PDF file
- Click File > Document Properties
- Click on the Fonts Tab to display the list of all fonts
- All fonts are either Type 1 or TrueType fonts
- All fonts should show as "Embedded Subset"
Cite: How to Embed Fonts in a PDF Document on qoppa.com
Embedding Fonts in PDFs
Noe: a font can only be embedded if it contains a setting by the font vendor that permits it to be embedded.
- Open the file in Adobe Acrobat (the editor, not the free reader.
- In the File menu, click Print.
- Click Adobe PDF
- Click the Properties button to the right of the Printer Name text box
- Select the tab Adobe PDF Settings
- Edit the Default Settings
- Click Fonts
- For Subset embedded fonts when percent of characters used is less than: Set the percentage to 100%
- Select the Embed all Fonts option
- For Embedding, select the folder with the fonts you want to embed from the drop-down list
- Make sure the fonts you need to embed are in the Always Embed box and not in the Never Embed box
Cite: How to Embed Fonts in PDF on printivity.com
Here's a short video (less than 2 minutes long): How to Embed Fonts in a PDF using Acrobat pro-2017
Bookmarks
What Are Bookmarks?
You might know bookmarks from web browsers-they save web pages so you can find them later. In PDFs, bookmarks work a little differently. They show up in the navigation panel and let you jump to different sections within the same document.
So, while browser bookmarks help you move between websites, PDF bookmarks help you move around within a document.
Are Bookmarks Required?
Some tools-like Adobe Acrobat's checker and the PDF Accessibility Checker (PAC)-may flag an error if a PDF over 20 pages doesn't have bookmarks.
The WCAG guidelines don't strictly require bookmarks, but it's considered a good practice to include them in long PDFs.
Some A11y professionals advise: if you use Bookmarks, they should mirror headings. For example, in the U.S. Department of Health and Human Services' Adobe Acrobat PDF Accessibility Reference (in the "Take Additional Measures" section), the HHS advises Adding bookmarks to lengthy documents can aid navigation. Open the Bookmarks pane and confirm bookmarks are present. Insert bookmarks by activating the Options menu and selecting New Bookmarks from Structure… Typically heading structure (i.e., H1-H6) is used. Bookmarks must be organized and properly nested.
(That document was last revised in August 2020.)
Prior to that, the HHS's advice (in a no longer extant web page) was a little more emphatic.
This screen capture from the "Required Fixes for PDF Files" page on the HHS.gov website was taken before 2008. It states The document contains at least 10 pages and does not contain proper bookmarks. This issue is a violation of section 508 and WCAG 2.0 Success Criterion 2.4.5..
It goes on to recommend to Add bookmarks for major divisions of the document. Recommend creating based on the heading structure or Table of Contents if one exists. For assistance see: W3 PDF Technique #2
It concludes with this link: Adobe Bookmark
As mentioned earlier, WCAG doesn't require bookmarks. But what WCAG Success Criterion 2.4.5 (Level AA) actually requires is that there is More than one way is available to locate a Web page within a set of Web pages except where the Web Page is the result of, or a step in, a process.
The intent of the success criteria is to make it possible for users to locate content in a manner that best meets their needs. Users may find one technique easier or more comprehensible to use than another.
By applying that web page criteria to PDF, I interprete it to mean the bookmark technique is not required, but it is one technique that can be applied. The goal is to help users find what they need in the way that works best for them. For PDFs, bookmarks are one way to do that.
W3C's PDF Technique #2 (i.e., "Creating bookmarks in PDF documents")
The intent of this technique is to make it possible for users to locate content using bookmarks (outline entries in an Outline dictionary) in long documents.
Furthermore A person with cognitive disabilities may prefer a hierarchical outline that provides an overview of the document rather than reading and traversing through many pages. This is also a conventional means of navigating a document that benefits all users.
Notice that the W3C is offering this as technique, not as success criteria.
Is it Redundant to Use Both Headings and Bookmarks?
It can seem redundant to include headings as well as bookmarks that mirror headings, but bookmarks offer an additional benefit. They show up in the navigation pane, so users can quickly jump between sections without scrolling or using a mouse. This helps everyone - not just people using screen readers. Headings alone are hidden unless you use assistive tech.
How Do You Use Word Headings for Bookmarks?
How to add bookmarks based on a document's heading structure via Adobe Acrobat Pro DC:
- Select Bookmarks icon on the Accessibility Checker panel
- Select the Options icon
- Select New Bookmarks from Structure
- In the Structure Elements dialog box, select the element(s) (e.g., headings) that you want to use as bookmarks.
- Click OK.
Citation and Image Source: Accessible PDFs: When Bookmarks Are Required Posted on June 1, 2019 by Mary Gillen
In Conclussion (Bookmarks)
For long PDFs with headings, it seems prudent and helpful to include bookmarks based on the orginal document's headings. While not strictly a WCAG success criteria, it will improve the PDF's accessibiity which is the ultimate goal.
That said, I welcome hearing from PDF Accessibility experts who have a different opinion or offer new information.
PDF Tools
Adobe Acrobat
The most popular software for creating PDFs is Adobe Acrobat. Among other functions it provides OCR, editing, and remediation. It's not to be confused with the free Adobe Reader.
For document authors who already have licenses for Acrobat, the built in Accessibility Checker is an obvious choice for auditing PDF. In most cases, that checker will be enough.
However, document authors should consider what guidelines they mean to follow when checking the options for the checker. If they are not folowing the PDF/AU guidelines they should not select the "accessibility permission flag" or "missing bookmarks" criteria. The remaining criteria will work for both WCAG and PDF/AU.
Other Adobe PDF Products
- Acrobat Distiller: A software application that converts documents from PostScript format to Adobe PDF. (A PostScript file, or PS file, is a sequence of drawing instructions that specify how to create text, images, shapes, and other graphical components on a page.)
- Acrobat Pro: A version of Adobe Acrobat that includes more advanced features like editing scanned documents, comparing PDFs, and redacting sensitive information.
- Acrobat Reader: Free software for mostly viewing PDFs. You can not create PDFs with Acrobat reader.
- Acrobat Sign: A cloud-based service that allows users to send, sign, track, and manage electronic signatures for documents
- Acrobat Standard: A subscription-based version of Adobe Acrobat that allows users to create, edit, sign, and track PDFs on multiple devices.
- FrameMaker: A tool for creating and publishing technical content for a variety of devices, including mobile, web, desktop, and print. It's designed for technical communicators, information architects, designers, developers, and other documentation specialists.
- Illustrator: a vector graphics design program that lets users create a variety of digital and printed images. You can export Illustrator files to PDF, but you have to manually tag the PDFs outside of Illustrator to make them accessible.
- InDesign: A desktop publishing and page layout software application that allows users to create designs and content for print and digital media (including PDFs.) InDesign replaced Adobe PageMaker in 1999.
Some Alternative Free PDF Editors
This is not meant to be a complete list or intended as recommendations.
- Indigo PDF Tools
- PDFGear
- Stirling PDF
Additional PDF Tools
Depending on how much PDF remediation you perform, other tools may be helpful or necessary.
I'm not personally recommending any of the following. I have limited experience using any of them. (As stated previously, I avoid PDF remediation as much as possible, but it can't always be avoided.)
- ABBYY FineReader PDF (OCR) by ABBYY
- CommonLook PDF is
the world's leading PDF remediation software plugin for Adobe Acrobat, enabling users to test, repair, and report on accessible PDF documents.
- PAC (PDF Accessibility Checker) 2024 (Free)
- AxesPDF from axes4.
- See Dax Castro's AxesPDF's Table Editor Video
- iText PDF: Our PDF toolkit offers you one of the best-documented and most versatile PDF engines in the world (written in Java and .NET), which allows you to not only integrate PDF functionalities into your workflow, but also in your applications, processes or products.
- pdfGoHTML:
pdfGoHTML substantially speeds up the creation and evaluation of tagged PDFs and ensures a much higher degree of usability of tagged PDF files. One simple click on the plug-in button converts the tagged PDF into HTML making it easy to examine the tagging structure, have a more flexible reading experience or make the document accessible for people with visual disabilities or dyslexia.
Links for PDFs
- PDF Accessibility group on Facebook
- PDF Accessibility on the Web: Tricks and Traps: Ricky Onsman WordPress Accessibility Meetup on date (VIDEO 1:23:23)
- Tagged PDF channel on YouTube
- PDF Accessibility Testing with Denis Boudreau (55:05)
- HHS - PDF File 508 Checklist
- HHS - HHS Section 508 Accessibility checklists (for PDF and MS Office Docs)
- Webaim - PDF Accessibility
- W3C - PDF Techniques for Web Content Accessibility Guidelines 1.0 and 2.0
- Evaluating the Acrobat PDF accessibility checker
- W3C - PDF Techniques for WCAG 2.0
- PDF Accessibility Overview from Adobe
- Create Accessible PDFs from Section508.gov
- Creating Accessible PDFs: The ultimate guide to accessible PDFs from LinkedIn Learning (5h 33m course)