Redaction of PDF Files Using Adobe Acrobat Professional X
Redaction of PDF Files Using Adobe Acrobat Professional X
Redaction of PDF Files Using Adobe Acrobat Professional X
Professional X
Enterprise Applications Division
of the
Systems and Network Analysis Center (SNAC)
Information Assurance Directorate
Executive Summary
Redaction is the process of selectively removing visible and non-visible classified or sensitive information
from a document for release to a recipient (s) not cleared to view that information. The goal of the
process is to prevent inadvertent release of the information. The Portable Document Format (PDF) is a
common format for publishing documents on the web and for exchanging files between government
entities and government contractors. Redaction of PDF files is an ongoing challenge for these entities.
This document describes how to use Adobe Acrobat Professional X for redaction of PDF files.
Table of Contents
1.
Introduction .......................................................................................................................................... 1
2.
3.
4.
Conclusion ............................................................................................................................................. 8
Table of Figures
Figure 1: Preferences Dialog ......................................................................................................................... 2
Figure 2: Protection Menu ............................................................................................................................ 3
Figure 3: Redaction Dialog ............................................................................................................................ 4
Figure 4: Redaction Outline Marks ............................................................................................................... 4
Figure 5: Redaction Black Box Marks ............................................................................................................ 4
Figure 6: Sanitize Document Dialog .............................................................................................................. 5
Figure 7a: Default Redaction Properties Dialog ............................................................................................ 6
Figure 7b: Modified Redaction Properties Dialog ......................................................................................... 6
Figure 8: Redaction Mark with Custom Text ................................................................................................ 6
Figure 9: Remove Hidden Information Check List ........................................................................................ 7
Figure 10: PDF Optimizer Dialog ................................................................................................................... 8
1.
Introduction
Redaction of information from documents is an ongoing challenge. There are multiple tools to assist the
user depending on the source file and the desired destination file type. NSA has released several papers
on the topic of redaction and the removal of hidden data in Microsoft Word and PDF files. These papers
are on the NSA.gov website. Adobe Acrobat Professional X (AAPX) includes improvements on the
redaction process and a new feature, Sanitize Document, to greatly enhance redaction capabilities. This
paper describes the use of AAPX in redaction.
2.
Prior versions of Acrobat Pro had redaction functions spread across two menus, and the user had to
perform multiple steps to remove hidden data. Even after completing the steps, some complex
structures may have retained hidden data, such as layers in diagrams. In version X, Adobe added a
Sanitize Document function to accomplish a one-button sanitization that removes metadata, embedded
content and attached files, scripts, hidden layers, embedded search indexes, bookmarks, stored form
data, review and comment data, hidden data from previous document saves, obscured text and images,
comments hidden within the body of the PDF file, unreferenced data, links, actions and JavaScripts, and
overlapping objects.
The new Sanitize Document feature is different from the Remove Hidden Information feature which
allows the user to choose what content to remove and what to leave in. When the user selects Remove
Hidden Information, AAPX creates a list of items to remove and the user can un-select things to keep in
the document. The user must then click the Remove button to remove all of the selected items. On the
other hand, Sanitize Document removes everything in one step creating a what you see is what you
get document. In environments where the user must prepare many documents for declassification or
for release where no advanced functionality is necessary in the final copy, Sanitize Document will save
time and reduce the likelihood of process errors.
Complex images with hidden layers had to be handled individually in prior versions of Acrobat Pro.
Documents with diagrams, embedded spreadsheets, etc. had to be saved as JPEGs to insure removal of
hidden layers the whole document had to be converted to JPEG images to ensure removal of all
hidden layers. AAPX now handles hidden layers correctly with both Sanitize Document and Remove
Hidden Information. The user is given the option to leave those items as is with the Remove Hidden
Information option.
3.
Assuming the user is starting the redaction process with a PDF document (rather than a Microsoft Word
or other format), these are the high-level steps to perform redaction in AAPX (detailed steps are listed in
the next section):
Page | 1
1. Start AAPX.
2. Turn off JavaScript. If you are trying to remove hidden malicious code from a document, it is
best to not be hit with it yourself.
3. Open a copy of the original document in AAPX (always work with a copy).
4. Redact Information.
5. Apply Redactions.
6. Sanitize Document.
7. (Optional) Save Document with File->SaveAs->Optimized PDF.
4.
Detailed Steps
4.1.
For documents created by the user or from a known source, turning off JavaScript is optional. However,
this step is essential for sanitization of documents of unknown origin that might contain malware or
other executable content.
JavaScript is one of the things that should be removed from PDF documents during the redaction
process when the desire is to produce a what you see is what you get document. Since some PDF
documents might contain malicious JavaScript, it is best to not be hit with such attacks while trying to
prevent them. After opening AAPX and before opening a document, the user should turn off JavaScript:
1. On the top menu, select Edit->Preferences to bring up the Preferences dialog
2. Under Categories, select JavaScript. This brings up the JavaScript palate shown in Figure 1.
3. JavaScript is enabled by default as shown in Figure 1. Uncheck Enable Acrobat JavaScript and
leave everything else as shown above, then click OK.
Now it is safe to open the document for redaction.
4.2.
Page | 3
Page | 4
6. After applying redactions, you will see a dialog box that says Redactions have been successfully
applied. Would you like to also find and remove hidden information in your document? Select
no. You can choose this option from the right panel later, or later chose Sanitize Document
instead.
7. You can continue to Mark and Apply in the document. When you are finished marking all of the
redactions, you are ready to sanitize the document.
4.3.
Sanitize Document
After marking and applying all of the redactions for the document:
1. Select Sanitize Document from the right panel menu. This brings up a dialog shown in Figure 6.
The user can customize the redaction properties rather than using the default black box mark. The right
panel menu includes a Redaction Properties choice which displays the dialog in Figure 7a.
Page | 5
The figure on the left shows the default settings. Some redaction may require redaction marks that
explain the reason for the redaction, such as for U.S. FOIA or U.S. Privacy Act requirements. AAPX allows
the user to insert custom text in the redaction marks. To insert custom text, check Use Overlay Text on
the top right of the dialog. The greyed-out fields become usable and the user can insert either custom
text or choose Redaction Code from the codes shown in the dialog. Be careful not to use classified or
sensitive text as the custom overlay text.
Figure 8 shows how the redaction marks appear with the Custom Text set to MY OVERLAY TEXT as
shown in figure 7b above:
Page | 6
4.5.
Sanitize Document removes all form and JavaScript functionality from the document, all tags, all
comments, everything! In those cases where the user needs to leave some functionality in the
document, the user can use the Remove Hidden Information option and select individual items to leave
in the document; however, this reduces the assurance that all hidden data is removed and means the
user must save the document as optimized PDF (see next section). Sanitize Document includes
optimizing the PDF format. The Remove Hidden Information option should only be used for special
circumstances. Sanitize Document is the recommended option otherwise.
When the user selects Remove Hidden Information, AAPX searches the document for hidden
information and displays a checklist on the left menu pane as shown in Figure 9:
Multiple vendors have applications that create PDF files. These files are not always well-formed or
standardized PDF and could contain sensitive or classified remnant data within the file format. The
Sanitize Document option includes optimizing the PDF format to correct these problems. However, if the
Page | 7
user chose Remove Hidden Data instead of Sanitize Document (see previous section), the user should
then save the file as Optimized PDF.
1. Select File->SaveAs->Optimized PDF from the top menu. This brings up the dialog in Figure 10.
5.
Conclusion
When using AAPX for redaction of sensitive or classified information from documents meant for public
release, the steps outlined in this paper with the Sanitize Document option are the recommended
procedure. This produces a what you see is what you get final document.
Page | 8