Text removal made fast and simple

Redax software removes selected text and images from PDF files

Many government agencies are searching for an automated solution to help them deal with new regulations covering the release of information. The problem gets even more complex when the documents are in formats other than Microsoft Corp.'s Word and include things such as rotated text or images.

Fortunately, Appligent Inc.'s Redax 3.5 specifically addresses the problem of removing selected text and images from Adobe Systems Inc.'s Acrobat PDF files.

After installing the program, a new menu option — Redax — appears at the top of the Adobe Acrobat screen. Prior to using the plug-in, you must go through a setup process to configure the default settings. You must also create a text file with exception words that will be redacted from prospective documents. This file will include specific words to look for as well as a code to denote the type of redaction that will occur. Redax ships with two code lists derived from the Freedom of Information Act and the U.S. Privacy Act.

There is also a manual method of text removal involving searching the document in Adobe Acrobat and marking text you wish to have excluded from the released version. This process uses a pop-up palette of exemption codes, making it possible to associate a specific reason for removal with each marked section.

Redax will not detect words from scanned documents or images in PostScript format. Using tools such as the Find Text Areas and Find Image Areas helps identify these potential problems so they don't go undetected. Once found, they can be manually marked for exclusion.

Redax templates make it easy to deal with forms by having information in the same location. A template can be applied across any number of forms to remove a specific field on each one. To create a template, simply mark the areas to exclude from one document and then choose the Export Redax Template option from the menu. To apply a template to other documents, use the Import Redax Template option. The only downside is that you must either combine or manually load all of the pages that need to be processed into a single document.

Generating a detailed report of each item removed from the document is a useful tool for determining how many items were identified. The Report option generates a tab-delimited file containing the page number, creation date and time, color, exemption code, author — as defined in Redax's preferences — and any note associated with it.

One caveat from the document- ation encourages users to "check each redaction individually" for missed exemptions caused by typographical errors, hyphenation or other irregularities. That could be tedious for large documents.

The documentation warning highlights the point that no computer program is 100 percent accurate in performing the redaction process. Human intervention is still necessary to give the final product a thorough examination. Redax provides the tools necessary to help automate the release process as much as possible and to document what information was removed. The final results will depend a lot on the person operating the tool.

Ferrill, based in Lancaster, Calif., has been writing about software for almost 15 years. He can be reached at paul.ferrill@verizon.net.

***

Redax benefits

Using Appligent Inc.'s Redax, users can:

* Remove text and images.

* Replace removed text with text characters and images with black pixels or blank space.

* Customize removal templates.

* Comply with Freedom of Information Act and Privacy Act rules.

REPORT CARD

Redax 3.5

Appligent Inc.
(610) 284-4006
www.appligent.com

Redax 3.5 costs $349.

Redax 3.5 helps automate the tedious task of removing text and images from official documents that cannot be released. Although it doesn't eliminate the need for human inspection, it does provide tools for finding and deleting nonreleasable information. It has a number of special cases, such as PostScript images and form fields, that it cannot handle, but the documentation offers a workaround for those situations. It runs on Microsoft Corp.'s Windows 98, NT 4.0, 2000 Professional and XP. It requires Adobe Systems Inc. Acrobat 5.0 or higher and will not work with Acrobat Reader.

We tested Redax 3.5 on a Hewlett-Packard Co. Compaq Evo desktop system with a 1.7 GHz processor, 1G of memory and Windows XP.

NEXT STORY: Oracle still on PeopleSoft trail