SAN MATEO (04/24/2000) - You've got 5,000 pages of human resources manuals your boss wants you to post on the intranet by tomorrow, and the documents must retain the original formatting and be searchable by users. You can either stay up all night scanning, formatting, and indexing, or you can load Adobe Systems Inc.'s Acrobat Capture.
Capture has a narrowly defined purpose: converting scanned hard-copy documents into collections of searchable electronic documents. It controls the scanning, zoning, and recognition of scanned pages and then converts them into Acrobat's PDF files.
With the just-released Version 3.0, Cluster Edition, Adobe has transformed Capture into a tool suitable for large enterprises with heavy loads of documents to move to the Web.
The new Cluster Edition has no page limitation, offers load balancing across multiple workstations, and supports both dual-and quad-processor systems.
Administrators can dole out various processing jobs to specified workstations, for example, routing all documents to a designated staff person for zoning and character recognition.
By automating the processing of documents from hard copy to searchable electronic text and by allowing you to distribute the workload within the enterprise, Capture can save your staff scads of time, not to mention saving them from tedium.
Other programs cover the same basic ground, most notably ZyLab's ZyImage, which offers more flexibility than Capture. For example, Capture can handle only scanned documents or TIF images for conversion, whereas ZyImage can incorporate a wide variety of formats.
But Capture is decidedly easier to use than ZyImage and, thanks to its multiprocessor capabilities, is more scalable. If you're only going to be dealing with scanned documents and PDF files are suitable output, you won't find a more powerful, easier-to-use tool than Adobe's.
Capture's new interface is at first a tad daunting, in part because it displays so much information. The left-hand panel has four tabs: Configure, Scan, Submit, and Watch. Clicking on the Configure tab causes Capture to display all of the available workflows, which are recipes for how documents will be processed, indicating which steps will be performed by whom. Creating new workflows, as well as editing existing ones, is easy with a generous set of templates. You can drag and drop steps from one workflow to another, and customizing each step is straightforward via intelligently designed dialog boxes.
The Scan tab provides access to scanner configuration tools. The Submit tab summons a dialog box that lets you specify files for processing in a workflow.
With the Watch tab, you can set up folders to be watched for new files to be processed through workflows. After you've set up directories to be watched, they'll be checked periodically as new files are foundOnce you've set up the workgroup by designating which users are authorized to access workflows, it's easy to specify the steps in a workflow -- from scanning to zoning, from optical character recognition to exporting -- that can be performed on selected workstations.
I noticed that network traffic increases with the use of a workgroup, and the workstation that serves as the hub must dedicate a significant share of processor time to Capture. Accordingly, you'll want either to devote a workstation to serve as a hub or to use your fastest system as the hub.
Also new is the QuickFix utility, which provides excellent tools for checking and repairing suspect words. You can make the submission of suspect words to QuickFix part of any workflow, in which case Capture will pause at the appropriate stage for an editor to complete the checking procedure.
QuickFix offers a very effective interface for document checking. Suspect words are presented in a table with the suggested spellings. The editor is given three options for each word: Accept the suggestion, delete the word, or edit it.
QuickFix also offers unexpected flexibility in that it allows the editor to sort the suspects alphabetically, by their order of appearance, by degree of confidence, or by reason.
But Capture's optical character-recognition engine is accurate enough that you'll rarely need to make corrections. I found that it did an excellent job of zoning graphics, headlines, and body text into separate regions and recognizing text, even on complex pages loaded with graphics. If your documents are in good shape, you may never need QuickFix, but it comes in handy if you're converting tattered, faded, or otherwise degraded pages.
Another feature new to Version 3.0 is the automatic creation of document links during recognition. You might, for example, set the program to create a table of contents, bookmarks, indexes, e-mail addresses, or URLs when the program finds appropriately formatted text.
Capture's main display panel allows administrators to track the status of documents, view the current status of workflow steps on the local workstation, and view the status of all workflows on all of the workstations in the workgroup.
The interface is concise and easy to grasp. The only things I found myself wishing for were more flexible alert tools. The program provides audio and visual alerts if a warning is logged or a manual step is awaiting execution, but they require that Capture be running on your local workstation. It would be useful, especially for larger workgroups, if a logged warning could be configured to trigger an e-mail alert to an administrator.
The end result of a Capture workflow is a cross-platform PDF document that is eminently searchable on the LAN or via the Internet using Adobe Acrobat 4.0.
You can also make pushing a copy of the PDF document to specified e-mail addresses a part of the workflow.
In addition, Capture supports the ODMA (Open Document Management API), and Adobe provides a software development kit that allows programmers to integrate Capture with document workflow, fax, and e-mail applications.
Acrobat Capture is a powerful, scalable, and easy-to-use solution for turning hard-copy documents into searchable, online documents that faithfully reproduce the layout and formatting of the original.
If your company has large volumes of documents that you need to get online quickly and in searchable form, Acrobat Capture 3.0, Cluster Edition is an elegant solution. Perhaps best of all, it will save your staff lots of time and may encourage your company to make information available online to customers and employees that would otherwise remain unaccessible.
Patrick Marshall (firstname.lastname@example.org.) is an InfoWorld contributing editor.
He is reviews editor at Federal Computer Week.
THE BOTTOM LINE: EXCELLENT
Adobe Acrobat Capture 3.0, Cluster EditionBusiness Case: Acrobat Capture quickly converts paper documents to searchable electronic files with a minimum of staff time. This time-saving product should encourage organizations to put volumes of otherwise unavailable documents online.
Technology Case: Acrobat Capture turns hard-copy pages into cross-platform PDF files viewable via Adobe Acrobat. The Cluster Edition's support for multiple processors and distributed processing means volumes of text can be processed without delay.
+ Easy-to-manage workflows
+ Distributed processing and load balancing+ Multiprocessor supportCons:
- Limited alert capabilities
Cost: $699, Standard Edition; $7,000 per processor, Cluster EditionPlatform(s): Windows NT 4.0, Service Pack 3 or laterAdobe Systems Inc., San Jose, Calif.; (800) 833-6687; www.adobe.com.