Log in

Sign up for our mailing list!

When & Where
Date: 
Tue, September 12, 2017 - 1:00 PM to 2:00 PM
Location: 
Barrows 356: Convening Room
Description
Type: 

Getting research materials in a digital form that you can search and computationally analyze can be a time-consuming initial step in the research process. While Adobe Acrobat can do basic optical character recognition (OCR, transforming an image of a text into editable text), it performs poorly on documents with complex layouts or non-English text.

This workshop will cover how to use ABBYY FineReader, professional-level OCR software, via the OCR virtual research desktop provided by Research IT or in the D-Lab. It will also briefly cover the pros and cons of FineReader compared to the open-source OCR package Tesseract, and how you can use Tesseract on the Savio high-performance compute cluster for large-scale OCR jobs.

Prior knowledge: No prior knowledge is required for this workshop. Register if you have any interest in learning more about OCR tools and resources.

Technology requirement: None. This workshop will demonstrate realistic applications of OCR software.

 

Materials: 
Details
Training Host: 
D-lab Facilitator: 
Evan Muzzall
Log in to register for this training.