Skip to Main Content

OCR for Canvas (for faculty)

Overview

OCR for Canvas is a collaborative effort between departmental administrative assistants, Collins Library, Student Accessibility and Accommodation, and faculty to ensure the accessibility of course readings made available to our students via Canvas.  OCR (optical character recognition) in documents supports the principles of universal design for learning.  Running scanned documents through an OCR application is critical for our students who use screen readers or text-to-speech software, and it makes texts searchable for all students.  

Link to OCR for Canvas Request Form

[Please make sure you are logged in to the University of Puget Sound Google Suite.]

Please submit all OCR for Canvas requests via the form.

What happens after I submit my request?

After the faculty member and/or departmental administrative assistant submits the request, the following occurs:

  • Library staff will quickly check library holdings. 
  • If the material is already available electronically via one of the library's databases--nearly all of which already provide high quality OCR--the permalink will be sent to you.
    • It is best practice to provide the permalink rather than upload the PDF to Canvas.  
  • If the library owns only the print format of the material, we will scan the material, within the limits of copyright law, run it through an OCR application, and perform a quality check, including making corrections as needed. 
    • You will receive a link to a Google Drive folder containing the final PDF(s).
    • Please make sure that you are logged into the Puget Sound Google Suite in order to access the materials. 
    • Please download the PDFs and then upload them to your Canvas course site, where access is limited to enrolled students, faculty, and instructional staff.  
    • Copyrighted materials should not be posted to the open web or shared via email. 
  • If the library does not own the material, we will quickly inform the faculty member and/or administrative assistant and the appropriate Liaison Librarian will be consulted to help fulfill the request.

 

Frequently Asked Questions

What is the turnaround time for requests?

  • 2-4 day turnaround time for most requests.
  • Requests may take longer to fulfill when:
    • there is a high volume of requests (usually at the beginning of the semester).
    • the library does not own the material.
  • We ordinarily will process requests in the order we receive them, but if there is a high volume of requests, we will use the needs-by dates to prioritize our work.

What OCR application does Collins Library use?

We use ABBYY FineReader, which not only OCRs text characters, but also accurately captures charts, tables, images, and other media.  It recognizes 192 languages, and can accurately capture multilingual documents. 

What copyright concerns should I be aware of?

Copyright and fair use are complex issues.  Please see the library's Copyright Guide for detailed information, including helpful checklists for making decisions in an educational setting.  

What will an OCR'd PDF look and act like?

  • The text will be searchable (E.g. CTRL-F)
  • Two page spreads will be split and straightened
    • Non-relevant pages will be deleted (E.g. the last page of a chapter that was on the facing page of the relevant chapter)
    • If text runs across a full page spread, we will keep that original format
  • Pages may be cropped to make for a cleaner image
  • Pictures, handwriting, maps and graphs, i.e. anything visual will be tagged as an image not text
    • We can insert into the document alt text describing those items if the text is provided
    • Handwritten notes, annotations, and other markings will be removed if possible so that the most accessible version of the document can be presented to students. If you'd prefer we leave your annotations, that is an option in the form.

What does the OCR process and using ABBYY look like?

During import into ABBYY, pages are automatically split and recognized.

 

OCR Staff process the images. Pages are cropped, backgrounds whitened as high contrast is best for accessibility, and marks are removed.

 

 

This is another example of removal of markings and annotations.

 

 

After editing of the images, pages are re-recognized. The verification process then occurs to either confirm a word was recognized correctly or to fix errors. Anything highlighted in blue is a potential error OCR Staff must look at during the verification process. The document is then saved as a searchable pdf. The final product can be seen below!

 

 

Contact

Questions about any aspect of the OCR for Canvas process should be sent to OCR@pugetsound.libanswers.com and a member of the OCR for Canvas team in Collins Library will respond as quickly as possible.