Go to Laboratory Home Go to Laboratory Home PageGo to Laboratory PhoneGo to Laboratory Search
Abstract

As part of the Department of Energy document declassification program, we have developed a numerical rating system to predict the OCR error rate that we expect to encounter when processing a particular document. The rating algorithm produces a vector containing scores for different document image attributes such as speckle and touching characters. The OCR error rate for a document is computed from a weighted sum of the elements of the corresponding quality vector. The predicted OCR error rate will be used to screen documents that would not be handled properly with existing document processing products.

M. Cannon, P. Kelly, S. SitharamaIyengar and Nathan Brener. An automated system for numerically rating document image quality. In SPIE Vol. 3027 Document Recognition IV, pp. 161-167, 1997. Los Alamos National Laboratory Technical Report LA-UR-97-0214.   [   Abstract   |   Postscript (920 KB)   |   PDF (76 KB)   ]