Analysis of Convolutional Neural Networks for Document Image Classification

Tensmeyer, Chris; Martinez, Tony

Computer Science > Computer Vision and Pattern Recognition

arXiv:1708.03273 (cs)

[Submitted on 10 Aug 2017]

Title:Analysis of Convolutional Neural Networks for Document Image Classification

Authors:Chris Tensmeyer, Tony Martinez

View PDF

Abstract:Convolutional Neural Networks (CNNs) are state-of-the-art models for document image classification tasks. However, many of these approaches rely on parameters and architectures designed for classifying natural images, which differ from document images. We question whether this is appropriate and conduct a large empirical study to find what aspects of CNNs most affect performance on document images. Among other results, we exceed the state-of-the-art on the RVL-CDIP dataset by using shear transform data augmentation and an architecture designed for a larger input image. Additionally, we analyze the learned features and find evidence that CNNs trained on RVL-CDIP learn region-specific layout features.

Comments:	Accepted ICDAR 2017
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1708.03273 [cs.CV]
	(or arXiv:1708.03273v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1708.03273

Submission history

From: Chris Tensmeyer [view email]
[v1] Thu, 10 Aug 2017 15:50:30 UTC (1,779 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2017-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chris Tensmeyer
Tony R. Martinez
Tony Martinez

export BibTeX citation

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computer Vision and Pattern Recognition

Title:Analysis of Convolutional Neural Networks for Document Image Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computer Vision and Pattern Recognition

Title:Analysis of Convolutional Neural Networks for Document Image Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators