A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts
Date
2014Metadata
Show full item recordAbstract
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide information about tangible and intangible cultural aspects from the past. Massive digitization projects have made these kind of data available to a world-wide population, and pose real challenges for automatic processing. In this scenario, document layout analysis plays a significant role, being a fundamental step of any document image understanding system. In this paper, we present a completely automatic algorithm to perform a robust text segmentation of old handwritten manuscripts on a per-book basis, and we show how to exploit this outcome to find two layout elements, i.e., text blocks and text lines. Our proposed technique have been evaluated on a large and heterogeneous corpus content, and our experimental results demonstrate that this approach is efficient and reliable, even when applied to very noisy and damaged books.
BibTeX
@inproceedings {10.2312:gch.20141302,
booktitle = {Eurographics Workshop on Graphics and Cultural Heritage},
editor = {Reinhard Klein and Pedro Santos},
title = {{A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts}},
author = {Pintus, Ruggero and Yang, Ying and Gobbetti, Enrico and Rushmeier, Holly},
year = {2014},
publisher = {The Eurographics Association},
ISSN = {2312-6124},
ISBN = {978-3-905674-63-7},
DOI = {10.2312/gch.20141302}
}
booktitle = {Eurographics Workshop on Graphics and Cultural Heritage},
editor = {Reinhard Klein and Pedro Santos},
title = {{A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts}},
author = {Pintus, Ruggero and Yang, Ying and Gobbetti, Enrico and Rushmeier, Holly},
year = {2014},
publisher = {The Eurographics Association},
ISSN = {2312-6124},
ISBN = {978-3-905674-63-7},
DOI = {10.2312/gch.20141302}
}