Sensors, Vol. 24, Pages 3892: A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization

1 month ago 24

Sensors, Vol. 24, Pages 3892: A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization

Sensors doi: 10.3390/s24123892

Authors: Alvaro Barreiro-Garrido Victoria Ruiz-Parrado A. Belen Moreno Jose F. Velez

In the realm of offline handwritten text recognition, numerous normalization algorithms have been developed over the years to serve as preprocessing steps prior to applying automatic recognition models to handwritten text scanned images. These algorithms have demonstrated effectiveness in enhancing the overall performance of recognition architectures. However, many of these methods rely heavily on heuristic strategies that are not seamlessly integrated with the recognition architecture itself. This paper introduces the use of a Pix2Pix trainable model, a specific type of conditional generative adversarial network, as the method to normalize handwritten text images. Also, this algorithm can be seamlessly integrated as the initial stage of any deep learning architecture designed for handwritten recognition tasks. All of this facilitates training the normalization and recognition components as a unified whole, while still maintaining some interpretability of each module. Our proposed normalization approach learns from a blend of heuristic transformations applied to text images, aiming to mitigate the impact of intra-personal handwriting variability among different writers. As a result, it achieves slope and slant normalizations, alongside other conventional preprocessing objectives, such as normalizing the size of text ascenders and descenders. We will demonstrate that the proposed architecture replicates, and in certain cases surpasses, the results of a widely used heuristic algorithm across two metrics and when integrated as the first step of a deep recognition architecture.

Read Entire Article