SpringerPlus

Table 1 The UPTI dataset splits used in this work

From: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks

Sets	Text-lines	Characters
Train set	6800	6,35,107
Validation set	1600	1,62,513
Test set	1600	1,73,029
Total	10,000	9,70,649

The total number of instances of each character in training, testing and validation sets

Back to article page