Skip to main content

Table 1 The UPTI dataset splits used in this work

From: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks

Sets

Text-lines

Characters

Train set

6800

6,35,107

Validation set

1600

1,62,513

Test set

1600

1,73,029

Total

10,000

9,70,649

  1. The total number of instances of each character in training, testing and validation sets