From: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks
Sets
Text-lines
Characters
Train set
6800
6,35,107
Validation set
1600
1,62,513
Test set
1,73,029
Total
10,000
9,70,649