Parameter | Value(s) | Error rate (%) train set/Validation set | Number of passes | Approx. Ave. time per epoch (minutes) |
---|---|---|---|---|
Learning rate | \(1\times 10^-3\) | 0.96/1.98 | 332 | 36 |
\(1\times 10^-4\) | 0.85/1.83 | 227 | 35 | |
\(1\times 10^-5\) | 99.508/99.65 | 398 (experiment was terminated) | 40 | |
\(1\times 10^-6\) | 98.67 /98.86 | 403 (experiment was tenninited) | 40 | |
Sub-sampling | 6 and 20 | 0.85/1.83 | 227 | 35 |
6 and 40 | 1.73/3.93 | 256 | 40 | |
12 and 40 | 2.14/3.64 | 307 | 30 | |
24 and 80 | 0.8/4.47 | 289 | 55 | |
Hidden layer sizes | 2, 4 and 20 | 25.88/25.69 | 251 | 36 |
4, 10 and 30 | 13.46 /19.20 | 256 | 45 | |
2, 10 and 50 | 0.85/1.83 | 227 | 35 | |
4, 20 and 100 | 0.82/1.80 | 236 | 75 |