Skip to main content

Table 3 Test collection characteristics

From: Econo-ESA in semantic text similarity

Dataset Docs. Qys. Rel. Document terms
     Min. Q1 Med. Q3 Max.
LISA 6,004 35 335 11 68 96 128.25 352
NPL 11,429 93 2,083 3 25 39 58 293
CACM 3,204 64 796 3 10 23 108 455
CISI 1,460 112 3,114 13 97 137 186 676
Cranfield 1,400 225 1,838 1 113 165 241.25 738
Time 423 83 324 91 399 612 918 6,618
Medline 1,033 30 696 24 107 159 226 758
ADI 82 35 170 28 60.25 70.5 80 216
Query terms Explanation    
Min. Q1 Med. Q3 Max.     
23 49.5 64 85 142 Abstracts collection    
4 9 12 15 24 Short text    
3 8.75 16 30 62 CACM articles index    
4 20 72 122.75 335 Index of articles    
6 12 16 21 43 Index of articles    
8 15 20 23.5 46 Short text    
3 9.25 16.5 23.75 60 Medical text    
4 8 13 21.5 57 Short articles