Example of using Terrier to index a TREC collection

Below, we give an example of using Terrier, in order to index WT2G, a standard TREC test collection. We assume that the operating system is Linux, and that the collection, along with the topics and the relevance assessments, is stored in the directory /local/ir.collections2/WT2G/.

#goto the terrier folder
cd terrier

#get terrier setup for using a trec collection
bin/trec_setup.sh /local/ir.collections2/WT2G/

#rebuild the collection.spec file correctly
find /local/ir.collections2/WT2G/ -type f | grep -v info > etc/collection.spec

#use In_expB2 DFR model for querying
echo uk.ac.gla.terrier.matching.models.In_expB2 > etc/trec.models

#use this file for the topcis
echo /local/ir.collections2/WT2G/info/info/topics.401-450.gz >> etc/trec.topics.list

#use this file for query relevance assessments
echo /local/ir.collections2/WT2G/info/qrels.trec8.small_web.gz >> etc/trec.qrels

#index the collection
bin/trec_terrier.sh -i

#add the language modelling indices
bin/trec_terrier.sh -i -l

#run the topics, with suggested c value 10.99 (see weighting_models.txt)
bin/trec_terrier.sh -r -c 10.99
#run topics again with query expansion enabled
bin/trec_terrier.sh -r -q -c 10.99
#run topics again, using language modelling instead of statistical models
bin/trec_terrier.sh -r -l

#evaluate the results in var/results/
bin/trec_terrier.sh -e

#display the Mean Average Precision
tail -1 var/results/*.eval
#MAP should be 
#PL2 Average Precision: 0.3140