Welcome to the documentation for the Terrier IR platform v4.2. If you are a new user, we recommend that you begin with a quickstart guide from those listed below. The quickstart guides will introduce you to core concepts when using Terrier within different scenarios. If you are looking to find out about a particular function or component of Terrier, scroll down this page to the main Table of Contents.
This quickstart guide is designed for information retrieval students and researchers looking to use Terrier to experiment with or learn about some aspect of a search engine. The main learning outcomes are: how to download and install a local copy of the Terrier platform; how to produce an on-disk index from a collection of documents; and how to issue single queries as well as batches of queries over that index from the command line.
This quickstart guide is for software developers that want to use Terrier as a search engine within their own application. The guide covers how to import Terrier as an application dependancy using Maven, how to create an index, how to index files within your java program, and how to issue queries to the index. A variant of the quickstart shows the same using exclusively in-memory data structures.
An overview of what the Terrier platform is, and what it can be used for.
What has changed in the Terrier platform in the recent releases.
An overview of the main components of Terrier.
A description of the query language that Terrier supports.
Future Features & Known Issues
Upcoming features in future releases.
Running Batch IR Experiments with Terrier
A quickstart guide is designed for information retrieval students and researchers.
Integrating Terrier as a Search Engine into your Java Application with a persistent index
Integrating Terrier as a Search Engine into your Java Application, with a memory index
Quickstart guides for software developers.
A brief introduction to the configuration of Terrier
A guide of indexing, and how it can be configured to your needs.
A guide of the retrieval functionalities, covering frequently-used retrieval methodologies, such as TF-IDF, Okapi’s BM25, language models (Hiemstra and Ponte & Croft) and weighting models from the probabilistic Divergence From Randomness (DFR), as well as query expansion (pseudo-relevance feedback).
Configuring Real-time Index Structures
An introduction to the real-time index structures in Terrier.
Learning to Rank with Terrier
A guide to using multiple retrieval features with learning to rank techniques to enhance search effectiveness.
A guide to configuring byte-level compression schemes to reduce the size of Terrier’s index structures.
Non English language support
Description of support functionalities in Terrier for indexing and retrieving from documents written in languages other than English.
Hadoop MapReduce Indexing with Terrier
A guide to using the Hadoop MapReduce indexer in Terrier.
A guide to how to configure Terrier to use a Hadoop cluster
A guide to using the Web-based application of Terrier.
Website Search Application
A guide to using the website search application, which illustrates real-time crawling, indexing and retrieval functionalities in Terrier.
A summary of the Desktop Search application of Terrier available from Github.
TREC Experiment Examples
An example of how to create an index and produce a TREC run on the WT2G and Blogs06 collections.
Evaluation of Experiments
Shows how the results of experiments can be evaluated using the in-built evaluation package in Terrier.
Developing with Terrier
Introduction to developing applications using Terrier.
In depth guide about extending indexing
More information about the roles of various classes in the indexing process.
In depth guide about retrieval, and how various retrieval functionalities can be integrated into Terrier, as well as, how you can use Terrier to obtain various statistics about the terms and the collection.
Terrier API Javadoc
API documentation of each class in Terrier.
Description of DFR
Description of the Divergence From Randomness framework that Terrier implements.
The Terrier discussion forum is for developers and users of the Terrier platform to discuss the software, ask questions, post patches and share tips.
Hints and tips, and configurations for various well-known corpora.
If you use Terrier in your research, please cite us!