Documentation for Terrier v4.2

Welcome to the documentation for the Terrier IR platform v4.2. If you are a new user, we recommend that you begin with a quickstart guide from those listed below. The quickstart guides will introduce you to core concepts when using Terrier within different scenarios. If you are looking to find out about a particular function or component of Terrier, scroll down this page to the main Table of Contents.

Quickstart Guides

Running Batch IR Experiments with Terrier

This quickstart guide is designed for information retrieval students and researchers looking to use Terrier to experiment with or learn about some aspect of a search engine. The main learning outcomes are: how to download and install a local copy of the Terrier platform; how to produce an on-disk index from a collection of documents; and how to issue single queries as well as batches of queries over that index from the command line.

Integrating Terrier as a Search Engine into your Java Application

This quickstart guide is for software developers that want to use Terrier as a search engine within their own application. The guide covers how to import Terrier as an application dependancy using Maven, how to create an index, how to index files within your java program, and how to issue queries to the index. A variant of the quickstart shows the same using exclusively in-memory data structures.

Table of Contents

Platform Information

An overview of what the Terrier platform is, and what it can be used for.

What's New
What has changed in the Terrier platform in the recent releases.

Terrier Components
An overview of the main components of Terrier.

Query Language
A description of the query language that Terrier supports.

Future Features & Known Issues
Upcoming features in future releases.

Quickstart Guides

Running Batch IR Experiments with Terrier
A quickstart guide is designed for information retrieval students and researchers.

Integrating Terrier as a Search Engine into your Java Application with a persistent index
Integrating Terrier as a Search Engine into your Java Application, with a memory index
Quickstart guides for software developers.

Common Configuration Options

Configuring Terrier
A brief introduction to the configuration of Terrier

Configuring Indexing
A guide of indexing, and how it can be configured to your needs.

Configuring Retrieval
A guide of the retrieval functionalities, covering frequently-used retrieval methodologies, such as TF-IDF, Okapi’s BM25, language models (Hiemstra and Ponte & Croft) and weighting models from the probabilistic Divergence From Randomness (DFR), as well as query expansion (pseudo-relevance feedback).

Configuring Real-time Index Structures
An introduction to the real-time index structures in Terrier.

Advanced Functionality

Learning to Rank with Terrier
A guide to using multiple retrieval features with learning to rank techniques to enhance search effectiveness.

Pluggable Compression
A guide to configuring byte-level compression schemes to reduce the size of Terrier’s index structures.

Non English language support
Description of support functionalities in Terrier for indexing and retrieving from documents written in languages other than English.

Hadoop MapReduce Indexing with Terrier
A guide to using the Hadoop MapReduce indexer in Terrier.

Terrier/Hadoop Configuration
A guide to how to configure Terrier to use a Hadoop cluster

Search Applications

Web-based Terrier
A guide to using the Web-based application of Terrier.

Website Search Application
A guide to using the website search application, which illustrates real-time crawling, indexing and retrieval functionalities in Terrier.

Desktop Search
A summary of the Desktop Search application of Terrier available from Github.

Experiment Support

TREC Experiment Examples
An example of how to create an index and produce a TREC run on the WT2G and Blogs06 collections.

Evaluation of Experiments
Shows how the results of experiments can be evaluated using the in-built evaluation package in Terrier.

Extending Terrier

Developing with Terrier
Introduction to developing applications using Terrier.

Extending Indexing
In depth guide about extending indexing

Indexer Details
More information about the roles of various classes in the indexing process.

Extending Retrieval
In depth guide about retrieval, and how various retrieval functionalities can be integrated into Terrier, as well as, how you can use Terrier to obtain various statistics about the terms and the collection.

Other Resources

Terrier API Javadoc
API documentation of each class in Terrier.

Description of DFR
Description of the Divergence From Randomness framework that Terrier implements.

Terrier Forum
The Terrier discussion forum is for developers and users of the Terrier platform to discuss the software, ask questions, post patches and share tips.

Terrier Wiki
Hints and tips, and configurations for various well-known corpora.

If you use Terrier in your research, please cite us!

Terrier Contacts

Contact: School of Computing Science
Copyright (C) 2004-2016 University of Glasgow. All Rights Reserved.