Query Language

Terrier offers two query languages - a high-level, user facing query language, and a low-level query language for developers which is expressed in terms of matching operations (matching ops). All user queries are rewritten down into matching operations. The matching op query language borrows from the Indri and Galago query languages.

User Query Language

Terrier offers a user flexible query language for searching with phrases, fields, or specifying that terms are required to appear in the retrieved documents.

Some examples of Terrier's query language are the following:

Combinations of the different constructs are possible as well. For example, the query term1 term2 -"term1 term2" would retrieve all the documents that contain at least one of the terms term1 and term2, but not the documents where the phrase "term1 term2" appears.

Note that in some configurations, the Terrier query language may not be available by default. In particular, if batch processing queries from a file using a class that extends TRECQuery, then the queries are pre-processed by a tokeniser that may remove the query language characters (e.g. brackets and colons). To use the Terrier query language in this case, you should use SingleLineTRECQuery and set SingleLineTRECQuery.tokenise to false in the terrier.properties file.

Matching Op Query Language

In general, this follows a subset of the Indri query language:

Using the Matching Op Query Language

You can use the matchingop query language in interactive querying command by passing the -m option. The prompt will be matchop query>, as shown in the exxample below:

$ bin/terrier interactive -m Setting TERRIER_HOME to /home/Terrier 23:33:14.496 [main] INFO o.t.structures.CompressingMetaIndex - Structure meta reading lookup file into memory 23:33:14.503 [main] INFO o.t.structures.CompressingMetaIndex - Structure meta loading data file into memory matchop query> #combine:0=0.85:1=0.15:2=0.05(#combine(dramatise personae) #1(dramatise personae) #uw8(dramatise personae)) etc

Similarly, batchretrieve command also takes a -m option, whereby the queries will be assumed to be in matchingop query language. $ cat mytopics 1 terrier #1(information retrieval) 2 systems $ bin/terrier batchretrieve -s -m -t mytopics where -m defines that matchingop query language will be used, and -s defines that topics are in single-line format.


Webpage: http://terrier.org
Contact: School of Computing Science
Copyright (C) 2004-2018 University of Glasgow. All Rights Reserved.