[TR-239] Clarify when Terrier Query language can be used viz TREC Created: 10/Jan/14  Updated: 01/Apr/14  Resolved: 01/Apr/14

Status: Resolved
Project: Terrier Core
Component/s: .querying
Affects Version/s: 3.5
Fix Version/s: 3.6

Type: Bug Priority: Major
Reporter: chedi bechikh Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None


i have indexed 2 documents that contain 4 words, used blok indexing

fist doc: english football player dance
second doc: football player dance english

i have a query phrase "english football", but the result for the tow document is tha same as english football, there is a thing that i missing to use phrase?

also i tested english football^2.3 the reult is the same

what is the problem?


Comment by chedi bechikh [ 12/Jan/14 ]

i fix it, the problem is that documentation is incomplete, please note in the documentation that to use query language we must use single line query
Best regards

Comment by Craig Macdonald [ 04/Mar/14 ]

Documentation fix is required

Comment by Richard McCreadie [ 06/Mar/14 ]

I this is a legacy of the Tokenisation change in 3.5.

SingleLineTRECQuery uses the new Tokeniser.getTokeniser() method, which defaults to EnglishTokenizer (that does not keep query language characters).
TRECQuery uses TRECFullTokenizer that keeps all characters

tokeniser should be set to a tokeniser that supports the query language.

Comment by Richard McCreadie [ 31/Mar/14 ]

Tagging for addition in configure_retrieval.html

Possible additions

  • Difference between TRECQuerying, TRECQuery and SingleLineQuery
  • Effect of Tokenisation
  • Query language
Comment by Richard McCreadie [ 01/Apr/14 ]

Updated query language documentation. Resolving issue.

Generated at Thu Sep 24 21:29:38 BST 2020 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.