Terrier IR Platform
2.2.1

uk.ac.gla.terrier.applications
Class TRECQuerying

java.lang.Object
  extended by uk.ac.gla.terrier.applications.TRECQuerying
Direct Known Subclasses:
TRECLMQuerying, TRECQueryingExpansion

public class TRECQuerying
extends java.lang.Object

This class performs a batch mode retrieval from a set of TREC queries.

Configuring

Topics

Topics can be specified for this class in two different ways. Firstly, by placing the the name of the file(s) to be processed for topics in the file etc/trec.topics.list. Secondly, by specifying the name of the file to be processed for topics in the property trec.topics. If the trec.topics property exists, then only the specified file will be processsed, otherwise, the class will attempt to read a filename from the trec.topics.list file. The location of the trec.topics.list file can be altered from the default by altering the property of the same name.

Models

If the trec.model property is specified, then all runs will be made using that weighting model. Otherwise, one run is done for each weighting model specified in the file etc/trec.models. The location of the trec.models file can be altered from the default by altering the property of the same name.

Result Files

The results from the system are output in a trec_eval compatable format. The filename of the results file is specified as the WEIGHTINGMODELNAME_cCVALUE.RUNNO.res, in the var/results folder. RUNNO is (usually) a constantly increasing number, as specified by a file in the results folder. The location of the results folder can be altered by the trec.results property. If the property trec.querycounter.type is not set to sequential, the RUNNO will be a string including the time and a randomly generated number. This is best to use when many instances of Terrier are writing to the same results folder, as the incrementing RUNNO method is not mult-process safe (eg one Terrier could delete it while another is reading it).

Properties

  • trec.topics.parser - the query parser that parses the topic file(s). TRECQuery by default. Subclass the TRECQuery class and alter this property if your topics come in a very different format to those of TREC.
  • trec.topics - the name of the topic file.
  • trec.topics.list - the name of the file containing the name(s) of the topic file(s).
  • trec.model the name of the weighting model used during retrieval.
  • trec.models - the name of the file containing the name(s) of the topic file(s).
  • c - the term frequency normalisation parameter value. A value specified at runtime as an API parameter (e.g. TrecTerrier -c) overrides this property.
  • trec.matching the name of the matching model that is used for retrieval. Defaults to Matching.
  • trec.manager the name of the Manager that is used for retrieval. Defaults to Manager.
  • trec.results the location of the results folder for results. Defaults to TERRIER_VAR/results/
  • trec.output.format.length - the very maximum number of results ever output per-query into the results file . Default value 1000.
  • trec.querycounter.type - now the number (RUNNO) at the end of a run file should be generated. Defaults to sequential, in which case RUNNO is a constantly increasing number. Otherwise it is a string including the time and a randomly generated number.
  • trec.iteration - the contents of the Iteration column in the trec_eval compatable results. Defaults to 0.
  • trec.querying.dump.settings - controls whether the settings used to generate a results file should be dumped to a .settings file in conjunction with the .res file. Defaults to true.

    Version:
    $Revision: 1.87 $
    Author:
    Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald

    Constructor Summary
    TRECQuerying()
              TRECQuerying default constructor initialises the inverted index, the lexicon and the document index structures.
    TRECQuerying(Index i)
              TRECQuerying constructor initialises the specified inverted index, the lexicon and the document index structures.
     
    Method Summary
     void close()
              Closes the used structures.
     Index getIndex()
              Get the index pointer.
     Manager getManager()
              Get the querying manager.
     java.io.PrintWriter getResultFile(java.lang.String predefinedName)
              Returns a PrintWriter used to store the results.
     void printResults(java.io.PrintWriter pw, SearchRequest q)
              Prints the results for the given search request, using the specified destination.
     void printSettings(SearchRequest default_q, java.lang.String[] topicsFiles, java.lang.String otherComments)
              prints the current settings to a file with the same name as the current results file.
     java.lang.String processQueries()
              Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism.
     java.lang.String processQueries(double c)
              Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism.
     java.lang.String processQueries(double c, boolean c_set)
              Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism.
     SearchRequest processQuery(java.lang.String queryId, java.lang.String query)
              According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.
     SearchRequest processQuery(java.lang.String queryId, java.lang.String query, double cParameter)
              According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.
     SearchRequest processQuery(java.lang.String queryId, java.lang.String query, double cParameter, boolean c_set)
              According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.
     void setIndex(Index i)
              Set the index pointer.
     
    Methods inherited from class java.lang.Object
    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Constructor Detail

    TRECQuerying

    public TRECQuerying()
    TRECQuerying default constructor initialises the inverted index, the lexicon and the document index structures.


    TRECQuerying

    public TRECQuerying(Index i)
    TRECQuerying constructor initialises the specified inverted index, the lexicon and the document index structures.

    Parameters:
    i - The specified index.
    Method Detail

    getIndex

    public Index getIndex()
    Get the index pointer.

    Returns:
    The index pointer.

    setIndex

    public void setIndex(Index i)
    Set the index pointer.

    Parameters:
    i - The index pointer.

    getManager

    public Manager getManager()
    Get the querying manager.

    Returns:
    The querying manager.

    close

    public void close()
    Closes the used structures.


    getResultFile

    public java.io.PrintWriter getResultFile(java.lang.String predefinedName)
    Returns a PrintWriter used to store the results.

    Parameters:
    predefinedName - java.lang.String a non-standard prefix for the result file.
    Returns:
    a handle used as a destination for storing results.

    processQuery

    public SearchRequest processQuery(java.lang.String queryId,
                                      java.lang.String query)
    According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.

    Parameters:
    queryId - the identifier of the query to process.
    query - the query to process.

    processQuery

    public SearchRequest processQuery(java.lang.String queryId,
                                      java.lang.String query,
                                      double cParameter)
    According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.

    Parameters:
    queryId - the identifier of the query to process.
    query - the query to process.
    cParameter - double the value of the parameter to use.

    processQuery

    public SearchRequest processQuery(java.lang.String queryId,
                                      java.lang.String query,
                                      double cParameter,
                                      boolean c_set)
    According to the given parameters, it sets up the correct matching class and performs retrieval for the given query.

    Parameters:
    queryId - the identifier of the query to process.
    query - the query to process.
    cParameter - double the value of the parameter to use.
    c_set - boolean specifies whether the parameter c is set.

    processQueries

    public java.lang.String processQueries()
    Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism. It parses the file with the queries (the name of the file is defined in the address_query file), creates the file of results, and for each query, gets the relevant documents, scores them, and outputs the results to the result file.

    Returns:
    String the filename that the results have been written to

    processQueries

    public java.lang.String processQueries(double c)
    Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism. It parses the file with the queries, creates the file of results, and for each query, gets the relevant documents, scores them, and outputs the results to the result file. It the term frequency normalisation parameter equal to the given value.

    Parameters:
    c - double the value of the term frequency parameter to use.
    Returns:
    String the filename that the results have been written to

    processQueries

    public java.lang.String processQueries(double c,
                                           boolean c_set)
    Performs the matching using the specified weighting model from the setup and possibly a combination of evidence mechanism. It parses the file with the queries creates the file of results, and for each query, gets the relevant documents, scores them, and outputs the results to the result file.

    Queries
    Queries are parse from a file. The filename can be expressed in the trec.topics property, or else the file named in the property trec.topics.list property is read, and the each file in that is used for queries.

    Parameters:
    c - the value of c.
    c_set - specifies whether a value for c has been specified.
    Returns:
    String the filename that the results have been written to

    printSettings

    public void printSettings(SearchRequest default_q,
                              java.lang.String[] topicsFiles,
                              java.lang.String otherComments)
    prints the current settings to a file with the same name as the current results file. this assists in tracing the settings used to generate a given run.


    printResults

    public void printResults(java.io.PrintWriter pw,
                             SearchRequest q)
    Prints the results for the given search request, using the specified destination.

    Parameters:
    pw - PrintWriter the destination where to save the results.
    q - SearchRequest the object encapsulating the query and the results.

    Terrier IR Platform
    2.2.1

    Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow