Terrier IR Platform
2.2.1

uk.ac.gla.terrier.matching.models.queryexpansion
Class QueryExpansionModel

java.lang.Object
  extended by uk.ac.gla.terrier.matching.models.queryexpansion.QueryExpansionModel
Direct Known Subclasses:
Bo1, Bo2, KL

public abstract class QueryExpansionModel
extends java.lang.Object

This class should be extended by the classes used for weighting temrs and documents.

Properties:

Version:
$Revision: 1.22 $
Author:
Gianni Amati, Ben He, Vassilis Plachouras

Field Summary
 boolean PARAMETER_FREE
          Boolean variable indicates whether to apply the parameter free query expansion.
 double ROCCHIO_BETA
          Rocchio's beta for query expansion.
 
Constructor Summary
QueryExpansionModel()
          A default constructor for the class that initialises the idf attribute.
 
Method Summary
abstract  java.lang.String getInfo()
          Returns the name of the model.
 void initialise()
          Initialises the Rocchio's beta for query expansion.
abstract  double parameterFreeNormaliser()
          This method provides the contract for computing the normaliser of parameter-free query expansion.
abstract  double parameterFreeNormaliser(double maxTermFrequency, double collectionLength, double totalDocumentLength)
          This method provides the contract for computing the normaliser of parameter-free query expansion.
abstract  double score(double withinDocumentFrequency, double termFrequency)
          This method provides the contract for implementing query expansion models.
abstract  double score(double withinDocumentFrequency, double termFrequency, double totalDocumentLength, double collectionLength, double averageDocumentLength)
          This method provides the contract for implementing query expansion models.
 void setAverageDocumentLength(double averageDocumentLength)
          Set the average document length.
 void setCollectionLength(double collectionLength)
          Set the collection length.
 void setDocumentFrequency(double documentFrequency)
          Set the document frequency.
 void setMaxTermFrequency(double maxTermFrequency)
          This method sets the maximum of the term frequency values of query terms.
 void setNumberOfDocuments(long numberOfDocuments)
           
 void setTotalDocumentLength(double totalDocumentLength)
          Set the total document length.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ROCCHIO_BETA

public double ROCCHIO_BETA
Rocchio's beta for query expansion. Its default value is 0.4.


PARAMETER_FREE

public boolean PARAMETER_FREE
Boolean variable indicates whether to apply the parameter free query expansion.

Constructor Detail

QueryExpansionModel

public QueryExpansionModel()
A default constructor for the class that initialises the idf attribute.

Method Detail

initialise

public void initialise()
Initialises the Rocchio's beta for query expansion.


setNumberOfDocuments

public void setNumberOfDocuments(long numberOfDocuments)
Parameters:
numberOfDocuments - the numberOfDocuments to set

getInfo

public abstract java.lang.String getInfo()
Returns the name of the model. Creation date: (19/06/2003 12:09:55)

Returns:
java.lang.String

setAverageDocumentLength

public void setAverageDocumentLength(double averageDocumentLength)
Set the average document length.

Parameters:
averageDocumentLength - double The average document length.

setCollectionLength

public void setCollectionLength(double collectionLength)
Set the collection length.

Parameters:
collectionLength - double The number of tokens in the collection.

setDocumentFrequency

public void setDocumentFrequency(double documentFrequency)
Set the document frequency.

Parameters:
documentFrequency - double The document frequency of a term.

setTotalDocumentLength

public void setTotalDocumentLength(double totalDocumentLength)
Set the total document length.

Parameters:
totalDocumentLength - double The total document length.

setMaxTermFrequency

public void setMaxTermFrequency(double maxTermFrequency)
This method sets the maximum of the term frequency values of query terms.

Parameters:
maxTermFrequency -

parameterFreeNormaliser

public abstract double parameterFreeNormaliser()
This method provides the contract for computing the normaliser of parameter-free query expansion.

Returns:
The normaliser.

parameterFreeNormaliser

public abstract double parameterFreeNormaliser(double maxTermFrequency,
                                               double collectionLength,
                                               double totalDocumentLength)
This method provides the contract for computing the normaliser of parameter-free query expansion.

Parameters:
maxTermFrequency - The maximum of the in-collection term frequency of the terms in the pseudo relevance set.
collectionLength - The number of tokens in the collections.
totalDocumentLength - The sum of the length of the top-ranked documents.
Returns:
The normaliser.

score

public abstract double score(double withinDocumentFrequency,
                             double termFrequency)
This method provides the contract for implementing query expansion models.

Parameters:
withinDocumentFrequency - double The term frequency in the X top-retrieved documents.
termFrequency - double The term frequency in the collection.
Returns:
the score assigned to a document with the parameters, and other preset parameters

score

public abstract double score(double withinDocumentFrequency,
                             double termFrequency,
                             double totalDocumentLength,
                             double collectionLength,
                             double averageDocumentLength)
This method provides the contract for implementing query expansion models. For some models, we have to set the beta and the documentFrequency of a term.

Parameters:
withinDocumentFrequency - double The term frequency in the X top-retrieved documents.
termFrequency - double The term frequency in the collection.
totalDocumentLength - double The sum of length of the X top-retrieved documents.
collectionLength - double The number of tokens in the whole collection.
averageDocumentLength - double The average document length in the collection.
Returns:
double The score returned by the implemented model.

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow