Class BA
- java.lang.Object
-
- org.terrier.matching.models.queryexpansion.QueryExpansionModel
-
- org.terrier.matching.models.queryexpansion.BA
-
public class BA extends QueryExpansionModel
This class implements an approximation of the binomial distribution through the Kullback-Leibler divergence to weight query terms for query expansion. The class is named BA, which standard for Binomial Approximation. That is F * D(f, p)+0.5*log_2 (2*PI �tf(1-f)) with D the Kullback Leibler divergence, f the MLE estimate of the term frequency in the retrieved set (sample), F the sample size, p the prior of the term See Equation (8) on page 365 of the paper: Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20, 4 (October 2002), 357-389. DOI=10.1145/582415.582416 http://doi.acm.org/10.1145/582415.582416 The description of the query expansion technique and models can be found in Amati, Giambattista (2003),�Probability Models for Information Retrieval based on Divergence from Randomness (pdf). PhD thesis, University of Glasgow.- Author:
- Gianni Amati
-
-
Field Summary
-
Fields inherited from class org.terrier.matching.models.queryexpansion.QueryExpansionModel
averageDocumentLength, collectionLength, documentFrequency, EXPANSION_DOCUMENTS, EXPANSION_TERMS, idf, maxTermFrequency, numberOfDocuments, PARAMETER_FREE, ROCCHIO_BETA, totalDocumentLength
-
-
Constructor Summary
Constructors Constructor Description BA()
A default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getInfo()
Returns the name of the model.double
parameterFreeNormaliser()
This method provides the contract for computing the normaliser of parameter-free query expansion.double
parameterFreeNormaliser(double maxTermFrequency, double collectionLength, double totalDocumentLength)
This method provides the contract for computing the normaliser of parameter-free query expansion.double
score(double withinDocumentFrequency, double termFrequency)
This method implements the query expansion model.double
score(double withinDocumentFrequency, double termFrequency, double totalDocumentLength, double collectionLength, double averageDocumentLength)
This method implements the query expansion model.-
Methods inherited from class org.terrier.matching.models.queryexpansion.QueryExpansionModel
initialise, setAverageDocumentLength, setCollectionLength, setDocumentFrequency, setMaxTermFrequency, setNumberOfDocuments, setTotalDocumentLength
-
-
-
-
Method Detail
-
getInfo
public final java.lang.String getInfo()
Returns the name of the model.- Specified by:
getInfo
in classQueryExpansionModel
- Returns:
- the name of the model
-
parameterFreeNormaliser
public final double parameterFreeNormaliser()
This method provides the contract for computing the normaliser of parameter-free query expansion.- Specified by:
parameterFreeNormaliser
in classQueryExpansionModel
- Returns:
- The normaliser.
-
parameterFreeNormaliser
public final double parameterFreeNormaliser(double maxTermFrequency, double collectionLength, double totalDocumentLength)
This method provides the contract for computing the normaliser of parameter-free query expansion.- Specified by:
parameterFreeNormaliser
in classQueryExpansionModel
- Parameters:
maxTermFrequency
- The maximum of the in-collection term frequency of the terms in the pseudo relevance set.collectionLength
- The number of tokens in the collections.totalDocumentLength
- The sum of the length of the top-ranked documents.- Returns:
- The normaliser.
-
score
public final double score(double withinDocumentFrequency, double termFrequency)
This method implements the query expansion model.- Specified by:
score
in classQueryExpansionModel
- Parameters:
withinDocumentFrequency
- double The term frequency in the X top-retrieved documents.termFrequency
- double The term frequency in the collection.- Returns:
- double The query expansion weight using he complete Kullback-Leibler divergence.
-
score
public final double score(double withinDocumentFrequency, double termFrequency, double totalDocumentLength, double collectionLength, double averageDocumentLength)
This method implements the query expansion model.- Specified by:
score
in classQueryExpansionModel
- Parameters:
withinDocumentFrequency
- double The term frequency in the X top-retrieved documents.termFrequency
- double The term frequency in the collection.totalDocumentLength
- double The sum of length of the X top-retrieved documents.collectionLength
- double The number of tokens in the whole collection.averageDocumentLength
- double The average document length in the collection.- Returns:
- double The score returned by the implemented model.
-
-