Terrier IR Platform
2.2.1

uk.ac.gla.terrier.matching.models.languagemodel
Class PonteCroft

java.lang.Object
  extended by uk.ac.gla.terrier.matching.models.languagemodel.LanguageModel
      extended by uk.ac.gla.terrier.matching.models.languagemodel.PonteCroft
All Implemented Interfaces:
Model

public class PonteCroft
extends LanguageModel

This class implements Ponte & Croft's language modelling approach.

Version:
$Revision: 1.13 $
Author:
Ben He

Constructor Summary
PonteCroft()
          The default constructor.
 
Method Summary
 double averageTermGenerationProbability(int[] tf, double[] docLength)
          The method computes the average term generation probability of a term in vocabulary.
 java.lang.String getInfo()
          Returns the name of the model.
 double risk(double tf, double docLength, double termEstimate)
          The method computes the risk of retrieving a seen query term.
 double scoreSeenNonQuery(double tf, double docLength, double termFrequency, double termEstimate)
          The method assigns score for a seen non-query term.
 double scoreSeenQuery(double tf, double docLength, double termFrequency, double termEstimate)
          The method assigns score for a seen query term.
 double scoreUnseenNonQuery(double termFrequency)
          The method assigns score for a unseen non-query term.
 double scoreUnseenQuery(double termFrequency)
          The method assigns score for a unseen query term.
 void setAverageDocumentLength(double a)
          Set the average document length in the collection.
 void setNumberOfPointers(double n)
           
 void setNumberOfTokens(double value)
          Set the number of tokens in the whole collection.
 void setNumberOfUniqueTerms(double n)
          Set the number of unique terms in the collection.
 
Methods inherited from class uk.ac.gla.terrier.matching.models.languagemodel.LanguageModel
getParameter, setNumberOfDocuments, setParameter, setTermFrequency
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PonteCroft

public PonteCroft()
The default constructor.

Method Detail

getInfo

public java.lang.String getInfo()
Returns the name of the model.

Specified by:
getInfo in interface Model
Specified by:
getInfo in class LanguageModel
Returns:
java.lang.String The name of the model.

scoreSeenQuery

public double scoreSeenQuery(double tf,
                             double docLength,
                             double termFrequency,
                             double termEstimate)
The method assigns score for a seen query term.

Specified by:
scoreSeenQuery in class LanguageModel
Parameters:
tf - The within-document frequency.
docLength - The length of the weighted document.
termFrequency - The term frequency in the collection.
termEstimate - The term estimate of the query term.
Returns:
The score for a seen query term.

scoreSeenNonQuery

public double scoreSeenNonQuery(double tf,
                                double docLength,
                                double termFrequency,
                                double termEstimate)
The method assigns score for a seen non-query term.

Specified by:
scoreSeenNonQuery in class LanguageModel
Parameters:
tf - The within-document frequency.
docLength - The length of the weighted document.
termFrequency - The term frequency in the collection.
termEstimate - The term estimate of the query term.
Returns:
The score for a seen non-query term.

scoreUnseenQuery

public double scoreUnseenQuery(double termFrequency)
The method assigns score for a unseen query term.

Specified by:
scoreUnseenQuery in class LanguageModel
Parameters:
termFrequency - The term frequency in the collection.
Returns:
The score for a unseen query term.

scoreUnseenNonQuery

public double scoreUnseenNonQuery(double termFrequency)
The method assigns score for a unseen non-query term.

Specified by:
scoreUnseenNonQuery in class LanguageModel
Parameters:
termFrequency - The term frequency in the collection.
Returns:
The score for a unseen non-query term.

risk

public double risk(double tf,
                   double docLength,
                   double termEstimate)
The method computes the risk of retrieving a seen query term.

Specified by:
risk in class LanguageModel
Parameters:
tf - The within-document frequency.
docLength - The length of the weighted document.
termEstimate - The term estimate of the query term.
Returns:
The risk.

averageTermGenerationProbability

public double averageTermGenerationProbability(int[] tf,
                                               double[] docLength)
The method computes the average term generation probability of a term in vocabulary.

Specified by:
averageTermGenerationProbability in class LanguageModel
Parameters:
tf - An array of within-document frequency of a query term in all documents where it occurs.
docLength - The length of all the documents where the term occurs.
Returns:
The average generation probability.

setNumberOfTokens

public void setNumberOfTokens(double value)
Description copied from interface: Model
Set the number of tokens in the whole collection.

Parameters:
value - The number of tokens in the whole collection.

setAverageDocumentLength

public void setAverageDocumentLength(double a)
Description copied from interface: Model
Set the average document length in the collection.

Parameters:
a - The average document length in the collection.

setNumberOfUniqueTerms

public void setNumberOfUniqueTerms(double n)
Description copied from interface: Model
Set the number of unique terms in the collection.

Parameters:
n - double The number of unique terms in the collection.

setNumberOfPointers

public void setNumberOfPointers(double n)

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow