|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.matching.models.WeightingModel org.terrier.matching.models.DirichletLM
public class DirichletLM
Bayesian smoothing with Dirichlet Prior. This has one parameter, mu > 0. "The optimal value of mu also tends to be larger for long queries than for title queries. The optimal ... seems to vary from collection to collection, though in most cases, it is around 2,000. The tail of the curves is generally flat." This class sets mu to 2500 by default. As a default, this gives higher performance than BM25 (b=0.75) on TREC Terabyte track 2004.
The retrieval performance of this weighting model has been empirically verified to be similar to that reported below. This model is formulated such that all scores are > 0.
A Study of Smoothing Methods for Language Models Applied to Information Retrieval. Zhai & Lafferty, ACM Transactions on Information Systems, Vol. 22, No. 2, April 2004, Pages 179--214.
Field Summary |
---|
Fields inherited from class org.terrier.matching.models.WeightingModel |
---|
averageDocumentLength, c, documentFrequency, i, keyFrequency, numberOfDocuments, numberOfPointers, numberOfTokens, numberOfUniqueTerms, termFrequency |
Constructor Summary | |
---|---|
DirichletLM()
Constructs an instance of DirichletLM |
Method Summary | |
---|---|
java.lang.String |
getInfo()
Returns the name of the model. |
double |
score(double tf,
double docLength)
This method provides the contract for implementing weighting models. |
double |
score(double tf,
double docLength,
double n_t,
double F_t,
double keyFrequency)
This method provides the contract for implementing weighting models. |
Methods inherited from class org.terrier.matching.models.WeightingModel |
---|
clone, getOverflowed, getParameter, prepare, score, setAverageDocumentLength, setCollectionStatistics, setDocumentFrequency, setEntryStatistics, setKeyFrequency, setNumberOfDocuments, setNumberOfPointers, setNumberOfTokens, setNumberOfUniqueTerms, setParameter, setRequest, setTermFrequency, stirlingPower |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DirichletLM()
Method Detail |
---|
public java.lang.String getInfo()
WeightingModel
getInfo
in interface Model
getInfo
in class WeightingModel
public double score(double tf, double docLength)
WeightingModel
score
in class WeightingModel
tf
- The term frequency in the documentdocLength
- the document's length
public double score(double tf, double docLength, double n_t, double F_t, double keyFrequency)
WeightingModel
score
in class WeightingModel
tf
- The term frequency in the documentdocLength
- the document's lengthn_t
- The document frequency of the termF_t
- the term frequency in the collectionkeyFrequency
- the term frequency in the query
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |