org.terrier.matching.models
Class Js_KLs

java.lang.Object
  extended by org.terrier.matching.models.WeightingModel
      extended by org.terrier.matching.models.Js_KLs
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Model

public class Js_KLs
extends WeightingModel

This class implements the Js_KLs weighting model, which is the product of two measures: the Jefrreys' divergence with the Kullback Leibler's divergence. The two measures are obtained by the addition of one query token. Then Jefrreys' divergence and the information growth in the document by Kullback Leibler's divergence are computed. The model computes the product of these two information measures as amount of information carried by a single query token. Js_KLs is an unsupervised model (parameter free model) of IR.

Js_KLs has a high performance but it can be used with verbose queries. In particular, it has statistically or moderately significant better MAP performance than most of the supervised models with long queries on the terabyte collection (GOV2) with the exception of PL2. MAP for long topics, and comparative p values (two-tailed paired t-test) compared to supervised models (with optimal MAP parameter values) are as follows:

QueriesMAP of JS_KLsLGDDirichlet_LMPL2BM25In_expB2
long0.3178 (>) p=1.7E-17(>) p=0.0544(<) p=0.3155(>) p=0.7866(>) p=5151

References

  1. Frequentist and Bayesian approach to Information Retrieval. G. Amati. In Proceedings of the 28th European Conference on IR Research (ECIR 2006). LNCS vol 3936, pages 13--24.

Since:
3.5
Author:
Gianni Amati
See Also:
Serialized Form

Field Summary
 
Fields inherited from class org.terrier.matching.models.WeightingModel
averageDocumentLength, c, documentFrequency, i, keyFrequency, numberOfDocuments, numberOfPointers, numberOfTokens, numberOfUniqueTerms, termFrequency
 
Constructor Summary
Js_KLs()
          A default constructor to make this model.
 
Method Summary
 java.lang.String getInfo()
          Returns the name of the model, in this case "Js_KLs"
 double score(double tf, double docLength)
          Uses Js_KLs to compute a weight for a term in a document.
 double score(double tf, double docLength, double documentFrequency, double termFrequency, double keyFrequency)
          Uses Js_KLs to compute a weight for a term in a document.
 
Methods inherited from class org.terrier.matching.models.WeightingModel
clone, getOverflowed, getParameter, prepare, score, setAverageDocumentLength, setCollectionStatistics, setDocumentFrequency, setEntryStatistics, setKeyFrequency, setNumberOfDocuments, setNumberOfPointers, setNumberOfTokens, setNumberOfUniqueTerms, setParameter, setRequest, setTermFrequency, stirlingPower
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Js_KLs

public Js_KLs()
A default constructor to make this model.

Method Detail

getInfo

public final java.lang.String getInfo()
Returns the name of the model, in this case "Js_KLs"

Specified by:
getInfo in interface Model
Specified by:
getInfo in class WeightingModel
Returns:
the name of the model

score

public final double score(double tf,
                          double docLength)
Uses Js_KLs to compute a weight for a term in a document.

Specified by:
score in class WeightingModel
Parameters:
tf - The term frequency of the term in the document
docLength - the document's length
Returns:
the score assigned to a document with the given tf and docLength, and other preset parameters

score

public final double score(double tf,
                          double docLength,
                          double documentFrequency,
                          double termFrequency,
                          double keyFrequency)
Uses Js_KLs to compute a weight for a term in a document.

Specified by:
score in class WeightingModel
Parameters:
tf - The term frequency of the term in the document
docLength - the document's length
documentFrequency - The document frequency of the term (ignored)
termFrequency - the term frequency in the collection (ignored)
keyFrequency - the term frequency in the query (ignored).
Returns:
the score assigned by the weighting model Js_KLs.


Terrier 3.5. Copyright © 2004-2011 University of Glasgow