|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.matching.dsms.DependenceScoreModifier
public abstract class DependenceScoreModifier
Base class for Dependence models. Document scores are modified using n-grams, approximating the dependence of terms between documents. Implemented as a document score modifier, similarly to PhraseScoreModifier. Postings lists are traversed in a DAAT fashion.
Properties
QTW Combination Functions
Field Summary | |
---|---|
protected double |
avgDocLen
|
protected java.lang.String |
dependency
type of proximity to use |
protected int |
ngramLength
The size of the considered ngrams |
protected double |
numTokens
|
protected int |
phraseQTWfnid
|
protected java.lang.String[] |
phraseTerms
A list of the strings of the phrase terms. |
protected double |
w_o
weight of ordered dependence model |
protected double |
w_t
weight of unigram model |
protected double |
w_u
weight of unordered dependence model |
Constructor Summary | |
---|---|
DependenceScoreModifier()
Constructs an instance of the DependenceScoreModifier. |
Method Summary | |
---|---|
java.lang.Object |
clone()
Creates a clone of this object |
protected static int |
countTrue(boolean[] in)
|
protected void |
determineGlobalStatistics(java.lang.String[] terms,
EntryStatistics[] es,
boolean SD)
unused hook method |
protected void |
doDependency(Index index,
EntryStatistics[] es,
IterablePosting[] ips,
ResultSet rs,
double[] phraseTermWeights,
boolean SD)
|
java.lang.String |
getName()
Returns the name of the modifier. |
boolean |
modifyScores(Index index,
MatchingQueryTerms terms,
ResultSet set)
Modifies the scores of documents, in which there exist, or there does not exist a given phrase. |
protected static boolean |
NOR(boolean[] in)
|
double |
score(Posting[] postings)
Calculate the score for a document (from the given posting for that document) |
protected double |
scoreFDSD(boolean SD,
int i,
Posting ip1,
int j,
Posting ip2,
double _avgDocLen)
how likely is it that these two postings have so many near-occurrences, given the length of this document |
protected abstract double |
scoreFDSD(int matchingNGrams,
int docLength)
|
void |
setCollectionStatistics(CollectionStatistics cs,
Index _index)
Sets the collection statistics used to score the documents (number of documents in the collection, etc) |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected int ngramLength
protected java.lang.String dependency
protected final int phraseQTWfnid
protected double w_t
protected double w_o
protected double w_u
protected java.lang.String[] phraseTerms
protected double avgDocLen
protected double numTokens
Constructor Detail |
---|
public DependenceScoreModifier()
Method Detail |
---|
public java.lang.Object clone()
clone
in interface DocumentScoreModifier
clone
in class java.lang.Object
protected abstract double scoreFDSD(int matchingNGrams, int docLength)
public java.lang.String getName()
getName
in interface DocumentScoreModifier
protected static boolean NOR(boolean[] in)
public boolean modifyScores(Index index, MatchingQueryTerms terms, ResultSet set)
modifyScores
in interface DocumentScoreModifier
index
- Index the data structures to use.terms
- MatchingQueryTerms the terms to be matched for the query. This
does not correspond to the phrase terms necessarily, but to
all the terms of the query.set
- ResultSet the result set for the query.
protected void determineGlobalStatistics(java.lang.String[] terms, EntryStatistics[] es, boolean SD) throws java.io.IOException
java.io.IOException
protected void doDependency(Index index, EntryStatistics[] es, IterablePosting[] ips, ResultSet rs, double[] phraseTermWeights, boolean SD) throws java.io.IOException
java.io.IOException
protected static int countTrue(boolean[] in)
public void setCollectionStatistics(CollectionStatistics cs, Index _index)
public double score(Posting[] postings)
protected double scoreFDSD(boolean SD, int i, Posting ip1, int j, Posting ip2, double _avgDocLen)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |