|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.querying.ExpansionTerms org.terrier.querying.DFRBagExpansionTerms
public class DFRBagExpansionTerms
This class implements a data structure of terms in the top-retrieved documents. In particular, this implementation treats the entire feedback set as a bag of words, and weights term occurrences in this bag.
Properties:
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.terrier.querying.ExpansionTerms |
---|
ExpansionTerms.ExpansionTerm |
Field Summary | |
---|---|
protected double |
averageDocumentLength
The average document length in the collection. |
protected PostingIndex<BitIndexPointer> |
directIndex
|
protected DocumentIndex |
documentIndex
|
protected int |
feedbackDocumentCount
|
protected Lexicon<java.lang.String> |
lexicon
The lexicon used for retrieval. |
double |
normaliser
The parameter-free term weight normaliser. |
protected int |
numberOfDocuments
The number of documents in the collection. |
protected long |
numberOfTokens
The number of tokens in the collection. |
protected gnu.trove.TIntObjectHashMap<ExpansionTerms.ExpansionTerm> |
terms
The terms in the top-retrieval documents. |
protected double |
totalDocumentLength
The number of tokens in the X top ranked documents. |
Fields inherited from class org.terrier.querying.ExpansionTerms |
---|
EXPANSIONTERM_DESC_SCORE_SORTER, model, originalTermFreqs, originalTermids |
Constructor Summary | |
---|---|
DFRBagExpansionTerms(CollectionStatistics collStats,
Lexicon<java.lang.String> _lexicon,
PostingIndex<BitIndexPointer> _directIndex,
DocumentIndex _documentIndex)
Constructs an instance of ExpansionTerms. |
Method Summary | |
---|---|
void |
assignWeights(QueryExpansionModel QEModel)
Assign weight to terms that are stored in ExpansionTerm[] terms. |
void |
deleteTerm(int termid)
Remove the records for a given term |
double |
getDocumentFrequency(int termId)
Returns the number of the top-ranked documents a given term occurs in. |
SingleTermQuery[] |
getExpandedTerms(int numberOfExpandedTerms)
This method implements the functionality of assigning expansion weights to the terms in the top-retrieved documents, and returns the most informative terms among them. |
protected SingleTermQuery[] |
getExpandedTerms(int numberOfExpandedTerms,
QueryExpansionModel QEModel)
|
double |
getExpansionProbability(int termId)
Returns the probability of a given termid occurring in the expansion documents. |
gnu.trove.TIntObjectHashMap<ExpansionTerms.ExpansionTerm> |
getExpansionTerms()
Returns expanded terms |
double |
getExpansionWeight(int termId)
Returns the weight of a term with the given term identifier. |
double |
getExpansionWeight(int termId,
QueryExpansionModel model)
Returns the weight of a term with the given term identifier, computed by the specified query expansion model. |
double |
getExpansionWeight(java.lang.String term)
Returns the weight of a given term. |
double |
getExpansionWeight(java.lang.String term,
QueryExpansionModel model)
Returns the weight of a given term, computed by the specified query expansion model. |
double |
getFrequency(int termId)
Returns the frequency of a given term in the top-ranked documents. |
double |
getFrequency(java.lang.String term)
Returns the frequency of a given term in the top-ranked documents. |
int |
getNumberOfUniqueTerms()
Returns the unique number of terms found in all the top-ranked documents |
double |
getOriginalExpansionWeight(java.lang.String term)
Returns the un-normalised weight of a given term. |
int[] |
getTermIds()
Returns the termids of all terms found in the top-ranked documents |
void |
insertDocument(FeedbackDocument doc)
Adds the feedback document to the feedback set. |
void |
insertDocument(int docid,
int rank,
double score)
Adds the feedback document from the index given a docid |
protected void |
insertTerm(int termID,
double withinDocumentFrequency)
Add a term in the X top-retrieved documents as a candidate of the expanded terms. |
void |
setTotalDocumentLength(double totalLength)
Allows the totalDocumentLength to be set after the fact |
Methods inherited from class org.terrier.querying.ExpansionTerms |
---|
setModel, setOriginalQueryTerms |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected gnu.trove.TIntObjectHashMap<ExpansionTerms.ExpansionTerm> terms
protected Lexicon<java.lang.String> lexicon
protected PostingIndex<BitIndexPointer> directIndex
protected DocumentIndex documentIndex
protected int numberOfDocuments
protected long numberOfTokens
protected double averageDocumentLength
protected double totalDocumentLength
public double normaliser
protected int feedbackDocumentCount
Constructor Detail |
---|
public DFRBagExpansionTerms(CollectionStatistics collStats, Lexicon<java.lang.String> _lexicon, PostingIndex<BitIndexPointer> _directIndex, DocumentIndex _documentIndex)
collStats
- Statistics of the used corpora_lexicon
- Lexicon The lexicon used for retrieval._directIndex
- DirectIndex to use for finding terms for documents_documentIndex
- DocumentIndex to use for finding statistics about documentsMethod Detail |
---|
public void setTotalDocumentLength(double totalLength)
public int[] getTermIds()
public int getNumberOfUniqueTerms()
getNumberOfUniqueTerms
in class ExpansionTerms
public gnu.trove.TIntObjectHashMap<ExpansionTerms.ExpansionTerm> getExpansionTerms()
public SingleTermQuery[] getExpandedTerms(int numberOfExpandedTerms)
getExpandedTerms
in class ExpansionTerms
numberOfExpandedTerms
- int The number of terms to extract from the
top-retrieved documents. ConservativeQE is set if this parameter is set to 0.
* @return TermTreeNode[] The expanded terms.
protected SingleTermQuery[] getExpandedTerms(int numberOfExpandedTerms, QueryExpansionModel QEModel)
public void deleteTerm(int termid)
public double getExpansionWeight(java.lang.String term, QueryExpansionModel model)
term
- String the term to set the weight for.model
- QueryExpansionModel the used query expansion model.
public double getExpansionWeight(java.lang.String term)
term
- String the term to get the weight for.
public double getOriginalExpansionWeight(java.lang.String term)
term
- String the given term.
public double getFrequency(java.lang.String term)
term
- String the term to get the frequency for.
public double getFrequency(int termId)
termId
- int the id of the term to get the frequency for.
public double getDocumentFrequency(int termId)
termId
- int the id of the term to get the frequency for.
public void assignWeights(QueryExpansionModel QEModel)
QEModel
- QueryExpansionModel the used query expansion model.public double getExpansionWeight(int termId, QueryExpansionModel model)
termId
- int the term identifier to set the weight for.model
- QueryExpansionModel the used query expansion model.
public double getExpansionWeight(int termId)
termId
- int the term identifier to set the weight for.
public double getExpansionProbability(int termId)
termId
- int the term identifier to obtain the probability
public void insertDocument(FeedbackDocument doc) throws java.io.IOException
insertDocument
in class ExpansionTerms
java.io.IOException
public void insertDocument(int docid, int rank, double score) throws java.io.IOException
java.io.IOException
protected void insertTerm(int termID, double withinDocumentFrequency)
termID
- int the integer identifier of a termwithinDocumentFrequency
- double the within document
frequency of a term
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |