|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.structures.indexing.LexiconMap
public class LexiconMap
This class keeps track of the total counts of terms within a bundle of documents being indexed. Internally, uses hashmaps. This class replaces the LexiconTree etc.
Properties
Field Summary | |
---|---|
protected static int |
BUNDLE_AVG_UNIQUE_TERMS
Number of unique terms expected to be indexed in a bundle of documents. |
protected gnu.trove.TObjectIntHashMap<java.lang.String> |
nts
mapping: term to document frequency |
protected int |
numberOfNodes
number of different terms |
protected int |
numberOfPointers
number of different entries there will be in the inverted index |
protected gnu.trove.TObjectIntHashMap<java.lang.String> |
tfs
mapping: term to term frequency in the collection |
Constructor Summary | |
---|---|
LexiconMap()
|
Method Summary | |
---|---|
void |
clear()
Clear the lexicon map |
int |
getNumberOfNodes()
Returns the numbe of nodes in the tree. |
int |
getNumberOfPointers()
Returns the number of pointers in the tree. |
void |
insert(DocumentPostingList doc)
Inserts all the terms from a document posting into the lexicon map |
void |
insert(java.lang.String term,
int tf)
Inserts a new term in the lexicon map. |
void |
storeToStream(LexiconOutputStream<java.lang.String> lexiconStream)
Stores the lexicon tree to a lexicon stream as a sequence of entries. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final int BUNDLE_AVG_UNIQUE_TERMS
protected int numberOfNodes
protected int numberOfPointers
protected final gnu.trove.TObjectIntHashMap<java.lang.String> tfs
protected final gnu.trove.TObjectIntHashMap<java.lang.String> nts
Constructor Detail |
---|
public LexiconMap()
Method Detail |
---|
public void clear()
public void insert(java.lang.String term, int tf)
term
- The term to be inserted.tf
- The id of the term.public void insert(DocumentPostingList doc)
doc
- The postinglist for that documentpublic void storeToStream(LexiconOutputStream<java.lang.String> lexiconStream) throws java.io.IOException
lexiconStream
- The lexicon output stream to store to.
java.io.IOException
public int getNumberOfNodes()
public int getNumberOfPointers()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |