public class LexiconMap extends Object
Properties
Modifier and Type | Field and Description |
---|---|
protected static int |
BUNDLE_AVG_UNIQUE_TERMS
Number of unique terms expected to be indexed in a bundle of documents.
|
protected gnu.trove.TObjectIntHashMap<String> |
maxtfs
mapping: term to max tf
|
protected gnu.trove.TObjectIntHashMap<String> |
nts
mapping: term to document frequency
|
protected int |
numberOfNodes
number of different terms
|
protected int |
numberOfPointers
number of different entries there will be in the inverted index
|
protected gnu.trove.TObjectIntHashMap<String> |
tfs
mapping: term to term frequency in the collection
|
Constructor and Description |
---|
LexiconMap() |
Modifier and Type | Method and Description |
---|---|
void |
clear()
Clear the lexicon map
|
int |
getNumberOfNodes()
Returns the numbe of nodes in the tree.
|
int |
getNumberOfPointers()
Returns the number of pointers in the tree.
|
void |
insert(DocumentPostingList doc)
Inserts all the terms from a document posting
into the lexicon map
|
void |
insert(String term,
int tf)
Inserts a new term in the lexicon map.
|
void |
storeToStream(LexiconOutputStream<String> lexiconStream,
TermCodes termCodes)
Stores the lexicon tree to a lexicon stream as a sequence of entries.
|
protected static final int BUNDLE_AVG_UNIQUE_TERMS
protected int numberOfNodes
protected int numberOfPointers
protected final gnu.trove.TObjectIntHashMap<String> tfs
protected final gnu.trove.TObjectIntHashMap<String> nts
protected final gnu.trove.TObjectIntHashMap<String> maxtfs
public void clear()
public void insert(String term, int tf)
term
- The term to be inserted.tf
- The id of the term.public void insert(DocumentPostingList doc)
doc
- The postinglist for that documentpublic void storeToStream(LexiconOutputStream<String> lexiconStream, TermCodes termCodes) throws IOException
lexiconStream
- The lexicon output stream to store to.IOException
public int getNumberOfNodes()
public int getNumberOfPointers()
Terrier Information Retrieval Platform 5.1. Copyright © 2004-2019, University of Glasgow