|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object uk.ac.gla.terrier.structures.indexing.LexiconBuilder
public class LexiconBuilder
Builds temporary lexicons during indexing a collection and merges them when the indexing of a collection has finished.
Constructor Summary | |
---|---|
LexiconBuilder()
Deprecated. |
|
LexiconBuilder(Index i)
|
|
LexiconBuilder(java.lang.String pathname,
java.lang.String prefix)
Creates an instance of the class, given the path to save the temporary lexicons. |
Method Summary | |
---|---|
void |
addDocumentTerms(DocumentPostingList terms)
adds the terms of a document to the temporary lexicon in memory. |
void |
addTemporaryLexicon(java.lang.String filename)
If the application code generated lexicons itself, use this method to add them to the merge list Otherwise dont touch this method. |
void |
addTerm(java.lang.String term,
int tf)
Add a single term to the lexicon being built |
static void |
createLexiconHash(Index index)
Creates a lexicon hash for the specified index |
void |
createLexiconHash(LexiconInputStream lexStream)
Create a lexicon hash for the current index |
static void |
createLexiconHash(LexiconInputStream lexStream,
java.io.OutputStream out)
|
static void |
createLexiconHash(LexiconInputStream lexStream,
java.lang.String path,
java.lang.String prefix)
Creates a Lexicon hash. |
static void |
createLexiconIndex(Index index)
Creates a lexicon index for the specified index |
void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize)
Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. |
static void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.io.DataOutputStream dosLexid)
|
static void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.lang.String path,
java.lang.String prefix)
Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. |
void |
finishedDirectIndexBuild()
Processing the lexicon after finished creating the direct and document indexes. |
void |
finishedInvertedIndexBuild()
Processing the lexicon after finished creating the inverted index. |
void |
flush()
Force a temporary lexicon to be flushed |
int |
getFinalNumberOfTerms()
Returns the number of terms in the final lexicon. |
static void |
main(java.lang.String[] args)
|
void |
merge(java.util.LinkedList<java.lang.String> filesToMerge)
Merges the intermediate lexicon files created during the indexing. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LexiconBuilder()
public LexiconBuilder(Index i)
public LexiconBuilder(java.lang.String pathname, java.lang.String prefix)
pathname
- String the path to save the temporary lexicons.Method Detail |
---|
public int getFinalNumberOfTerms()
public void addTemporaryLexicon(java.lang.String filename)
filename
- Fully path to a lexicon to mergepublic void addTerm(java.lang.String term, int tf)
term
- The String termtf
- the frequency of the termpublic void addDocumentTerms(DocumentPostingList terms)
terms
- DocumentPostingList the terms of the document to add to the temporary lexiconpublic void flush()
public void finishedInvertedIndexBuild()
public void finishedDirectIndexBuild()
public void merge(java.util.LinkedList<java.lang.String> filesToMerge) throws java.io.IOException
filesToMerge
- java.util.LinkedList the list containing the
filenames of the temporary files.
java.io.IOException
- an input/output exception is throws
if a problem is encountered.public void createLexiconIndex(LexiconInputStream lexicon, int lexiconEntries, int lexiconEntrySize) throws java.io.IOException
lexicon
- The input stream of the lexicon that we are creating the lexid file forlexiconEntries
- The number of entries in this lexiconlexiconEntrySize
- The size of one entry in this lexicon
java.io.IOException
- Throws an Input/Output exception if
there is an input/output error.public static void createLexiconIndex(LexiconInputStream lexicon, int lexiconEntries, int lexiconEntrySize, java.lang.String path, java.lang.String prefix) throws java.io.IOException
lexicon
- The input stream of the lexicon that we are creating the lexid file forlexiconEntries
- The number of entries in this lexiconlexiconEntrySize
- The size of one entry in this lexiconpath
- The path to the index containing the lexiconprefix
- The prefix of the index containing the lexicon
java.io.IOException
- Throws an Input/Output exception if
there is an input/output error.public static void createLexiconIndex(LexiconInputStream lexicon, int lexiconEntries, int lexiconEntrySize, java.io.DataOutputStream dosLexid) throws java.io.IOException
java.io.IOException
public static void createLexiconIndex(Index index) throws java.io.IOException
index
- Index to make the lexicon index for
java.io.IOException
public void createLexiconHash(LexiconInputStream lexStream)
lexStream
- lexiconinputstream to processpublic static void createLexiconHash(Index index) throws java.io.IOException
index
- Index to make the LexiconHash for
java.io.IOException
public static void createLexiconHash(LexiconInputStream lexStream, java.lang.String path, java.lang.String prefix)
lexStream
- LexiconInputStream to processpath
- Path to the index containing the lexiconprefix
- Prefix of the index containing the lexiconpublic static void createLexiconHash(LexiconInputStream lexStream, java.io.OutputStream out)
public static void main(java.lang.String[] args)
|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |