|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectuk.ac.gla.terrier.structures.indexing.LexiconBuilder
public class LexiconBuilder
Builds temporary lexicons during indexing a collection and merges them when the indexing of a collection has finished.
Constructor Summary | |
---|---|
LexiconBuilder()
A default constructor of the class. |
|
LexiconBuilder(java.lang.String pathname,
java.lang.String prefix)
Creates an instance of the class, given the path to save the temporary lexicons. |
Method Summary | |
---|---|
void |
addDocumentTerms(DocumentPostingList terms)
adds the terms of a document to the temporary lexicon in memory. |
void |
addDocumentTerms(FieldDocumentTreeNode[] terms)
Adds the terms of a document in the temporary lexicon in memory. |
void |
addTemporaryLexicon(java.lang.String filename)
If the application code generated lexicons itself, use this method to add them to the merge list Otherwise dont touch this method. |
void |
createLexiconHash(LexiconInputStream lexStream)
|
static void |
createLexiconHash(LexiconInputStream lexStream,
java.lang.String path,
java.lang.String prefix)
This method reads the lexicon and finds the entries which start with a different letter. |
void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize)
Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. |
static void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.lang.String path,
java.lang.String prefix)
|
void |
finishedDirectIndexBuild()
Processing the lexicon after finished creating the direct and document indexes. |
void |
finishedInvertedIndexBuild()
Processing the lexicon after finished creating the inverted index. |
int |
getFinalNumberOfTerms()
Returns the number of terms in the final lexicon. |
LexiconInputStream |
getLexInputStream(java.lang.String filename)
|
LexiconOutputStream |
getLexOutputStream(java.lang.String filename)
|
static void |
main(java.lang.String[] args)
|
void |
merge(java.util.LinkedList<java.lang.String> filesToMerge)
Merges the intermediate lexicon files created during the indexing. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LexiconBuilder()
public LexiconBuilder(java.lang.String pathname, java.lang.String prefix)
pathname
- String the path to save the temporary lexicons.Method Detail |
---|
public int getFinalNumberOfTerms()
public void addTemporaryLexicon(java.lang.String filename)
filename
- Fully path to a lexicon to mergepublic void addDocumentTerms(FieldDocumentTreeNode[] terms)
terms
- FieldDocumentTreeNode[] the terms of the document to
add in the temporary lexicon in memory.public void addDocumentTerms(DocumentPostingList terms)
terms
- DocumentPostingList the terms of the document to add to the temporary lexiconpublic void finishedInvertedIndexBuild()
public void finishedDirectIndexBuild()
public void merge(java.util.LinkedList<java.lang.String> filesToMerge) throws java.io.IOException
filesToMerge
- java.util.LinkedList the list containing the
filenames of the temporary files.
java.io.IOException
- an input/output exception is throws
if a problem is encountered.public void createLexiconIndex(LexiconInputStream lexicon, int lexiconEntries, int lexiconEntrySize) throws java.io.IOException
lexicon
- The input stream of the lexicon that we are creating the lexid file forlexiconEntries
- The number of entries in this lexiconlexiconEntrySize
- The size of one entry in this lexicon
java.io.IOException
- Throws an Input/Output exception if
there is an input/output error.public static void createLexiconIndex(LexiconInputStream lexicon, int lexiconEntries, int lexiconEntrySize, java.lang.String path, java.lang.String prefix) throws java.io.IOException
java.io.IOException
public void createLexiconHash(LexiconInputStream lexStream)
public static void createLexiconHash(LexiconInputStream lexStream, java.lang.String path, java.lang.String prefix)
public static void main(java.lang.String[] args)
public LexiconInputStream getLexInputStream(java.lang.String filename)
public LexiconOutputStream getLexOutputStream(java.lang.String filename)
|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |