|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.indexing.LexiconBuilder
public class LexiconBuilder
Builds temporary lexicons during indexing a collection and merges them when the indexing of a collection has finished.
| Constructor Summary | |
|---|---|
LexiconBuilder()
Deprecated. |
|
LexiconBuilder(Index i)
|
|
LexiconBuilder(java.lang.String pathname,
java.lang.String prefix)
Creates an instance of the class, given the path to save the temporary lexicons. |
|
| Method Summary | |
|---|---|
void |
addDocumentTerms(DocumentPostingList terms)
adds the terms of a document to the temporary lexicon in memory. |
void |
addTemporaryLexicon(java.lang.String filename)
If the application code generated lexicons itself, use this method to add them to the merge list Otherwise dont touch this method. |
void |
addTerm(java.lang.String term,
int tf)
Add a single term to the lexicon being built |
static void |
createLexiconHash(Index index)
Creates a lexicon hash for the specified index |
void |
createLexiconHash(LexiconInputStream lexStream)
Create a lexicon hash for the current index |
static void |
createLexiconHash(LexiconInputStream lexStream,
java.io.OutputStream out)
|
static void |
createLexiconHash(LexiconInputStream lexStream,
java.lang.String path,
java.lang.String prefix)
Creates a Lexicon hash. |
static void |
createLexiconIndex(Index index)
Creates a lexicon index for the specified index |
void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize)
Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. |
static void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.io.DataOutputStream dosLexid)
|
static void |
createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.lang.String path,
java.lang.String prefix)
Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. |
void |
finishedDirectIndexBuild()
Processing the lexicon after finished creating the direct and document indexes. |
void |
finishedInvertedIndexBuild()
Processing the lexicon after finished creating the inverted index. |
void |
flush()
Force a temporary lexicon to be flushed |
int |
getFinalNumberOfTerms()
Returns the number of terms in the final lexicon. |
static void |
main(java.lang.String[] args)
|
void |
merge(java.util.LinkedList<java.lang.String> filesToMerge)
Merges the intermediate lexicon files created during the indexing. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public LexiconBuilder()
public LexiconBuilder(Index i)
public LexiconBuilder(java.lang.String pathname,
java.lang.String prefix)
pathname - String the path to save the temporary lexicons.| Method Detail |
|---|
public int getFinalNumberOfTerms()
public void addTemporaryLexicon(java.lang.String filename)
filename - Fully path to a lexicon to merge
public void addTerm(java.lang.String term,
int tf)
term - The String termtf - the frequency of the termpublic void addDocumentTerms(DocumentPostingList terms)
terms - DocumentPostingList the terms of the document to add to the temporary lexiconpublic void flush()
public void finishedInvertedIndexBuild()
public void finishedDirectIndexBuild()
public void merge(java.util.LinkedList<java.lang.String> filesToMerge)
throws java.io.IOException
filesToMerge - java.util.LinkedList the list containing the
filenames of the temporary files.
java.io.IOException - an input/output exception is throws
if a problem is encountered.
public void createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize)
throws java.io.IOException
lexicon - The input stream of the lexicon that we are creating the lexid file forlexiconEntries - The number of entries in this lexiconlexiconEntrySize - The size of one entry in this lexicon
java.io.IOException - Throws an Input/Output exception if
there is an input/output error.
public static void createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.lang.String path,
java.lang.String prefix)
throws java.io.IOException
lexicon - The input stream of the lexicon that we are creating the lexid file forlexiconEntries - The number of entries in this lexiconlexiconEntrySize - The size of one entry in this lexiconpath - The path to the index containing the lexiconprefix - The prefix of the index containing the lexicon
java.io.IOException - Throws an Input/Output exception if
there is an input/output error.
public static void createLexiconIndex(LexiconInputStream lexicon,
int lexiconEntries,
int lexiconEntrySize,
java.io.DataOutputStream dosLexid)
throws java.io.IOException
java.io.IOException
public static void createLexiconIndex(Index index)
throws java.io.IOException
index - Index to make the lexicon index for
java.io.IOExceptionpublic void createLexiconHash(LexiconInputStream lexStream)
lexStream - lexiconinputstream to process
public static void createLexiconHash(Index index)
throws java.io.IOException
index - Index to make the LexiconHash for
java.io.IOException
public static void createLexiconHash(LexiconInputStream lexStream,
java.lang.String path,
java.lang.String prefix)
lexStream - LexiconInputStream to processpath - Path to the index containing the lexiconprefix - Prefix of the index containing the lexicon
public static void createLexiconHash(LexiconInputStream lexStream,
java.io.OutputStream out)
public static void main(java.lang.String[] args)
|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||