| 
 | Terrier IR Platform 2.2.1 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.indexing.LexiconBuilder
public class LexiconBuilder
Builds temporary lexicons during indexing a collection and merges them when the indexing of a collection has finished.
| Constructor Summary | |
|---|---|
| LexiconBuilder()Deprecated. | |
| LexiconBuilder(Index i) | |
| LexiconBuilder(java.lang.String pathname,
               java.lang.String prefix)Creates an instance of the class, given the path to save the temporary lexicons. | |
| Method Summary | |
|---|---|
|  void | addDocumentTerms(DocumentPostingList terms)adds the terms of a document to the temporary lexicon in memory. | 
|  void | addTemporaryLexicon(java.lang.String filename)If the application code generated lexicons itself, use this method to add them to the merge list Otherwise dont touch this method. | 
|  void | addTerm(java.lang.String term,
        int tf)Add a single term to the lexicon being built | 
| static void | createLexiconHash(Index index)Creates a lexicon hash for the specified index | 
|  void | createLexiconHash(LexiconInputStream lexStream)Create a lexicon hash for the current index | 
| static void | createLexiconHash(LexiconInputStream lexStream,
                  java.io.OutputStream out) | 
| static void | createLexiconHash(LexiconInputStream lexStream,
                  java.lang.String path,
                  java.lang.String prefix)Creates a Lexicon hash. | 
| static void | createLexiconIndex(Index index)Creates a lexicon index for the specified index | 
|  void | createLexiconIndex(LexiconInputStream lexicon,
                   int lexiconEntries,
                   int lexiconEntrySize)Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. | 
| static void | createLexiconIndex(LexiconInputStream lexicon,
                   int lexiconEntries,
                   int lexiconEntrySize,
                   java.io.DataOutputStream dosLexid) | 
| static void | createLexiconIndex(LexiconInputStream lexicon,
                   int lexiconEntries,
                   int lexiconEntrySize,
                   java.lang.String path,
                   java.lang.String prefix)Creates the lexicon index file that contains a mapping from the given term id to the offset in the lexicon, in order to be able to retrieve the term information according to the term identifier. | 
|  void | finishedDirectIndexBuild()Processing the lexicon after finished creating the direct and document indexes. | 
|  void | finishedInvertedIndexBuild()Processing the lexicon after finished creating the inverted index. | 
|  void | flush()Force a temporary lexicon to be flushed | 
|  int | getFinalNumberOfTerms()Returns the number of terms in the final lexicon. | 
| static void | main(java.lang.String[] args) | 
|  void | merge(java.util.LinkedList<java.lang.String> filesToMerge)Merges the intermediate lexicon files created during the indexing. | 
| Methods inherited from class java.lang.Object | 
|---|
| equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public LexiconBuilder()
public LexiconBuilder(Index i)
public LexiconBuilder(java.lang.String pathname,
                      java.lang.String prefix)
pathname - String the path to save the temporary lexicons.| Method Detail | 
|---|
public int getFinalNumberOfTerms()
public void addTemporaryLexicon(java.lang.String filename)
filename - Fully path to a lexicon to merge
public void addTerm(java.lang.String term,
                    int tf)
term - The String termtf - the frequency of the termpublic void addDocumentTerms(DocumentPostingList terms)
terms - DocumentPostingList the terms of the document to add to the temporary lexiconpublic void flush()
public void finishedInvertedIndexBuild()
public void finishedDirectIndexBuild()
public void merge(java.util.LinkedList<java.lang.String> filesToMerge)
           throws java.io.IOException
filesToMerge - java.util.LinkedList the list containing the 
                filenames of the temporary files.
java.io.IOException - an input/output exception is throws 
                 if a problem is encountered.
public void createLexiconIndex(LexiconInputStream lexicon,
                               int lexiconEntries,
                               int lexiconEntrySize)
                        throws java.io.IOException
lexicon - The input stream of the lexicon that we are creating the lexid file forlexiconEntries - The number of entries in this lexiconlexiconEntrySize - The size of one entry in this lexicon
java.io.IOException - Throws an Input/Output exception if 
                        there is an input/output error.
public static void createLexiconIndex(LexiconInputStream lexicon,
                                      int lexiconEntries,
                                      int lexiconEntrySize,
                                      java.lang.String path,
                                      java.lang.String prefix)
                               throws java.io.IOException
lexicon - The input stream of the lexicon that we are creating the lexid file forlexiconEntries - The number of entries in this lexiconlexiconEntrySize - The size of one entry in this lexiconpath - The path to the index containing the lexiconprefix - The prefix of the index containing the lexicon
java.io.IOException - Throws an Input/Output exception if
        there is an input/output error.
public static void createLexiconIndex(LexiconInputStream lexicon,
                                      int lexiconEntries,
                                      int lexiconEntrySize,
                                      java.io.DataOutputStream dosLexid)
                               throws java.io.IOException
java.io.IOException
public static void createLexiconIndex(Index index)
                               throws java.io.IOException
index - Index to make the lexicon index for
java.io.IOExceptionpublic void createLexiconHash(LexiconInputStream lexStream)
lexStream - lexiconinputstream to process
public static void createLexiconHash(Index index)
                              throws java.io.IOException
index - Index to make the LexiconHash for
java.io.IOException
public static void createLexiconHash(LexiconInputStream lexStream,
                                     java.lang.String path,
                                     java.lang.String prefix)
lexStream - LexiconInputStream to processpath - Path to the index containing the lexiconprefix - Prefix of the index containing the lexicon
public static void createLexiconHash(LexiconInputStream lexStream,
                                     java.io.OutputStream out)
public static void main(java.lang.String[] args)
| 
 | Terrier IR Platform 2.2.1 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||