| 
 | Terrier IR Platform 2.2.1 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.LexiconOutputStream
public class LexiconOutputStream
This class implements an output stream for the lexicon structure.
| Constructor Summary | |
|---|---|
| LexiconOutputStream()A default constructor. | |
| LexiconOutputStream(java.io.DataOutput out)Create a lexicon using the specified data stream | |
| LexiconOutputStream(java.io.File file)A constructor given the filename. | |
| LexiconOutputStream(java.lang.String filename)A constructor given the filename. | |
| LexiconOutputStream(java.lang.String path,
                    java.lang.String prefix)A constructor for a LexiconOutputStream given the index path and prefix | |
| Method Summary | |
|---|---|
|  void | close()Closes the lexicon stream. | 
|  long | getNumberOfPointersWritten()Returns the number of pointers there would be in an inverted index built using this lexicon (thus far). | 
|  int | getNumberOfTermsWritten()Returns the number of terms written so far by this LexiconInputStream | 
|  long | getNumberOfTokensWritten()Returns the number of tokens there are in the entire collection represented by this lexicon (thus far). | 
|  void | setEndBitOffset(byte _endBitOffset)Deprecated. | 
|  void | setEndOffset(long _endOffset)Deprecated. | 
|  void | setNt(int _Nt)Deprecated. | 
|  void | setTerm(java.lang.String _term)Deprecated. | 
|  void | setTermId(int _termId)Deprecated. | 
|  void | setTF(int _termFrequency)Deprecated. | 
|  int | writeNextEntry(byte[] _term,
               int _termId,
               int _documentFrequency,
               int _termFrequency,
               long _endOffset,
               byte _endBitOffset)Writes a lexicon entry. | 
|  int | writeNextEntry(java.lang.String _term,
               int _termId,
               int _documentFrequency,
               int _termFrequency,
               long _endOffset,
               byte _endBitOffset)Writes a lexicon entry. | 
| Methods inherited from class java.lang.Object | 
|---|
| equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public LexiconOutputStream()
public LexiconOutputStream(java.io.DataOutput out)
public LexiconOutputStream(java.lang.String filename)
filename - java.lang.String the name of the lexicon file.public LexiconOutputStream(java.io.File file)
file - java.io.File the name of the lexicon file.
public LexiconOutputStream(java.lang.String path,
                           java.lang.String prefix)
path - String the path to the indexprefix - String the prefix of the filenames in the index| Method Detail | 
|---|
public void close()
close in interface Closeablejava.io.IOException - if an I/O error occurs while closing the stream.
public int writeNextEntry(java.lang.String _term,
                          int _termId,
                          int _documentFrequency,
                          int _termFrequency,
                          long _endOffset,
                          byte _endBitOffset)
                   throws java.io.IOException
_term - the string representation of the term_termId - the terms integer identifier_documentFrequency - the term's document frequency in the collection_termFrequency - the term's frequency in the collection_endOffset - the term's ending byte offset in the inverted file_endBitOffset - the term's ending byte bit-offset in the inverted file
java.io.IOException - if an I/O error occurs
public int writeNextEntry(byte[] _term,
                          int _termId,
                          int _documentFrequency,
                          int _termFrequency,
                          long _endOffset,
                          byte _endBitOffset)
                   throws java.io.IOException
_term - the byte[] representation of the term. Using this format means that
 the term does not have to be decoded and recoded every time._termId - the terms integer identifier_documentFrequency - the term's document frequency in the collection_termFrequency - the term's frequency in the collection_endOffset - the term's ending byte offset in the inverted file_endBitOffset - the term's ending byte bit-offset in the inverted file
java.io.IOException - if an I/O error occurspublic long getNumberOfPointersWritten()
public long getNumberOfTokensWritten()
public int getNumberOfTermsWritten()
public void setEndBitOffset(byte _endBitOffset)
_endBitOffset - byte the bit offset in the last byte of the 
                term's entry in the inverted file.public void setEndOffset(long _endOffset)
_endOffset - long The ending byte of the term's 
                entry in the inverted file.public void setNt(int _Nt)
_Nt - int The document frequency for the given term.public void setTerm(java.lang.String _term)
_term - java.lang.String The string representation of 
                the seeked term.public void setTermId(int _termId)
_termId - int the term's identifier.public void setTF(int _termFrequency)
_termFrequency - int The term frequency in the collection.| 
 | Terrier IR Platform 2.2.1 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||