|
Terrier IR Platform 1.1.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.LexiconOutputStream
public class LexiconOutputStream
This class implements an output stream for the lexicon structure.
| Constructor Summary | |
|---|---|
LexiconOutputStream()
A default constructor. |
|
LexiconOutputStream(java.io.File file)
A constructor given the filename. |
|
LexiconOutputStream(java.lang.String filename)
A constructor given the filename. |
|
LexiconOutputStream(java.lang.String path,
java.lang.String prefix)
A constructor for a LexiconOutputStream given the index path and prefix |
|
| Method Summary | |
|---|---|
void |
close()
Closes the lexicon stream. |
long |
getNumberOfPointersWritten()
Returns the number of pointers there would be in an inverted index built using this lexicon. |
int |
getNumberOfTermsWritten()
|
long |
getNumberOfTokensWritten()
Returns the number of tokens there are in the entire collection represented by this lexicon. |
void |
setEndBitOffset(byte _endBitOffset)
Deprecated. |
void |
setEndOffset(long _endOffset)
Deprecated. |
void |
setNt(int _Nt)
Deprecated. |
void |
setTerm(java.lang.String _term)
Deprecated. |
void |
setTermId(int _termId)
Deprecated. |
void |
setTF(int _termFrequency)
Deprecated. |
int |
writeNextEntry(byte[] _term,
int _termId,
int _documentFrequency,
int _termFrequency,
long _endOffset,
byte _endBitOffset)
Writes a lexicon entry. |
int |
writeNextEntry(java.lang.String _term,
int _termId,
int _documentFrequency,
int _termFrequency,
long _endOffset,
byte _endBitOffset)
Writes a lexicon entry. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public LexiconOutputStream()
public LexiconOutputStream(java.lang.String filename)
filename - java.lang.String the name of the lexicon file.public LexiconOutputStream(java.io.File file)
file - java.io.File the name of the lexicon file.
public LexiconOutputStream(java.lang.String path,
java.lang.String prefix)
path - String the path to the indexprefix - String the prefix of the filenames in the index| Method Detail |
|---|
public void close()
close in interface Closeablejava.io.IOException - if an I/O error occurs while closing the stream.
public int writeNextEntry(java.lang.String _term,
int _termId,
int _documentFrequency,
int _termFrequency,
long _endOffset,
byte _endBitOffset)
throws java.io.IOException
_term - the string representation of the term_termId - the terms integer identifier_documentFrequency - the term's document frequency in the collection_termFrequency - the term's frequency in the collection_endOffset - the term's ending byte offset in the inverted file_endBitOffset - the term's ending byte bit-offset in the inverted file
java.io.IOException - if an I/O error occurs
public int writeNextEntry(byte[] _term,
int _termId,
int _documentFrequency,
int _termFrequency,
long _endOffset,
byte _endBitOffset)
throws java.io.IOException
_term - the byte[] representation of the term. Using this format means that
the term does not have to be decoded and recoded every time._termId - the terms integer identifier_documentFrequency - the term's document frequency in the collection_termFrequency - the term's frequency in the collection_endOffset - the term's ending byte offset in the inverted file_endBitOffset - the term's ending byte bit-offset in the inverted file
java.io.IOException - if an I/O error occurspublic long getNumberOfPointersWritten()
public long getNumberOfTokensWritten()
public int getNumberOfTermsWritten()
public void setEndBitOffset(byte _endBitOffset)
_endBitOffset - byte the bit offset in the last byte of the
term's entry in the inverted file.public void setEndOffset(long _endOffset)
_endOffset - long The ending byte of the term's
entry in the inverted file.public void setNt(int _Nt)
_Nt - int The document frequency for the given term.public void setTerm(java.lang.String _term)
_term - java.lang.String The string representation of
the seeked term.public void setTermId(int _termId)
_termId - int the term's identifier.public void setTF(int _termFrequency)
_termFrequency - int The term frequency in the collection.
|
Terrier IR Platform 1.1.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||