Terrier IR Platform
2.2.1

uk.ac.gla.terrier.structures
Class UTFBlockLexiconOutputStream

java.lang.Object
  extended by uk.ac.gla.terrier.structures.LexiconOutputStream
      extended by uk.ac.gla.terrier.structures.BlockLexiconOutputStream
          extended by uk.ac.gla.terrier.structures.UTFBlockLexiconOutputStream
All Implemented Interfaces:
Closeable

public class UTFBlockLexiconOutputStream
extends BlockLexiconOutputStream

An output stream for writing the lexicon to a file sequentially.

Version:
$Revision: 1.10 $
Author:
Douglas Johnson, Vassilis Plachouras

Constructor Summary
UTFBlockLexiconOutputStream()
          A default constructor.
UTFBlockLexiconOutputStream(java.io.DataOutput out)
          Create a lexicon using the specified data stream
UTFBlockLexiconOutputStream(java.io.File file)
          A constructor given the file.
UTFBlockLexiconOutputStream(java.lang.String filename)
          A constructor given the filename.
 
Method Summary
 void setBF(int blockFrequency)
          Sets the block frequency for the given term
 int writeNextEntry(byte[] term, int termId, int documentFrequency, int termFrequency, int blockFrequency, long endOffset, byte endBitOffset)
          Write a lexicon entry.
 int writeNextEntry(java.lang.String term, int termId, int documentFrequency, int termFrequency, int blockFrequency, long endOffset, byte endBitOffset)
          Write a lexicon entry.
 
Methods inherited from class uk.ac.gla.terrier.structures.LexiconOutputStream
close, getNumberOfPointersWritten, getNumberOfTermsWritten, getNumberOfTokensWritten, setEndBitOffset, setEndOffset, setNt, setTerm, setTermId, setTF, writeNextEntry, writeNextEntry
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

UTFBlockLexiconOutputStream

public UTFBlockLexiconOutputStream()
A default constructor.


UTFBlockLexiconOutputStream

public UTFBlockLexiconOutputStream(java.lang.String filename)
A constructor given the filename.

Parameters:
filename - java.lang.String the name of the lexicon file.

UTFBlockLexiconOutputStream

public UTFBlockLexiconOutputStream(java.io.File file)
A constructor given the file.

Parameters:
file - java.io.File the lexicon file.

UTFBlockLexiconOutputStream

public UTFBlockLexiconOutputStream(java.io.DataOutput out)
Create a lexicon using the specified data stream

Method Detail

writeNextEntry

public int writeNextEntry(java.lang.String term,
                          int termId,
                          int documentFrequency,
                          int termFrequency,
                          int blockFrequency,
                          long endOffset,
                          byte endBitOffset)
                   throws java.io.IOException
Write a lexicon entry.

Overrides:
writeNextEntry in class BlockLexiconOutputStream
Parameters:
term - the string representation of the term
termId - the terms integer identifier
documentFrequency - the term's document frequency in the collection
termFrequency - the term's frequency in the collection
endOffset - the term's ending byte offset in the inverted file
endBitOffset - the term's ending byte bit-offset in the inverted file
Returns:
the number of bytes written if there is no error, otherwise returns -1 in case of EOF
Throws:
java.io.IOException - if an I/O error occurs

writeNextEntry

public int writeNextEntry(byte[] term,
                          int termId,
                          int documentFrequency,
                          int termFrequency,
                          int blockFrequency,
                          long endOffset,
                          byte endBitOffset)
                   throws java.io.IOException
Write a lexicon entry.

Overrides:
writeNextEntry in class BlockLexiconOutputStream
Parameters:
term - the byte array representation of the term
termId - the terms integer identifier
documentFrequency - the term's document frequency in the collection
termFrequency - the term's frequency in the collection
endOffset - the term's ending byte offset in the inverted file
endBitOffset - the term's ending byte bit-offset in the inverted file
blockFrequency - the term's frequency in the collection
Returns:
the number of bytes written if there is no error, otherwise returns -1 in case of EOF
Throws:
java.io.IOException - if an I/O error occurs

setBF

public void setBF(int blockFrequency)
Sets the block frequency for the given term

Overrides:
setBF in class BlockLexiconOutputStream
Parameters:
blockFrequency - The new block frequency

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow