org.terrier.structures.indexing.singlepass
Class RunWriter

java.lang.Object
  extended by org.terrier.structures.indexing.singlepass.RunWriter
Direct Known Subclasses:
HadoopRunWriter

public class RunWriter
extends Object

This class writes a run to disk. The data written depends on the specific subclass. This one, writes the Nt, TF and the sequence. It also writes the max frequency of a term in the run (useful for allocating memory during the merging phase).

Author:
Roi Blanco

Field Summary
protected  BitOutputStream bos
          Underlying BitOutputStream to write the compressed objects
protected  String info
          Debug String representation of this RunWriter
protected  DataOutputStream stringDos
          Underlying DataOutputStream to write the term Strings
 
Constructor Summary
protected RunWriter()
           
protected RunWriter(BitOutputStream _bos, DataOutputStream _stringDos)
          other constructor for use by subclasses
  RunWriter(String fileName, String termsFile)
          Instanciates a RunWriter, given the filenames to write.
 
Method Summary
 void beginWrite(int maxSize, int size)
          Writes the headers of the run.
 void finishWrite()
          Closes the output streams.
 String toString()
          
 boolean writeSorted()
          Returns true if this RunWriter needs writeTerm() to be called sorted by term
 void writeTerm(String term, Posting post)
          Writes the information for a given term.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

bos

protected final BitOutputStream bos
Underlying BitOutputStream to write the compressed objects


stringDos

protected final DataOutputStream stringDos
Underlying DataOutputStream to write the term Strings


info

protected String info
Debug String representation of this RunWriter

Constructor Detail

RunWriter

protected RunWriter()

RunWriter

protected RunWriter(BitOutputStream _bos,
                    DataOutputStream _stringDos)
             throws IOException
other constructor for use by subclasses

Throws:
IOException

RunWriter

public RunWriter(String fileName,
                 String termsFile)
          throws IOException
Instanciates a RunWriter, given the filenames to write.

Parameters:
fileName - name of the file to write the posting lists data.
termsFile - name of the file to write the terms.
Throws:
IOException - if an I/O error occurs.
Method Detail

writeSorted

public boolean writeSorted()
Returns true if this RunWriter needs writeTerm() to be called sorted by term


beginWrite

public void beginWrite(int maxSize,
                       int size)
                throws IOException
Writes the headers of the run.

Parameters:
maxSize - max size of a posting.
size - number of postings in the run.
Throws:
IOException - if an I/O error occurs.

writeTerm

public void writeTerm(String term,
                      Posting post)
               throws IOException
Writes the information for a given term.

Parameters:
term - the term to write.
post - the Posting with the data of the term.
Throws:
IOException - if an I/O error occurs.

finishWrite

public void finishWrite()
                 throws IOException
Closes the output streams.

Throws:
IOException - if an I/O error occurs.

toString

public String toString()

Overrides:
toString in class Object


Terrier 3.6. Copyright © 2004-2011 University of Glasgow