org.terrier.structures.indexing.singlepass
Class RunWriter

java.lang.Object
  extended by org.terrier.structures.indexing.singlepass.RunWriter
Direct Known Subclasses:
HadoopRunWriter

public class RunWriter
extends java.lang.Object

This class writes a run to disk. The data written depends on the specific subclass. This one, writes the Nt, TF and the sequence. It also writes the max frequency of a term in the run (useful for allocating memory during the merging phase).

Author:
Roi Blanco

Field Summary
protected  BitOutputStream bos
          Underlying BitOutputStream to write the compressed objects
protected  java.lang.String info
          Debug String representation of this RunWriter
protected  java.io.DataOutputStream stringDos
          Underlying DataOutputStream to write the term Strings
 
Constructor Summary
protected RunWriter()
           
protected RunWriter(BitOutputStream _bos, java.io.DataOutputStream _stringDos)
          other constructor for use by subclasses
  RunWriter(java.lang.String fileName, java.lang.String termsFile)
          Instanciates a RunWriter, given the filenames to write.
 
Method Summary
 void beginWrite(int maxSize, int size)
          Writes the headers of the run.
 void finishWrite()
          Closes the output streams.
 java.lang.String toString()
          
 boolean writeSorted()
          Returns true if this RunWriter needs writeTerm() to be called sorted by term
 void writeTerm(java.lang.String term, Posting post)
          Writes the information for a given term.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

bos

protected final BitOutputStream bos
Underlying BitOutputStream to write the compressed objects


stringDos

protected final java.io.DataOutputStream stringDos
Underlying DataOutputStream to write the term Strings


info

protected java.lang.String info
Debug String representation of this RunWriter

Constructor Detail

RunWriter

protected RunWriter()

RunWriter

protected RunWriter(BitOutputStream _bos,
                    java.io.DataOutputStream _stringDos)
             throws java.io.IOException
other constructor for use by subclasses

Throws:
java.io.IOException

RunWriter

public RunWriter(java.lang.String fileName,
                 java.lang.String termsFile)
          throws java.io.IOException
Instanciates a RunWriter, given the filenames to write.

Parameters:
fileName - name of the file to write the posting lists data.
termsFile - name of the file to write the terms.
Throws:
java.io.IOException - if an I/O error occurs.
Method Detail

writeSorted

public boolean writeSorted()
Returns true if this RunWriter needs writeTerm() to be called sorted by term


beginWrite

public void beginWrite(int maxSize,
                       int size)
                throws java.io.IOException
Writes the headers of the run.

Parameters:
maxSize - max size of a posting.
size - number of postings in the run.
Throws:
java.io.IOException - if an I/O error occurs.

writeTerm

public void writeTerm(java.lang.String term,
                      Posting post)
               throws java.io.IOException
Writes the information for a given term.

Parameters:
term - the term to write.
post - the Posting with the data of the term.
Throws:
java.io.IOException - if an I/O error occurs.

finishWrite

public void finishWrite()
                 throws java.io.IOException
Closes the output streams.

Throws:
java.io.IOException - if an I/O error occurs.

toString

public java.lang.String toString()

Overrides:
toString in class java.lang.Object


Terrier 3.5. Copyright © 2004-2011 University of Glasgow