Terrier IR Platform
2.2.1

uk.ac.gla.terrier.structures
Class DocumentIndexInputStream

java.lang.Object
  extended by uk.ac.gla.terrier.structures.DocumentIndexInputStream
All Implemented Interfaces:
IndexConfigurable
Direct Known Subclasses:
SimpleDocumentIndexInputStream

public class DocumentIndexInputStream
extends java.lang.Object
implements IndexConfigurable

This class provides access to the document index file sequentially, as a stream. Each entry in the document index consists of a document id, the document number, and the length of the document, that is the number of terms that make up the document.

Version:
$Revision: 1.28 $
Author:
Vassilis Plachouras

Constructor Summary
DocumentIndexInputStream()
          A default constructor of a document index, from a given filename.
DocumentIndexInputStream(java.io.File file)
          A constructor of a document index, from a given filename.
DocumentIndexInputStream(java.io.InputStream is)
          A constructor for the class.
DocumentIndexInputStream(java.lang.String filename)
          A constructor of a document index, from a given filename.
DocumentIndexInputStream(java.lang.String path, java.lang.String prefix)
          A constructor of a document index input stream from an index path and prefix.
 
Method Summary
 void close()
          Closes the stream.
 int getDocumentId()
          Returns the document's id for the given docno.
 int getDocumentLength()
          Return the length of the document with the given docno.
 java.lang.String getDocumentNumber()
          Reading the docno for the i-th document.
 byte getEndBitOffset()
          Returns the bit offset in the ending byte in the direct file's entry for this document
 long getEndOffset()
          Returns the offset of the ending byte in the direct file for this document
 byte getStartBitOffset()
          Return the bit offset in the starting byte in the entry in the direct file for this document.
 long getStartOffset()
          Return the starting byte in the direct file for this document.
 void print()
          Prints out to the standard error stream the contents of the document index file.
 int readNextEntry()
          Reads the next entry from the stream.
 void setDocnoEntryLength(int l)
          Set the length of docnos in the index file
 void setIndex(Index i)
          This structure can be configured by the Index object.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentIndexInputStream

public DocumentIndexInputStream(java.io.InputStream is)
A constructor for the class.

Parameters:
is - java.io.InputStream The underlying input stream

DocumentIndexInputStream

public DocumentIndexInputStream(java.lang.String filename)
A constructor of a document index, from a given filename.

Parameters:
filename - java.lang.String The name of the document index file.

DocumentIndexInputStream

public DocumentIndexInputStream()
A default constructor of a document index, from a given filename.


DocumentIndexInputStream

public DocumentIndexInputStream(java.io.File file)
A constructor of a document index, from a given filename.

Parameters:
file - java.io.File The document index file.

DocumentIndexInputStream

public DocumentIndexInputStream(java.lang.String path,
                                java.lang.String prefix)
A constructor of a document index input stream from an index path and prefix.

Parameters:
path - String path to the index
prefix - String prefix of the filenames of the index
Method Detail

setIndex

public void setIndex(Index i)
This structure can be configured by the Index object. In particular, the length docno.byte.length can be picked up automatically from the index for non-default installations.

Specified by:
setIndex in interface IndexConfigurable
Parameters:
i - Index object to use

setDocnoEntryLength

public void setDocnoEntryLength(int l)
Set the length of docnos in the index file


close

public void close()
Closes the stream.


readNextEntry

public int readNextEntry()
                  throws java.io.IOException
Reads the next entry from the stream.

Returns:
the number of bytes read from the stream, or -1 if EOF has been reached.
Throws:
java.io.IOException - if an I/O error occurs.

print

public void print()
Prints out to the standard error stream the contents of the document index file.


getDocumentId

public int getDocumentId()
Returns the document's id for the given docno.

Returns:
int The document's id

getDocumentLength

public int getDocumentLength()
Return the length of the document with the given docno.

Returns:
int The document's length

getDocumentNumber

public java.lang.String getDocumentNumber()
Reading the docno for the i-th document.

Returns:
the document number of the i-th document.

getEndBitOffset

public byte getEndBitOffset()
Returns the bit offset in the ending byte in the direct file's entry for this document

Returns:
byte the bit offset in the ending byte in the direct file's entry for this document

getEndOffset

public long getEndOffset()
Returns the offset of the ending byte in the direct file for this document

Returns:
long the offset of the ending byte in the direct file for this document

getStartBitOffset

public byte getStartBitOffset()
Return the bit offset in the starting byte in the entry in the direct file for this document.

Returns:
byte the bit offset in the starting byte in the entry in the direct file.

getStartOffset

public long getStartOffset()
Return the starting byte in the direct file for this document.

Returns:
long the offset of the starting byte in the direct file

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow