Terrier IR Platform
2.2.1

uk.ac.gla.terrier.structures
Class DocumentIndexInMemory

java.lang.Object
  extended by uk.ac.gla.terrier.structures.DocumentIndex
      extended by uk.ac.gla.terrier.structures.DocumentIndexInMemory
All Implemented Interfaces:
Closeable, IndexConfigurable

public class DocumentIndexInMemory
extends DocumentIndex

This class extends DocumentIndex, but instead of accessing the disk file each time, the data are loaded into memory, in order to decrease access time.

Version:
$Revision: 1.33 $
Author:
Vassilis Plachouras

Field Summary
 
Fields inherited from class uk.ac.gla.terrier.structures.DocumentIndex
entryLength
 
Constructor Summary
DocumentIndexInMemory()
          The default constructor for DocumentIndexInMemory.
DocumentIndexInMemory(java.lang.String filename)
          A constructor for DocumentIndexInMemory that specifies the file to open.
DocumentIndexInMemory(java.lang.String path, java.lang.String prefix)
           
 
Method Summary
 FilePosition getDirectIndexEndOffset()
          Returns the ending offset of the current document's entry in the direct index.
 FilePosition getDirectIndexStartOffset()
          Returns the starting offset of the current document's entry in the direct index.
 int getDocumentId(java.lang.String docno)
          Returns the id of a document with a given document number.
 int getDocumentLength(int docid)
          Returns the length of a document with a given id.
 int getDocumentLength(java.lang.String docno)
          Returns the document length of the document with a given document number .
 java.lang.String getDocumentNumber(int docid)
          Returns the number of a document with a given id.
 int getNumberOfDocuments()
          Returns the number of documents.
 void loadIntoMemory(java.io.DataInputStream dis, int numOfEntries)
          This method loads the data into memory.
 void print()
          Prints to the standard error the document index structure, which is loaded into memory.
 boolean seek(int i)
          This method overrides the seek(int docid) method of DocumentIndex class.
 void setDocnoEntryLength(int l)
          Set the length of docnos in the index file
 
Methods inherited from class uk.ac.gla.terrier.structures.DocumentIndex
close, main, seek, setIndex
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentIndexInMemory

public DocumentIndexInMemory()
The default constructor for DocumentIndexInMemory. Opens the document index file and reads its contents into memory.


DocumentIndexInMemory

public DocumentIndexInMemory(java.lang.String filename)
A constructor for DocumentIndexInMemory that specifies the file to open. Opens the document index file and reads its contents into memory. For the document pointers file we replace the extension of the document index file with the right default extension.

Parameters:
filename - java.lang.String

DocumentIndexInMemory

public DocumentIndexInMemory(java.lang.String path,
                             java.lang.String prefix)
Method Detail

setDocnoEntryLength

public void setDocnoEntryLength(int l)
Set the length of docnos in the index file

Overrides:
setDocnoEntryLength in class DocumentIndex

print

public void print()
Prints to the standard error the document index structure, which is loaded into memory.

Overrides:
print in class DocumentIndex

getDocumentId

public int getDocumentId(java.lang.String docno)
Returns the id of a document with a given document number.

Overrides:
getDocumentId in class DocumentIndex
Parameters:
docno - java.lang.String The document's number
Returns:
int The document's id, or a negative number if a document with the given number doesn't exist.

getDocumentLength

public int getDocumentLength(int docid)
Returns the length of a document with a given id.

Overrides:
getDocumentLength in class DocumentIndex
Parameters:
docid - the document's id
Returns:
int The document's length

getDocumentLength

public int getDocumentLength(java.lang.String docno)
Returns the document length of the document with a given document number .

Overrides:
getDocumentLength in class DocumentIndex
Parameters:
docno - java.lang.String The document's number
Returns:
int The document's length

getDocumentNumber

public java.lang.String getDocumentNumber(int docid)
Returns the number of a document with a given id.

Overrides:
getDocumentNumber in class DocumentIndex
Parameters:
docid - int The documents id
Returns:
java.lang.String The documents number

getDirectIndexEndOffset

public FilePosition getDirectIndexEndOffset()
Returns the ending offset of the current document's entry in the direct index.

Overrides:
getDirectIndexEndOffset in class DocumentIndex
Returns:
FilePosition an offset in the direct index.

getNumberOfDocuments

public int getNumberOfDocuments()
Returns the number of documents.

Overrides:
getNumberOfDocuments in class DocumentIndex
Returns:
int the total number of indexed documents.

getDirectIndexStartOffset

public FilePosition getDirectIndexStartOffset()
Returns the starting offset of the current document's entry in the direct index.

Overrides:
getDirectIndexStartOffset in class DocumentIndex
Returns:
FilePosition an offset in the direct index.

loadIntoMemory

public void loadIntoMemory(java.io.DataInputStream dis,
                           int numOfEntries)
                    throws java.io.IOException
This method loads the data into memory.

Parameters:
dis - java.io.DataInputStream The input stream from which the data are read,
numOfEntries - int The number of entries to read
Throws:
java.io.IOException - An input/output exception is thrown if there is any error while reading from disk.

seek

public boolean seek(int i)
This method overrides the seek(int docid) method of DocumentIndex class.

Overrides:
seek in class DocumentIndex
Parameters:
i - the docid of the document we are looking for.
Returns:
always true because we are handling a stream and arbitrary seeking is not allowed.

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow