Terrier IR Platform
2.2.1

uk.ac.gla.terrier.indexing
Interface Collection

All Known Implementing Classes:
SimpleFileCollection, SimpleMedlineXMLCollection, SimpleXMLCollection, TRECCollection, TRECUTFCollection

public interface Collection

This interface encapsulates the most fundamental concept to indexing with Terrier - a Collection. Anyone using Terrier to encapuslate a new source of data (a corpus, colllection etc) needs to create an object which implements this Collection interface.
The Collection interface is essentially an Iterator over a series of documents. It generates Document objects for each next document requested from the collection. It is aware of the type of Document objects available, and how to instantiate them.
Terrier core provides two Collection implementation: TRECCollection and SimpleFileCollection.

Version:
$Revision: 1.12 $
Author:
Craig Macdonald

Method Summary
 void close()
          Closes the collection, any files that may be open.
 boolean endOfCollection()
          Returns true if the end of the collection has been reached
 java.lang.String getDocid()
          Get the String document identifier of the current document.
 Document getDocument()
          Get the document object representing the current document.
 boolean nextDocument()
          Move the collection to the start of the next document.
 void reset()
          Resets the Collection iterator to the start of the collection.
 

Method Detail

nextDocument

boolean nextDocument()
Move the collection to the start of the next document.

Returns:
boolean true if there exists another document in the collection, otherwise it returns false.

getDocument

Document getDocument()
Get the document object representing the current document.

Returns:
Document the current document;

getDocid

java.lang.String getDocid()
Get the String document identifier of the current document.

Returns:
String the document identifier of a document.

endOfCollection

boolean endOfCollection()
Returns true if the end of the collection has been reached

Returns:
boolean true if the end of collection has been reached, otherwise it returns false.

reset

void reset()
Resets the Collection iterator to the start of the collection.


close

void close()
Closes the collection, any files that may be open. Collection can no longer be used after this has been called


Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow