Package org.terrier.indexing
Interface Collection
-
- All Superinterfaces:
java.lang.AutoCloseable
,java.io.Closeable
- All Known Implementing Classes:
CollectionDocumentList
,MultiDocumentFileCollection
,SimpleFileCollection
,SimpleMedlineXMLCollection
,SimpleXMLCollection
,TRECCollection
,TRECUTFCollection
,TRECWebCollection
,TwitterJSONCollection
,WARC018Collection
,WARC09Collection
,WARC10Collection
public interface Collection extends java.io.Closeable
This interface encapsulates the most fundamental concept to indexing with Terrier - a Collection. Anyone using Terrier to encapuslate a new source of data (a corpus, colllection etc) needs to create an object which implements this Collection interface.
The Collection interface is essentially an Iterator over a series of documents. It generates Document objects for each next document requested from the collection. It is aware of the type of Document objects available, and how to instantiate them.
Terrier core provides two Collection implementation: TRECCollection and SimpleFileCollection.- Author:
- Craig Macdonald
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description boolean
endOfCollection()
Returns true if the end of the collection has been reachedDocument
getDocument()
Get the document object representing the current document.boolean
nextDocument()
Move the collection to the start of the next document.void
reset()
Resets the Collection iterator to the start of the collection.
-
-
-
Method Detail
-
nextDocument
boolean nextDocument()
Move the collection to the start of the next document.- Returns:
- boolean true if there exists another document in the collection, otherwise it returns false.
-
getDocument
Document getDocument()
Get the document object representing the current document.- Returns:
- Document the current document;
-
endOfCollection
boolean endOfCollection()
Returns true if the end of the collection has been reached- Returns:
- boolean true if the end of collection has been reached, otherwise it returns false.
-
reset
void reset()
Resets the Collection iterator to the start of the collection.
-
-