Terrier IR Platform
2.2.1

uk.ac.gla.terrier.indexing
Class SimpleXMLCollection

java.lang.Object
  extended by uk.ac.gla.terrier.indexing.SimpleXMLCollection
All Implemented Interfaces:
Collection
Direct Known Subclasses:
SimpleMedlineXMLCollection

public class SimpleXMLCollection
extends java.lang.Object
implements Collection

Initial implementation of a class that generates a Collection with Documents from a series of XML files.

Properties:


Field Summary
static java.lang.String ELEMENT_ATTR_SEPARATOR
           
static int tokenMaximumLength
           
 
Constructor Summary
SimpleXMLCollection()
           
SimpleXMLCollection(java.lang.String CollectionSpecFilename, java.lang.String BlacklistSpecFilename)
           
 
Method Summary
 void close()
          Closes the collection, any files that may be open.
 boolean endOfCollection()
          Returns true if the end of the collection has been reached
 java.lang.String getDocid()
          Get the String document identifier of the current document.
 Document getDocument()
          Get the document object representing the current document.
static void main(java.lang.String[] args)
           
 boolean nextDocument()
          Move the collection to the start of the next document.
 void reset()
          Resets the Collection iterator to the start of the collection.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ELEMENT_ATTR_SEPARATOR

public static final java.lang.String ELEMENT_ATTR_SEPARATOR
See Also:
Constant Field Values

tokenMaximumLength

public static final int tokenMaximumLength
Constructor Detail

SimpleXMLCollection

public SimpleXMLCollection()

SimpleXMLCollection

public SimpleXMLCollection(java.lang.String CollectionSpecFilename,
                           java.lang.String BlacklistSpecFilename)
Method Detail

close

public void close()
Description copied from interface: Collection
Closes the collection, any files that may be open. Collection can no longer be used after this has been called

Specified by:
close in interface Collection

endOfCollection

public boolean endOfCollection()
Description copied from interface: Collection
Returns true if the end of the collection has been reached

Specified by:
endOfCollection in interface Collection
Returns:
boolean true if the end of collection has been reached, otherwise it returns false.

getDocid

public java.lang.String getDocid()
Description copied from interface: Collection
Get the String document identifier of the current document.

Specified by:
getDocid in interface Collection
Returns:
String the document identifier of a document.

nextDocument

public boolean nextDocument()
Description copied from interface: Collection
Move the collection to the start of the next document.

Specified by:
nextDocument in interface Collection
Returns:
boolean true if there exists another document in the collection, otherwise it returns false.

getDocument

public Document getDocument()
Description copied from interface: Collection
Get the document object representing the current document.

Specified by:
getDocument in interface Collection
Returns:
Document the current document;

reset

public void reset()
Description copied from interface: Collection
Resets the Collection iterator to the start of the collection.

Specified by:
reset in interface Collection

main

public static void main(java.lang.String[] args)

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow