Terrier IR Platform
2.2.1

uk.ac.gla.terrier.indexing
Class SimpleFileCollection

java.lang.Object
  extended by uk.ac.gla.terrier.indexing.SimpleFileCollection
All Implemented Interfaces:
Collection

public class SimpleFileCollection
extends java.lang.Object
implements Collection

Implements a collection that can read arbitrary files on disk. It will use the file list given to it in the constructor, or it will read the file specified by the property collection.spec.

Version:
$Revision: 1.39 $
Author:
Craig Macdonald & Vassilis Plachouras

Field Summary
static java.lang.String NAMESPACE_DOCUMENTS
          The default namespace for all parsers to be loaded from.
 
Constructor Summary
SimpleFileCollection()
          A default constructor that uses the files to be processed by this collection, as specified by the property collection.spec
SimpleFileCollection(java.util.List<java.lang.String> filelist, boolean recurse)
          Constructs an instance of the class with the given list of files.
SimpleFileCollection(java.lang.String addressCollectionFilename)
          Creates an instance of the class.
 
Method Summary
 void close()
          Closes the collection, any files that may be open.
 boolean endOfCollection()
          Checks whether there are more documents in the colection.
 java.lang.String getDocid()
          Returns the current document's identifier string.
 Document getDocument()
          Return the current document in the collection.
 java.util.List<java.lang.String> getFileList()
          Returns the ist of indexed files in the order they were indexed in.
static void main(java.lang.String[] args)
          Simple test case.
 boolean nextDocument()
          Move onto the next document in the collection to be processed.
 void reset()
          Starts again from the beginning of the collection.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NAMESPACE_DOCUMENTS

public static final java.lang.String NAMESPACE_DOCUMENTS
The default namespace for all parsers to be loaded from. Only used if the class name specified does not contain any periods ('.')

See Also:
Constant Field Values
Constructor Detail

SimpleFileCollection

public SimpleFileCollection(java.util.List<java.lang.String> filelist,
                            boolean recurse)
Constructs an instance of the class with the given list of files.

Parameters:
filelist - ArrayList the files to be processed by this collection.

SimpleFileCollection

public SimpleFileCollection()
A default constructor that uses the files to be processed by this collection, as specified by the property collection.spec


SimpleFileCollection

public SimpleFileCollection(java.lang.String addressCollectionFilename)
Creates an instance of the class. The files to be processed are specified in the file with the given name.

Parameters:
addressCollectionFilename - String the name of the file that contains the list of files to be processed by this collecion.
Method Detail

nextDocument

public boolean nextDocument()
Move onto the next document in the collection to be processed.

Specified by:
nextDocument in interface Collection
Returns:
boolean true if there are more documents in the collection, otherwise return false.

getDocument

public Document getDocument()
Return the current document in the collection.

Specified by:
getDocument in interface Collection
Returns:
Document the next document object from the collection.

endOfCollection

public boolean endOfCollection()
Checks whether there are more documents in the colection.

Specified by:
endOfCollection in interface Collection
Returns:
boolean true if there are no more documents in the collection, otherwise it returns false.

reset

public void reset()
Starts again from the beginning of the collection.

Specified by:
reset in interface Collection

getDocid

public java.lang.String getDocid()
Returns the current document's identifier string.

Specified by:
getDocid in interface Collection
Returns:
String the identifier of the current document.

close

public void close()
Description copied from interface: Collection
Closes the collection, any files that may be open. Collection can no longer be used after this has been called

Specified by:
close in interface Collection

getFileList

public java.util.List<java.lang.String> getFileList()
Returns the ist of indexed files in the order they were indexed in.


main

public static void main(java.lang.String[] args)
Simple test case. Pass the filename of a file that lists files to be processed to this test case.


Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow