public class SimpleFileCollection extends Object implements Collection
| Modifier and Type | Field and Description | 
|---|---|
| protected InputStream | currentStreamThe InputStream of the most recently opened document. | 
| protected int | DocidThe identifier of a document in the collection. | 
| protected Map<String,Class<? extends Document>> | extension_DocumentClassMaps filename extensions to Document classes. | 
| protected LinkedList<String> | FileListThe list of files to index. | 
| protected List<String> | firstListContains the list of files first handed to the SimpleFileCollection, allowing
 the SimpleFileCollection instance to be simply reset. | 
| protected List<String> | indexedFilesThis is filled during traversal, so document IDs can be matched with filenames | 
| protected static org.slf4j.Logger | logger | 
| static String | NAMESPACE_DOCUMENTSThe default namespace for all parsers to be loaded from. | 
| protected boolean | RecurseWhether directories should be recursed into by this class | 
| protected String | thisFilenameThe filename of the current file we are processing. | 
| protected Tokeniser | tokeniser | 
| Constructor and Description | 
|---|
| SimpleFileCollection()A default constructor that uses the files to be processed
 by this collection, as specified by the property
 collection.spec | 
| SimpleFileCollection(List<String> filelist,
                    boolean recurse)Constructs an instance of the class with the given list of files. | 
| SimpleFileCollection(String addressCollectionFilename)Creates an instance of the class. | 
| Modifier and Type | Method and Description | 
|---|---|
| protected void | addDirectoryListing()Called when thisFile is identified as a directory, this adds the entire
 contents of the directory onto the list to be processed. | 
| void | close() | 
| protected void | createExtensionDocumentMapping()Parses the properties indexing.simplefilecollection.extensionsparsers
 and indexing.simplefilecollection.defaultparser and attempts to load
 all the mentioned classes, in a hashtable mapping filename extension to their
 respective parsers. | 
| boolean | endOfCollection()Checks whether there are more documents in the colection. | 
| String | getDocid()Returns the current document's identifier string. | 
| Document | getDocument()Return the current document in the collection. | 
| List<String> | getFileList()Returns the ist of indexed files in the order they were indexed in. | 
| boolean | hasNext()Check whether there is a next document in the collection to be processed | 
| static void | main(String[] args)Simple test case. | 
| protected Document | makeDocument(String Filename,
            InputStream in)Given the opened document in, of Filename and File f, work out which
 parser to try, and instantiate it. | 
| Document | next()Move onto the next document in the collection to be processed. | 
| boolean | nextDocument()Move onto the next document in the collection to be processed. | 
| void | remove()This is unsupported by this Collection implementation, and
 any calls will throw UnsupportedOperationException
 Throws UnsupportedOperationException on all invocations | 
| void | reset()Starts again from the beginning of the collection. | 
protected static final org.slf4j.Logger logger
public static final String NAMESPACE_DOCUMENTS
protected LinkedList<String> FileList
protected List<String> firstList
protected List<String> indexedFiles
protected int Docid
protected boolean Recurse
protected Map<String,Class<? extends Document>> extension_DocumentClass
protected String thisFilename
protected InputStream currentStream
protected Tokeniser tokeniser
public SimpleFileCollection(List<String> filelist, boolean recurse)
filelist - ArrayList the files to be processed by this collection.public SimpleFileCollection()
public SimpleFileCollection(String addressCollectionFilename)
addressCollectionFilename - String the name of the file that 
        contains the list of files to be processed by this collecion.protected void createExtensionDocumentMapping()
public boolean hasNext()
public Document next()
public void remove()
public boolean nextDocument()
nextDocument in interface Collectionpublic Document getDocument()
getDocument in interface Collectionprotected Document makeDocument(String Filename, InputStream in)
Filename - the filename of the currently open documentin - The stream of the currently open documentpublic boolean endOfCollection()
endOfCollection in interface Collectionpublic void reset()
reset in interface Collectionpublic String getDocid()
public void close()
close in interface Closeableclose in interface AutoCloseablepublic List<String> getFileList()
protected void addDirectoryListing()
public static void main(String[] args)
Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow