public abstract class MultiDocumentFileCollection extends Object implements Collection
| Modifier and Type | Field and Description | 
|---|---|
| protected String | currentFilenameFilename of current file | 
| protected String | desiredEncodingEncoding to be used to open all files. | 
| protected Map<String,String> | DocPropertiesproperties for the current document | 
| protected Class<? extends Document> | documentClassClass to use for all documents parsed by this class | 
| protected int | documentsInThisFileCounts the number of documents that have been found in this file. | 
| protected boolean | eocare we at the end of the collection? | 
| protected boolean | eofhas the end of the current input file been reached? | 
| protected int | FileNumberThe index in the FilesToProcess of the currently processed file. | 
| protected ArrayList<String> | FilesToProcessThe list of files to process. | 
| protected boolean | forceUTF8should UTF8 encoding be assumed? | 
| protected InputStream | isthe input stream of the current input file | 
| protected static org.slf4j.Logger | loggerlogger for this class | 
| protected boolean | SkipFileA boolean which is true when a new file is open. | 
| protected Tokeniser | tokeniserTokeniser to use for all documents parsed by this class | 
| Modifier | Constructor and Description | 
|---|---|
| protected  | MultiDocumentFileCollection() | 
| Modifier and Type | Method and Description | 
|---|---|
| void | close()Closes the collection, any files that may be open. | 
| boolean | endOfCollection()Returns true if the end of the collection has been reached | 
| protected void | extractCharset() | 
| abstract Document | getDocument()Get the document object representing the current document. | 
| boolean | hasNext()Check whether it is the last document in the collection | 
| protected void | loadDocumentClass()Loads the class that will supply all documents for this Collection. | 
| Document | next()Return the next document | 
| abstract boolean | nextDocument()Move the collection to the start of the next document. | 
| protected void | openNewFile() | 
| protected boolean | openNextFile()Opens the next document from the collection specification. | 
| protected void | readCollectionSpec(String CollectionSpecFilename)read in the collection.spec | 
| void | reset()Resets the Collection iterator to the start of the collection. | 
protected static final org.slf4j.Logger logger
protected int documentsInThisFile
protected boolean eoc
protected boolean eof
protected boolean SkipFile
protected String currentFilename
protected final boolean forceUTF8
protected InputStream is
protected int FileNumber
protected String desiredEncoding
protected Class<? extends Document> documentClass
protected Tokeniser tokeniser
public abstract Document getDocument()
CollectiongetDocument in interface Collectionprotected void loadDocumentClass()
public boolean hasNext()
public Document next()
public void close()
close in interface Closeableclose in interface AutoCloseablepublic boolean endOfCollection()
endOfCollection in interface Collectionprotected boolean openNextFile()
                        throws IOException
IOException - if there is an exception while opening the
           collection files.protected void extractCharset()
public abstract boolean nextDocument()
nextDocument in interface Collectionprotected void readCollectionSpec(String CollectionSpecFilename)
public void reset()
reset in interface CollectionTerrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow