org.terrier.indexing
Class MSPowerpointDocument
java.lang.Object
org.terrier.indexing.FileDocument
org.terrier.indexing.MSPowerpointDocument
- All Implemented Interfaces:
- Document
public class MSPowerpointDocument
- extends FileDocument
Implements a Document object for reading Microsoft Powerpoint files.
This implementation uses the Jakarta-POI (POIFS) library, so to compile
or use this module, you must have the poi-?.?./-final-*.jar in your
classpath.
- Author:
- Craig Macdonald
Field Summary |
protected static org.apache.log4j.Logger |
logger
|
Constructor Summary |
MSPowerpointDocument(java.io.InputStream docStream,
java.util.Map<java.lang.String,java.lang.String> docProperties,
Tokeniser tok)
Constructs a new MSPowerpointDocument object for the passed InputStream |
MSPowerpointDocument(java.io.Reader docReader,
java.util.Map<java.lang.String,java.lang.String> docProperties,
Tokeniser tok)
Constructs a new MSPowerpointDocument object for the passed InputStream |
MSPowerpointDocument(java.lang.String filename,
java.io.InputStream docStream,
Tokeniser tokeniser)
Constructs a new MSPowerpointDocument object for the passed InputStream |
MSPowerpointDocument(java.lang.String filename,
java.io.Reader docReader,
Tokeniser tok)
Constructs a new MSPowerpointDocument object for the passed InputStream |
Method Summary |
protected java.io.Reader |
getReader(java.io.InputStream docStream)
This method returns the Reader for the @param docStream
file stream. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
logger
protected static final org.apache.log4j.Logger logger
MSPowerpointDocument
public MSPowerpointDocument(java.lang.String filename,
java.io.InputStream docStream,
Tokeniser tokeniser)
- Constructs a new MSPowerpointDocument object for the passed InputStream
- Parameters:
filename
- the file that has been openeddocStream
- the stream of the file
MSPowerpointDocument
public MSPowerpointDocument(java.io.InputStream docStream,
java.util.Map<java.lang.String,java.lang.String> docProperties,
Tokeniser tok)
- Constructs a new MSPowerpointDocument object for the passed InputStream
- Parameters:
docStream
- docProperties
- tok
-
MSPowerpointDocument
public MSPowerpointDocument(java.io.Reader docReader,
java.util.Map<java.lang.String,java.lang.String> docProperties,
Tokeniser tok)
- Constructs a new MSPowerpointDocument object for the passed InputStream
- Parameters:
docReader
- docProperties
- tok
-
MSPowerpointDocument
public MSPowerpointDocument(java.lang.String filename,
java.io.Reader docReader,
Tokeniser tok)
- Constructs a new MSPowerpointDocument object for the passed InputStream
- Parameters:
filename
- docReader
- tok
-
getReader
protected java.io.Reader getReader(java.io.InputStream docStream)
- This method returns the Reader for the @param docStream
file stream. This involves loading and converting the powerpoint
document.
On failure, returns null, and sets EOD to true, so no terms can
be read from this object.
- Overrides:
getReader
in class FileDocument
- Parameters:
docStream
- an input stream that we want to
access as a buffered reader.
- Returns:
- the buffered reader that encapsulates the
given input stream.
Terrier 3.5. Copyright © 2004-2011 University of Glasgow