|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object uk.ac.gla.terrier.indexing.FileDocument uk.ac.gla.terrier.indexing.MSWordDocument
public class MSWordDocument
This class is used for indexing MS Word document files (ie files ending .doc). It does this by using the textmining.org MSWord conversion library (tm-extractors), which in turn uses the Jakarta-POI libraries. So to compile or use this object, you'll need to ensure poi-?.?.?-final-*.jar and tm-extractors.jar are part of you classpath.
Field Summary |
---|
Fields inherited from class uk.ac.gla.terrier.indexing.FileDocument |
---|
counter |
Constructor Summary | |
---|---|
MSWordDocument(java.io.File f,
java.io.InputStream docStream)
Constructs a new MSWordDocument object for the file represented by docStream. |
Method Summary |
---|
Methods inherited from class uk.ac.gla.terrier.indexing.FileDocument |
---|
endOfDocument, getAllProperties, getFields, getNextTerm, getProperty, getReader |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MSWordDocument(java.io.File f, java.io.InputStream docStream)
|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |