Terrier IR Platform
1.1.1

uk.ac.gla.terrier.indexing
Class MSExcelDocument

java.lang.Object
  extended by uk.ac.gla.terrier.indexing.FileDocument
      extended by uk.ac.gla.terrier.indexing.MSExcelDocument
All Implemented Interfaces:
Document

public class MSExcelDocument
extends FileDocument

Implements a Document object for a Microsoft Excel spreadsheet. Uses HSSF and POIFS subparts of the Jakarta-POI project. This means that to use or compile this module, you must have the poi-?.?.?-final-*.jar in your classpath.

A bug in the current stable POI library seems to mean that large Excel files cannot be parsed - see the MAXFILESIZE field to control the maximum file size that this class will attempt to read.

Version:
$Revision: 1.11 $
Author:
Craig Macdonald

Field Summary
 
Fields inherited from class uk.ac.gla.terrier.indexing.FileDocument
counter
 
Constructor Summary
MSExcelDocument(java.io.File f, java.io.InputStream docStream)
          Construct a new MSExcelDocument Document object
 
Method Summary
 
Methods inherited from class uk.ac.gla.terrier.indexing.FileDocument
endOfDocument, getAllProperties, getFields, getNextTerm, getProperty, getReader
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MSExcelDocument

public MSExcelDocument(java.io.File f,
                       java.io.InputStream docStream)
Construct a new MSExcelDocument Document object

Parameters:
f - the file that is opened for this
docStream - the actual stream of the open file

Terrier IR Platform
1.1.1

Terrier Information Retrieval Platform 1.1.1. Copyright 2004-2007 University of Glasgow