|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object uk.ac.gla.terrier.indexing.SimpleXMLCollection uk.ac.gla.terrier.indexing.SimpleMedlineXMLCollection
public class SimpleMedlineXMLCollection
Initial implementation of a class that generates a Collection with Documents from a series of XML files in the Medline format. It process a limited number of documents in an XML file to avoid OutOfMemory problem in case the XML file is too large.
Properties:
Field Summary | |
---|---|
java.lang.String |
docEndTag
The end tag of documents in the XML files. |
java.lang.String |
docTag
The tag of documents in the XML files. |
java.lang.String |
EOL
The end of line string. |
java.lang.String |
fileEndTag
The tag indicating the end of an XML file. |
java.lang.String |
fileTag
The tag indicating the start of an XML file. |
Fields inherited from class uk.ac.gla.terrier.indexing.SimpleXMLCollection |
---|
ELEMENT_ATTR_SEPARATOR, tokenMaximumLength |
Constructor Summary | |
---|---|
SimpleMedlineXMLCollection()
The default constructor. |
|
SimpleMedlineXMLCollection(java.lang.String CollectionSpecFilename,
java.lang.String BlacklistSpecFilename)
An alternative constructor. |
Method Summary |
---|
Methods inherited from class uk.ac.gla.terrier.indexing.SimpleXMLCollection |
---|
close, endOfCollection, getDocid, getDocument, main, nextDocument, reset |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public final java.lang.String docTag
public final java.lang.String docEndTag
public final java.lang.String fileTag
public final java.lang.String fileEndTag
public final java.lang.String EOL
Constructor Detail |
---|
public SimpleMedlineXMLCollection()
public SimpleMedlineXMLCollection(java.lang.String CollectionSpecFilename, java.lang.String BlacklistSpecFilename)
CollectionSpecFilename
- The name of the file containing the location of XML files in the collection.BlacklistSpecFilename
- The name of the file containing the location of the blacklisted XML files
in the collection.
|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |