|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.indexing.SimpleXMLCollection
uk.ac.gla.terrier.indexing.SimpleMedlineXMLCollection
public class SimpleMedlineXMLCollection
Initial implementation of a class that generates a Collection with Documents from a series of XML files in the Medline format. It process a limited number of documents in an XML file to avoid OutOfMemory problem in case the XML file is too large.
Properties:
| Field Summary | |
|---|---|
java.lang.String |
docEndTag
The end tag of documents in the XML files. |
java.lang.String |
docTag
The tag of documents in the XML files. |
java.lang.String |
EOL
The end of line string. |
java.lang.String |
fileEndTag
The tag indicating the end of an XML file. |
java.lang.String |
fileTag
The tag indicating the start of an XML file. |
| Fields inherited from class uk.ac.gla.terrier.indexing.SimpleXMLCollection |
|---|
ELEMENT_ATTR_SEPARATOR, tokenMaximumLength |
| Constructor Summary | |
|---|---|
SimpleMedlineXMLCollection()
The default constructor. |
|
SimpleMedlineXMLCollection(java.lang.String CollectionSpecFilename,
java.lang.String BlacklistSpecFilename)
An alternative constructor. |
|
| Method Summary |
|---|
| Methods inherited from class uk.ac.gla.terrier.indexing.SimpleXMLCollection |
|---|
close, endOfCollection, getDocid, getDocument, main, nextDocument, reset |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public final java.lang.String docTag
public final java.lang.String docEndTag
public final java.lang.String fileTag
public final java.lang.String fileEndTag
public final java.lang.String EOL
| Constructor Detail |
|---|
public SimpleMedlineXMLCollection()
public SimpleMedlineXMLCollection(java.lang.String CollectionSpecFilename,
java.lang.String BlacklistSpecFilename)
CollectionSpecFilename - The name of the file containing the location of XML files in the collection.BlacklistSpecFilename - The name of the file containing the location of the blacklisted XML files
in the collection.
|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||