|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use Collection | |
---|---|
org.terrier.indexing | Provides classes and interfaces related to the indexing of documents. |
org.terrier.structures.indexing.singlepass.hadoop | Provides classes implemeting the Hadoop MapReduce indexing in Terrier. |
Uses of Collection in org.terrier.indexing |
---|
Classes in org.terrier.indexing that implement Collection | |
---|---|
class |
SimpleFileCollection
Implements a collection that can read arbitrary files on disk. |
class |
SimpleMedlineXMLCollection
Initial implementation of a class that generates a Collection with Documents from a series of XML files in the Medline format. |
class |
SimpleXMLCollection
Initial implementation of a class that generates a Collection with Documents from a series of XML files. |
class |
TRECCollection
Models a TREC test collection by implementing the interfaces Collection and DocumentExtractor. |
class |
TRECUTFCollection
Deprecated. |
class |
TRECWebCollection
Version of TRECCollection which can parse standard form DOCHDR tags in TREC Web corpoa. |
class |
WARC018Collection
This object is used to parse WARC format web crawls, 0.18. |
class |
WARC09Collection
This object is used to parse WARC format web crawls, version 0.9. |
Methods in org.terrier.indexing that return Collection | |
---|---|
static Collection |
CollectionFactory.loadCollection(java.lang.String CollectionName)
Load collection(s) of the specified name. |
static Collection |
CollectionFactory.loadCollection(java.lang.String CollectionName,
java.lang.Class<?>[] contructorTypes,
java.lang.Object[] constructorValues)
Load collection(s) of the specified name. |
static Collection |
CollectionFactory.loadCollections()
Use the default property trec.collection.class, or it's default value TRECCollection |
static Collection |
CollectionFactory.loadCollections(java.lang.String[] collNames)
Load collection(s) of the specified name. |
static Collection |
CollectionFactory.loadCollections(java.lang.String[] collNames,
java.lang.Class<?>[] contructorTypes,
java.lang.Object[] constructorValues)
Load collection(s) of the specified name. |
Methods in org.terrier.indexing with parameters of type Collection | |
---|---|
abstract void |
Indexer.createDirectIndex(Collection[] collections)
An abstract method for creating the direct index, the document index and the lexicon for the given collections. |
void |
BlockIndexer.createDirectIndex(Collection[] collections)
For the given collection, it iterates through the documents and creates the direct index, document index and lexicon, using information about blocks and possibly fields. |
void |
BasicSinglePassIndexer.createDirectIndex(Collection[] collections)
|
void |
BasicIndexer.createDirectIndex(Collection[] collections)
Creates the direct index, the document index and the lexicon. |
void |
ExtensibleSinglePassIndexer.createInvertedIndex(Collection[] collections)
Builds the inverted file and lexicon file for the given collections Loops through each document in each of the collections, extracting terms and pushing these through the Term Pipeline (e.g. |
void |
BasicSinglePassIndexer.createInvertedIndex(Collection[] collections)
Builds the inverted file and lexicon file for the given collections Loops through each document in each of the collections, extracting terms and pushing these through the Term Pipeline (eg stemming, stopping, lowercase). |
void |
Indexer.index(Collection[] collections)
Creates the data structures for a set of collections. |
Uses of Collection in org.terrier.structures.indexing.singlepass.hadoop |
---|
Fields in org.terrier.structures.indexing.singlepass.hadoop declared as Collection | |
---|---|
protected Collection |
CollectionRecordReader.documentCollection
document collection currently being iterated through. |
Methods in org.terrier.structures.indexing.singlepass.hadoop that return Collection | |
---|---|
protected Collection |
FileCollectionRecordReader.openCollectionSplit(int index)
Opens a collection on the next file. |
protected abstract Collection |
CollectionRecordReader.openCollectionSplit(int index)
open a collection for the index'th parth of the current split |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |