Package | Description |
---|---|
org.terrier.applications |
Provides application-level code that use the Terrier platform to
perform indexing and retrieval from either standard test collections,
interactive querying of a indexed collection.
|
org.terrier.indexing |
Provides classes and interfaces related to the indexing of documents.
|
org.terrier.structures.indexing |
Provides the classes used for creating the data structures of
the Terrier platform.
|
org.terrier.structures.indexing.classical |
Provides functionality for creating on-disk indices via indexer classes.
|
org.terrier.structures.indexing.singlepass |
Provides implementation of the structures needed for performing a single
pass indexing
|
Modifier and Type | Field and Description |
---|---|
protected Collection |
TRECIndexing.collectionTREC
The collection to index.
|
Modifier and Type | Method and Description |
---|---|
protected static Collection |
ThreadedBatchIndexing.loadCollection(List<String> files) |
protected Collection |
TRECIndexing.loadCollection(String collectionSpec) |
Constructor and Description |
---|
TRECIndexing(String _path,
String _prefix,
Collection c)
A constructor that initialised the data structures
to use for indexing.
|
TRECIndexingSinglePass(String _path,
String _prefix,
Collection c) |
Modifier and Type | Class and Description |
---|---|
class |
CollectionDocumentList |
class |
MultiDocumentFileCollection |
class |
SimpleFileCollection
Implements a collection that can read arbitrary files on disk.
|
class |
SimpleMedlineXMLCollection
Initial implementation of a class that generates a Collection with Documents from a
series of XML files in the Medline format.
|
class |
SimpleXMLCollection
Initial implementation of a class that generates a Collection with Documents from a
series of XML files.
|
class |
TRECCollection
Models a TREC test collection by implementing the interfaces
Collection and DocumentExtractor.
|
class |
TRECUTFCollection
Deprecated.
|
class |
TRECWebCollection
Version of TRECCollection which can parse
standard form DOCHDR tags in TREC Web corpoa.
|
class |
TwitterJSONCollection
This class represents a collection of tweets stored in JSON
format.
|
class |
WARC018Collection
This object is used to parse WARC format web crawls, 0.18.
|
class |
WARC09Collection
This object is used to parse WARC format web crawls, version 0.9.
|
class |
WARC10Collection
This object is used to parse WARC format web crawls, version 0.10.
|
Modifier and Type | Method and Description |
---|---|
static Collection |
CollectionFactory.loadCollection(String CollectionName)
Load collection(s) of the specified name.
|
static Collection |
CollectionFactory.loadCollection(String CollectionName,
Class<?>[] contructorTypes,
Object[] constructorValues)
Load collection(s) of the specified name.
|
static Collection |
CollectionFactory.loadCollections()
Use the default property trec.collection.class, or it's default value TRECCollection
|
static Collection |
CollectionFactory.loadCollections(String[] collNames)
Load collection(s) of the specified name.
|
static Collection |
CollectionFactory.loadCollections(String[] collNames,
Class<?>[] contructorTypes,
Object[] constructorValues)
Load collection(s) of the specified name.
|
static Collection |
IndexTestUtils.makeCollection(String[] docnos,
String[] documents) |
Modifier and Type | Method and Description |
---|---|
abstract void |
Indexer.createDirectIndex(Collection[] collections)
An abstract method for creating the direct index, the document index
and the lexicon for the given collections.
|
void |
Indexer.index(Collection[] collections)
Creates the data structures for a set of collections.
|
Modifier and Type | Method and Description |
---|---|
void |
BlockIndexer.createDirectIndex(Collection[] collections)
For the given collection, it iterates through the documents and
creates the direct index, document index and lexicon, using
information about blocks and possibly fields.
|
void |
BasicIndexer.createDirectIndex(Collection[] collections)
Creates the direct index, the document index and the lexicon.
|
Modifier and Type | Method and Description |
---|---|
void |
BasicSinglePassIndexer.createDirectIndex(Collection[] collections) |
void |
ExtensibleSinglePassIndexer.createInvertedIndex(Collection[] collections)
Builds the inverted file and lexicon file for the given collections
Loops through each document in each of the collections,
extracting terms and pushing these through the Term Pipeline
(e.g.
|
void |
BasicSinglePassIndexer.createInvertedIndex(Collection[] collections)
Builds the inverted file and lexicon file for the given collections
Loops through each document in each of the collections,
extracting terms and pushing these through the Term Pipeline
(eg stemming, stopping, lowercase).
|
Terrier Information Retrieval Platform 5.1. Copyright © 2004-2019, University of Glasgow