public class BasicIndexer extends Indexer
Indexer
,
BlockIndexer
Modifier and Type | Class and Description |
---|---|
protected class |
BasicIndexer.BasicTermProcessor
This class implements an end of a TermPipeline that adds the
term to the DocumentTree.
|
protected class |
BasicIndexer.FieldTermProcessor
This class implements an end of a TermPipeline that adds the
term to the DocumentTree.
|
Modifier and Type | Field and Description |
---|---|
protected CompressionFactory.CompressionConfiguration |
compressionDirectConfig
The compression configuration for the direct index
|
protected CompressionFactory.CompressionConfiguration |
compressionInvertedConfig
The compression configuration for the inverted index
|
protected int |
numOfTokensInDocument
The number of tokens found in the current document so far/
|
protected Set<String> |
termFields
A private variable for storing the fields a term appears into.
|
protected DocumentPostingList |
termsInDocument
The structure that holds the terms found in a document.
|
BUILDER_BOUNDARY_DOCUMENTS, currentIndex, directIndexBuilder, docIndexBuilder, emptyDocIndexEntry, fieldNames, fileNameNoExtension, IndexEmptyDocuments, invertedIndexBuilder, lexiconBuilder, logger, MAX_DOCS_PER_BUILDER, MAX_TOKENS_IN_DOCUMENT, metaBuilder, numFields, path, pipeline_first, prefix, useFieldInformation
Modifier | Constructor and Description |
---|---|
protected |
BasicIndexer(long a,
long b,
long c)
Protected do-nothing constructor for use by child classes.
|
|
BasicIndexer(String path,
String prefix)
Constructs an instance of a BasicIndexer, using the given path name
for storing the data structures.
|
Modifier and Type | Method and Description |
---|---|
void |
createDirectIndex(Collection[] collections)
Creates the direct index, the document index and the lexicon.
|
protected void |
createDocumentPostings()
Hook method that creates the right type of DocumentTree class.
|
void |
createInvertedIndex()
Creates the inverted index after having created the
direct index, document index and lexicon.
|
protected void |
finishedInvertedIndexBuild()
Hook method, called when the inverted index is finished - ie the lexicon is finished
|
protected TermPipeline |
getEndOfPipeline()
Returns the end of the term pipeline, which corresponds to
an instance of either BasicIndexer.BasicTermProcessor, or
BasicIndexer.FieldTermProcessor, depending on whether
field information is stored.
|
protected void |
indexDocument(Map<String,String> docProperties,
DocumentPostingList _termsInDocument)
This adds a document to the direct and document indexes, as well
as it's terms to the lexicon.
|
createMetaIndexBuilder, finishedDirectIndexBuild, index, indexEmpty, init, load_builder_boundary_documents, load_field_ids, load_indexer_properties, load_pipeline, main, merge, merge, mergeTwoIndices, parseInts, useFieldInformation
protected Set<String> termFields
protected DocumentPostingList termsInDocument
protected int numOfTokensInDocument
protected CompressionFactory.CompressionConfiguration compressionDirectConfig
protected CompressionFactory.CompressionConfiguration compressionInvertedConfig
protected BasicIndexer(long a, long b, long c)
public BasicIndexer(String path, String prefix)
path
- String the path where the data structures will be created. This is assumed to be
absolute.prefix
- String the filename component of the data structuresprotected TermPipeline getEndOfPipeline()
getEndOfPipeline
in class Indexer
public void createDirectIndex(Collection[] collections)
createDirectIndex
in class Indexer
collections
- Collection[] the collections to be indexed.protected void indexDocument(Map<String,String> docProperties, DocumentPostingList _termsInDocument) throws Exception
docProperties
- Map_termsInDocument
- DocumentPostingList the terms in the document.Exception
public void createInvertedIndex()
createInvertedIndex
in class Indexer
protected void createDocumentPostings()
protected void finishedInvertedIndexBuild()
finishedInvertedIndexBuild
in class Indexer
Terrier 4.0. Copyright © 2004-2014 University of Glasgow