|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.indexing.Indexer org.terrier.indexing.BasicIndexer org.terrier.indexing.BasicSinglePassIndexer org.terrier.indexing.ExtensibleSinglePassIndexer
public abstract class ExtensibleSinglePassIndexer
Directly based on BasicSinglePassIndexer, with just a few modifications to enable some extra hooks.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.terrier.indexing.BasicIndexer |
---|
BasicIndexer.BasicTermProcessor, BasicIndexer.FieldTermProcessor |
Field Summary | |
---|---|
protected SinglePassIndexerFlushDelegate |
flushDelegate
Delegate for HadoopIndexerMapper to intercept flushes |
Fields inherited from class org.terrier.indexing.BasicIndexer |
---|
numOfTokensInDocument, termFields, termsInDocument |
Constructor Summary | |
---|---|
ExtensibleSinglePassIndexer(java.lang.String pathname,
java.lang.String prefix)
Default constructor |
Method Summary | |
---|---|
protected abstract void |
createDocumentPostings()
Hook method that creates the right type of DocumentTree class. |
void |
createInvertedIndex(Collection[] collections)
Builds the inverted file and lexicon file for the given collections Loops through each document in each of the collections, extracting terms and pushing these through the Term Pipeline (e.g. |
protected abstract void |
createMemoryPostings()
Hook method that creates the right type of MemoryPostings class. |
protected void |
createRunMerger(java.lang.String[][] files)
Hook method that creates a RunsMerger instance |
protected void |
forceFlush()
Force the indexer to flush everything and free memory. |
Index |
getCurrentIndex()
Get the index currently being constructed by this indexer. |
protected abstract TermPipeline |
getEndOfPipeline()
Returns the end of the term pipeline, which corresponds to an instance of either BasicIndexer.BasicTermProcessor, or BasicIndexer.FieldTermProcessor, depending on whether field information is stored. |
protected SinglePassIndexerFlushDelegate |
getFlushDelegate()
Get the flushDelegate |
protected abstract java.lang.Class<? extends PostingInRun> |
getPostingInRunClass()
Get the class for storing postings in runs. |
protected abstract void |
preProcess(Document doc,
java.lang.String term)
Perform an operation before the term pipeline is initiated. |
protected void |
setFlushDelegate(SinglePassIndexerFlushDelegate _flushDelegate)
Set the flushDelegate |
Methods inherited from class org.terrier.indexing.BasicSinglePassIndexer |
---|
checkFlush, createDirectIndex, createFieldRunMerger, createInvertedIndex, finishMemoryPosting, getFileNames, indexDocument, load_indexer_properties, performMultiWayMerge |
Methods inherited from class org.terrier.indexing.BasicIndexer |
---|
finishedInvertedIndexBuild |
Methods inherited from class org.terrier.indexing.Indexer |
---|
createMetaIndexBuilder, finishedDirectIndexBuild, index, indexEmpty, init, load_builder_boundary_documents, load_field_ids, load_pipeline, main, merge, merge, mergeTwoIndices, parseInts, useFieldInformation |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected SinglePassIndexerFlushDelegate flushDelegate
Constructor Detail |
---|
public ExtensibleSinglePassIndexer(java.lang.String pathname, java.lang.String prefix)
pathname
- String the path where the datastructures will
be created. This is assumed to be absolute.prefix
- String the prefix of the index, usually "data".Method Detail |
---|
protected abstract TermPipeline getEndOfPipeline()
getEndOfPipeline
in class BasicIndexer
protected abstract java.lang.Class<? extends PostingInRun> getPostingInRunClass()
protected void createRunMerger(java.lang.String[][] files) throws java.lang.Exception
createRunMerger
in class BasicSinglePassIndexer
java.io.IOException
- if an I/O error occurs.
java.lang.Exception
protected abstract void createMemoryPostings()
createMemoryPostings
in class BasicSinglePassIndexer
protected abstract void createDocumentPostings()
createDocumentPostings
in class BasicIndexer
public void createInvertedIndex(Collection[] collections)
createInvertedIndex
in class BasicSinglePassIndexer
collections
- Collection[] the collections to be indexed.protected abstract void preProcess(Document doc, java.lang.String term)
doc
- Current documentterm
- Current termpublic Index getCurrentIndex()
protected void setFlushDelegate(SinglePassIndexerFlushDelegate _flushDelegate)
_flushDelegate
- protected SinglePassIndexerFlushDelegate getFlushDelegate()
protected void forceFlush() throws java.io.IOException
forceFlush
in class BasicSinglePassIndexer
java.io.IOException
BasicSinglePassIndexer.forceFlush()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |