|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.indexing.Indexer org.terrier.indexing.BasicIndexer org.terrier.indexing.BasicSinglePassIndexer org.terrier.indexing.hadoop.Hadoop_BasicSinglePassIndexer org.terrier.indexing.hadoop.Hadoop_BlockSinglePassIndexer
public class Hadoop_BlockSinglePassIndexer
A MapReduce single-pass indexer that records term positions (blocks).
All normal block properties are supported. For more information, see BlockIndexer
.
Nested Class Summary | |
---|---|
protected class |
Hadoop_BlockSinglePassIndexer.BasicTermProcessor
This class implements an end of a TermPipeline that adds the term to the DocumentTree. |
protected class |
Hadoop_BlockSinglePassIndexer.DelimFieldTermProcessor
This class behaves in a similar fashion to FieldTermProcessor except that this one treats blocks bounded by delimiters instead of fixed-sized blocks. |
protected class |
Hadoop_BlockSinglePassIndexer.DelimTermProcessor
This class behaves in a similar fashion to BasicTermProcessor except that this one treats blocks bounded by delimiters instead of fixed-sized blocks. |
protected class |
Hadoop_BlockSinglePassIndexer.FieldTermProcessor
This class implements an end of a TermPipeline that adds the term to the DocumentTree. |
Field Summary | |
---|---|
protected int |
BLOCK_SIZE
The maximum number of terms allowed in a block |
protected int |
blockId
The block number in the current document. |
protected int |
MAX_BLOCKS
The maximum number allowed number of blocks in a document. |
protected int |
numOfTokensInBlock
The number of tokens in the current block of the current document. |
Fields inherited from class org.terrier.indexing.hadoop.Hadoop_BasicSinglePassIndexer |
---|
currentReporter, flushList, flushNo, jc, lastReporter, lexstream, MapIndexPrefixes, mapTaskID, mutipleIndices, outputPostingListCollector, reduceId, reduceStarted, RunData, runIteratorF, splitnum, start |
Fields inherited from class org.terrier.indexing.BasicIndexer |
---|
numOfTokensInDocument, termFields, termsInDocument |
Constructor Summary | |
---|---|
Hadoop_BlockSinglePassIndexer()
Constructs an instance of this class, where the created data structures are stored in the given path. |
Method Summary | |
---|---|
protected void |
createDocumentPostings()
Hook method that creates the right type of DocumentTree class. |
void |
createMemoryPostings()
Hook method that creates the right type of MemoryPostings class. |
protected RunsMerger |
createtheRunMerger()
Creates the RunsMerger and the RunIteratorFactory |
protected TermPipeline |
getEndOfPipeline()
Returns the object that is to be the end of the TermPipeline. |
protected void |
load_indexer_properties()
|
Methods inherited from class org.terrier.indexing.hadoop.Hadoop_BasicSinglePassIndexer |
---|
close, closeMap, closeReduce, configure, configureMap, configureReduce, createMetaIndexBuilder, finish, forceFlush, indexEmpty, load_builder_boundary_documents, loadRunData, main, map, mergeDocumentIndex, reduce, startReduce |
Methods inherited from class org.terrier.indexing.BasicSinglePassIndexer |
---|
checkFlush, createDirectIndex, createFieldRunMerger, createInvertedIndex, createInvertedIndex, createRunMerger, finishMemoryPosting, getFileNames, indexDocument, performMultiWayMerge |
Methods inherited from class org.terrier.indexing.BasicIndexer |
---|
finishedInvertedIndexBuild |
Methods inherited from class org.terrier.indexing.Indexer |
---|
finishedDirectIndexBuild, index, init, load_field_ids, load_pipeline, merge, merge, mergeTwoIndices, parseInts, useFieldInformation |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected int numOfTokensInBlock
protected int blockId
protected int BLOCK_SIZE
protected int MAX_BLOCKS
Constructor Detail |
---|
public Hadoop_BlockSinglePassIndexer()
Method Detail |
---|
protected TermPipeline getEndOfPipeline()
getEndOfPipeline
in class BasicIndexer
public void createMemoryPostings()
createMemoryPostings
in class BasicSinglePassIndexer
protected void createDocumentPostings()
BasicIndexer
createDocumentPostings
in class BasicIndexer
protected RunsMerger createtheRunMerger()
Hadoop_BasicSinglePassIndexer
createtheRunMerger
in class Hadoop_BasicSinglePassIndexer
protected void load_indexer_properties()
load_indexer_properties
in class BasicSinglePassIndexer
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |