Class BlockSinglePassIndexer
- java.lang.Object
- 
- org.terrier.structures.indexing.Indexer
- 
- org.terrier.structures.indexing.classical.BasicIndexer
- 
- org.terrier.structures.indexing.singlepass.BasicSinglePassIndexer
- 
- org.terrier.structures.indexing.singlepass.BlockSinglePassIndexer
 
 
 
 
- 
 public class BlockSinglePassIndexer extends BasicSinglePassIndexer Indexes a document collection saving block information for the indexed terms. It performs a single pass inversion (seeBasicSinglePassIndexer). All normal block properties are supported. For more information, seeBlockIndexer.- Author:
- Roi Blanco, Craig Macdonald, Rodrygo Santos.
 
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description protected classBlockSinglePassIndexer.BasicTermProcessorThis class implements an end of a TermPipeline that adds the term to the DocumentTree.protected classBlockSinglePassIndexer.DelimFieldTermProcessorThis class behaves in a similar fashion to FieldTermProcessor except that this one treats blocks bounded by delimiters instead of fixed-sized blocks.protected classBlockSinglePassIndexer.DelimTermProcessorThis class behaves in a similar fashion to BasicTermProcessor except that this one treats blocks bounded by delimiters instead of fixed-sized blocks.protected classBlockSinglePassIndexer.FieldTermProcessorThis class implements an end of a TermPipeline that adds the term to the DocumentTree.
 - 
Field SummaryFields Modifier and Type Field Description protected intBLOCK_SIZEThe maximum number of terms allowed in a blockprotected intblockIdThe block number in the current document.protected intMAX_BLOCKSThe maximum number allowed number of blocks in a document.protected intnumOfTokensInBlockThe number of tokens in the current block of the current document.- 
Fields inherited from class org.terrier.structures.indexing.singlepass.BasicSinglePassIndexerbasicInvertedIndexPostingIteratorClass, currentFile, currentId, docsPerCheck, fieldInvertedIndexPostingIteratorClass, fileNames, invertedIndexClass, invertedIndexInputStreamClass, maxDocsPerFlush, maxMemory, memoryAfterFlush, memoryCheck, merger, mp, numberOfDocsSinceCheck, numberOfDocsSinceFlush, numberOfDocuments, numberOfPointers, numberOfTokens, numberOfUniqueTerms, runtime
 - 
Fields inherited from class org.terrier.structures.indexing.classical.BasicIndexercompressionDirectConfig, compressionInvertedConfig, numOfTokensInDocument, termCodes, termFields, termsInDocument
 - 
Fields inherited from class org.terrier.structures.indexing.Indexerblocks, BUILDER_BOUNDARY_DOCUMENTS, currentIndex, directIndexBuilder, docIndexBuilder, emptyDocCount, emptyDocIndexEntry, externalParalllism, fieldNames, fileNameNoExtension, IndexEmptyDocuments, invertedIndexBuilder, lexiconBuilder, logger, MAX_DOCS_PER_BUILDER, MAX_TOKENS_IN_DOCUMENT, metaBuilder, numFields, path, pipeline_first, prefix, useFieldInformation
 
- 
 - 
Constructor SummaryConstructors Constructor Description BlockSinglePassIndexer(java.lang.String pathname, java.lang.String prefix)Constructs an instance of this block indexer which uses the single-pass strategy
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcreateDocumentPostings()Hook method that creates the right type of DocumentTree class.protected voidcreateFieldRunMerger(java.lang.String[][] files)Hook method that creates a FieldRunMerger instanceprotected voidcreateMemoryPostings()Hook method that creates the right type of MemoryPostings class.protected voidcreateRunMerger(java.lang.String[][] files)Hook method that creates a RunsMerger instanceprotected TermPipelinegetEndOfPipeline()Returns the object that is to be the end of the TermPipeline.voidperformMultiWayMerge()Uses the merger class to perform a k multiway merge in a set of previously written runs.- 
Methods inherited from class org.terrier.structures.indexing.singlepass.BasicSinglePassIndexercheckFlush, createDirectIndex, createInvertedIndex, createInvertedIndex, finishMemoryPosting, forceFlush, getFileNames, indexDocument, load_indexer_properties
 - 
Methods inherited from class org.terrier.structures.indexing.classical.BasicIndexerfinishedInvertedIndexBuild
 - 
Methods inherited from class org.terrier.structures.indexing.IndexercreateMetaIndexBuilder, finishedDirectIndexBuild, getExternalParalllism, index, indexEmpty, init, load_builder_boundary_documents, load_field_ids, load_pipeline, main, merge, merge, mergeTwoIndices, parseInts, setExternalParalllism, useFieldInformation
 
- 
 
- 
- 
- 
Field Detail- 
numOfTokensInBlockprotected int numOfTokensInBlock The number of tokens in the current block of the current document.
 - 
blockIdprotected int blockId The block number in the current document.
 - 
BLOCK_SIZEprotected int BLOCK_SIZE The maximum number of terms allowed in a block
 - 
MAX_BLOCKSprotected int MAX_BLOCKS The maximum number allowed number of blocks in a document. After this value, all the remaining terms are in the final block
 
- 
 - 
Constructor Detail- 
BlockSinglePassIndexerpublic BlockSinglePassIndexer(java.lang.String pathname, java.lang.String prefix)Constructs an instance of this block indexer which uses the single-pass strategy- Parameters:
- pathname- String location of the index
- prefix- String prefix to file of the index
 
 
- 
 - 
Method Detail- 
getEndOfPipelineprotected TermPipeline getEndOfPipeline() Returns the object that is to be the end of the TermPipeline. This method is used at construction time of the parent object.- Overrides:
- getEndOfPipelinein class- BasicIndexer
- Returns:
- TermPipeline the last component of the term pipeline.
 
 - 
createFieldRunMergerprotected void createFieldRunMerger(java.lang.String[][] files) throws java.io.IOExceptionDescription copied from class:BasicSinglePassIndexerHook method that creates a FieldRunMerger instance- Overrides:
- createFieldRunMergerin class- BasicSinglePassIndexer
- Throws:
- java.io.IOException- if an I/O error occurs.
 
 - 
createRunMergerprotected void createRunMerger(java.lang.String[][] files) throws java.lang.ExceptionDescription copied from class:BasicSinglePassIndexerHook method that creates a RunsMerger instance- Overrides:
- createRunMergerin class- BasicSinglePassIndexer
- Throws:
- java.io.IOException- if an I/O error occurs.
- java.lang.Exception
 
 - 
createMemoryPostingsprotected void createMemoryPostings() Description copied from class:BasicSinglePassIndexerHook method that creates the right type of MemoryPostings class.- Overrides:
- createMemoryPostingsin class- BasicSinglePassIndexer
 
 - 
createDocumentPostingsprotected void createDocumentPostings() Description copied from class:BasicIndexerHook method that creates the right type of DocumentTree class.- Overrides:
- createDocumentPostingsin class- BasicIndexer
 
 - 
performMultiWayMergepublic void performMultiWayMerge() throws java.io.IOExceptionDescription copied from class:BasicSinglePassIndexerUses the merger class to perform a k multiway merge in a set of previously written runs. The file names and the number of runs are given by the private queue- Overrides:
- performMultiWayMergein class- BasicSinglePassIndexer
- Throws:
- java.io.IOException
 
 
- 
 
-