Terrier IR Platform
2.2.1

uk.ac.gla.terrier.indexing
Class BlockIndexer

java.lang.Object
  extended by uk.ac.gla.terrier.indexing.Indexer
      extended by uk.ac.gla.terrier.indexing.BlockIndexer

public class BlockIndexer
extends Indexer

An indexer that saves block information for the indexed terms. Block information is usualy recorded in terms of relative term positions (position 1, positions 2, etc), however, since 2.2, Terrier supports the presence of "marker terms" during indexing which are used to increment the block counter. Properties:

Markered Blocks
Markers are terms (artificially inserted or otherwise into the term stream that are used to denote when the block counter should be incremented. This functionality is enabled using the block.delimiters.enabled property, while the terms are specified using a comma delimited fashion with the block.delimiters property. The following lists the properties:

Version:
$Revision: 1.49 $
Author:
Craig Macdonald, Vassilis Plachouras, Rodrygo Santos

Constructor Summary
BlockIndexer(java.lang.String pathname, java.lang.String prefix)
          Constructs an instance of this class, where the created data structures are stored in the given path, with the given prefix on the filenames.
 
Method Summary
 void createDirectIndex(Collection[] collections)
          For the given collection, it iterates through the documents and creates the direct index, document index and lexicon, using information about blocks and possibly fields.
 void createInvertedIndex()
          Creates the inverted index from the already created direct index, document index and lexicon.
 
Methods inherited from class uk.ac.gla.terrier.indexing.Indexer
index, isUTFIndexing, main, merge, merge, useFieldInformation
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BlockIndexer

public BlockIndexer(java.lang.String pathname,
                    java.lang.String prefix)
Constructs an instance of this class, where the created data structures are stored in the given path, with the given prefix on the filenames.

Parameters:
pathname - String the path in which the created data structures will be saved
prefix - String the prefix on the filenames of the created data structures
Method Detail

createDirectIndex

public void createDirectIndex(Collection[] collections)
For the given collection, it iterates through the documents and creates the direct index, document index and lexicon, using information about blocks and possibly fields.

Specified by:
createDirectIndex in class Indexer
Parameters:
collections - Collection[] the collection to index.
See Also:
Indexer.createDirectIndex(uk.ac.gla.terrier.indexing.Collection[])

createInvertedIndex

public void createInvertedIndex()
Creates the inverted index from the already created direct index, document index and lexicon. It saves block information and possibly field information as well.

Specified by:
createInvertedIndex in class Indexer
See Also:
Indexer.createInvertedIndex()

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow