org.terrier.structures.indexing.singlepass
Class BlockInverted2DirectIndexBuilder

java.lang.Object
  extended by org.terrier.structures.indexing.singlepass.Inverted2DirectIndexBuilder
      extended by org.terrier.structures.indexing.singlepass.BlockInverted2DirectIndexBuilder

public class BlockInverted2DirectIndexBuilder
extends Inverted2DirectIndexBuilder

Create a block direct index from a BlockInvertedIndex.

Properties:

  1. inverted2direct.processtokens - total number of tokens to attempt each iteration. Defaults to 50000000. Memory usage would more likely be linked to the number of pointers and the number of blocks, however as the document index does not contain these statistics on a document basis. these are impossible to estimate. Note that the default is less than Inverted2DirectIndexBuilder.

Since:
2.0
Author:
Craig Macdonald

Field Summary
 
Fields inherited from class org.terrier.structures.indexing.singlepass.Inverted2DirectIndexBuilder
basicDirectIndexPostingIteratorClass, destinationStructure, directIndexClass, directIndexInputStreamClass, fieldCount, fieldDirectIndexPostingIteratorClass, index, logger, processTokens, saveTagInformation, sourceStructure
 
Constructor Summary
BlockInverted2DirectIndexBuilder(Index i)
          constructor
 
Method Summary
protected  PostingInRun getPostingReader()
          returns the SPIR implementation that should be used for reading the postings written earlier
protected  Posting[] getPostings(int count)
          get an array of posting object of the specified size.
static void main(java.lang.String[] args)
          main
protected  long traverseInvertedFile(InvertedIndexInputStream iiis, int firstDocid, int lastDocid, Posting[] directPostings)
          traverse the inverted file, looking for all occurrences of documents in the given range
 
Methods inherited from class org.terrier.structures.indexing.singlepass.Inverted2DirectIndexBuilder
createDirectIndex, scanDocumentIndexForTokens
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BlockInverted2DirectIndexBuilder

public BlockInverted2DirectIndexBuilder(Index i)
constructor

Parameters:
i -
Method Detail

getPostings

protected Posting[] getPostings(int count)
get an array of posting object of the specified size. These will be used to hold the postings for a range of documents

Overrides:
getPostings in class Inverted2DirectIndexBuilder

getPostingReader

protected PostingInRun getPostingReader()
returns the SPIR implementation that should be used for reading the postings written earlier

Overrides:
getPostingReader in class Inverted2DirectIndexBuilder

traverseInvertedFile

protected long traverseInvertedFile(InvertedIndexInputStream iiis,
                                    int firstDocid,
                                    int lastDocid,
                                    Posting[] directPostings)
                             throws java.io.IOException
traverse the inverted file, looking for all occurrences of documents in the given range

Overrides:
traverseInvertedFile in class Inverted2DirectIndexBuilder
Returns:
the number of tokens found in all of the document.
Throws:
java.io.IOException

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
main

Parameters:
args -
Throws:
java.lang.Exception


Terrier 3.5. Copyright © 2004-2011 University of Glasgow