Class BlockInverted2DirectIndexBuilder


  • public class BlockInverted2DirectIndexBuilder
    extends Inverted2DirectIndexBuilder
    Create a block direct index from a BlockInvertedIndex.

    Properties:

    1. inverted2direct.processtokens - total number of tokens to attempt each iteration. Defaults to 50000000. Memory usage would more likely be linked to the number of pointers and the number of blocks, however as the document index does not contain these statistics on a document basis. these are impossible to estimate. Note that the default is less than Inverted2DirectIndexBuilder.
    Since:
    2.0
    Author:
    Craig Macdonald
    • Constructor Detail

      • BlockInverted2DirectIndexBuilder

        public BlockInverted2DirectIndexBuilder​(IndexOnDisk i)
        constructor
        Parameters:
        i -
    • Method Detail

      • getPostings

        protected org.terrier.structures.indexing.singlepass.Posting[] getPostings​(int count)
        get an array of posting object of the specified size. These will be used to hold the postings for a range of documents
        Overrides:
        getPostings in class Inverted2DirectIndexBuilder
      • getPostingReader

        protected org.terrier.structures.indexing.singlepass.PostingInRun getPostingReader()
        returns the SPIR implementation that should be used for reading the postings written earlier
        Overrides:
        getPostingReader in class Inverted2DirectIndexBuilder
      • traverseInvertedFile

        protected long traverseInvertedFile​(PostingIndexInputStream iiis,
                                            int firstDocid,
                                            int countDocuments,
                                            org.terrier.structures.indexing.singlepass.Posting[] directPostings)
                                     throws java.io.IOException
        traverse the inverted file, looking for all occurrences of documents in the given range
        Overrides:
        traverseInvertedFile in class Inverted2DirectIndexBuilder
        Returns:
        the number of tokens found in all of the document.
        Throws:
        java.io.IOException