InvertedIndex (Terrier Information Retrieval Platform version 1.1.1 API Specification)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

Terrier IR Platform
1.1.1

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

uk.ac.gla.terrier.structures
Class InvertedIndex

java.lang.Object
  uk.ac.gla.terrier.structures.InvertedIndex

Direct Known Subclasses:: BlockInvertedIndex

public class InvertedIndex
extends java.lang.Object
extends java.lang.Object

This class implements the inverted index for performing retrieval, with field information optionally.

Version:: $Revision: 1.34 $
Author:: Douglas Johnson, Vassilis Plachouras, Craig Macdonald

Field Summary
`static double`	`FIELD_LOAD_FACTOR` This is used during retrieval for a rough guess sizing of the temporaryTerms arraylist in getDocuments() - retrieval with Fields.
`static double`	`NORMAL_LOAD_FACTOR` This is used during retrieval for a rough guess sizing of the temporaryTerms arraylist in getDocuments().

Constructor Summary
`InvertedIndex(Lexicon lexicon)` Creates an instance of the HtmlInvertedIndex class using the lexicon.
`InvertedIndex(Lexicon lexicon, java.lang.String filename)` Creates an instance of the HtmlInvertedIndex class using the given lexicon.
`InvertedIndex(Lexicon lexicon, java.lang.String path, java.lang.String prefix)`

Method Summary
`void`	`close()` Closes the underlying bit file.
`BitFile`	`getBitFile()` Returns the underlying bit file, in order to make more efficient use of the bit file during assigning scores to the retrieved documents.
`int[][]`	`getDocuments(int termid)` Returns a two dimensional array containing the document ids, term frequencies and field scores for the given documents.
`int[][]`	`getDocuments(int termid, int startDocid, int endDocid)` Returns a five dimensional array containing the document ids, the term frequencies, the field scores the block frequencies and the block ids for the given documents.
`int[][]`	`getDocuments(LexiconEntry lEntry)`
`int[][]`	`getDocuments(long sOffset, byte sBitOffset, long eOffset, byte eBitOffset)` Returns a two dimensional array containing the document ids, term frequencies and field scores for the given documents.
`void`	`print()` Prints out the inverted index file.

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

NORMAL_LOAD_FACTOR

public static final double NORMAL_LOAD_FACTOR

This is used during retrieval for a rough guess sizing of the temporaryTerms arraylist in getDocuments(). The higher this value, the less chance that the arraylist will have to be grown (growing is expensive), however more memory may be used unnecessarily.

See Also:: Constant Field Values

FIELD_LOAD_FACTOR

public static final double FIELD_LOAD_FACTOR

This is used during retrieval for a rough guess sizing of the temporaryTerms arraylist in getDocuments() - retrieval with Fields. The higher this value, the less chance that the arraylist will have to be grown (growing is expensive), however more memory may be used unnecessarily.

See Also:: Constant Field Values

Constructor Detail

InvertedIndex

public InvertedIndex(Lexicon lexicon,
                     java.lang.String path,
                     java.lang.String prefix)

InvertedIndex

public InvertedIndex(Lexicon lexicon)

Creates an instance of the HtmlInvertedIndex class using the lexicon.

Parameters:: lexicon - The lexicon used for retrieval

InvertedIndex

public InvertedIndex(Lexicon lexicon,
                     java.lang.String filename)

Creates an instance of the HtmlInvertedIndex class using the given lexicon.

Parameters:: lexicon - The lexicon used for retrieval; filename - The name of the inverted file

Method Detail

print

public void print()

Prints out the inverted index file.

getDocuments

public int[][] getDocuments(LexiconEntry lEntry)

getDocuments

public int[][] getDocuments(int termid)

Returns a two dimensional array containing the document ids, term frequencies and field scores for the given documents.

Parameters:: termid - the identifier of the term whose documents we are looking for.
Returns:: int[][] the two dimensional [3][n] array containing the n document identifiers, frequencies and field scores. If fields is not enabled, then size is [2][n].

getDocuments

public int[][] getDocuments(long sOffset,
                            byte sBitOffset,
                            long eOffset,
                            byte eBitOffset)

Returns a two dimensional array containing the document ids, term frequencies and field scores for the given documents.

Parameters:: sOffset - start byte of the postings in the inverted file; sBitOffset - start bit of the postings in the inverted file; eOffset - end byte of the postings in the inverted file; eBitOffset - end bit of the postings in the inverted file
Returns:: int[][] the two dimensional [3][n] array containing the n document identifiers, frequencies and field scores. If fields is not enabled, then size is [2][n].

getDocuments

public int[][] getDocuments(int termid,
                            int startDocid,
                            int endDocid)

Returns a five dimensional array containing the document ids, the term frequencies, the field scores the block frequencies and the block ids for the given documents. The returned postings are for the documents within a specified range of docids.

Parameters:: termid - the id of the term whose documents we are looking for.; startDocid - The starting docid that will be returned.; endDocid - The last possible docid that will be returned.
Returns:: int[][] the five dimensional [5][] array containing the document ids, frequencies, field scores and block frequencies, while the last vector contains the block identifiers and it has a different length from the document identifiers.

close

public void close()

Closes the underlying bit file.

getBitFile

public BitFile getBitFile()

Returns the underlying bit file, in order to make more efficient use of the bit file during assigning scores to the retrieved documents.

Returns:: file the underlying bit file

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

Terrier IR Platform
1.1.1

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

uk.ac.gla.terrier.structures Class InvertedIndex

NORMAL_LOAD_FACTOR

FIELD_LOAD_FACTOR

InvertedIndex

InvertedIndex

InvertedIndex

print

getDocuments

getDocuments

getDocuments

getDocuments

close

getBitFile

uk.ac.gla.terrier.structures
Class InvertedIndex