Package org.terrier.structures.indexing
Class BlockDocumentPostingList
- java.lang.Object
-
- org.terrier.structures.indexing.DocumentPostingList
-
- org.terrier.structures.indexing.BlockDocumentPostingList
-
- All Implemented Interfaces:
java.io.Serializable,org.apache.hadoop.io.Writable
public class BlockDocumentPostingList extends DocumentPostingList
Represents the postings of one document, and saves block (term position) information. Uses HashMaps internally.Properties:
- indexing.avg.unique.terms.per.doc - number of unique terms per doc on average, used to tune the initial size of the haashmaps used in this class.
- See Also:
DocumentPostingList, Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.terrier.structures.indexing.DocumentPostingList
DocumentPostingList.postingIterator
-
-
Field Summary
Fields Modifier and Type Field Description protected intblockCountnumber of blocks in this document.protected gnu.trove.THashMap<java.lang.String,gnu.trove.TIntHashSet>term_blocksmapping term to blockids in this document-
Fields inherited from class org.terrier.structures.indexing.DocumentPostingList
AVG_DOCUMENT_UNIQUE_TERMS, documentLength, occurrences
-
-
Constructor Summary
Constructors Constructor Description BlockDocumentPostingList()Instantiate a new block document posting list.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclear()Removes all postings from this documentint[]getBlocks(java.lang.String term)return blocksint[][]getPostings(TermCodes termCodes)returns the postings suitable to be written into the block direct indexvoidinsert(java.lang.String t, int blockId)Insert a term into this document, occurs at given block idprotected IterablePostingmakePostingIterator(java.lang.String[] _terms, int[] termIds)voidreadFields(java.io.DataInput in)voidwrite(java.io.DataOutput out)-
Methods inherited from class org.terrier.structures.indexing.DocumentPostingList
forEachTerm, getDocumentLength, getDocumentStatistics, getFrequency, getNumberOfPointers, getPostings2, insert, insert, termSet
-
-
-
-
Method Detail
-
insert
public void insert(java.lang.String t, int blockId)Insert a term into this document, occurs at given block id
-
getBlocks
public int[] getBlocks(java.lang.String term)
return blocks- Parameters:
term-- Returns:
- int[]
-
getPostings
public int[][] getPostings(TermCodes termCodes)
returns the postings suitable to be written into the block direct index- Overrides:
getPostingsin classDocumentPostingList
-
makePostingIterator
protected IterablePosting makePostingIterator(java.lang.String[] _terms, int[] termIds)
- Overrides:
makePostingIteratorin classDocumentPostingList
-
clear
public void clear()
Description copied from class:DocumentPostingListRemoves all postings from this document- Overrides:
clearin classDocumentPostingList
-
readFields
public void readFields(java.io.DataInput in) throws java.io.IOException- Specified by:
readFieldsin interfaceorg.apache.hadoop.io.Writable- Overrides:
readFieldsin classDocumentPostingList- Throws:
java.io.IOException
-
write
public void write(java.io.DataOutput out) throws java.io.IOException- Specified by:
writein interfaceorg.apache.hadoop.io.Writable- Overrides:
writein classDocumentPostingList- Throws:
java.io.IOException
-
-