public class DocumentPostingList extends Object implements org.apache.hadoop.io.Writable
 Properties:
 
| Modifier and Type | Class and Description | 
|---|---|
| protected class  | DocumentPostingList.postingIterator | 
| Modifier and Type | Field and Description | 
|---|---|
| protected static int | AVG_DOCUMENT_UNIQUE_TERMSnumber of unique terms per doc on average, used to tune the initial size of the hashmaps used in this class. | 
| protected int | documentLengthlength of the document so far. | 
| protected gnu.trove.TObjectIntHashMap<String> | occurrencesmapping term to tf mapping | 
| Constructor and Description | 
|---|
| DocumentPostingList()Create a new DocumentPostingList object | 
| Modifier and Type | Method and Description | 
|---|---|
| void | clear()Removes all postings from this document | 
| void | forEachTerm(gnu.trove.TObjectIntProcedure<String> proc)Execute the specifed method for each term. | 
| int | getDocumentLength()Returns the total number of tokens in this document | 
| DocumentIndexEntry | getDocumentStatistics()Return a DocumentIndexEntry for this document | 
| int | getFrequency(String term)Return the frequency of the specified term in this document | 
| int | getNumberOfPointers()Returns the number of unique terms in this document. | 
| int[][] | getPostings()Returns the postings suitable to be written into the direct index. | 
| IterablePosting | getPostings2()Returns a posting iterator suitable to be written into the direct index. | 
| protected int | getTermId(String term)Used by getPostings() and getPostings2() to obtain the term id of the term. | 
| void | insert(int tf,
      String term)Insert a term into the posting list of this document | 
| void | insert(String term)Insert a term into the posting list of this document | 
| protected IterablePosting | makePostingIterator(String[] _terms,
                   int[] termIds) | 
| void | readFields(DataInput in) | 
| String[] | termSet()Returns all terms in this posting list | 
| void | write(DataOutput out) | 
protected static final int AVG_DOCUMENT_UNIQUE_TERMS
protected int documentLength
protected final gnu.trove.TObjectIntHashMap<String> occurrences
public DocumentPostingList()
public String[] termSet()
public int getFrequency(String term)
public void clear()
public int getDocumentLength()
public int getNumberOfPointers()
public void insert(String term)
term - the Term being insertedpublic void insert(int tf,
          String term)
tf - frequencyterm - the Term being insertedpublic DocumentIndexEntry getDocumentStatistics()
public void forEachTerm(gnu.trove.TObjectIntProcedure<String> proc)
protected int getTermId(String term)
public int[][] getPostings()
public IterablePosting getPostings2()
protected IterablePosting makePostingIterator(String[] _terms, int[] termIds)
public void readFields(DataInput in) throws IOException
readFields in interface org.apache.hadoop.io.WritableIOExceptionpublic void write(DataOutput out) throws IOException
write in interface org.apache.hadoop.io.WritableIOExceptionTerrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow