|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.structures.indexing.singlepass.Inverted2DirectIndexBuilder
public class Inverted2DirectIndexBuilder
Create a direct index from an InvertedIndex. The algorithm is similar to that followed by InvertedIndexBuilder. To summarise, InvertedIndexBuilder builds an InvertedIndex from a DirectIndex. This class does the opposite, building a DirectIndex from an InvertedIndex.
Algorithm:
For a selection of document ids
(Scan the inverted index looking for postings with these document ids)
For each term in the inverted index
Select required postings from all the postings of that term
Add these to posting objects that represents each document
&nsbp;For each posting object
Write out the postings for that document
Notes:
This algorithm assumes that termids start at 0 and are strictly increasing. This assumption holds true
only for inverted indices generated by the single pass indexing method.
Properties:
Field Summary | |
---|---|
protected java.lang.String |
basicDirectIndexPostingIteratorClass
|
protected java.lang.String |
destinationStructure
|
protected java.lang.String |
directIndexClass
Class to read the generated direct index |
protected java.lang.String |
directIndexInputStreamClass
Class to read the generated inverted index |
protected int |
fieldCount
The number of different fields that are used for indexing field information. |
protected java.lang.String |
fieldDirectIndexPostingIteratorClass
|
protected Index |
index
index currently being used |
protected static org.apache.log4j.Logger |
logger
The logger used |
protected long |
processTokens
number of tokens limit per iteration |
protected boolean |
saveTagInformation
Indicates whether field information is used. |
protected java.lang.String |
sourceStructure
|
Constructor Summary | |
---|---|
Inverted2DirectIndexBuilder(Index i)
Construct a new instance of this builder class |
Method Summary | |
---|---|
void |
createDirectIndex()
create the direct index when the collection contains an existing inverted index |
protected PostingInRun |
getPostingReader()
returns the SPIR implementation that should be used for reading the postings written earlier |
protected Posting[] |
getPostings(int count)
get an array of posting object of the specified size. |
static void |
main(java.lang.String[] args)
main |
protected int |
scanDocumentIndexForTokens(long _processTokens,
java.util.Iterator<DocumentIndexEntry> docidStream)
Iterates through the document index, until it has reached the given number of terms |
protected long |
traverseInvertedFile(InvertedIndexInputStream iiis,
int firstDocid,
int lastDocid,
Posting[] directPostings)
traverse the inverted file, looking for all occurrences of documents in the given range |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final org.apache.log4j.Logger logger
protected Index index
protected final int fieldCount
protected final boolean saveTagInformation
protected java.lang.String directIndexClass
protected java.lang.String directIndexInputStreamClass
protected java.lang.String basicDirectIndexPostingIteratorClass
protected java.lang.String fieldDirectIndexPostingIteratorClass
protected long processTokens
protected java.lang.String sourceStructure
protected java.lang.String destinationStructure
Constructor Detail |
---|
public Inverted2DirectIndexBuilder(Index i)
Method Detail |
---|
public void createDirectIndex()
protected Posting[] getPostings(int count)
protected PostingInRun getPostingReader()
protected long traverseInvertedFile(InvertedIndexInputStream iiis, int firstDocid, int lastDocid, Posting[] directPostings) throws java.io.IOException
java.io.IOException
protected int scanDocumentIndexForTokens(long _processTokens, java.util.Iterator<DocumentIndexEntry> docidStream) throws java.io.IOException
_processTokens
- Number of tokens to stop reading the lexicon afterdocidStream
- the document index stream to read
java.io.IOException
public static void main(java.lang.String[] args) throws java.lang.Exception
args
-
java.lang.Exception
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |