|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.terrier.structures.merging.StructureMerger
public class StructureMerger
This class merges the structures created by Terrier, so that we use fewer and larger inverted and direct files.
Properties:<ul>
Field Summary | |
---|---|
protected String |
basicDirectIndexPostingIteratorClass
|
protected String |
basicInvertedIndexPostingIteratorClass
|
protected Index |
destIndex
destination index |
protected String |
directFileInputClass
class to use to read the direct file |
protected String |
directFileInputStreamClass
class to use to read the direct file as a stream |
protected Class<? extends DirectInvertedOutputStream> |
directFileOutputStreamClass
class to use to write direct file |
protected Class<? extends DirectInvertedOutputStream> |
fieldDirectFileOutputStreamClass
|
protected String |
fieldDirectIndexPostingIteratorClass
|
protected Class<? extends DirectInvertedOutputStream> |
fieldInvertedFileOutputStreamClass
class to use to write inverted file |
protected String |
fieldInvertedIndexPostingIteratorClass
|
protected String |
invertedFileInputClass
class to use to read the inverted file |
protected String |
invertedFileInputStreamClass
class to use to read the inverted file as a stream |
protected Class<? extends DirectInvertedOutputStream> |
invertedFileOutputStreamClass
class to use to write inverted file |
protected boolean |
keepTermCodeMap
|
protected static org.apache.log4j.Logger |
logger
the logger used |
protected boolean |
MetaReverse
|
protected int |
numberOfDocuments
The number of documents in the merged structures. |
protected long |
numberOfPointers
The number of pointers in the merged structures. |
protected int |
numberOfTerms
The number of terms in the collection. |
protected Index |
srcIndex1
source index 1 |
protected Index |
srcIndex2
source index 2 |
protected gnu.trove.TIntIntHashMap |
termcodeHashmap
A hashmap for converting the codes of terms appearing only in the vocabulary of the second set of data structures into a new set of term codes for the merged set of data structures. |
Constructor Summary | |
---|---|
StructureMerger(Index _srcIndex1,
Index _srcIndex2,
Index _destIndex)
constructor |
Method Summary | |
---|---|
protected void |
createLexidFile()
creates the final term code to offset file, and the lexicon hash if enabled. |
protected static Class<?>[] |
getInterfaces(Object o)
|
static void |
main(String[] args)
Usage: java org.terrier.structures.merging.StructureMerger [binary bits] [inverted file 1] [inverted file 2] [output inverted file] |
protected void |
mergeDirectFiles()
Merges the two direct files and the corresponding document id files. |
protected void |
mergeDocumentIndexFiles()
Merges the two document index files, and the meta files. |
protected void |
mergeInvertedFiles()
Merges the two lexicons into one. |
void |
mergeStructures()
Merges the structures created by terrier. |
void |
setOutputIndex(Index _outputIndex)
Sets the output index. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final org.apache.log4j.Logger logger
protected gnu.trove.TIntIntHashMap termcodeHashmap
protected boolean keepTermCodeMap
protected int numberOfDocuments
protected long numberOfPointers
protected int numberOfTerms
protected boolean MetaReverse
protected Index srcIndex1
protected Index srcIndex2
protected Index destIndex
protected Class<? extends DirectInvertedOutputStream> directFileOutputStreamClass
protected Class<? extends DirectInvertedOutputStream> fieldDirectFileOutputStreamClass
protected Class<? extends DirectInvertedOutputStream> invertedFileOutputStreamClass
protected Class<? extends DirectInvertedOutputStream> fieldInvertedFileOutputStreamClass
protected String directFileInputClass
protected String directFileInputStreamClass
protected String invertedFileInputClass
protected String invertedFileInputStreamClass
protected String basicInvertedIndexPostingIteratorClass
protected String fieldInvertedIndexPostingIteratorClass
protected String basicDirectIndexPostingIteratorClass
protected String fieldDirectIndexPostingIteratorClass
Constructor Detail |
---|
public StructureMerger(Index _srcIndex1, Index _srcIndex2, Index _destIndex)
_srcIndex1
- _srcIndex2
- _destIndex
- Method Detail |
---|
public void setOutputIndex(Index _outputIndex)
_outputIndex
- the index to be merged toprotected void mergeInvertedFiles()
protected void mergeDirectFiles()
protected static Class<?>[] getInterfaces(Object o)
protected void mergeDocumentIndexFiles()
protected void createLexidFile()
public void mergeStructures()
public static void main(String[] args) throws Exception
Binary bits concerns the number of fields in use in the index.
Exception
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |