public class RunsMerger extends Object
RunIterator
 each one pointing at a different run in disk. Each run is sorted, so we only need to compare the heads of the 
 element in the queue in each merging step.
 As the runs are being merged, they are written (to disk) using a BitOut.| Modifier and Type | Class and Description | 
|---|---|
| static class  | RunsMerger.PostingComparatorImplements a comparator for RunIterators (so it can be used by the queue). | 
| Modifier and Type | Field and Description | 
|---|---|
| protected BitOut | bosBitOut used to write the merged postings to disk | 
| protected int | currentTermNumber of terms written | 
| protected int | lastDocFreqLast document's frequency | 
| protected int | lastDocumentLast document written in the stream | 
| protected int | lastFreqFrequency in the run of the last term merged | 
| protected String | lastTermWrittenLast term written to disk (useful for terms appearing in mutiple runs | 
| protected RunIterator | myRunRunReader reference for merging | 
| protected int | numberOfPointersNumber of pointers written | 
| protected Queue<RunIterator> | queueHeap for the postings coming from different runs. | 
| protected RunIteratorFactory | runsSource | 
| protected BitFilePosition | startOffset | 
| protected LexiconEntry | termStatistics | 
| Constructor and Description | 
|---|
| RunsMerger(RunIteratorFactory _runsSource)constructor | 
| Modifier and Type | Method and Description | 
|---|---|
| void | beginMerge(int size,
          String fileName)Begins the multiway merging phase. | 
| void | endMerge(LexiconOutputStream<String> lexStream)Ends the merging phase, writes the last entry and closes the streams. | 
| byte | getBitOffset() | 
| BitOut | getBos()getBos | 
| long | getByteOffset() | 
| int | getLastDocFreq() | 
| int | getLastFreq() | 
| String | getLastTermWritten() | 
| int | getNumberOfPointers() | 
| int | getNumberOfTerms() | 
| protected void | init(int size,
    BitOut invertedFile) | 
| protected void | init(int size,
    String fileName)Begins the merge, initilialising the structures. | 
| boolean | isDone()Indicates whether the merging is done or not | 
| void | mergeOne(LexiconOutputStream<String> lexStream)Mergers one term in the runs. | 
| void | setBos(BitOut _bos)setBos | 
| void | setLastTermWritten(String _lastTermWritten)Setter for the last term written. | 
protected Queue<RunIterator> queue
protected BitOut bos
protected String lastTermWritten
protected LexiconEntry termStatistics
protected int lastFreq
protected int lastDocument
protected int lastDocFreq
protected RunIterator myRun
protected int currentTerm
protected int numberOfPointers
protected BitFilePosition startOffset
protected RunIteratorFactory runsSource
public RunsMerger(RunIteratorFactory _runsSource)
_runsSource - public int getLastFreq()
public int getLastDocFreq()
public int getNumberOfTerms()
public int getNumberOfPointers()
public boolean isDone()
public long getByteOffset()
public byte getBitOffset()
public String getLastTermWritten()
public void setLastTermWritten(String _lastTermWritten)
_lastTermWritten - String with the last term written.protected void init(int size,
        String fileName)
             throws Exception
size - number of runs in disk.fileName - String with the file name of the final inverted file.IOException - if an I/O error occurs.Exceptionpublic void beginMerge(int size,
              String fileName)
                throws Exception
size - number of runs to be merged.fileName - output filename.Exception - if an I/O error occurs.public void mergeOne(LexiconOutputStream<String> lexStream) throws Exception
lexStream - LexiconOutputStream used to write the lexicon.Exception - if an I/O error occurs.public void endMerge(LexiconOutputStream<String> lexStream) throws IOException
lexStream - LexiconOutputStream used to write the lexicon.IOException - if an I/O error occurs.public BitOut getBos()
public void setBos(BitOut _bos)
_bos - Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow