Terrier IR Platform
1.1.1

uk.ac.gla.terrier.structures.merging
Class StructureMerger

java.lang.Object
  extended by uk.ac.gla.terrier.structures.merging.StructureMerger
Direct Known Subclasses:
BlockStructureMerger

public class StructureMerger
extends java.lang.Object

This class merges the structures created by Terrier, so that we use fewer and larger inverted and direct files.

Version:
$Revision: 1.17 $
Author:
Vassilis Plachouras

Constructor Summary
StructureMerger(java.lang.String _filename1, java.lang.String _filename2)
          A constructor that sets the filenames of the inverted files to merge
 
Method Summary
static void main(java.lang.String[] args)
          Usage: java uk.ac.gla.terrier.structures.merging.StructureMerger [binary bits] [inverted file 1] [inverted file 2] [output inverted file]
 void mergeStructures()
          Merges the structures created by terrier.
 void setNumberOfBits(int bits)
          Sets the number of bits to write or read for binary encoded numbers
 void setOutputFilename(java.lang.String _outputName)
          Sets the output filename of the merged inverted file
static void writeFieldPostings(int[][] postings, int firstId, BitOutputStream output, int binaryBits)
          Writes the given postings to a bit file.
static void writeNoFieldPostings(int[][] postings, int firstId, BitOutputStream output)
          Writes the given postings to a bit file.
static void writePostings(int[][] postings, int firstId, BitOutputStream output, int binaryBits)
          Writes the given postings to a bit file.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StructureMerger

public StructureMerger(java.lang.String _filename1,
                       java.lang.String _filename2)
A constructor that sets the filenames of the inverted files to merge

Parameters:
_filename1 - the first inverted file to merge
_filename2 - the second inverted file to merge
Method Detail

setNumberOfBits

public void setNumberOfBits(int bits)
Sets the number of bits to write or read for binary encoded numbers

Parameters:
bits - the number of bits to write or read

setOutputFilename

public void setOutputFilename(java.lang.String _outputName)
Sets the output filename of the merged inverted file

Parameters:
_outputName - the filename of the merged inverted file

writePostings

public static void writePostings(int[][] postings,
                                 int firstId,
                                 BitOutputStream output,
                                 int binaryBits)
                          throws java.io.IOException
Writes the given postings to a bit file. Depending on the value of the field binaryBits, this method will call the appropriate method writeToInvertedFileFields, or writeToInvertedFileNoFields.

Parameters:
postings - the postings list to write.
firstId - the first identifier to write. This can be an id plus one, or the gap of the current id and the previous one.
output - the output bit file.
Throws:
java.io.IOException

writeFieldPostings

public static void writeFieldPostings(int[][] postings,
                                      int firstId,
                                      BitOutputStream output,
                                      int binaryBits)
                               throws java.io.IOException
Writes the given postings to a bit file. This method assumes that field information is available as well.

Parameters:
postings - the postings list to write.
firstId - the first identifier to write. This can be an id plus one, or the gap of the current id and the previous one.
output - the output bit file.
Throws:
java.io.IOException

writeNoFieldPostings

public static void writeNoFieldPostings(int[][] postings,
                                        int firstId,
                                        BitOutputStream output)
                                 throws java.io.IOException
Writes the given postings to a bit file. This method assumes that field information is not available.

Parameters:
postings - the postings list to write.
firstId - the first identifier to write. This can be an id plus one, or the gap of the current id and the previous one.
output - the output bit file.
Throws:
java.io.IOException - if an error occurs during writing to a file.

mergeStructures

public void mergeStructures()
Merges the structures created by terrier.


main

public static void main(java.lang.String[] args)
Usage: java uk.ac.gla.terrier.structures.merging.StructureMerger [binary bits] [inverted file 1] [inverted file 2] [output inverted file]

Binary bits concerns the number of fields in use in the index.


Terrier IR Platform
1.1.1

Terrier Information Retrieval Platform 1.1.1. Copyright 2004-2007 University of Glasgow