Terrier IR Platform
2.2.1

uk.ac.gla.terrier.compression
Class BitOutputStream

java.lang.Object
  extended by uk.ac.gla.terrier.compression.BitOutputStream
All Implemented Interfaces:
java.io.Closeable, BitOut
Direct Known Subclasses:
BitByteOutputStream, OldBitOutputStream

public class BitOutputStream
extends java.lang.Object
implements BitOut

This class provides methods to write compressed integers to an outputstream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right). There is an internal int buffer used before writting the bytes to the underlying stream, and the bytes are written into 32-bits integers.

Author:
Roi Blanco

Constructor Summary
BitOutputStream()
          Empty constructor
BitOutputStream(java.io.OutputStream os)
          Constructs an instance of the class for a given OutputSTream
BitOutputStream(java.lang.String filename)
          Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.
 
Method Summary
 void append(byte[] toAppend, int len)
          Appends a byte array to the current stream.
 void append(byte[] toAppend, int len, byte newByte, int bitswritten)
          Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.
 void close()
          Closes the BitOutputStream.
 void flush()
          Deprecated.  
 byte getBitOffset()
          Returns the bit offset in the last byte.
 long getByteOffset()
          Returns the byte offset of the stream.
 void padAndFlush()
          Pads the current byte and writes the current int into the buffer.
 int writeBinary(int len, int x)
          Writes an integer in binary format to the stream.
 int writeDelta(int x)
          Writes an integer x into the stream using delta encoding.
 int writeGamma(int x)
          Writes an integer x into the stream using gamma encoding.
 int writeGolomb(int x, int b)
          Writes and integer x into the stream using golomb coding.
 int writeInt(int x, int len)
          Writes an integer x into the underlying OutputStream.
 int writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi)
          Writes a sequence of integers using interpolative coding.
 int writeMinimalBinary(int x, int b)
          Writes an integer x using minimal binary encoding, given an upper bound.
 int writeSkewedGolomb(int x, int b)
          Writes and integer x into the stream using skewed-golomb coding.
 int writeUnary(int x)
          Writes an integer x using unary encoding.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BitOutputStream

public BitOutputStream()
Empty constructor


BitOutputStream

public BitOutputStream(java.io.OutputStream os)
                throws java.io.IOException
Constructs an instance of the class for a given OutputSTream

Parameters:
os - the java.io.OutputStream used for writting
Throws:
java.io.IOException - if an I/O error occurs

BitOutputStream

public BitOutputStream(java.lang.String filename)
                throws java.io.IOException
Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.

Parameters:
filename - String with the name of the underlying file
Throws:
java.io.IOException - if an I/O error occurs
Method Detail

getByteOffset

public long getByteOffset()
Returns the byte offset of the stream. It corresponds to the position of the byte in which the next bit will be written.

Specified by:
getByteOffset in interface BitOut
Returns:
the byte offset in the stream.

getBitOffset

public byte getBitOffset()
Returns the bit offset in the last byte. It corresponds to the position in which the next bit will be written.

Specified by:
getBitOffset in interface BitOut
Returns:
the bit offset in the stream.

writeUnary

public int writeUnary(int x)
               throws java.io.IOException
Writes an integer x using unary encoding. The encoding is a sequence of x -1 zeros and 1 one: 1, 01, 001, 0001, etc .. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeUnary in interface BitOut
Parameters:
x - the number to write
Returns:
the number of bis written
Throws:
java.io.IOException - if an I/O error occurs.

writeGamma

public int writeGamma(int x)
               throws java.io.IOException
Writes an integer x into the stream using gamma encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeGamma in interface BitOut
Parameters:
x - the int number to write
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeDelta

public int writeDelta(int x)
               throws java.io.IOException
Writes an integer x into the stream using delta encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Parameters:
x - the int number to write
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeInt

public int writeInt(int x,
                    int len)
             throws java.io.IOException
Writes an integer x into the underlying OutputStream. First, it checks if it fits into the current byte we are using for writting, and then it writes as many bytes as necessary

Parameters:
x - the int to write
len - length of the int in bits
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

flush

public void flush()
Deprecated. 


close

public void close()
           throws java.io.IOException
Closes the BitOutputStream. It flushes the variables and buffer first.

Specified by:
close in interface java.io.Closeable
Throws:
java.io.IOException - if an I/O error occurs when closing the underlying OutputStream

writeSkewedGolomb

public int writeSkewedGolomb(int x,
                             int b)
                      throws java.io.IOException
Writes and integer x into the stream using skewed-golomb coding. Consider a bucket-vector v = (b, 2b, 4b, ... , 2^i b, ...) .
An integer x is coded as unary(k+1) where k is the index sum(i=0)(k) v_i < x <= sum(i=0)(k+1)
and the remainder with log(v_k) bits in binary.
k = log(x/b + 1) and sum_i = b(2^n -1) (geometric progression)), so if lower = ceil(x/b) -> lower = 2^i * b -> i = log(ceil(x/b)) + 1 the remainder x - sum_i 2^i*b - 1 = x - b(2^n - 1) - 1 is coded with floor(log(v_k)) bits
This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.

Parameters:
x - the number to write
b - the parameter for golomb coding
Returns:
the number of bits written
Throws:
java.io.IOException - if and I/O error occurs

writeInterpolativeCode

public int writeInterpolativeCode(int[] data,
                                  int offset,
                                  int len,
                                  int lo,
                                  int hi)
                           throws java.io.IOException
Writes a sequence of integers using interpolative coding. The data must be sorted (increasing order).

Parameters:
data - the vector containing the integer sequence.
offset - the offset into data where the sequence starts.
len - the number of integers to code.
lo - a lower bound (must be smaller than or equal to the first integer in the sequence).
hi - an upper bound (must be greater than or equal to the last integer in the sequence).
Returns:
the number of written bits.
Throws:
java.io.IOException - if an I/O error occurs.

writeGolomb

public int writeGolomb(int x,
                       int b)
                throws java.io.IOException
Writes and integer x into the stream using golomb coding. This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.

Parameters:
x - the number to write
b - the parameter for golomb coding
Returns:
the number of bits written
Throws:
java.io.IOException - if and I/O error occurs

writeMinimalBinary

public int writeMinimalBinary(int x,
                              int b)
                       throws java.io.IOException
Writes an integer x using minimal binary encoding, given an upper bound. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Parameters:
x - the number to write
b - and strict bound for x
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

append

public void append(byte[] toAppend,
                   int len)
            throws java.io.IOException
Appends a byte array to the current stream. Flushes the current int, the buffer and then writes the new sequence of bytes.

Parameters:
toAppend - byte[] it is going to be written to the stream.
len - length in bytes of the byte buffer (number of elements of the array).
Throws:
java.io.IOException - if an I/O exception occurs.

append

public void append(byte[] toAppend,
                   int len,
                   byte newByte,
                   int bitswritten)
            throws java.io.IOException
Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.

Parameters:
toAppend - byte[] it is going to be written to the stream.
len - length in bytes of the byte buffer (number of elements of the array).
newByte - last byte (the one not fully written)
bitswritten - number of bits written in the last byte
Throws:
java.io.IOException - if an I/O exception occurs.

padAndFlush

public void padAndFlush()
                 throws java.io.IOException
Pads the current byte and writes the current int into the buffer. Then, it flushes the buffer to the underlying OutputStream.

Throws:
java.io.IOException - if an I/O error occurs.

writeBinary

public int writeBinary(int len,
                       int x)
                throws java.io.IOException
Writes an integer in binary format to the stream.

Specified by:
writeBinary in interface BitOut
Parameters:
len - size in bits of the number.
x - the integer to write.
Returns:
the number of bits written.
Throws:
java.io.IOException - if an I/O error occurs.

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow