org.terrier.compression
Class BitOutputStream

java.lang.Object
  extended by org.terrier.compression.BitOutputStream
All Implemented Interfaces:
java.io.Closeable, BitOut
Direct Known Subclasses:
BitByteOutputStream

public class BitOutputStream
extends java.lang.Object
implements BitOut

This class provides methods to write compressed integers to an outputstream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right). There is an internal int buffer used before writing the bytes to the underlying stream, and the bytes are written into 32-bits integers.

Author:
Roi Blanco

Field Summary
protected  int bitOffset
          The bit offset.
protected  byte[] buffer
          Writing buffer
protected  int bufferPointer
          poijnter for the buffer
protected  int bufferSize
          size of the buffer it has to be 4 * k
protected  long byteOffset
          The byte offset.
protected  int byteToWrite
          A int to write to the stream.
protected static int DEFAULT_SIZE
          Default size for the buffer
protected  java.io.DataOutputStream dos
          The private output stream used internally.
protected static org.apache.log4j.Logger logger
          the logger for this class
 
Constructor Summary
BitOutputStream()
          Empty constructor
BitOutputStream(java.io.OutputStream os)
          Constructs an instance of the class for a given OutputSTream
BitOutputStream(java.lang.String filename)
          Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.
 
Method Summary
 void append(byte[] toAppend, int len)
          Appends a byte array to the current stream.
 void append(byte[] toAppend, int len, byte newByte, int bitswritten)
          Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.
 void close()
          Closes the BitOutputStream.
 void flush()
          Deprecated.  
 byte getBitOffset()
          Returns the bit offset in the last byte.
 long getByteOffset()
          Returns the byte offset of the stream.
 void padAndFlush()
          Pads the current byte and writes the current int into the buffer.
 int writeBinary(int len, int x)
          Writes an integer in binary format to the stream.
 int writeDelta(int x)
          Writes an integer x into the stream using delta encoding.
 int writeGamma(int x)
          Writes an integer x into the stream using gamma encoding.
 int writeGolomb(int x, int b)
          Writes and integer x into the stream using golomb coding.
 int writeInt(int x, int len)
          Writes an integer x into the underlying OutputStream.
 int writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi)
          Writes a sequence of integers using interpolative coding.
 int writeMinimalBinary(int x, int b)
          Writes an integer x using minimal binary encoding, given an upper bound.
 int writeSkewedGolomb(int x, int b)
          Writes and integer x into the stream using skewed-golomb coding.
 int writeUnary(int x)
          Writes an integer x using unary encoding.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

protected static final org.apache.log4j.Logger logger
the logger for this class


buffer

protected byte[] buffer
Writing buffer


bufferPointer

protected int bufferPointer
poijnter for the buffer


bufferSize

protected int bufferSize
size of the buffer it has to be 4 * k


DEFAULT_SIZE

protected static final int DEFAULT_SIZE
Default size for the buffer

See Also:
Constant Field Values

dos

protected java.io.DataOutputStream dos
The private output stream used internally.


byteOffset

protected long byteOffset
The byte offset.


bitOffset

protected int bitOffset
The bit offset.


byteToWrite

protected int byteToWrite
A int to write to the stream.

Constructor Detail

BitOutputStream

public BitOutputStream()
Empty constructor


BitOutputStream

public BitOutputStream(java.io.OutputStream os)
                throws java.io.IOException
Constructs an instance of the class for a given OutputSTream

Parameters:
os - the java.io.OutputStream used for writting
Throws:
java.io.IOException - if an I/O error occurs

BitOutputStream

public BitOutputStream(java.lang.String filename)
                throws java.io.IOException
Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.

Parameters:
filename - String with the name of the underlying file
Throws:
java.io.IOException - if an I/O error occurs
Method Detail

getByteOffset

public long getByteOffset()
Returns the byte offset of the stream. It corresponds to the position of the byte in which the next bit will be written.

Specified by:
getByteOffset in interface BitOut
Returns:
the byte offset in the stream.

getBitOffset

public byte getBitOffset()
Returns the bit offset in the last byte. It corresponds to the position in which the next bit will be written.

Specified by:
getBitOffset in interface BitOut
Returns:
the bit offset in the stream.

append

public void append(byte[] toAppend,
                   int len)
            throws java.io.IOException
Appends a byte array to the current stream. Flushes the current int, the buffer and then writes the new sequence of bytes.

Parameters:
toAppend - byte[] it is going to be written to the stream.
len - length in bytes of the byte buffer (number of elements of the array).
Throws:
java.io.IOException - if an I/O exception occurs.

append

public void append(byte[] toAppend,
                   int len,
                   byte newByte,
                   int bitswritten)
            throws java.io.IOException
Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.

Parameters:
toAppend - byte[] it is going to be written to the stream.
len - length in bytes of the byte buffer (number of elements of the array).
newByte - last byte (the one not fully written)
bitswritten - number of bits written in the last byte
Throws:
java.io.IOException - if an I/O exception occurs.

padAndFlush

public void padAndFlush()
                 throws java.io.IOException
Pads the current byte and writes the current int into the buffer. Then, it flushes the buffer to the underlying OutputStream.

Throws:
java.io.IOException - if an I/O error occurs.

flush

public void flush()
Deprecated. 


close

public void close()
           throws java.io.IOException
Closes the BitOutputStream. It flushes the variables and buffer first.

Specified by:
close in interface java.io.Closeable
Throws:
java.io.IOException - if an I/O error occurs when closing the underlying OutputStream

writeUnary

public int writeUnary(int x)
               throws java.io.IOException
Writes an integer x using unary encoding. The encoding is a sequence of x -1 zeros and 1 one: 1, 01, 001, 0001, etc .. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeUnary in interface BitOut
Parameters:
x - the number to write
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeGamma

public int writeGamma(int x)
               throws java.io.IOException
Writes an integer x into the stream using gamma encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeGamma in interface BitOut
Parameters:
x - the int number to write
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeDelta

public int writeDelta(int x)
               throws java.io.IOException
Writes an integer x into the stream using delta encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeDelta in interface BitOut
Parameters:
x - the int number to write
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeInt

public int writeInt(int x,
                    int len)
             throws java.io.IOException
Writes an integer x into the underlying OutputStream. First, it checks if it fits into the current byte we are using for writing, and then it writes as many bytes as necessary

Specified by:
writeInt in interface BitOut
Parameters:
x - the int to write
len - length of the int in bits
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeSkewedGolomb

public int writeSkewedGolomb(int x,
                             int b)
                      throws java.io.IOException
Writes and integer x into the stream using skewed-golomb coding. Consider a bucket-vector v = <b, 2b, 4b, ... , 2^i b, ...> an integer x is coded as unary(k+1) where k is the index sum(i=0)(k) v_i < x <= sum(i=0)(k+1)
, so k = log(x/b + 1) sum_i = b(2^n -1) (geometric progression) and the remainder with log(v_k) bits in binary if lower = ceil(x/b) -> lower = 2^i * b -> i = log(ceil(x/b)) + 1 the remainder x - sum_i 2^i*b - 1 = x - b(2^n - 1) - 1 is coded with floor(log(v_k)) bits This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.

Specified by:
writeSkewedGolomb in interface BitOut
Parameters:
x - the number to write
b - the parameter for golomb coding
Returns:
the number of bits written
Throws:
java.io.IOException - if and I/O error occurs

writeInterpolativeCode

public int writeInterpolativeCode(int[] data,
                                  int offset,
                                  int len,
                                  int lo,
                                  int hi)
                           throws java.io.IOException
Writes a sequence of integers using interpolative coding. The data must be sorted (increasing order).

Specified by:
writeInterpolativeCode in interface BitOut
Parameters:
data - the vector containing the integer sequence.
offset - the offset into data where the sequence starts.
len - the number of integers to code.
lo - a lower bound (must be smaller than or equal to the first integer in the sequence).
hi - an upper bound (must be greater than or equal to the last integer in the sequence).
Returns:
the number of written bits.
Throws:
java.io.IOException - if an I/O error occurs.

writeGolomb

public int writeGolomb(int x,
                       int b)
                throws java.io.IOException
Writes and integer x into the stream using golomb coding. This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.

Specified by:
writeGolomb in interface BitOut
Parameters:
x - the number to write
b - the parameter for golomb coding
Returns:
the number of bits written
Throws:
java.io.IOException - if and I/O error occurs

writeMinimalBinary

public int writeMinimalBinary(int x,
                              int b)
                       throws java.io.IOException
Writes an integer x using minimal binary encoding, given an upper bound. This method is not failsafe, it doesn't check if the argument is 0 or negative.

Specified by:
writeMinimalBinary in interface BitOut
Parameters:
x - the number to write
b - and strict bound for x
Returns:
the number of bits written
Throws:
java.io.IOException - if an I/O error occurs.

writeBinary

public int writeBinary(int len,
                       int x)
                throws java.io.IOException
Writes an integer in binary format to the stream.

Specified by:
writeBinary in interface BitOut
Parameters:
len - size in bits of the number.
x - the integer to write.
Returns:
the number of bits written.
Throws:
java.io.IOException - if an I/O error occurs.


Terrier 3.5. Copyright © 2004-2011 University of Glasgow