Package org.terrier.compression.bit
Class BitOutputStream
- java.lang.Object
-
- org.terrier.compression.bit.BitOutputStream
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,BitOut
- Direct Known Subclasses:
BitByteOutputStream
public class BitOutputStream extends java.lang.Object implements BitOut
This class provides methods to write compressed integers to an outputstream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right). There is an internal int buffer used before writing the bytes to the underlying stream, and the bytes are written into 32-bits integers.- Author:
- Roi Blanco
-
-
Field Summary
Fields Modifier and Type Field Description protected int
bitOffset
The bit offset.protected byte[]
buffer
Writing bufferprotected int
bufferPointer
poijnter for the bufferprotected int
bufferSize
size of the buffer it has to be 4 * kprotected long
byteOffset
The byte offset.protected int
byteToWrite
A int to write to the stream.protected static int
DEFAULT_SIZE
Default size for the bufferprotected java.io.DataOutputStream
dos
The private output stream used internally.protected static org.slf4j.Logger
logger
the logger for this class
-
Constructor Summary
Constructors Constructor Description BitOutputStream()
Empty constructorBitOutputStream(java.io.OutputStream os)
Constructs an instance of the class for a given OutputSTreamBitOutputStream(java.lang.String filename)
Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
append(byte[] toAppend, int len)
Appends a byte array to the current stream.void
append(byte[] toAppend, int len, byte newByte, int bitswritten)
Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.void
close()
Closes the BitOutputStream.void
flush()
Deprecated.byte
getBitOffset()
Returns the bit offset in the last byte.long
getByteOffset()
Returns the byte offset of the stream.void
padAndFlush()
Pads the current byte and writes the current int into the buffer.int
writeBinary(int len, int x)
Writes an integer in binary format to the stream.int
writeDelta(int x)
Writes an integer x into the stream using delta encoding.int
writeGamma(int x)
Writes an integer x into the stream using gamma encoding.int
writeGolomb(int x, int b)
Writes and integer x into the stream using golomb coding.int
writeInt(int x, int len)
Writes an integer x into the underlying OutputStream.int
writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi)
Writes a sequence of integers using interpolative coding.int
writeMinimalBinary(int x, int b)
Writes an integer x using minimal binary encoding, given an upper bound.int
writeSkewedGolomb(int x, int b)
Writes and integer x into the stream using skewed-golomb coding.int
writeUnary(int x)
Writes an integer x using unary encoding.
-
-
-
Field Detail
-
logger
protected static final org.slf4j.Logger logger
the logger for this class
-
buffer
protected byte[] buffer
Writing buffer
-
bufferPointer
protected int bufferPointer
poijnter for the buffer
-
bufferSize
protected int bufferSize
size of the buffer it has to be 4 * k
-
DEFAULT_SIZE
protected static final int DEFAULT_SIZE
Default size for the buffer- See Also:
- Constant Field Values
-
dos
protected java.io.DataOutputStream dos
The private output stream used internally.
-
byteOffset
protected long byteOffset
The byte offset.
-
bitOffset
protected int bitOffset
The bit offset.
-
byteToWrite
protected int byteToWrite
A int to write to the stream.
-
-
Constructor Detail
-
BitOutputStream
public BitOutputStream()
Empty constructor
-
BitOutputStream
public BitOutputStream(java.io.OutputStream os) throws java.io.IOException
Constructs an instance of the class for a given OutputSTream- Parameters:
os
- the java.io.OutputStream used for writting- Throws:
java.io.IOException
- if an I/O error occurs
-
BitOutputStream
public BitOutputStream(java.lang.String filename) throws java.io.IOException
Constructs an instance of the class for a given filename Note that on a FileNotFoundException, this contructor will sleep for 2 seconds before retrying to open the file.- Parameters:
filename
- String with the name of the underlying file- Throws:
java.io.IOException
- if an I/O error occurs
-
-
Method Detail
-
getByteOffset
public long getByteOffset()
Returns the byte offset of the stream. It corresponds to the position of the byte in which the next bit will be written.- Specified by:
getByteOffset
in interfaceBitOut
- Returns:
- the byte offset in the stream.
-
getBitOffset
public byte getBitOffset()
Returns the bit offset in the last byte. It corresponds to the position in which the next bit will be written.- Specified by:
getBitOffset
in interfaceBitOut
- Returns:
- the bit offset in the stream.
-
append
public void append(byte[] toAppend, int len) throws java.io.IOException
Appends a byte array to the current stream. Flushes the current int, the buffer and then writes the new sequence of bytes.- Parameters:
toAppend
- byte[] it is going to be written to the stream.len
- length in bytes of the byte buffer (number of elements of the array).- Throws:
java.io.IOException
- if an I/O exception occurs.
-
append
public void append(byte[] toAppend, int len, byte newByte, int bitswritten) throws java.io.IOException
Appends a byte array to the current stream, where the last byte is not fully written Flushes the current int, the buffer and then writes the new sequence of bytes.- Parameters:
toAppend
- byte[] it is going to be written to the stream.len
- length in bytes of the byte buffer (number of elements of the array).newByte
- last byte (the one not fully written)bitswritten
- number of bits written in the last byte- Throws:
java.io.IOException
- if an I/O exception occurs.
-
padAndFlush
public void padAndFlush() throws java.io.IOException
Pads the current byte and writes the current int into the buffer. Then, it flushes the buffer to the underlying OutputStream.- Throws:
java.io.IOException
- if an I/O error occurs.
-
flush
public void flush()
Deprecated.
-
close
public void close() throws java.io.IOException
Closes the BitOutputStream. It flushes the variables and buffer first.- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
- if an I/O error occurs when closing the underlying OutputStream
-
writeUnary
public int writeUnary(int x) throws java.io.IOException
Writes an integer x using unary encoding. The encoding is a sequence of x -1 zeros and 1 one: 1, 01, 001, 0001, etc .. This method is not failsafe, it doesn't check if the argument is 0 or negative.- Specified by:
writeUnary
in interfaceBitOut
- Parameters:
x
- the number to write- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if an I/O error occurs.
-
writeGamma
public int writeGamma(int x) throws java.io.IOException
Writes an integer x into the stream using gamma encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.- Specified by:
writeGamma
in interfaceBitOut
- Parameters:
x
- the int number to write- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if an I/O error occurs.
-
writeDelta
public int writeDelta(int x) throws java.io.IOException
Writes an integer x into the stream using delta encoding. This method is not failsafe, it doesn't check if the argument is 0 or negative.- Specified by:
writeDelta
in interfaceBitOut
- Parameters:
x
- the int number to write- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if an I/O error occurs.
-
writeInt
public int writeInt(int x, int len) throws java.io.IOException
Writes an integer x into the underlying OutputStream. First, it checks if it fits into the current byte we are using for writing, and then it writes as many bytes as necessary
-
writeSkewedGolomb
public int writeSkewedGolomb(int x, int b) throws java.io.IOException
Writes and integer x into the stream using skewed-golomb coding. Consider a bucket-vectorv = <b, 2b, 4b, ... , 2^i b, ...>
an integerx
is coded asunary(k+1)
wherek
is the indexsum(i=0)(k) v_i > x <= sum(i=0)(k+1)
, sok = log(x/b + 1)
sum_i = b(2^n -1)
(geometric progression) and the remainder withlog(v_k)
bits in binary iflower = ceil(x/b) -> lower = 2^i * b -> i = log(ceil(x/b)) + 1
the remainderx - sum_i 2^i*b - 1 = x - b(2^n - 1) - 1
is coded withfloor(log(v_k))
bits This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.- Specified by:
writeSkewedGolomb
in interfaceBitOut
- Parameters:
x
- the number to writeb
- the parameter for golomb coding- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if and I/O error occurs
-
writeInterpolativeCode
public int writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi) throws java.io.IOException
Writes a sequence of integers using interpolative coding. The data must be sorted (increasing order).- Specified by:
writeInterpolativeCode
in interfaceBitOut
- Parameters:
data
- the vector containing the integer sequence.offset
- the offset intodata
where the sequence starts.len
- the number of integers to code.lo
- a lower bound (must be smaller than or equal to the first integer in the sequence).hi
- an upper bound (must be greater than or equal to the last integer in the sequence).- Returns:
- the number of written bits.
- Throws:
java.io.IOException
- if an I/O error occurs.
-
writeGolomb
public int writeGolomb(int x, int b) throws java.io.IOException
Writes and integer x into the stream using golomb coding. This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.- Specified by:
writeGolomb
in interfaceBitOut
- Parameters:
x
- the number to writeb
- the parameter for golomb coding- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if and I/O error occurs
-
writeMinimalBinary
public int writeMinimalBinary(int x, int b) throws java.io.IOException
Writes an integer x using minimal binary encoding, given an upper bound. This method is not failsafe, it doesn't check if the argument is 0 or negative.- Specified by:
writeMinimalBinary
in interfaceBitOut
- Parameters:
x
- the number to writeb
- and strict bound forx
- Returns:
- the number of bits written
- Throws:
java.io.IOException
- if an I/O error occurs.
-
writeBinary
public int writeBinary(int len, int x) throws java.io.IOException
Writes an integer in binary format to the stream.- Specified by:
writeBinary
in interfaceBitOut
- Parameters:
len
- size in bits of the number.x
- the integer to write.- Returns:
- the number of bits written.
- Throws:
java.io.IOException
- if an I/O error occurs.
-
-