Package org.terrier.compression

Provides implementation of a random access and input and output streams where gamma, unary and binary, delta Golomb encoded integers can be read or written.

See:
          Description

Interface Summary
BitIn Interface describing the read compression methods supported by the BitFileBuffered and BitInputStream classes.
BitInSeekable Interface for reading a bit compressed file in a random access manner.
BitOut Interface describing the writing compression methods supported by the BitOutputStream classes.
BitWritable Like o.a.h.io.Writable, but for using BitIn and BitOut
 

Class Summary
BitByteOutputStream An implementation of BitOutputStream that does no buffering.
BitFile Deprecated. Use BitFileBuffered and BitOutputStream instead
BitFileBuffered Implementation of BitInSeekable/BitIn interfaces similar to BitFile.
BitFileBuffered.BitInBuffered Implements a BitIn around a RandomDataInput
BitFileInMemory Class which enables a bit compressed file to be read wholly into memory and accessed from there with low latency.
BitFileInMemoryLarge Allows access to bit compressed files that are loaded entirely into memory.
BitInputStream This class reads from a file or an InputStream integers that can be coded with different encoding algorithms.
BitOutputStream This class provides methods to write compressed integers to an outputstream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right).
BitUtilities Utility methods for use in the BitFile classes.
DebuggingBitIn This class provides debugging at the bit stream level.
LinkedBuffer Implements an element of a linked list that contains a byte array
MemoryLinkedOutputStream This class implements an OutputStream that writes everything in memory, and never flushes the data to disk.
MemoryOutputStream This class extends an ordinary OutputStream to handle transparently writes in memory.
MemorySBOS This class extends the BitByteOutputStream, so it provides the compression writing functions, but uses a MemoryOutputStream as an underlying OutputStream, so it is needed to be flushed to disk separately.
 

Package org.terrier.compression Description

Provides implementation of a random access and input and output streams where gamma, unary and binary, delta Golomb encoded integers can be read or written.

Reading and Writing Stream Examples

Writing and reading of streams of compressed integers can be made using BitOutputStream and BitInputStream classes, while the general contracts are specified using the BitOut and BitIn interfaces.

//Golomb coding parameter
final int GOLOM_B = 10;

//write a bit compressed stream to the file test.bf
BitOut out = new BitOutputStream("test" + BitIn.USUAL_EXTENSION);
//note that the numbers written must be greater than 0. The result for writing numbers less
//than 1 is undefined
for(int i=1;i<number;i++)
{
 //unary, gamma, delta, and int write compressed integers
 out.writeUnary(i);
 out.writeGamma(i);
 out.writeDelta(i);
 out.writeInt(i);
 //write a number given knowledge of how large it can be
 out.writeMinimalBinary(i, number);
 out.writeGolomb(i, GOLOM_B);
 out.writeSkewedGolomb(i, GOLOM_B);
 //get the position. This is used for creating pointers into the bit file
 long byteOffset = out.getByteOffset();
 byte bitOffset = out.getBitOffset();
}
out.close();

//now read in the compressed stream
BitIn in = new BitInputStream("test" + BitIn.USUAL_EXTENSION);
for(int i=1;i<number;i++)
{
 int num;
 //unary, gamma, delta, and int write compressed integers
 num = in.readUnary();
 num = in.readGamma();
 num = in.readDelta();
 num = in.writeInt();
 //write a number given knowledge of how large it can be
 num = in.writeMinimalBinary(number);
 num = in.writeGolomb(GOLOM_B);
 num = in.writeSkewedGolomb(GOLOM_B);
 //get the position. This is used for creating pointers into the bit file
 long byteOffset = in.getByteOffset();
 byte bitOffset = in.getBitOffset();
 //save or write the pointer for later use
}
in.close();

Reading RandomAccess

As an alternative to reading and writing streams, a BitInSeekable implemenation can be used to access a random point within a bit compressed file. In general, BitFileBuffered is the preferred BitInSeekable implementation, however BitFileInMemory and BitFileInMemoryLarge are also available for keeping files in memory.
BitInSeekable bitFile = new BitFileBuffered("test" + BitIn.USUAL_EXTENSION);
//position to seek to
long byteOffset = ?;
byte bitOffset = ?;
BitIn in = bitFile.readReset(byteOffset, bitOffset);
int num;
num = in.readUnary();
num = in.readGamma();
num = in.readDelta();
num = in.writeInt();
//write a number given knowledge of how large it can be
num = in.writeMinimalBinary(number);
num = in.writeGolomb(GOLOM_B);
num = in.writeSkewedGolomb(GOLOM_B);



Terrier 3.5. Copyright © 2004-2011 University of Glasgow