Package org.terrier.compression.bit

Provides implementation of a random access and input and output streams where gamma, unary and binary, delta Golomb encoded integers can be read or written.

See: Description

Package org.terrier.compression.bit Description

Provides implementation of a random access and input and output streams where gamma, unary and binary, delta Golomb encoded integers can be read or written.

Reading and Writing Stream Examples

Writing and reading of streams of compressed integers can be made using BitOutputStream and BitInputStream classes, while the general contracts are specified using the BitOut and BitIn interfaces.

//Golomb coding parameter
final int GOLOM_B = 10;

//write a bit compressed stream to the file test.bf
BitOut out = new BitOutputStream("test" + BitIn.USUAL_EXTENSION);
//note that the numbers written must be greater than 0. The result for writing numbers less
//than 1 is undefined
for(int i=1;i<number;i++)
{
 //unary, gamma, delta, and int write compressed integers
 out.writeUnary(i);
 out.writeGamma(i);
 out.writeDelta(i);
 out.writeInt(i);
 //write a number given knowledge of how large it can be
 out.writeMinimalBinary(i, number);
 out.writeGolomb(i, GOLOM_B);
 out.writeSkewedGolomb(i, GOLOM_B);
 //get the position. This is used for creating pointers into the bit file
 long byteOffset = out.getByteOffset();
 byte bitOffset = out.getBitOffset();
}
out.close();

//now read in the compressed stream
BitIn in = new BitInputStream("test" + BitIn.USUAL_EXTENSION);
for(int i=1;i<number;i++)
{
 int num;
 //unary, gamma, delta, and int write compressed integers
 num = in.readUnary();
 num = in.readGamma();
 num = in.readDelta();
 num = in.writeInt();
 //write a number given knowledge of how large it can be
 num = in.writeMinimalBinary(number);
 num = in.writeGolomb(GOLOM_B);
 num = in.writeSkewedGolomb(GOLOM_B);
 //get the position. This is used for creating pointers into the bit file
 long byteOffset = in.getByteOffset();
 byte bitOffset = in.getBitOffset();
 //save or write the pointer for later use
}
in.close();

Reading RandomAccess

As an alternative to reading and writing streams, a BitInSeekable implemenation can be used to access a random point within a bit compressed file. In general, BitFileBuffered is the preferred BitInSeekable implementation, however BitFileInMemory and BitFileInMemoryLarge are also available for keeping files in memory.
BitInSeekable bitFile = new BitFileBuffered("test" + BitIn.USUAL_EXTENSION);
//position to seek to
long byteOffset = ?;
byte bitOffset = ?;
BitIn in = bitFile.readReset(byteOffset, bitOffset);
int num;
num = in.readUnary();
num = in.readGamma();
num = in.readDelta();
num = in.writeInt();
//write a number given knowledge of how large it can be
num = in.writeMinimalBinary(number);
num = in.writeGolomb(GOLOM_B);
num = in.writeSkewedGolomb(GOLOM_B);

Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow