|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object uk.ac.gla.terrier.compression.BitFile
public class BitFile
This class encapsulates a random access file and provides
the functionalities to write highly compressed data structures, eg binary encoded, unary encoded and gamma encoded
integers greater than zero, as well as specifying their offset in the file. It
is employed by the DirectIndex and the InvertedIndex classes.
The sequence of method calls to write a sequence of gamma encoded
and unary encoded numbers is:
file.writeReset();
while for reading a sequence of numbers the sequence of calls is:
long startByte1 = file.getByteOffset();
byte startBit1 = file.getBitOffset();
file.writeGamma(20000);
file.writeUnary(2);
file.writeGamma(35000);
file.writeUnary(1);
file.writeGamma(3);
file.writeUnary(2);
file.writeFlush();
long endByte1 = file.getByteOffset();
byte endBit1 = file.getBitOffset();
if (endBit1 == 0 && endByte1 > 0) {
endBit1 = 7;
endByte1--;
}
file.readReset((long) startByte1, (byte) startBit1, (long) endByte1, (byte) endBit1);
int gamma = file.readGamma();
int unary = file.readUnary();
Constructor Summary | |
---|---|
BitFile(java.io.File file)
A constuctor for an instance of this class, given an abstract file. |
|
BitFile(java.io.File file,
java.lang.String access)
A constuctor for an instance of this class, given an abstract file. |
|
BitFile(java.lang.String filename)
A constuctor for an instance of this class. |
|
BitFile(java.lang.String filename,
java.lang.String access)
A constuctor for an instance of this class. |
Method Summary | |
---|---|
void |
close()
Closes the random access file. |
byte |
getBitOffset()
Returns the bit offset of the last current byte in the buffer. |
long |
getByteOffset()
Returns the byte offset in the buffer. |
byte[] |
getInBuffer()
Returns the current buffer being processed |
int |
readBinary(int noBits)
Reads a binary integer from the already read buffer. |
int |
readGamma()
Reads and decodes a gamma encoded integer from the already read buffer. |
void |
readReset(long startByteOffset,
byte startBitOffset,
long endByteOffset,
byte endBitOffset)
Reads from the file a specific number of bytes and after this call, a sequence of read calls may follow. |
int |
readUnary()
Reads a unary integer from the already read buffer. |
void |
writeBinary(int bitsToWrite,
int n)
Writes a binary integer, of a given length, to the already read buffer. |
void |
writeFlush()
Flushes the in-memory buffer to the file after finishing a sequence of write calls. |
void |
writeGamma(int n)
Writes an gamma encoded integer in the buffer. |
void |
writeReset()
Prepares for writing to the file unary or gamma encoded integers. |
void |
writeUnary(int n)
Writes a unary integer to the buffer. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public BitFile(java.io.File file)
public BitFile(java.io.File file, java.lang.String access)
public BitFile(java.lang.String filename)
public BitFile(java.lang.String filename, java.lang.String access)
Method Detail |
---|
public int readBinary(int noBits)
noBits
- the number of binary bits to read
public void writeBinary(int bitsToWrite, int n)
bitsToWrite
- the number of bits to writen
- the integer to writepublic void close()
public byte getBitOffset()
public long getByteOffset()
public int readGamma()
public void readReset(long startByteOffset, byte startBitOffset, long endByteOffset, byte endBitOffset)
startByteOffset
- the starting byte to read fromstartBitOffset
- the bit offset in the starting byteendByteOffset
- the ending byteendBitOffset
- the bit offset in the ending byte.
This bit is the last bit of this entry.public int readUnary()
public byte[] getInBuffer()
public void writeFlush()
public void writeGamma(int n)
n
- The integer to be encoded and saved in the buffer.public void writeReset()
public void writeUnary(int n)
n
- The integer to be encoded and writen in the buffer.
|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |