|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object uk.ac.gla.terrier.compression.BitFile
public class BitFile
This class encapsulates a random access file and provides
the functionalities to write binary encoded, unary encoded and gamma encoded
integers greater than zero, as well as specifying their offset in the file. It
is employed by the DirectFile and the InvertedFile classes.
Use the getBit/ByteOffset methods only for writting, and not for reading.
This class contains the methods in both BitInputStream and BitOutputStream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right).
The sequence of method calls to write a sequence of gamma encoded
and unary encoded numbers is:
file.writeReset();
long startByte1 = file.getByteOffset();
byte startBit1 = file.getBitOffset();
file.writeGamma(20000);
file.writeUnary(2);
file.writeGamma(35000);
file.writeUnary(1);
file.writeGamma(3);
file.writeUnary(2);
long endByte1 = file.getByteOffset();
byte endBit1 = file.getBitOffset();
if (endBit1 == 0 && endByte1 > 0) {
endBit1 = 7;
endByte1--;
}
while for reading a sequence of numbers the sequence of calls is:
file.readReset((long) startByte1, (byte) startBit1, (long) endByte1, (byte) endBit1);
int gamma = file.readGamma();
int unary = file.readUnary();
Constructor Summary | |
---|---|
BitFile(java.io.File file)
|
|
BitFile(java.io.File _file,
java.lang.String access)
Constructs an instance of the class for a given file and an acces method to the file |
|
BitFile(java.lang.String filename)
Constructs an instance of the class for a given filename, "rw" permissions |
|
BitFile(java.lang.String filename,
java.lang.String access)
Constructs an instance of the class for a given filename and an acces method to the file |
Method Summary | |
---|---|
void |
align()
Aligns the stream to the next byte |
void |
close()
Closes the file. |
byte |
getBitOffset()
Returns the bit offset in the last byte. |
long |
getByteOffset()
Returns the byte offset of the stream. |
int |
readBinary(int len)
Reads a binary integer from the already read buffer. |
int |
readGamma()
Reads a gamma encoded integer from the underlying stream |
int |
readGolomb(int b)
Reads a Golomb encoded integer |
void |
readInterpolativeCoding(int[] data,
int offset,
int len,
int lo,
int hi)
Reads a sequence of numbers from the stream interpolative coded. |
int |
readMinimalBinary(int b)
Reads a binary encoded integer, given an upper bound |
int |
readMinimalBinaryZero(int b)
Reads a minimal binary encoded number, when the upper bound can b zero. |
BitIn |
readReset(long startByteOffset,
byte startBitOffset,
long endByteOffset,
byte endBitOffset)
Reads from the file a specific number of bytes and after this call, a sequence of read calls may follow. |
int |
readSkewedGolomb(int b)
Reads a skewed-golomb encoded integer from the underlying stream Consider a bucket-vector v = <0, 2b, 4b, ... |
int |
readUnary()
Reads a unary encoded integer from the underlying stream |
void |
skipBits(int len)
Skip a number of bits in the current input stream |
int |
writeBinary(int len,
int x)
Writes an integer in binary format to the stream. |
void |
writeFlush()
Flushes the OuputStream (empty method) |
int |
writeGamma(int x)
Writes an integer x into the stream using gamma encoding. |
int |
writeGolomb(int x,
int b)
Writes and integer x into the stream using golomb coding. |
int |
writeInt(int x,
int len)
Writes an integer x into the underlying OutputStream. |
int |
writeInterpolativeCode(int[] data,
int offset,
int len,
int lo,
int hi)
Writes a sequence of integers using interpolative coding. |
int |
writeMinimalBinary(int x,
int b)
Writes an integer x using minimal binary encoding, given an upper bound. |
void |
writeReset()
Set the write mode to true |
int |
writeSkewedGolomb(int x,
int b)
Writes and integer x into the stream using skewed-golomb coding. |
int |
writeUnary(int x)
Writes an integer x using unary encoding. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public BitFile(java.io.File _file, java.lang.String access)
_file
- File to read/writeaccess
- String indicating the access permissions of the file
java.io.IOException
- if an I/O error occurspublic BitFile(java.lang.String filename, java.lang.String access)
filename
- java.lang.String the name of the underlying fileaccess
- String indicating the access permissions of the file
java.io.IOException
- if an I/O error occurspublic BitFile(java.lang.String filename)
filename
- java.lang.String the name of the underlying file
java.io.IOException
- if an I/O error occurspublic BitFile(java.io.File file)
Method Detail |
---|
public long getByteOffset()
getByteOffset
in interface BitIn
getByteOffset
in interface BitOut
public byte getBitOffset()
getBitOffset
in interface BitIn
getBitOffset
in interface BitOut
public int writeUnary(int x) throws java.io.IOException
writeUnary
in interface BitOut
x
- the number to write
java.io.IOException
- if an I/O error occurs.public int writeGamma(int x) throws java.io.IOException
writeGamma
in interface BitOut
x
- the int number to write
java.io.IOException
- if an I/O error occurs.public int writeInt(int x, int len) throws java.io.IOException
x
- the int to writelen
- length of the int in bits
java.io.IOException
- if an I/O error occurs.public void writeFlush()
public BitIn readReset(long startByteOffset, byte startBitOffset, long endByteOffset, byte endBitOffset)
readReset
in interface BitInSeekable
startByteOffset
- the starting byte to read fromstartBitOffset
- the bit offset in the starting byteendByteOffset
- the ending byteendBitOffset
- the bit offset in the ending byte.
This bit is the last bit of this entry.
public int readGamma()
readGamma
in interface BitIn
java.io.IOException
- if an I/O error occurspublic int readUnary()
readUnary
in interface BitIn
java.io.IOException
- if an I/O error occurspublic void align()
align
in interface BitIn
java.io.IOException
- if an I/O error occurspublic int readBinary(int len)
readBinary
in interface BitIn
len
- is the number of binary bits to read
java.io.IOException
- if an I/O error occurspublic void skipBits(int len)
skipBits
in interface BitIn
len
- The number of bits to skippublic void close()
close
in interface java.io.Closeable
java.io.IOException
- if an I/O error occurs.public void writeReset() throws java.io.IOException
java.io.IOException
public int writeBinary(int len, int x) throws java.io.IOException
writeBinary
in interface BitOut
len
- size in bits of the number.x
- the integer to write.
java.io.IOException
- if an I/O error occurs.public int writeMinimalBinary(int x, int b) throws java.io.IOException
x
- the number to writeb
- and strict bound for x
java.io.IOException
- if an I/O error occurs.public int readMinimalBinary(int b) throws java.io.IOException
b
- the upper bound
java.io.IOException
- if an I/O error occurspublic int writeGolomb(int x, int b) throws java.io.IOException
x
- the number to writeb
- the parameter for golomb coding
java.io.IOException
- if and I/O error occurspublic int readGolomb(int b) throws java.io.IOException
b
- the golomb modulus
java.io.IOException
- if and I/O error occurspublic int writeSkewedGolomb(int x, int b) throws java.io.IOException
v = <b, 2b, 4b, ... , 2^i b, ...>
an integer x
is coded as unary(k+1)
where k
is the index
sum(i=0)(k) v_i < x <= sum(i=0)(k+1)
k = log(x/b + 1)
sum_i = b(2^n -1)
(geometric progression)
and the remainder with log(v_k)
bits in binary
if lower = ceil(x/b) -> lower = 2^i * b -> i = log(ceil(x/b)) + 1
the remainder x - sum_i 2^i*b - 1 = x - b(2^n - 1) - 1
is coded with floor(log(v_k)) bits
This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.
x
- the number to writeb
- the parameter for golomb coding
java.io.IOException
- if and I/O error occurspublic int writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi) throws java.io.IOException
data
- the vector containing the integer sequence.offset
- the offset into data
where the sequence starts.len
- the number of integers to code.lo
- a lower bound (must be smaller than or equal to the first integer in the sequence).hi
- an upper bound (must be greater than or equal to the last integer in the sequence).
java.io.IOException
- if an I/O error occurs.public int readSkewedGolomb(int b) throws java.io.IOException
v = <0, 2b, 4b, ... , 2^i b, ...>
The sum of the elements in the vector goes
b, 3b, 7b, 2^(i-1)*b
java.io.IOException
- if an I/O error occurspublic void readInterpolativeCoding(int[] data, int offset, int len, int lo, int hi) throws java.io.IOException
data
- the result vectoroffset
- offset where to write in the vectorlen
- the number of integers to decode.lo
- a lower bound (the same one passed to writeInterpolativeCoding)hi
- an upper bound (the same one passed to writeInterpolativeCoding)
java.io.IOException
- if an I/O error occurspublic int readMinimalBinaryZero(int b) throws java.io.IOException
b
- the upper bound
java.io.IOException
- if an I/O error occurs
|
Terrier IR Platform 2.2.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |