|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.compression.BitFile
public class BitFile
This class encapsulates a random access file and provides
the functionalities to read and write binary encoded, unary encoded and gamma encoded
integers greater than zero, as well as specifying their offset in the file. It
is employed by the DirectFile and the InvertedFile classes.
Use the getBit/ByteOffset methods only for writting, and not for reading.
This class contains the methods in both BitInputStream and BitOutputStream.
The numbers are written into a byte starting from the most significant bit (i.e, left to right).
The sequence of method calls to write a sequence of gamma encoded
and unary encoded numbers is:
file.writeReset(); long startByte1 = file.getByteOffset(); byte startBit1 = file.getBitOffset(); file.writeGamma(20000); file.writeUnary(2); file.writeGamma(35000); file.writeUnary(1); file.writeGamma(3); file.writeUnary(2); long endByte1 = file.getByteOffset(); byte endBit1 = file.getBitOffset(); if (endBit1 == 0 && endByte1 > 0) { endBit1 = 7; endByte1--; }while for reading a sequence of numbers the sequence of calls is:
file.readReset(startByte1, startBit1, endByte1, endBit1); int gamma = file.readGamma(); int unary = file.readUnary();
Field Summary | |
---|---|
protected int |
bitOffset
Deprecated. The bit offset. |
protected byte[] |
buffer
Deprecated. Write buffer |
protected int |
bufferPointer
Deprecated. Pointer for the buffer |
protected int |
bufferSize
Deprecated. Size of the buffer (it has to be 4 * k) |
protected long |
byteOffset
Deprecated. The byte offset. |
protected int |
byteToWrite
Deprecated. A int to write to the stream. |
protected static java.lang.String |
DEFAULT_FILE_MODE
Deprecated. Default file mode access for a BitFile object. |
protected static int |
DEFAULT_SIZE
Deprecated. Default size |
protected RandomDataInput |
file
Deprecated. The underlying file |
protected byte[] |
inBuffer
Deprecated. Buffer for reads |
protected boolean |
isWriteMode
Deprecated. Indicates if we are writting or reading |
protected static org.apache.log4j.Logger |
logger
Deprecated. The logger used |
protected int |
readBits
Deprecated. Number of bits read so far |
protected int |
readByteOffset
Deprecated. The current byte offset to be read |
protected RandomDataOutput |
writeFile
Deprecated. Same object as file, but cast to RandomDataOutput |
Fields inherited from interface org.terrier.compression.BitIn |
---|
USUAL_EXTENSION |
Constructor Summary | |
---|---|
protected |
BitFile()
Deprecated. do nothing constructor |
|
BitFile(java.io.File _file)
Deprecated. Constructs an instance of the class for a given filename, "rw" permissions |
|
BitFile(java.io.File _file,
java.lang.String access)
Deprecated. Constructs an instance of the class for a given file and an acces method to the file |
|
BitFile(RandomDataInput data)
Deprecated. Constructs an instance of the class for a given RandomDataInput instance accessing a bit compressed file/stream |
|
BitFile(java.lang.String filename)
Deprecated. Constructs an instance of the class for a given filename, "rw" permissions |
|
BitFile(java.lang.String filename,
java.lang.String access)
Deprecated. Constructs an instance of the class for a given filename and an acces method to the file |
Method Summary | |
---|---|
void |
align()
Deprecated. Aligns the stream to the next byte |
void |
close()
Deprecated. Closes the file. |
byte |
getBitOffset()
Deprecated. Returns the bit offset in the last byte. |
long |
getByteOffset()
Deprecated. Returns the byte offset of the stream. |
protected void |
init()
Deprecated. Initialises the variables, used internally |
int |
readBinary(int len)
Deprecated. Reads a binary integer from the already read buffer. |
int |
readDelta()
Deprecated. Reads a delta encoded integer from the underlying stream |
int |
readGamma()
Deprecated. Reads a gamma encoded integer from the underlying stream |
int |
readGolomb(int b)
Deprecated. Reads a Golomb encoded integer |
protected void |
readIn()
Deprecated. Reads a new byte from the InputStream if we have finished with the current one. |
void |
readInterpolativeCoding(int[] data,
int offset,
int len,
int lo,
int hi)
Deprecated. Reads a sequence of numbers from the stream interpolative coded. |
int |
readMinimalBinary(int b)
Deprecated. Reads a binary encoded integer, given an upper bound |
int |
readMinimalBinaryZero(int b)
Deprecated. Reads a minimal binary encoded number, when the upper bound can b zero. |
BitIn |
readReset(long startByteOffset,
byte startBitOffset)
Deprecated. Reads from the file a specific number of bytes and after this call, a sequence of read calls may follow. |
BitIn |
readReset(long startByteOffset,
byte startBitOffset,
long endByteOffset,
byte endBitOffset)
Deprecated. Reads from the file a specific number of bytes and after this call, a sequence of read calls may follow. |
int |
readSkewedGolomb(int b)
Deprecated. Reads a skewed-golomb encoded integer from the underlying stream Consider a bucket-vector v = <0, 2b, 4b, ... |
int |
readUnary()
Deprecated. Reads a unary encoded integer from the underlying stream |
void |
skipBits(int len)
Deprecated. Skip a number of bits in the current input stream |
void |
skipBytes(long len)
Deprecated. Skip a number of bytes while reading the bit file. |
int |
writeBinary(int len,
int x)
Deprecated. Writes an integer in binary format to the stream. |
int |
writeDelta(int x)
Deprecated. Writes an integer x into the stream using delta encoding. |
void |
writeFlush()
Deprecated. Flushes the OuputStream (empty method) |
int |
writeGamma(int x)
Deprecated. Writes an integer x into the stream using gamma encoding. |
int |
writeGolomb(int x,
int b)
Deprecated. Writes and integer x into the stream using golomb coding. |
protected int |
writeInCurrent(int b,
int len)
Deprecated. Writes a number in the current byte we are using. |
int |
writeInt(int x,
int len)
Deprecated. Writes an integer x into the underlying OutputStream. |
protected void |
writeIntBuffer(int writeMe)
Deprecated. Flushes the int currently being written into the buffer, and if it is necessary, it flush the buffer to the underlying OutputStream |
protected void |
writeIntBufferToBit(int writeMe,
int _bitOffset)
Deprecated. Writes the current integer used into the buffer, taking into account the number of bits written. |
int |
writeInterpolativeCode(int[] data,
int offset,
int len,
int lo,
int hi)
Deprecated. Writes a sequence of integers using interpolative coding. |
int |
writeMinimalBinary(int x,
int b)
Deprecated. Writes an integer x using minimal binary encoding, given an upper bound. |
void |
writeReset()
Deprecated. Set the write mode to true |
int |
writeSkewedGolomb(int x,
int b)
Deprecated. Writes and integer x into the stream using skewed-golomb coding. |
int |
writeUnary(int x)
Deprecated. Writes an integer x using unary encoding. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final org.apache.log4j.Logger logger
protected byte[] buffer
protected int bufferPointer
protected int bufferSize
protected static final int DEFAULT_SIZE
protected static final java.lang.String DEFAULT_FILE_MODE
protected long byteOffset
protected int readByteOffset
protected int bitOffset
protected int byteToWrite
protected boolean isWriteMode
protected RandomDataInput file
protected RandomDataOutput writeFile
protected byte[] inBuffer
protected int readBits
Constructor Detail |
---|
public BitFile(RandomDataInput data)
data
- a RandomDataInput instance containing the bit compressed datapublic BitFile(java.io.File _file, java.lang.String access)
_file
- File to read/writeaccess
- String indicating the access permissions of the filepublic BitFile(java.lang.String filename, java.lang.String access)
filename
- java.lang.String the name of the underlying fileaccess
- String indicating the access permissions of the filepublic BitFile(java.lang.String filename)
filename
- java.lang.String the name of the underlying filepublic BitFile(java.io.File _file)
_file
- java.io.Fileprotected BitFile()
Method Detail |
---|
protected void init()
public long getByteOffset()
getByteOffset
in interface BitIn
getByteOffset
in interface BitOut
public byte getBitOffset()
getBitOffset
in interface BitIn
getBitOffset
in interface BitOut
protected void writeIntBuffer(int writeMe) throws java.io.IOException
writeMe
- int to be written into the buffer
java.io.IOException
- if an I/O error occursprotected int writeInCurrent(int b, int len) throws java.io.IOException
b
- the number to writelen
- the length of the number in bits
java.io.IOException
- if an I/O error occurs.public int writeUnary(int x) throws java.io.IOException
writeUnary
in interface BitOut
x
- the number to write
java.io.IOException
- if an I/O error occurs.public int writeDelta(int x) throws java.io.IOException
writeDelta
in interface BitOut
x
- the int number to write
java.io.IOException
- if an I/O error occurs.public int writeGamma(int x) throws java.io.IOException
writeGamma
in interface BitOut
x
- the int number to write
java.io.IOException
- if an I/O error occurs.public int writeInt(int x, int len) throws java.io.IOException
writeInt
in interface BitOut
x
- the int to writelen
- length of the int in bits
java.io.IOException
- if an I/O error occurs.public void writeFlush()
public BitIn readReset(long startByteOffset, byte startBitOffset, long endByteOffset, byte endBitOffset)
readReset
in interface BitInSeekable
startByteOffset
- the starting byte to read fromstartBitOffset
- the bit offset in the starting byteendByteOffset
- the ending byteendBitOffset
- the bit offset in the ending byte.
This bit is the last bit of this entry.
public BitIn readReset(long startByteOffset, byte startBitOffset) throws java.io.IOException
readReset
in interface BitInSeekable
startByteOffset
- the starting byte to read fromstartBitOffset
- the bit offset in the starting byte
java.io.IOException
public int readGamma()
readGamma
in interface BitIn
public int readUnary()
readUnary
in interface BitIn
public int readDelta() throws java.io.IOException
readDelta
in interface BitIn
java.io.IOException
- if an I/O error occursprotected void readIn()
java.io.IOException
- if we have reached the end of the filepublic void align()
align
in interface BitIn
public int readBinary(int len)
readBinary
in interface BitIn
len
- is the number of binary bits to read
public void skipBits(int len)
skipBits
in interface BitIn
len
- The number of bits to skippublic void skipBytes(long len) throws java.io.IOException
skipBytes
in interface BitIn
len
- The number of bytes to skip
java.io.IOException
- if an I/O error occurspublic void close()
close
in interface java.io.Closeable
protected void writeIntBufferToBit(int writeMe, int _bitOffset)
writeMe
- int to write_bitOffset
- number of bits written so far in the intpublic void writeReset() throws java.io.IOException
java.io.IOException
public int writeBinary(int len, int x) throws java.io.IOException
writeBinary
in interface BitOut
len
- size in bits of the number.x
- the integer to write.
java.io.IOException
- if an I/O error occurs.public int writeMinimalBinary(int x, int b) throws java.io.IOException
writeMinimalBinary
in interface BitOut
x
- the number to writeb
- and strict bound for x
java.io.IOException
- if an I/O error occurs.public int readMinimalBinary(int b) throws java.io.IOException
readMinimalBinary
in interface BitIn
b
- the upper bound
java.io.IOException
- if an I/O error occurspublic int writeGolomb(int x, int b) throws java.io.IOException
writeGolomb
in interface BitOut
x
- the number to writeb
- the parameter for golomb coding
java.io.IOException
- if and I/O error occurspublic int readGolomb(int b) throws java.io.IOException
readGolomb
in interface BitIn
b
- the golomb modulus
java.io.IOException
- if and I/O error occurspublic int writeSkewedGolomb(int x, int b) throws java.io.IOException
v = <b, 2b, 4b, ... , 2^i b, ...>
an integer x
is coded as unary(k+1)
where k
is the index
sum(i=0)(k) v_i < x <= sum(i=0)(k+1)
k = log(x/b + 1)
sum_i = b(2^n -1)
(geometric progression)
and the remainder with log(v_k)
bits in binary
if lower = ceil(x/b) -> lower = 2^i * b -> i = log(ceil(x/b)) + 1
the remainder x - sum_i 2^i*b - 1 = x - b(2^n - 1) - 1
is coded with floor(log(v_k)) bits
This method is not failsafe, it doesn't check if the argument or the modulus is 0 or negative.
writeSkewedGolomb
in interface BitOut
x
- the number to writeb
- the parameter for golomb coding
java.io.IOException
- if and I/O error occurspublic int writeInterpolativeCode(int[] data, int offset, int len, int lo, int hi) throws java.io.IOException
writeInterpolativeCode
in interface BitOut
data
- the vector containing the integer sequence.offset
- the offset into data
where the sequence starts.len
- the number of integers to code.lo
- a lower bound (must be smaller than or equal to the first integer in the sequence).hi
- an upper bound (must be greater than or equal to the last integer in the sequence).
java.io.IOException
- if an I/O error occurs.public int readSkewedGolomb(int b) throws java.io.IOException
v = <0, 2b, 4b, ... , 2^i b, ...>
The sum of the elements in the vector goes
b, 3b, 7b, 2^(i-1)*b
readSkewedGolomb
in interface BitIn
java.io.IOException
- if an I/O error occurspublic void readInterpolativeCoding(int[] data, int offset, int len, int lo, int hi) throws java.io.IOException
readInterpolativeCoding
in interface BitIn
data
- the result vectoroffset
- offset where to write in the vectorlen
- the number of integers to decode.lo
- a lower bound (the same one passed to writeInterpolativeCoding)hi
- an upper bound (the same one passed to writeInterpolativeCoding)
java.io.IOException
- if an I/O error occurspublic int readMinimalBinaryZero(int b) throws java.io.IOException
readMinimalBinaryZero
in interface BitIn
b
- the upper bound
java.io.IOException
- if an I/O error occurs
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |