org.terrier.structures
Class BasicLexiconEntry

java.lang.Object
  extended by org.terrier.structures.LexiconEntry
      extended by org.terrier.structures.BasicLexiconEntry
All Implemented Interfaces:
Serializable, org.apache.hadoop.io.Writable, BitFilePosition, BitIndexPointer, EntryStatistics, Pointer
Direct Known Subclasses:
BlockFieldLexiconEntry, BlockLexiconEntry, FieldLexiconEntry

public class BasicLexiconEntry
extends LexiconEntry
implements BitIndexPointer

Contains all the information about one entry in the Lexicon. Created to make thread-safe lookups in the Lexicon easier.

See Also:
Serialized Form

Nested Class Summary
static class BasicLexiconEntry.Factory
          Factory for creating LexiconEntry objects
 
Field Summary
 int n_t
          the number of document that this entry occurs in
 byte startBitOffset
          the start bit offset of the entry in the inverted index
 long startOffset
          the start offset of the entry in the inverted index
 int termId
          the termid of this entry
 int TF
          the total number of occurrences of the term in the index
 
Fields inherited from interface org.terrier.structures.BitIndexPointer
BIT_MASK, FILE_SHIFT, MAX_FILE_ID
 
Constructor Summary
BasicLexiconEntry()
          Create an empty LexiconEntry
BasicLexiconEntry(int tid, int _n_t, int _TF)
          Create a lexicon entry with the following information.
BasicLexiconEntry(int tid, int _n_t, int _TF, byte fileId, BitFilePosition offset)
          Create a lexicon entry with the following information.
BasicLexiconEntry(int tid, int _n_t, int _TF, byte fileId, long _startOffset, byte _startBitOffset)
          Create a lexicon entry with the following information.
 
Method Summary
 void add(EntryStatistics le)
          increment this lexicon entry by another
 int getDocumentFrequency()
          The number of documents that the entry (term) occurred in
 byte getFileNumber()
          Returns the file number: 0-32
 int getFrequency()
          The frequency (total number of occurrences) of the entry (term).
 int getNumberOfEntries()
          Returns number of "things" that this pointer refers to
 long getOffset()
          Return the offset
 byte getOffsetBits()
          Return the offset bits
 int getTermId()
          The id of the term
 String pointerToString()
          Returns a textual representation of the pointer alone
 void readFields(DataInput in)
          
 void setBitIndexPointer(BitIndexPointer pointer)
          Update this pointer to reflect the same values as the specified pointer
 void setFileNumber(byte fileId)
          Set the file number
 void setNumberOfEntries(int n)
          Set the number of "things that the pointer refers to
 void setOffset(BitFilePosition pos)
          Set the offset
 void setOffset(long bytes, byte bits)
          Set the offset in bytes and bits
 void setPointer(Pointer p)
          Sets the pointer within this object to that represented by the specified pointer
 void setStatistics(int _n_t, int _TF)
          Set the term statistics, in particular, the number of documents that this term appears in and the total number of occurrences of the term.
 void setTermId(int newTermId)
          Sets the ID for this term
 void subtract(EntryStatistics le)
          alter this lexicon entry to subtract another lexicon entry
 String toString()
          returns a string representation of this lexicon entry
 void write(DataOutput out)
          
 
Methods inherited from class org.terrier.structures.LexiconEntry
equals, getWritableEntryStatistics, hashCode
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

termId

public int termId
the termid of this entry


n_t

public int n_t
the number of document that this entry occurs in


TF

public int TF
the total number of occurrences of the term in the index


startOffset

public long startOffset
the start offset of the entry in the inverted index


startBitOffset

public byte startBitOffset
the start bit offset of the entry in the inverted index

Constructor Detail

BasicLexiconEntry

public BasicLexiconEntry()
Create an empty LexiconEntry


BasicLexiconEntry

public BasicLexiconEntry(int tid,
                         int _n_t,
                         int _TF)
Create a lexicon entry with the following information.

Parameters:
tid - the term id
_n_t - the number of documents the term occurs in (document frequency)
_TF - the total count of therm t in the collection

BasicLexiconEntry

public BasicLexiconEntry(int tid,
                         int _n_t,
                         int _TF,
                         byte fileId,
                         long _startOffset,
                         byte _startBitOffset)
Create a lexicon entry with the following information.

Parameters:
tid -
_n_t -
_TF -
fileId -
_startOffset -
_startBitOffset -

BasicLexiconEntry

public BasicLexiconEntry(int tid,
                         int _n_t,
                         int _TF,
                         byte fileId,
                         BitFilePosition offset)
Create a lexicon entry with the following information.

Parameters:
tid -
_n_t -
_TF -
fileId -
offset -
Method Detail

setStatistics

public void setStatistics(int _n_t,
                          int _TF)
Set the term statistics, in particular, the number of documents that this term appears in and the total number of occurrences of the term.

Specified by:
setStatistics in class LexiconEntry

add

public void add(EntryStatistics le)
increment this lexicon entry by another

Specified by:
add in interface EntryStatistics

subtract

public void subtract(EntryStatistics le)
alter this lexicon entry to subtract another lexicon entry

Specified by:
subtract in interface EntryStatistics

toString

public String toString()
returns a string representation of this lexicon entry

Overrides:
toString in class LexiconEntry

getDocumentFrequency

public int getDocumentFrequency()
The number of documents that the entry (term) occurred in

Specified by:
getDocumentFrequency in interface EntryStatistics

getFrequency

public int getFrequency()
The frequency (total number of occurrences) of the entry (term).

Specified by:
getFrequency in interface EntryStatistics

getTermId

public int getTermId()
The id of the term

Specified by:
getTermId in interface EntryStatistics

getNumberOfEntries

public int getNumberOfEntries()
Returns number of "things" that this pointer refers to

Specified by:
getNumberOfEntries in interface Pointer
Overrides:
getNumberOfEntries in class LexiconEntry

getOffsetBits

public byte getOffsetBits()
Return the offset bits

Specified by:
getOffsetBits in interface BitFilePosition

getOffset

public long getOffset()
Return the offset

Specified by:
getOffset in interface BitFilePosition

getFileNumber

public byte getFileNumber()
Returns the file number: 0-32

Specified by:
getFileNumber in interface BitIndexPointer

setFileNumber

public void setFileNumber(byte fileId)
Set the file number

Specified by:
setFileNumber in interface BitIndexPointer

setTermId

public void setTermId(int newTermId)
Sets the ID for this term

Specified by:
setTermId in class LexiconEntry

setOffset

public void setOffset(long bytes,
                      byte bits)
Set the offset in bytes and bits

Specified by:
setOffset in interface BitFilePosition

setBitIndexPointer

public void setBitIndexPointer(BitIndexPointer pointer)
Update this pointer to reflect the same values as the specified pointer

Specified by:
setBitIndexPointer in interface BitIndexPointer
Parameters:
pointer - - pointer to use to set the offset, bit offset and file Id parameters.

setOffset

public void setOffset(BitFilePosition pos)
Set the offset

Specified by:
setOffset in interface BitFilePosition

readFields

public void readFields(DataInput in)
                throws IOException

Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

write

public void write(DataOutput out)
           throws IOException

Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

setNumberOfEntries

public void setNumberOfEntries(int n)
Set the number of "things that the pointer refers to

Specified by:
setNumberOfEntries in interface Pointer
Overrides:
setNumberOfEntries in class LexiconEntry

pointerToString

public String pointerToString()
Returns a textual representation of the pointer alone

Specified by:
pointerToString in interface Pointer
Overrides:
pointerToString in class LexiconEntry

setPointer

public void setPointer(Pointer p)
Sets the pointer within this object to that represented by the specified pointer

Specified by:
setPointer in interface Pointer
Overrides:
setPointer in class LexiconEntry
Parameters:
p - other pointer to update the pointer in this object


Terrier 3.6. Copyright © 2004-2011 University of Glasgow