|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.Lexicon
uk.ac.gla.terrier.structures.BlockLexicon
uk.ac.gla.terrier.structures.UTFBlockLexicon
public class UTFBlockLexicon
A lexicon class that saves the number of different blocks a term appears in, using UTF encoding of Strings. It is used only during creating a utf block inverted index. After the utf block inverted index has been created, the utf block lexicon is transformed into a utf lexicon.
| Field Summary | |
|---|---|
static int |
lexiconEntryLength
The size in bytes of an entry in the lexicon file. |
| Constructor Summary | |
|---|---|
UTFBlockLexicon()
A default constructor. |
|
UTFBlockLexicon(java.lang.String lexiconName)
Constructs an instace of BlockLexicon and opens the corresponding file. |
|
UTFBlockLexicon(java.lang.String path,
java.lang.String prefix)
|
|
| Method Summary | |
|---|---|
boolean |
findTerm(int termId)
Finds the term given its term code. |
boolean |
findTerm(java.lang.String _term)
Performs a binary search in the lexicon in order to locate the given term. |
int |
getBlockFrequency()
Returns the block frequency for the given term |
static int |
numberOfEntries(java.io.File f)
returns the number of entries in the lexicon named by f |
static int |
numberOfEntries(java.lang.String filename)
returns the number of entries in the lexicon named by filename |
boolean |
seekEntry(int i)
Seeks the i-th entry of the lexicon. |
boolean |
updateEntry(int i,
int frequency,
long endOffset,
byte endBitOffset)
Deprecated. Block Lexicons are used during indexing, but not during retrieval. |
| Methods inherited from class uk.ac.gla.terrier.structures.Lexicon |
|---|
close, getEndBitOffset, getEndOffset, getIthLexiconEntry, getLexiconEntry, getLexiconEntry, getNt, getNumberOfLexiconEntries, getStartBitOffset, getStartOffset, getTerm, getTermId, getTF, iterator, print |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int lexiconEntryLength
| Constructor Detail |
|---|
public UTFBlockLexicon()
public UTFBlockLexicon(java.lang.String path,
java.lang.String prefix)
public UTFBlockLexicon(java.lang.String lexiconName)
lexiconName - the name of the lexicon file.| Method Detail |
|---|
public boolean findTerm(int termId)
findTerm in class BlockLexicontermId - the term's id
public boolean findTerm(java.lang.String _term)
findTerm in class BlockLexicon_term - the term to search for.
public int getBlockFrequency()
getBlockFrequency in class BlockLexiconpublic boolean seekEntry(int i)
seekEntry in class BlockLexiconi - The index of the entry we are looking for.
public boolean updateEntry(int i,
int frequency,
long endOffset,
byte endBitOffset)
updateEntry in class BlockLexiconi - the i-th entryfrequency - the term's FrequencyendOffset - the offset of the ending byte in the inverted fileendBitOffset - the offset in bits in the ending byte in the term's entry in
inverted file
public static int numberOfEntries(java.io.File f)
public static int numberOfEntries(java.lang.String filename)
|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||