|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectuk.ac.gla.terrier.structures.Lexicon
uk.ac.gla.terrier.structures.UTFLexicon
public class UTFLexicon
The class that implements the lexicon structure. Apart from the lexicon file, which contains the actual data about the terms, and takes its name from ApplicationSetup.LEXICON_FILENAME, another file is created and used, containing a mapping from the term's code to the offset of the term in the lexicon. The name of this file is given by ApplicationSetup.LEXICON_INDEX_FILENAME.
ApplicationSetup.LEXICON_FILENAME,
ApplicationSetup.LEXICON_INDEX_FILENAME| Field Summary | |
|---|---|
static int |
lexiconEntryLength
The size in bytes of an entry in the lexicon file. |
| Constructor Summary | |
|---|---|
UTFLexicon()
A default constructor. |
|
UTFLexicon(java.lang.String lexiconName)
Constructs an instace of Lexicon and opens the corresponding file. |
|
UTFLexicon(java.lang.String path,
java.lang.String prefix)
|
|
| Method Summary | |
|---|---|
boolean |
findTerm(int _termId)
Finds the term given its term code. |
boolean |
findTerm(java.lang.String _term)
Performs a binary search in the lexicon in order to locate the given term. |
LexiconEntry |
getLexiconEntry(int termid)
Returns a LexiconEntry describing all the information in the lexicon about the term denoted by termid |
LexiconEntry |
getLexiconEntry(java.lang.String _term)
Returns a LexiconEntry describing all the information in the lexicon about the term denoted by _term |
static int |
numberOfEntries(java.io.File f)
|
static int |
numberOfEntries(java.lang.String filename)
|
boolean |
seekEntry(int i)
Seeks the i-th entry of the lexicon. |
boolean |
updateEntry(int i,
int frequency,
long endOffset,
byte endBitOffset)
Deprecated. The Lexicon class is only used for reading the lexicon file, and not for writing any information. |
| Methods inherited from class uk.ac.gla.terrier.structures.Lexicon |
|---|
close, getEndBitOffset, getEndOffset, getIthLexiconEntry, getNt, getNumberOfLexiconEntries, getStartBitOffset, getStartOffset, getTerm, getTermId, getTF, iterator, print |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int lexiconEntryLength
| Constructor Detail |
|---|
public UTFLexicon()
public UTFLexicon(java.lang.String path,
java.lang.String prefix)
public UTFLexicon(java.lang.String lexiconName)
lexiconName - the name of the lexicon file.| Method Detail |
|---|
public boolean findTerm(int _termId)
findTerm in class Lexicon_termId - the term's identifier
public boolean findTerm(java.lang.String _term)
findTerm in class Lexicon_term - The term to search for.
public boolean seekEntry(int i)
seekEntry in class Lexiconi - The index of the entry we are looking for.
public LexiconEntry getLexiconEntry(int termid)
getLexiconEntry in class Lexicontermid - the termid of the term of interest
public LexiconEntry getLexiconEntry(java.lang.String _term)
getLexiconEntry in class Lexicon_term - the String term that is of interest
public boolean updateEntry(int i,
int frequency,
long endOffset,
byte endBitOffset)
updateEntry in class Lexiconi - the i-th entryfrequency - the term's FrequencyendOffset - the offset of the ending byte in the inverted fileendBitOffset - the offset in bits in the ending byte
in the term's entry in inverted file
public static int numberOfEntries(java.io.File f)
public static int numberOfEntries(java.lang.String filename)
|
Terrier IR Platform 2.2.1 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||