Package org.terrier.utility
Class TermCodes
- java.lang.Object
-
- org.terrier.utility.TermCodes
-
public class TermCodes extends java.lang.Object
This class is used for assigning codes to terms as we index a document collection.
It makes use of two properties from the default properties file. The first one is termcodes.initialcapacity, which specifies the initial capacity of the used hash map. The default value is 3000000.
The second property is termcodes.garbagecollect, which enables or disables garbage collection during the call of the method reset(). The default value is true.
- Author:
- Vassilis Plachouras
-
-
Constructor Summary
Constructors Constructor Description TermCodes()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getCode(java.lang.String term)
Returns the code for a given term.static void
initialise()
Initialises the properties from the property file.void
reset()
Resets the hashmap that contains the mapping from the terms to the term ids.void
setTermCode(java.lang.String term, int termCode)
For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method.
-
-
-
Method Detail
-
initialise
public static void initialise()
Initialises the properties from the property file. The initial capacity of the hash map, is set to the value of the property termcodes.initialcapacity. The default value is 3000000. The second property is related to the method reset() and enables or disables garbage collection when the reset method is called. The corresponding property is termcodes.garbagecollect, and its default property is true.
-
getCode
public final int getCode(java.lang.String term)
Returns the code for a given term.- Parameters:
term
- String the term for which the code will be returned.- Returns:
- int the code for the given term
-
reset
public void reset()
Resets the hashmap that contains the mapping from the terms to the term ids. If the property garbageCollection is true, then it performs garbage collection in order to free alocated memory. This method should be called after the creation of the lexicon.
-
setTermCode
public void setTermCode(java.lang.String term, int termCode)
For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method. NB: counter variable above probably needs to be considered in this method.
-
-