org.terrier.utility
Class TermCodes

java.lang.Object
  extended by org.terrier.utility.TermCodes

public class TermCodes
extends java.lang.Object

This class is used for assigning codes to terms as we index a document collection.

It makes use of two properties from the default properties file. The first one is termcodes.initialcapacity, which specifies the initial capacity of the used hash map. The default value is 3000000.

The second property is termcodes.garbagecollect, which enables or disables garbage collection during the call of the method reset(). The default value is true.

Author:
Vassilis Plachouras

Constructor Summary
TermCodes()
           
 
Method Summary
static int getCode(java.lang.String term)
          Returns the code for a given term.
static void initialise()
          Initialises the properties from the property file.
static void reset()
          Resets the hashmap that contains the mapping from the terms to the term ids.
static void setTermCode(java.lang.String term, int termCode)
          For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TermCodes

public TermCodes()
Method Detail

initialise

public static void initialise()
Initialises the properties from the property file. The initial capacity of the hash map, is set to the value of the property termcodes.initialcapacity. The default value is 3000000. The second property is related to the method reset() and enables or disables garbage collection when the reset method is called. The corresponding property is termcodes.garbagecollect, and its default property is true.


getCode

public static final int getCode(java.lang.String term)
Returns the code for a given term.

Parameters:
term - String the term for which the code will be returned.
Returns:
int the code for the given term

reset

public static void reset()
Resets the hashmap that contains the mapping from the terms to the term ids. If the property garbageCollection is true, then it performs garbage collection in order to free alocated memory. This method should be called after the creation of the lexicon.


setTermCode

public static void setTermCode(java.lang.String term,
                               int termCode)
For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method. NB: counter variable above probably needs to be considered in this method.



Terrier 3.5. Copyright © 2004-2011 University of Glasgow