Class TermCodes


  • public class TermCodes
    extends java.lang.Object

    This class is used for assigning codes to terms as we index a document collection.

    It makes use of two properties from the default properties file. The first one is termcodes.initialcapacity, which specifies the initial capacity of the used hash map. The default value is 3000000.

    The second property is termcodes.garbagecollect, which enables or disables garbage collection during the call of the method reset(). The default value is true.

    Author:
    Vassilis Plachouras
    • Constructor Summary

      Constructors 
      Constructor Description
      TermCodes()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int getCode​(java.lang.String term)
      Returns the code for a given term.
      static void initialise()
      Initialises the properties from the property file.
      void reset()
      Resets the hashmap that contains the mapping from the terms to the term ids.
      void setTermCode​(java.lang.String term, int termCode)
      For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • TermCodes

        public TermCodes()
    • Method Detail

      • initialise

        public static void initialise()
        Initialises the properties from the property file. The initial capacity of the hash map, is set to the value of the property termcodes.initialcapacity. The default value is 3000000. The second property is related to the method reset() and enables or disables garbage collection when the reset method is called. The corresponding property is termcodes.garbagecollect, and its default property is true.
      • getCode

        public final int getCode​(java.lang.String term)
        Returns the code for a given term.
        Parameters:
        term - String the term for which the code will be returned.
        Returns:
        int the code for the given term
      • reset

        public void reset()
        Resets the hashmap that contains the mapping from the terms to the term ids. If the property garbageCollection is true, then it performs garbage collection in order to free alocated memory. This method should be called after the creation of the lexicon.
      • setTermCode

        public void setTermCode​(java.lang.String term,
                                int termCode)
        For when you manually want to set the term for a given term, and you know that this term and termcodes do NOT exist, then you can use this method. NB: counter variable above probably needs to be considered in this method.