Class Decorate

  • All Implemented Interfaces:
    PostFilter, Process

    public class Decorate
    extends java.lang.Object
    implements Process, PostFilter
    This class decorates a result set with metadata. This metadata can be highlighted, can have a query biased summary created, and also be escaped for display in another format. Controls:
    • summaries - comma or semicolon delimited list of the key names for which a query biased summary should be created. e.g. summaries:snippet
    • emphasis - comma or semicolon delimited list of they key names that should have boldened for occurrences of the query terms. e.g. emphasis:title;snippet
    • earlyDecorate - comma or semicolon delimited list of the key names that should be decorated early, e.g. to support another PostProcess using them.
    • escape - comma or semicolon delimited list of the key names that should be escaped e.g. escape:title;snippet;url. Currently, per-key type escaping is not supported. The default escape type is defined using the property decorate.escape.
    Properties:
    • decorate.escape - default escape type for metadata. Default is HTML. Possible escape types include XML, JAVASCRIPT, and URL. See utility.StringTools.ESCAPE
    Since:
    3.0
    Author:
    Craig Macdonald, Vassilis Plachouras, Ben He
    • Field Detail

      • CONTROL_VALUE_DELIMS

        protected static final java.lang.String[] CONTROL_VALUE_DELIMS
        delimiters for breaking down the values of controls further
      • metaCache

        protected LRUMap<java.lang.Integer,​java.lang.String[]> metaCache
        The cache used for the meta data. Implements a Least-Recently-Used policy for retaining the most recently accessed metadata.
      • metaIndex

        protected MetaIndex metaIndex
        The meta index server. It is provided by the manager.
      • controlNonVisibleCharacters

        protected static final java.util.regex.Pattern controlNonVisibleCharacters
      • defaultEscape

        protected static final StringTools.ESCAPE defaultEscape
        what is the default escape sequence
      • controlNonVisibleCharactersMatcher

        protected java.util.regex.Matcher controlNonVisibleCharactersMatcher
      • cleanQuery

        protected static final java.util.regex.Pattern cleanQuery
      • highlight

        protected java.util.regex.Pattern highlight
        highlighting pattern for the current query
      • qTerms

        protected java.lang.String[] qTerms
        query terms of the current query
      • keys

        protected gnu.trove.TObjectIntHashMap<java.lang.String> keys
      • summaryKeys

        protected java.util.Set<java.lang.String> summaryKeys
      • emphasisKeys

        protected java.util.Set<java.lang.String> emphasisKeys
      • earlyKeys

        protected java.util.Set<java.lang.String> earlyKeys
      • metaKeys

        protected java.lang.String[] metaKeys
    • Constructor Detail

      • Decorate

        public Decorate()
    • Method Detail

      • new_query

        public void new_query​(Manager m,
                              SearchRequest q,
                              ResultSet rs)
        Called before the processing of a resultset using this PostFilter is applied. Can be used to save information for the duration of the query.
        Specified by:
        new_query in interface PostFilter
        Parameters:
        m - The manager controlling this query
        q - The search request being processed
        rs - the resultset that is being iterated through
      • filter

        public byte filter​(Manager m,
                           SearchRequest q,
                           ResultSet rs,
                           int rank,
                           int docid)
        Called for each result in the resultset, used to filter out unwanted results.
        Specified by:
        filter in interface PostFilter
        Parameters:
        m - The manager controlling this query
        q - The search request being processed
        rank - which array index (rank) in the resultset have we reached
        docid - The docid of the currently being procesed result.
      • process

        public void process​(Manager manager,
                            Request q)
        decoration at the postprocess stage. only decorate if required for future postfilter or postprocesses.
        Specified by:
        process in interface Process
        Parameters:
        manager - The manager instance handling this search session.
        q - the current query being processed
      • getMetadata

        protected java.lang.String[] getMetadata​(java.lang.String[] metaKeys,
                                                 int docid)
      • getMetadata

        protected java.lang.String[][] getMetadata​(java.lang.String[] metaKeys,
                                                   int[] docids)
      • generateEmphasisPattern

        protected java.util.regex.Pattern generateEmphasisPattern​(java.lang.String[] _qTerms)
        Creates a regular expression pattern to highlight query terms metadata.
        Parameters:
        _qTerms - query terms
        Returns:
        Pattern to apply
      • checkControl

        protected boolean checkControl​(java.lang.String control_name,
                                       SearchRequest srq)
      • getInfo

        public java.lang.String getInfo()
        Returns the name of the post processor.
        Specified by:
        getInfo in interface Process
        Returns:
        String the name of the post processor.