Package org.terrier.querying
Class Decorate
- java.lang.Object
-
- org.terrier.querying.Decorate
-
- All Implemented Interfaces:
PostFilter
,Process
public class Decorate extends java.lang.Object implements Process, PostFilter
This class decorates a result set with metadata. This metadata can be highlighted, can have a query biased summary created, and also be escaped for display in another format. Controls:- summaries - comma or semicolon delimited list of the key names for which a query biased summary should be created. e.g. summaries:snippet
- emphasis - comma or semicolon delimited list of they key names that should have boldened for occurrences of the query terms. e.g. emphasis:title;snippet
- earlyDecorate - comma or semicolon delimited list of the key names that should be decorated early, e.g. to support another PostProcess using them.
- escape - comma or semicolon delimited list of the key names that should be escaped e.g. escape:title;snippet;url. Currently, per-key type escaping is not supported. The default escape type is defined using the property decorate.escape.
- decorate.escape - default escape type for metadata. Default is HTML. Possible escape types include XML, JAVASCRIPT, and URL. See utility.StringTools.ESCAPE
- Since:
- 3.0
- Author:
- Craig Macdonald, Vassilis Plachouras, Ben He
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.regex.Pattern
cleanQuery
protected static java.lang.String[]
CONTROL_VALUE_DELIMS
delimiters for breaking down the values of controls furtherprotected static java.util.regex.Pattern
controlNonVisibleCharacters
protected java.util.regex.Matcher
controlNonVisibleCharactersMatcher
protected static StringTools.ESCAPE
defaultEscape
what is the default escape sequenceprotected java.util.Set<java.lang.String>
earlyKeys
protected java.util.Set<java.lang.String>
emphasisKeys
protected java.util.Map<java.lang.String,StringTools.ESCAPE>
escapeKeys
protected java.util.regex.Pattern
highlight
highlighting pattern for the current queryprotected gnu.trove.TObjectIntHashMap<java.lang.String>
keys
protected LRUMap<java.lang.Integer,java.lang.String[]>
metaCache
The cache used for the meta data.protected MetaIndex
metaIndex
The meta index server.protected java.lang.String[]
metaKeys
protected java.lang.String[]
qTerms
query terms of the current queryprotected Summariser
summariser
protected java.util.Set<java.lang.String>
summaryKeys
-
Fields inherited from interface org.terrier.querying.PostFilter
FILTER_ADJUSTED, FILTER_OK, FILTER_REMOVE
-
-
Constructor Summary
Constructors Constructor Description Decorate()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
checkControl(java.lang.String control_name, SearchRequest srq)
byte
filter(Manager m, SearchRequest q, ResultSet rs, int rank, int docid)
Called for each result in the resultset, used to filter out unwanted results.protected java.util.regex.Pattern
generateEmphasisPattern(java.lang.String[] _qTerms)
Creates a regular expression pattern to highlight query terms metadata.java.lang.String
getInfo()
Returns the name of the post processor.protected java.lang.String[]
getMetadata(java.lang.String[] metaKeys, int docid)
protected java.lang.String[][]
getMetadata(java.lang.String[] metaKeys, int[] docids)
void
new_query(Manager m, SearchRequest q, ResultSet rs)
Called before the processing of a resultset using this PostFilter is applied.void
process(Manager manager, Request q)
decoration at the postprocess stage.
-
-
-
Field Detail
-
CONTROL_VALUE_DELIMS
protected static final java.lang.String[] CONTROL_VALUE_DELIMS
delimiters for breaking down the values of controls further
-
metaCache
protected LRUMap<java.lang.Integer,java.lang.String[]> metaCache
The cache used for the meta data. Implements a Least-Recently-Used policy for retaining the most recently accessed metadata.
-
metaIndex
protected MetaIndex metaIndex
The meta index server. It is provided by the manager.
-
controlNonVisibleCharacters
protected static final java.util.regex.Pattern controlNonVisibleCharacters
-
defaultEscape
protected static final StringTools.ESCAPE defaultEscape
what is the default escape sequence
-
controlNonVisibleCharactersMatcher
protected java.util.regex.Matcher controlNonVisibleCharactersMatcher
-
cleanQuery
protected static final java.util.regex.Pattern cleanQuery
-
highlight
protected java.util.regex.Pattern highlight
highlighting pattern for the current query
-
qTerms
protected java.lang.String[] qTerms
query terms of the current query
-
keys
protected gnu.trove.TObjectIntHashMap<java.lang.String> keys
-
summaryKeys
protected java.util.Set<java.lang.String> summaryKeys
-
emphasisKeys
protected java.util.Set<java.lang.String> emphasisKeys
-
escapeKeys
protected java.util.Map<java.lang.String,StringTools.ESCAPE> escapeKeys
-
earlyKeys
protected java.util.Set<java.lang.String> earlyKeys
-
summariser
protected Summariser summariser
-
metaKeys
protected java.lang.String[] metaKeys
-
-
Method Detail
-
new_query
public void new_query(Manager m, SearchRequest q, ResultSet rs)
Called before the processing of a resultset using this PostFilter is applied. Can be used to save information for the duration of the query.- Specified by:
new_query
in interfacePostFilter
- Parameters:
m
- The manager controlling this queryq
- The search request being processedrs
- the resultset that is being iterated through
-
filter
public byte filter(Manager m, SearchRequest q, ResultSet rs, int rank, int docid)
Called for each result in the resultset, used to filter out unwanted results.- Specified by:
filter
in interfacePostFilter
- Parameters:
m
- The manager controlling this queryq
- The search request being processedrank
- which array index (rank) in the resultset have we reacheddocid
- The docid of the currently being procesed result.
-
process
public void process(Manager manager, Request q)
decoration at the postprocess stage. only decorate if required for future postfilter or postprocesses.
-
getMetadata
protected java.lang.String[] getMetadata(java.lang.String[] metaKeys, int docid)
-
getMetadata
protected java.lang.String[][] getMetadata(java.lang.String[] metaKeys, int[] docids)
-
generateEmphasisPattern
protected java.util.regex.Pattern generateEmphasisPattern(java.lang.String[] _qTerms)
Creates a regular expression pattern to highlight query terms metadata.- Parameters:
_qTerms
- query terms- Returns:
- Pattern to apply
-
checkControl
protected boolean checkControl(java.lang.String control_name, SearchRequest srq)
-
-