Package org.terrier.matching
Class TRECResultsMatching
- java.lang.Object
-
- org.terrier.matching.TRECResultsMatching
-
- All Implemented Interfaces:
Matching
public class TRECResultsMatching extends java.lang.Object implements Matching
A matching implementation that retrieves results from a TREC result file rather than the current index. Such a result file must be compatible with trec_eval, i.e., it should have the following format:queryID Q0 docno score rank label
Properties:
- matching.trecresults.file - the path to the TREC results file.
- matching.trecresults.format - the input format to parse document identifiers. Defaults to DOCNO.
DOCNO assumes that docno is a reverse lookup key in the
MetaIndex. If DOCID is specified, then the docnos are assumed to represent Terrier's docids, as generated byTRECDocidOutputFormat. - matching.trecresults.scores - whether scores should be parsed. Defaults to true.
- matching.trecresults.length - the maximum number of documents per query. Defaults to 1000. Note that setting this property to 0 may slow down the retrieval process for large collections, as a result set of the size of the collection will be allocated in memory.
- Author:
- Craig Macdonald, Rodrygo Santos
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classTRECResultsMatching.InputFormatThe result set input format.
-
Field Summary
Fields Modifier and Type Field Description protected CollectionStatisticscollStatsThe underlying collections statistics.protected intdocidThe current read document identifier.protected static java.lang.StringDSMNSThe default namespace for document score modifiers.protected java.util.List<DocumentScoreModifier>dsmsThe list of document score modifiers to be applied.protected java.lang.StringfilenameThe TREC results filename.protected TRECResultsMatching.InputFormatformatThe input format to use when parsing document identifiers.protected booleanfoundWhether the current query was found in the results file.protected IndexindexThe underlying index.protected org.slf4j.LoggerloggerThis object's logger.protected intmaxResultsThe maximum number of results to read per query.protected booleanparseScoresWhether document scores should be parsed from the results file.protected java.lang.StringqidThe current query id.protected java.io.BufferedReaderreaderThe TREC results file reader.protected booleanresetWhether the current file has already been reset.protected ResultSetrsThe result set for a query.protected doublescoreThe current read score.protected static java.util.regex.PatternSPLIT_SPACE_PLUS
-
Constructor Summary
Constructors Constructor Description TRECResultsMatching(Index _index)Contructs an instance of the TRECResultsMatching given an index.TRECResultsMatching(Index _index, java.lang.String _filename)Contructs an instance of the TRECResultsMatching.TRECResultsMatching(Index _index, java.lang.String _filename, java.lang.String defDSMs)Contructs an instance of the TRECResultsMatching.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleancheckValid()protected voidfinalize()CollectionStatisticsgetCollectionStatistics()Returns collection statistics.protected intgetDocid(java.lang.String docno)java.lang.StringgetInfo()Return a human readable description of this Matching classprotected voidinitDSMs(java.lang.String defDSMs)protected voidinitialise(int max)Initialises the current result set by allocating memory for max results.ResultSetmatch(java.lang.String _qid, MatchingQueryTerms mqt)Get a ResultSet for the given query terms.protected booleanread(java.lang.String _qid)protected voidreopen()voidsetCollectionStatistics(CollectionStatistics _collStats)Update the collection statistics being used by this matching instance
-
-
-
Field Detail
-
SPLIT_SPACE_PLUS
protected static final java.util.regex.Pattern SPLIT_SPACE_PLUS
-
index
protected Index index
The underlying index.
-
collStats
protected CollectionStatistics collStats
The underlying collections statistics.
-
DSMNS
protected static final java.lang.String DSMNS
The default namespace for document score modifiers.- See Also:
- Constant Field Values
-
dsms
protected java.util.List<DocumentScoreModifier> dsms
The list of document score modifiers to be applied.
-
filename
protected java.lang.String filename
The TREC results filename.
-
reader
protected java.io.BufferedReader reader
The TREC results file reader.
-
format
protected TRECResultsMatching.InputFormat format
The input format to use when parsing document identifiers.
-
parseScores
protected final boolean parseScores
Whether document scores should be parsed from the results file.
-
maxResults
protected final int maxResults
The maximum number of results to read per query.
-
qid
protected java.lang.String qid
The current query id.
-
rs
protected ResultSet rs
The result set for a query.
-
docid
protected int docid
The current read document identifier.
-
score
protected double score
The current read score.
-
found
protected boolean found
Whether the current query was found in the results file.
-
reset
protected boolean reset
Whether the current file has already been reset.
-
logger
protected org.slf4j.Logger logger
This object's logger.
-
-
Constructor Detail
-
TRECResultsMatching
public TRECResultsMatching(Index _index) throws java.io.IOException
Contructs an instance of the TRECResultsMatching given an index.- Parameters:
_index-- Throws:
java.io.IOException
-
TRECResultsMatching
public TRECResultsMatching(Index _index, java.lang.String _filename) throws java.io.IOException
Contructs an instance of the TRECResultsMatching.- Parameters:
_index-_filename-- Throws:
java.io.IOException
-
TRECResultsMatching
public TRECResultsMatching(Index _index, java.lang.String _filename, java.lang.String defDSMs) throws java.io.IOException
Contructs an instance of the TRECResultsMatching.- Parameters:
_index-_filename-defDSMs-- Throws:
java.io.IOException
-
-
Method Detail
-
reopen
protected void reopen() throws java.io.IOException- Throws:
java.io.IOException
-
initDSMs
protected void initDSMs(java.lang.String defDSMs)
-
getInfo
public java.lang.String getInfo()
Description copied from interface:MatchingReturn a human readable description of this Matching class
-
getDocid
protected int getDocid(java.lang.String docno) throws java.io.IOException- Throws:
java.io.IOException
-
read
protected boolean read(java.lang.String _qid) throws java.io.IOException- Throws:
java.io.IOException
-
checkValid
protected boolean checkValid()
-
match
public ResultSet match(java.lang.String _qid, MatchingQueryTerms mqt) throws java.io.IOException
Description copied from interface:MatchingGet a ResultSet for the given query terms.
-
setCollectionStatistics
public void setCollectionStatistics(CollectionStatistics _collStats)
Description copied from interface:MatchingUpdate the collection statistics being used by this matching instance- Specified by:
setCollectionStatisticsin interfaceMatching- Parameters:
_collStats- CollectionStatistics to use during matching
-
getCollectionStatistics
public CollectionStatistics getCollectionStatistics()
Returns collection statistics.- Returns:
- collection statistics
-
initialise
protected void initialise(int max)
Initialises the current result set by allocating memory for max results.- Parameters:
max- The maximum number of results to be stored.
-
finalize
protected void finalize() throws java.lang.Throwable- Overrides:
finalizein classjava.lang.Object- Throws:
java.lang.Throwable
-
-