Package org.terrier.matching
Class TRECResultsMatching
- java.lang.Object
-
- org.terrier.matching.TRECResultsMatching
-
- All Implemented Interfaces:
Matching
public class TRECResultsMatching extends java.lang.Object implements Matching
A matching implementation that retrieves results from a TREC result file rather than the current index. Such a result file must be compatible with trec_eval, i.e., it should have the following format:queryID Q0 docno score rank label
Properties:
- matching.trecresults.file - the path to the TREC results file.
- matching.trecresults.format - the input format to parse document identifiers. Defaults to DOCNO.
DOCNO assumes that docno is a reverse lookup key in the
MetaIndex
. If DOCID is specified, then the docnos are assumed to represent Terrier's docids, as generated byTRECDocidOutputFormat
. - matching.trecresults.scores - whether scores should be parsed. Defaults to true.
- matching.trecresults.length - the maximum number of documents per query. Defaults to 1000. Note that setting this property to 0 may slow down the retrieval process for large collections, as a result set of the size of the collection will be allocated in memory.
- Author:
- Craig Macdonald, Rodrygo Santos
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
TRECResultsMatching.InputFormat
The result set input format.
-
Field Summary
Fields Modifier and Type Field Description protected CollectionStatistics
collStats
The underlying collections statistics.protected int
docid
The current read document identifier.protected static java.lang.String
DSMNS
The default namespace for document score modifiers.protected java.util.List<DocumentScoreModifier>
dsms
The list of document score modifiers to be applied.protected java.lang.String
filename
The TREC results filename.protected TRECResultsMatching.InputFormat
format
The input format to use when parsing document identifiers.protected boolean
found
Whether the current query was found in the results file.protected Index
index
The underlying index.protected org.slf4j.Logger
logger
This object's logger.protected int
maxResults
The maximum number of results to read per query.protected boolean
parseScores
Whether document scores should be parsed from the results file.protected java.lang.String
qid
The current query id.protected java.io.BufferedReader
reader
The TREC results file reader.protected boolean
reset
Whether the current file has already been reset.protected ResultSet
rs
The result set for a query.protected double
score
The current read score.protected static java.util.regex.Pattern
SPLIT_SPACE_PLUS
-
Constructor Summary
Constructors Constructor Description TRECResultsMatching(Index _index)
Contructs an instance of the TRECResultsMatching given an index.TRECResultsMatching(Index _index, java.lang.String _filename)
Contructs an instance of the TRECResultsMatching.TRECResultsMatching(Index _index, java.lang.String _filename, java.lang.String defDSMs)
Contructs an instance of the TRECResultsMatching.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
checkValid()
protected void
finalize()
CollectionStatistics
getCollectionStatistics()
Returns collection statistics.protected int
getDocid(java.lang.String docno)
java.lang.String
getInfo()
Return a human readable description of this Matching classprotected void
initDSMs(java.lang.String defDSMs)
protected void
initialise(int max)
Initialises the current result set by allocating memory for max results.ResultSet
match(java.lang.String _qid, MatchingQueryTerms mqt)
Get a ResultSet for the given query terms.protected boolean
read(java.lang.String _qid)
protected void
reopen()
void
setCollectionStatistics(CollectionStatistics _collStats)
Update the collection statistics being used by this matching instance
-
-
-
Field Detail
-
SPLIT_SPACE_PLUS
protected static final java.util.regex.Pattern SPLIT_SPACE_PLUS
-
index
protected Index index
The underlying index.
-
collStats
protected CollectionStatistics collStats
The underlying collections statistics.
-
DSMNS
protected static final java.lang.String DSMNS
The default namespace for document score modifiers.- See Also:
- Constant Field Values
-
dsms
protected java.util.List<DocumentScoreModifier> dsms
The list of document score modifiers to be applied.
-
filename
protected java.lang.String filename
The TREC results filename.
-
reader
protected java.io.BufferedReader reader
The TREC results file reader.
-
format
protected TRECResultsMatching.InputFormat format
The input format to use when parsing document identifiers.
-
parseScores
protected final boolean parseScores
Whether document scores should be parsed from the results file.
-
maxResults
protected final int maxResults
The maximum number of results to read per query.
-
qid
protected java.lang.String qid
The current query id.
-
rs
protected ResultSet rs
The result set for a query.
-
docid
protected int docid
The current read document identifier.
-
score
protected double score
The current read score.
-
found
protected boolean found
Whether the current query was found in the results file.
-
reset
protected boolean reset
Whether the current file has already been reset.
-
logger
protected org.slf4j.Logger logger
This object's logger.
-
-
Constructor Detail
-
TRECResultsMatching
public TRECResultsMatching(Index _index) throws java.io.IOException
Contructs an instance of the TRECResultsMatching given an index.- Parameters:
_index
-- Throws:
java.io.IOException
-
TRECResultsMatching
public TRECResultsMatching(Index _index, java.lang.String _filename) throws java.io.IOException
Contructs an instance of the TRECResultsMatching.- Parameters:
_index
-_filename
-- Throws:
java.io.IOException
-
TRECResultsMatching
public TRECResultsMatching(Index _index, java.lang.String _filename, java.lang.String defDSMs) throws java.io.IOException
Contructs an instance of the TRECResultsMatching.- Parameters:
_index
-_filename
-defDSMs
-- Throws:
java.io.IOException
-
-
Method Detail
-
reopen
protected void reopen() throws java.io.IOException
- Throws:
java.io.IOException
-
initDSMs
protected void initDSMs(java.lang.String defDSMs)
-
getInfo
public java.lang.String getInfo()
Description copied from interface:Matching
Return a human readable description of this Matching class
-
getDocid
protected int getDocid(java.lang.String docno) throws java.io.IOException
- Throws:
java.io.IOException
-
read
protected boolean read(java.lang.String _qid) throws java.io.IOException
- Throws:
java.io.IOException
-
checkValid
protected boolean checkValid()
-
match
public ResultSet match(java.lang.String _qid, MatchingQueryTerms mqt) throws java.io.IOException
Description copied from interface:Matching
Get a ResultSet for the given query terms.
-
setCollectionStatistics
public void setCollectionStatistics(CollectionStatistics _collStats)
Description copied from interface:Matching
Update the collection statistics being used by this matching instance- Specified by:
setCollectionStatistics
in interfaceMatching
- Parameters:
_collStats
- CollectionStatistics to use during matching
-
getCollectionStatistics
public CollectionStatistics getCollectionStatistics()
Returns collection statistics.- Returns:
- collection statistics
-
initialise
protected void initialise(int max)
Initialises the current result set by allocating memory for max results.- Parameters:
max
- The maximum number of results to be stored.
-
finalize
protected void finalize() throws java.lang.Throwable
- Overrides:
finalize
in classjava.lang.Object
- Throws:
java.lang.Throwable
-
-