org.terrier.matching
Class OldBasicMatching

java.lang.Object
  extended by org.terrier.matching.OldBasicMatching
All Implemented Interfaces:
Matching

public class OldBasicMatching
extends java.lang.Object
implements Matching

This is the original matching implementation of Terrier. Performs the matching of documents with a query, by first assigning scores to documents for each query term and modifying these scores with the appropriate modifiers. Then, a series of document score modifiers are applied if necessary.

Author:
Vassilis Plachouras, Craig Macdonald

Field Summary
protected  CollectionStatistics collectionStatistics
          The collection statistics
protected  java.util.ArrayList<DocumentScoreModifier> documentModifiers
          Contains the document score modifiers to be applied for a query.
protected static java.lang.String dsmNamespace
          The default namespace for the document score modifiers that are specified in the properties file.
protected static boolean IGNORE_LOW_IDF_TERMS
          A property that enables to ignore the terms with a low IDF.
protected  Index index
          The index used for retrieval.
protected  PostingIndex<BitIndexPointer> invertedIndex
          The inverted file.
protected  Lexicon<java.lang.String> lexicon
          The lexicon used.
protected static org.apache.log4j.Logger logger
          the logger for this class
protected static boolean MATCH_EMPTY_QUERY
          A property that when it is true, it allows matching all documents to an empty query.
protected  int numberOfRetrievedDocuments
          The number of retrieved documents for a query.
protected  ResultSet resultSet
          The result set.
protected static int RETRIEVED_SET_SIZE
          The maximum number of documents in the final retrieved set.
 
Constructor Summary
protected OldBasicMatching()
           
  OldBasicMatching(Index _index)
          A default constructor that creates the CollectionResultSet and initialises the document and term modifier containers.
 
Method Summary
 void addDocumentScoreModifier(DocumentScoreModifier documentScoreModifier)
          Registers a document score modifier.
protected  void assignScores(int i, WeightingModel[] wModels, ResultSet rs, IterablePosting postings, LexiconEntry lEntry, double queryTermWeight)
          Assign scores method
 DocumentScoreModifier getDocumentScoreModifier(int i)
          Returns the i-th registered document score modifier.
 java.lang.String getInfo()
          Returns a descriptive string for the retrieval process performed.
 ResultSet getResultSet()
          Deprecated. match() now returns the ResultSet
protected  void initialise()
          Initialises the arrays prior of retrieval.
protected  void initialise(double[] scs)
          Initialises the arrays prior of retrieval, with the given scores.
protected  void initialiseDSMs()
           
 ResultSet match(java.lang.String queryNumber, MatchingQueryTerms queryTerms)
          Implements the matching of a query with the documents.
 void setCollectionStatistics(CollectionStatistics cs)
          Set the collection statistics.
 void setModel(Model model)
          Deprecated.  
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

protected static final org.apache.log4j.Logger logger
the logger for this class


dsmNamespace

protected static final java.lang.String dsmNamespace
The default namespace for the document score modifiers that are specified in the properties file.

See Also:
Constant Field Values

RETRIEVED_SET_SIZE

protected static int RETRIEVED_SET_SIZE
The maximum number of documents in the final retrieved set. It corresponds to the property matching.retrieved_set_size. The default value is 1000, however, setting the property to 0 will return all matched documents.


IGNORE_LOW_IDF_TERMS

protected static boolean IGNORE_LOW_IDF_TERMS
A property that enables to ignore the terms with a low IDF. In the match method, we check whether the frequency of a term in the collection is higher than the number of documents. If this is true, then by default we don't assign scores to documents that contain this term. We can change this default behaviour by altering the corresponding property ignore.low.idf.terms, the default value of which is true.


MATCH_EMPTY_QUERY

protected static boolean MATCH_EMPTY_QUERY
A property that when it is true, it allows matching all documents to an empty query. In this case the ordering of documents is random. More specifically, it is the ordering of documents in the document index. The corresponding property is match.empty.query and the default value is false.


numberOfRetrievedDocuments

protected int numberOfRetrievedDocuments
The number of retrieved documents for a query.


index

protected Index index
The index used for retrieval.


lexicon

protected Lexicon<java.lang.String> lexicon
The lexicon used.


invertedIndex

protected PostingIndex<BitIndexPointer> invertedIndex
The inverted file.


collectionStatistics

protected CollectionStatistics collectionStatistics
The collection statistics


resultSet

protected ResultSet resultSet
The result set.


documentModifiers

protected java.util.ArrayList<DocumentScoreModifier> documentModifiers
Contains the document score modifiers to be applied for a query.

Constructor Detail

OldBasicMatching

protected OldBasicMatching()

OldBasicMatching

public OldBasicMatching(Index _index)
A default constructor that creates the CollectionResultSet and initialises the document and term modifier containers.

Parameters:
_index - the object that encapsulates the basic data structures used for retrieval.
Method Detail

initialiseDSMs

protected void initialiseDSMs()

getResultSet

public ResultSet getResultSet()
Deprecated. match() now returns the ResultSet

Returns the result set.


initialise

protected void initialise()
Initialises the arrays prior of retrieval. Only the first time it is called, it will allocate memory for the arrays.


initialise

protected void initialise(double[] scs)
Initialises the arrays prior of retrieval, with the given scores. Only the first time it is called, it will allocate memory for the arrays.

Parameters:
scs - double[] the scores to initialise the result set with.

addDocumentScoreModifier

public void addDocumentScoreModifier(DocumentScoreModifier documentScoreModifier)
Registers a document score modifier. If more than one modifiers are registered, then they applied in the order they were registered.

Parameters:
documentScoreModifier - DocumentScoreModifier the score modifier to be applied.

getDocumentScoreModifier

public DocumentScoreModifier getDocumentScoreModifier(int i)
Returns the i-th registered document score modifier.

Returns:
the i-th registered document score modifier.

setModel

public void setModel(Model model)
Deprecated. 

Sets the weighting model used for retrieval.

Parameters:
model - the weighting model used for retrieval

setCollectionStatistics

public void setCollectionStatistics(CollectionStatistics cs)
Set the collection statistics.

Specified by:
setCollectionStatistics in interface Matching
Parameters:
cs - CollectionStatistics to use during matching

getInfo

public java.lang.String getInfo()
Returns a descriptive string for the retrieval process performed.

Specified by:
getInfo in interface Matching

match

public ResultSet match(java.lang.String queryNumber,
                       MatchingQueryTerms queryTerms)
                throws java.io.IOException
Implements the matching of a query with the documents.

Specified by:
match in interface Matching
Parameters:
queryNumber - the identifier of the processed query.
queryTerms - the query terms to be processed.
Returns:
Returns the resultset expressed by this query.
Throws:
java.io.IOException - if a problem occurs during matching

assignScores

protected void assignScores(int i,
                            WeightingModel[] wModels,
                            ResultSet rs,
                            IterablePosting postings,
                            LexiconEntry lEntry,
                            double queryTermWeight)
                     throws java.io.IOException
Assign scores method

Parameters:
i - which query term is this
wModels - weighting models to use for this term
rs - Resultset to alter
postings - post list to process
lEntry - entry statistics
queryTermWeight - weight of the query term
Throws:
java.io.IOException


Terrier 3.5. Copyright © 2004-2011 University of Glasgow