org.terrier.matching
Class PostingListManager

java.lang.Object
  extended by org.terrier.matching.PostingListManager
All Implemented Interfaces:
java.io.Closeable

public class PostingListManager
extends java.lang.Object
implements java.io.Closeable

The PostingListManager is reponsible for opening the appropriate posting lists IterablePosting given the MatchingQueryTerms object. Moreover, it knows how each Posting should be scored.

Plugins are also supported by PostingListManager. Each plugin class should implement the PostingListManagerPlugin interface, and be named explicitly in the matching.postinglist.manager.plugins property.

Properties:

Example Usage

Following code shows how term-at-a-time matching may occur using the PostingListManager:
 
 MatchingQueryTerms mqt;
 Index index;
 PostingListManager plm = new PostingListManager(index, index.getCollectionStatistics(), mqt);
 plm.prepare(false);
 for(int term = 0;term < plm.size(); term++)
 {
   IterablePosting ip = plm.get(term);
   while(ip.next() != IterablePosting.EOL)
   {
     double score = plm.score(term);
     int id = ip.getId();
   }
 }
 plm.close();
 

Since:
3.5
Author:
Nicola Tonellotto and Craig Macdonald
See Also:
Matching

Nested Class Summary
static interface PostingListManager.PostingListManagerPlugin
          Interface for plugins to further alter the posting lists managed by the PostingListManager
 
Field Summary
protected  CollectionStatistics collectionStatistics
          statistics of the collection
protected static boolean IGNORE_LOW_IDF_TERMS
          A property that enables to ignore the terms with a low IDF.
protected  Index index
          underlying index
protected  InvertedIndex invertedIndex
          inverted index of the index
protected  Lexicon<java.lang.String> lexicon
          lexicon for the index
protected static org.apache.log4j.Logger logger
           
protected  int numTerms
          number of terms
protected  java.util.List<WeightingModel[]> termModels
          weighting models for each term
protected  java.util.List<IterablePosting> termPostings
          posting lists for each term
protected  java.util.List<EntryStatistics> termStatistics
          EntryStatistics for each term
protected  java.util.List<java.lang.String> termStrings
          String form for each term
 
Constructor Summary
protected PostingListManager(Index _index, CollectionStatistics cs)
          Create a posting list manager for the given index and statistics
  PostingListManager(Index _index, CollectionStatistics _cs, MatchingQueryTerms mqt)
          Create a posting list manager for the given index and statistics, and populated using the specified MatchingQueryTerms.
 
Method Summary
 void addSingleTerm(java.lang.String queryTerm, double weight, EntryStatistics entryStats, WeightingModel[] wmodels)
          Add a single term to those to be matched for this query.
 void addSingleTermAlternatives(java.lang.String[] terms, java.lang.String stringForm, double weight, EntryStatistics[] entryStats, WeightingModel[] wmodels)
          Adds a synonym group to the matching process.
 void addSingleTermAlternatives(java.lang.String[] terms, java.lang.String stringForm, double weight, EntryStatistics entryStats, WeightingModel[] wmodels)
          Adds a synonym group to the matching process.
 void close()
           
 int getNumTerms()
          Returns the number of postings lists (that are terms) for this query
 IterablePosting getPosting(int i)
          Returns the IterablePosting corresponding to the specified term
 EntryStatistics getStatistics(int i)
          Returns the EntryStatistics corresponding to the specified term
 java.lang.String getTerm(int i)
           
static EntryStatistics mergeStatistics(EntryStatistics[] entryStats)
          Knows how to merge several EntryStatistics for a single effective term
 void prepare(boolean firstMove)
          Counts the number of terms active.
 double score(int i)
          Returns the score using all weighting models for the current posting of the specified term
 int size()
          Returns the number of posting lists for this query
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

protected static final org.apache.log4j.Logger logger

IGNORE_LOW_IDF_TERMS

protected static boolean IGNORE_LOW_IDF_TERMS
A property that enables to ignore the terms with a low IDF. Controlled by ignore.low.idf.terms property, defualts to true.


termPostings

protected final java.util.List<IterablePosting> termPostings
posting lists for each term


termModels

protected final java.util.List<WeightingModel[]> termModels
weighting models for each term


termStatistics

protected final java.util.List<EntryStatistics> termStatistics
EntryStatistics for each term


termStrings

protected final java.util.List<java.lang.String> termStrings
String form for each term


numTerms

protected int numTerms
number of terms


index

protected Index index
underlying index


lexicon

protected Lexicon<java.lang.String> lexicon
lexicon for the index


invertedIndex

protected InvertedIndex invertedIndex
inverted index of the index


collectionStatistics

protected CollectionStatistics collectionStatistics
statistics of the collection

Constructor Detail

PostingListManager

protected PostingListManager(Index _index,
                             CollectionStatistics cs)
                      throws java.io.IOException
Create a posting list manager for the given index and statistics

Throws:
java.io.IOException

PostingListManager

public PostingListManager(Index _index,
                          CollectionStatistics _cs,
                          MatchingQueryTerms mqt)
                   throws java.io.IOException
Create a posting list manager for the given index and statistics, and populated using the specified MatchingQueryTerms.

Parameters:
_index - - index to obtain postings from
_cs - - collection statistics to obtain
mqt - - MatchingQueryTerms object calculated for the query
Throws:
java.io.IOException
Method Detail

addSingleTerm

public void addSingleTerm(java.lang.String queryTerm,
                          double weight,
                          EntryStatistics entryStats,
                          WeightingModel[] wmodels)
                   throws java.io.IOException
Add a single term to those to be matched for this query. Those with more occurrences than the number of documents will be ignored if IGNORE_LOW_IDF_TERMS is enabled.

Parameters:
queryTerm - String form of the query term
weight - influence of this query term in scoring
entryStats - statistics to be used for this query term. If null, these will be obtained from the local Lexicon
wmodels - weighting models to be applied for this query term
Throws:
java.io.IOException

mergeStatistics

public static EntryStatistics mergeStatistics(EntryStatistics[] entryStats)
Knows how to merge several EntryStatistics for a single effective term


addSingleTermAlternatives

public void addSingleTermAlternatives(java.lang.String[] terms,
                                      java.lang.String stringForm,
                                      double weight,
                                      EntryStatistics[] entryStats,
                                      WeightingModel[] wmodels)
                               throws java.io.IOException
Adds a synonym group to the matching process. EntryStatistics for all terms in the group will be combined using mergeStatistics()

Parameters:
terms - String of the terms in the synonym group
weight - influence of this synonym group during retrieval
entryStats - statistics of the terms in the synonym group. If null, these will be obtained from the local Lexicon.
wmodels - WeightingModels for the synonym group (NOT one per member).
Throws:
java.io.IOException

addSingleTermAlternatives

public void addSingleTermAlternatives(java.lang.String[] terms,
                                      java.lang.String stringForm,
                                      double weight,
                                      EntryStatistics entryStats,
                                      WeightingModel[] wmodels)
                               throws java.io.IOException
Adds a synonym group to the matching process.

Parameters:
terms - String of the terms in the synonym group
weight - influence of this synonym group during retrieval
entryStats - statistics of the whole synonym group. If null, these will be obtained from the local Lexicon for all terms in the group will be combined using mergeStatistics()
wmodels - WeightingModels for the synonym group (NOT one per member).
Throws:
java.io.IOException

prepare

public void prepare(boolean firstMove)
             throws java.io.IOException
Counts the number of terms active. If firstMove is true, it will move each posting to the first posting.

Parameters:
firstMove - move all postings to the start?
Throws:
java.io.IOException

getStatistics

public EntryStatistics getStatistics(int i)
Returns the EntryStatistics corresponding to the specified term

Parameters:
i - term to obtain statistics for
Returns:
Statistics for this i-1th term

getPosting

public IterablePosting getPosting(int i)
Returns the IterablePosting corresponding to the specified term

Parameters:
i - term to obtain the posting list for
Returns:
Posting list for this i-1th term

size

public int size()
Returns the number of posting lists for this query


getNumTerms

public int getNumTerms()
Returns the number of postings lists (that are terms) for this query


score

public double score(int i)
Returns the score using all weighting models for the current posting of the specified term

Parameters:
i - Which term to score
Returns:
score obtained from all weighting models for that term

close

public void close()
           throws java.io.IOException
Specified by:
close in interface java.io.Closeable
Throws:
java.io.IOException

getTerm

public java.lang.String getTerm(int i)


Terrier 3.5. Copyright © 2004-2011 University of Glasgow