Terrier IR Platform
2.2.1

uk.ac.gla.terrier.structures
Class CollectionStatistics

java.lang.Object
  extended by uk.ac.gla.terrier.structures.CollectionStatistics

public class CollectionStatistics
extends java.lang.Object

This class provides basic statistics for the indexed collection of documents, such as the average length of documents, or the total number of documents in the collection.
After indexing, statistics are saved in the PREFIX.log file, along with the classes that should be used for the Lexicon, the DocumentIndex, the DirectIndex and the InvertedIndex. This means that an index knows how it was build and how it should be opened again.

Version:
$Revision: 1.32 $
Author:
Gianni Amati, Vassilis Plachouras, Craig Macdonald

Constructor Summary
CollectionStatistics(int numDocs, int numTerms, long numTokens, long numPointers)
           
 
Method Summary
 double getAverageDocumentLength()
          Returns the documents' average length.
 int getNumberOfDocuments()
          Returns the total number of documents in the collection.
 long getNumberOfPointers()
          Returns the total number of pointers in the collection.
 long getNumberOfTokens()
          Returns the total number of tokens in the collection.
 int getNumberOfUniqueTerms()
          Returns the total number of unique terms in the lexicon.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CollectionStatistics

public CollectionStatistics(int numDocs,
                            int numTerms,
                            long numTokens,
                            long numPointers)
Method Detail

getAverageDocumentLength

public double getAverageDocumentLength()
Returns the documents' average length.

Returns:
the average length of the documents in the collection.

getNumberOfDocuments

public int getNumberOfDocuments()
Returns the total number of documents in the collection.

Returns:
the total number of documents in the collection

getNumberOfPointers

public long getNumberOfPointers()
Returns the total number of pointers in the collection.

Returns:
the total number of pointers in the collection

getNumberOfTokens

public long getNumberOfTokens()
Returns the total number of tokens in the collection.

Returns:
the total number of tokens in the collection

getNumberOfUniqueTerms

public int getNumberOfUniqueTerms()
Returns the total number of unique terms in the lexicon.

Returns:
the total number of unique terms in the lexicon

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow