|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectuk.ac.gla.terrier.structures.CollectionStatistics
public class CollectionStatistics
This class provides basic statistics for the indexed
collection of documents, such as the average length of documents,
or the total number of documents in the collection.
After indexing, statistics are saved in the PREFIX.log file, along
with the classes that should be used for the Lexicon, the DocumentIndex,
the DirectIndex and the InvertedIndex. This means that an index knows
how it was build and how it should be opened again.
Constructor Summary | |
---|---|
CollectionStatistics()
|
|
CollectionStatistics(int numDocs,
int numTerms,
long numTokens,
long numPointers)
|
|
CollectionStatistics(java.lang.String filename)
|
|
CollectionStatistics(java.lang.String Path,
java.lang.String Prefix)
|
Method Summary | |
---|---|
static void |
createCollectionStatistics(int docs,
long tokens,
int terms,
long pointers,
java.lang.String[] classes)
|
static void |
createCollectionStatistics(java.lang.String filename,
int docs,
long tokens,
int terms,
long pointers,
java.lang.String[] classes)
Given the collection statistics, it stores them in a file with a standard name. |
static void |
createCollectionStatistics(java.lang.String Path,
java.lang.String Prefix,
int docs,
long tokens,
int terms,
long pointers,
java.lang.String[] classes)
|
double |
getAverageDocumentLength()
Returns the documents' average length. |
java.lang.String[] |
getClasses()
Returns the classes line given in the log file. |
int |
getNumberOfDocuments()
Returns the total number of documents in the collection. |
long |
getNumberOfPointers()
Returns the total number of pointers in the collection. |
long |
getNumberOfTokens()
Returns the total number of tokens in the collection. |
int |
getNumberOfUniqueTerms()
Returns the total number of unique terms in the lexicon. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public CollectionStatistics(int numDocs, int numTerms, long numTokens, long numPointers)
public CollectionStatistics() throws java.io.IOException
java.io.IOException
public CollectionStatistics(java.lang.String Path, java.lang.String Prefix) throws java.io.IOException
java.io.IOException
public CollectionStatistics(java.lang.String filename) throws java.io.IOException
java.io.IOException
Method Detail |
---|
public static void createCollectionStatistics(java.lang.String Path, java.lang.String Prefix, int docs, long tokens, int terms, long pointers, java.lang.String[] classes)
public static void createCollectionStatistics(int docs, long tokens, int terms, long pointers, java.lang.String[] classes)
public static void createCollectionStatistics(java.lang.String filename, int docs, long tokens, int terms, long pointers, java.lang.String[] classes)
docs
- The number of documents in the collectiontokens
- The number of tokens in the collectionterms
- The number of terms in the collectionpointers
- The number of pointers in the collectionpublic double getAverageDocumentLength()
public int getNumberOfDocuments()
public long getNumberOfPointers()
public long getNumberOfTokens()
public int getNumberOfUniqueTerms()
public java.lang.String[] getClasses()
|
Terrier IR Platform 1.1.1 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |