public class TrecTerrier extends Object
TrecTerrier, indexing TREC collections with Terrier.
usage: java TrecTerrier [flags in any order]
-h --help print this message
-V --version print version information
-i --index index a collection
-id --inverted2direct generate a direct index for an existing index
-r --retrieve retrieve from an indexed collection
-e --evaluate evaluates the results in the directory
var/results with the specified qrels file
in the file etc/trec.qrels
If invoked with '-i', then both the direct and
inverted files are build, unless it is specified which
of the structures to build.
-d --direct creates the direct file
-v --inverted creates the inverted file, from an already existing direct
If invoked with '-r', there are the following options.
-c value parameter value for term frequency normalisation.
If it is not specified, then the default value for each
weighting model is used, eg PL2 => c=1, BM25 b=> 0.75
-q --queryexpand applies query expansion
--docids will only print docids in the .res file
If invoked with '-e', there is the following option.
-p --perquery reports the average precision for each query separately.
filename.res restrict evaluation to filename.res only.
If invoked with one of the following options, then the contents of the
corresponding data structure are shown in the standard output.
--printdocid prints the contents of the document index
--printlexicon prints the contents of the lexicon
--printinverted prints the contents of the inverted file
--printdirect prints the contents of the direct file
--printstats prints statistics about the indexed collection
Modifier and Type | Field and Description |
---|---|
protected static int |
ARGUMENTS_OK |
protected double |
c
The value of the term frequency
normalisation parameter.
|
protected boolean |
direct
Specifies whether to build the direct file only.
|
protected boolean |
docids |
protected static int |
ERROR_CONFLICTING_ARGUMENTS |
protected static int |
ERROR_DIRECT_FILE_EXISTS |
protected static int |
ERROR_DIRECT_FILE_NOT_EXISTS |
protected static int |
ERROR_DIRECT_NOT_INDEXING |
protected static int |
ERROR_EXPAND_NOT_RETRIEVE |
protected static int |
ERROR_GIVEN_C_NOT_RETRIEVING |
protected static int |
ERROR_HADOOP_NOT_RETRIEVAL |
protected static int |
ERROR_HADOOP_ONLY_INDEX |
protected static int |
ERROR_INVERTED_NOT_INDEXING |
protected static int |
ERROR_LANGUAGEMODEL_NOT_RETRIEVE |
protected static int |
ERROR_NO_ARGUMENTS |
protected static int |
ERROR_NO_C_VALUE |
protected static int |
ERROR_PRINT_DIRECT_FILE_NOT_EXISTS |
protected static int |
ERROR_PRINT_DOCINDEX_FILE_NOT_EXISTS |
protected static int |
ERROR_PRINT_INVERTED_FILE_NOT_EXISTS |
protected static int |
ERROR_PRINT_LEXICON_FILE_NOT_EXISTS |
protected static int |
ERROR_PRINT_STATS_FILE_NOT_EXISTS |
protected static int |
ERROR_UNKNOWN_OPTION |
protected boolean |
evaluation
Specifies whether to perform trec_eval like evaluation.
|
protected boolean |
evaluation_per_query
Specifies whether to perform trec_eval like evaluation,
reporting only average precision for each query.
|
protected String |
evaluation_type
Specifies if the evaluation is done for adhoc or named-page
finding retrieval task.
|
protected String |
evaluationFilename
The file to evaluation, if any
|
protected boolean |
hadoop
use Hadoop indexing
|
protected boolean |
indexing
Specifies whether to index a collection
|
protected boolean |
inverted
Specifies whether to build the inverted file
from an already created direct file.
|
protected boolean |
inverted2direct |
protected boolean |
isParameterValueSpecified
Indicates whether there is a specified
value for the term frequency normalisation
parameter.
|
protected boolean |
printdirect
Specifies whether to print the direct file
|
protected boolean |
printdocid
Specifies whether to print the document index
|
protected boolean |
printHelp
Specifies whether a help message is printed
|
protected boolean |
printinverted
Specifies whether to print the inverted file
|
protected boolean |
printlexicon
Specifies whether to print the lexicon
|
protected boolean |
printmeta
whether to print the meta index
|
protected boolean |
printstats
Specifies whether to print the statistics file
|
protected boolean |
printVersion
Specified whether a version message is printed
|
protected boolean |
queryexpand
Specifies whether to apply query expansion
|
protected boolean |
retrieving
Specifies whether to retrieve from an indexed collection
|
protected boolean |
singlePass
Specifies whether to build the inverted file
from scrach, sigle pass method
|
protected String |
unknownOption
The unkown option
|
Constructor and Description |
---|
TrecTerrier() |
Modifier and Type | Method and Description |
---|---|
void |
applyOptions(int status)
Apply the option resulted from processing the command line arguments
|
static void |
main(String[] args)
The main method that starts the application
|
protected int |
processOptions(String[] args)
Processes the command line arguments and
sets the corresponding properties accordingly.
|
void |
run()
Calls the required classes from Terrier.
|
protected void |
usage()
Prints a help message that explains the
possible options.
|
protected void |
version()
Prints the version information about Terrier
|
protected String unknownOption
protected String evaluationFilename
protected boolean queryexpand
protected boolean printHelp
protected boolean printVersion
protected boolean indexing
protected boolean singlePass
protected boolean hadoop
protected boolean retrieving
protected boolean printdocid
protected boolean printlexicon
protected boolean printinverted
protected boolean printdirect
protected boolean printstats
protected boolean printmeta
protected boolean evaluation_per_query
protected String evaluation_type
protected boolean inverted
protected boolean direct
protected double c
protected boolean evaluation
protected boolean isParameterValueSpecified
protected boolean inverted2direct
protected boolean docids
protected static final int ARGUMENTS_OK
protected static final int ERROR_NO_ARGUMENTS
protected static final int ERROR_NO_C_VALUE
protected static final int ERROR_CONFLICTING_ARGUMENTS
protected static final int ERROR_DIRECT_FILE_EXISTS
protected static final int ERROR_DIRECT_FILE_NOT_EXISTS
protected static final int ERROR_PRINT_DOCINDEX_FILE_NOT_EXISTS
protected static final int ERROR_PRINT_LEXICON_FILE_NOT_EXISTS
protected static final int ERROR_PRINT_INVERTED_FILE_NOT_EXISTS
protected static final int ERROR_PRINT_STATS_FILE_NOT_EXISTS
protected static final int ERROR_PRINT_DIRECT_FILE_NOT_EXISTS
protected static final int ERROR_UNKNOWN_OPTION
protected static final int ERROR_DIRECT_NOT_INDEXING
protected static final int ERROR_INVERTED_NOT_INDEXING
protected static final int ERROR_EXPAND_NOT_RETRIEVE
protected static final int ERROR_GIVEN_C_NOT_RETRIEVING
protected static final int ERROR_LANGUAGEMODEL_NOT_RETRIEVE
protected static final int ERROR_HADOOP_NOT_RETRIEVAL
protected static final int ERROR_HADOOP_ONLY_INDEX
protected void version()
protected void usage()
public static void main(String[] args)
args
- the command line argumentsprotected int processOptions(String[] args)
args
- the command line arguments.Terrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow