Terrier allows the user to configure many different aspects of the framework, in order to be adaptable to the specific needs of different applications. Here, we describe the properties that are used while indexing or retrieving. A sample of how to set up the basic properties can be found in etc/terrier.properties.sample. This page contains many of the properties in Terrier, broken down by category: General, Indexing, Retrieval, Desktop Search and Miscellaneous.
Property | terrier.setup |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | Absolute directory path |
Default value | not specified |
Configures | Specifies where Terrier finds the terrier.properties file, which is usually in the etc/ directory. Analogous to terrier.etc property |
Property | terrier.home |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | Absolute directory path |
Default value | not specified |
Configures | ApplicationSetup.TERRIER_HOME. Where Terrier is installed. |
Property | terrier.etc |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | Absolute directory path |
Default value | TERRIER_HOME + "etc/" |
Configures | TERRIER_ETC. Where terrier finds it's terrier.properties file if -Dterrier.setup is not specified |
Property | terrier.share |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.terms.Stopwords |
Possible values | Absolute directory path |
Default value | TERRIER_HOME + "share/" |
Configures | ApplicationSetup.TERRIER_SHARE. Where static distribution files are found, for instance the stopword files. |
Property | terrier.var |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.applications.desktop.filehandling.WindowsFileOpener, org.terrier.structures.Index |
Possible values | Absolute directory path |
Default value | TERRIER_HOME + "var/" |
Configures | TERRIER_VAR. Where Terrier puts files that it creates, e.g. indices and results files. |
Property | terrier.plugins |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | A comma-separated list of plugins. |
Default value | not specified |
Configures | The list of plugins to be preloaded. |
Property | log4j.config |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | A valid log4j configuration file |
Default value | terrier-log.xml |
Configures | ApplicationSetup.LOG4J_CONFIG. The configuration file used by log4j. |
Property | terrier.index.path |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.SimpleFileCollection, org.terrier.indexing.TRECCollection |
Possible values | fully path of a directory |
Default value | TERRIER_VAR + "index/" |
Configures | TERRIER_INDEX_PATH. The name of the directory in which the data structures created by Terrier are stored |
Property | terrier.index.prefix |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.Indexer |
Possible values | Filename prefix for all the indices |
Default value | "data" |
Configures | TERRIER_INDEX_PREFIX. Filename prefix for all the indices. |
Property | stopwords.filename |
Used in | org.terrier.terms.Stopwords |
Possible values | absolute path to file |
Default value | TERRIER_SHARE + "stopword-list.txt" |
Configures | The name of the file which contains a list of stopwords. |
Property | collection.spec |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.SimpleFileCollection, org.terrier.indexing.TRECCollection |
Possible values | Absolute filename |
Default value | TERRIER_ETC + value of "collection.spec" |
Configures | COLLECTION_SPEC. Where the indexing process should find it's configuration for the Collection object. This is often a list of files or directories. |
Property | ignore.empty.documents |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.Indexer |
Possible values | true, false |
Default value | false |
Configures | IGNORE_EMPTY_DOCUMENTS. Whether empty documents have an entry in the document index. |
Property | ???.process |
Used in | org.terrier.utility.TagSet |
Possible values | Comma delimited list of tags to process |
Default value | not specified |
Configures | For many of the tokenisers, configures which tags should be processed. ??? can be TrecDocTags or TrecQueryTags, to configure the TREC Collection and Query parsers respectively. ??? as FieldTags specifies the field that should be stored in the index. |
Property | ???.skip |
Used in | org.terrier.utility.TagSet |
Possible values | Comma delimited list of tags to not process |
Default value | not specified |
Configures | For many of the tokenisers, configures which tags should be skipped completely. ??? can be TrecDocTags or TrecQueryTags, to configure the TREC Collection and Query parsers respectively. |
Property | ???.doctag |
Used in | org.terrier.utility.TagSet |
Possible values | Name of tag that marks the start of the document (trec only) |
Default value | not specified |
Configures | For some of the tokenisers, configures which tag which contains the opening tag (or query ID). ??? can be TrecDocTags or TrecQueryTags, to configure the TREC Collection and Query parsers respectively. |
Property | ???.idtag |
Used in | org.terrier.utility.TagSet |
Possible values | Name of tag that contains the unique identifier (trec only) |
Default value | not specified |
Configures | For some of the tokenisers, configures which tag which contains the document ID (or query ID). ??? can be TrecDocTags or TrecQueryTags, to configure the TREC Collection and Query parsers respectively. |
Property | ???.casesensitive |
Used in | org.terrier.utility.TagSet org.terrier.indexing.TRECCollection |
Possible values | true or false |
Default value | true for TrecDocTags, false otherwise |
Configures | For some of the tokenisers, configures if the tag matching is case-sensitive or not. The default is true for TRECCollection (TrecDocTags), and false for FieldTags and TrecQueryTags (TRECFullTokenizer which is used by the TREC query parser (TRECQuery)). |
Property | ???.propertytags |
Used in | org.terrier.utility.TagSet org.terrier.indexing.TRECCollection |
Possible values | Comma delimited list of tags to add as document properties |
Default value | not specified |
Configures | During indexing this enables document tags to be saved as document properties instead of being indexed. This is useful to store document properties in the meta index for use later, e.g. for display by the Terrier Web-based interface. |
Property | block.indexing |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.applications.TRECIndexing |
Possible values | true, false |
Default value | false |
Configures | ApplicationSetup.BLOCK_INDEXING. Sets whether block positions should be saved during indexing. This is required to do phrasal searches. Client code should examine this to determine whether to use the BasicIndexer or the BlockIndexer. |
Property | blocks.size |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.BlockIndexer |
Possible values | integer > 0 |
Default value | 1 |
Configures | ApplicationSetup.BLOCK_SIZE. The number of terms contained in the same block |
Property | blocks.max |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.BlockIndexer |
Possible values | integer >= 0 |
Default value | 100000 |
Configures | MAX_BLOCKS. The maximum number of blocks a document may contain. |
Property | lowercase |
Used in | org.terrier.indexing.HTMLDocument, org.terrier.indexing.TRECDocument, org.terrier.indexing.TRECFullTokenizer |
Possible values | true, or false |
Default value | true |
Configures | Whether text is converted to lowercase before parsing |
Property | tokeniser |
Used in | org.terrier.indexing.tokenisation.Tokeniser |
Possible values | a classname implementing the Tokeniser interface |
Default value | EnglishTokeniser |
Configures | The Tokeniser implementation to be used when splitting text into tokens. This allows for corpora in different languages to be indexed by setting a Tokeniser implementation appropriate for each language. |
Property | indexing.max.tokens |
Used in | org.terrier.indexing.Indexer |
Possible values | integer >=0 |
Default value | 0 |
Configures | Sets a limit to the maximum number of tokens indexed for a document. The default value 0 means that there is no limit. |
Property | indexing.max.docs.per.builder |
Used in | org.terrier.indexing.Indexer |
Possible values | integer >=0 |
Default value | 18,000,000 |
Configures | Sets a limit to the maximum number of documents in one index during indexing. After this point, a new index will be created, and at the end, all the indices will be merged. Reasoning: During classical two-pass indexing, memory is constrained by the TermCodes table. If too many different unique terms are indexed, then an OutOfMemoryError will occur. For TREC GOV2 collection, 18 million documents is a good point to start a new index. The special value 0 means that there is no limit. This property also applies for single-pass indexing, although it can be safely set higher. It does not apply for MapReduce indexing. |
Property | termpipelines |
Used in | org.terrier.querying.Manager, org.terrier.indexing.Indexer |
Possible values | Comma delimited list of term pipeline entities to pass query terms through. Use blank to denote no termpipeline objects |
Default value | Stopwords,PorterStemmer |
Configures | Defines which term pipeline entities to pass query terms through. |
Property | invertedfile.processpointers |
Used in | org.terrier.structures.indexing.InvertedIndexBuilder, org.terrier.structures.indexing.BlockInvertedIndexBuilder |
Possible values | Integer value > 0 |
Default value | 20000000 |
Configures | Defines the number of pointers that should be processed at once when
building the inverted index. The InvertedIndexBuilder first works out how many terms
correspond to that many pointers, then scans the direct index
looking for each of these term, then writes them to inverted index, then
repeats scan for next bunch of terms. Increasing this speeds up inverted
index building for large collections, but uses more memory. Decrease this
if you encounter OutOfMemory errors while building the inverted index.
Note that for block indexing, the default is lower: 2,000,000 pointers.
This option supersedes invertedfile.processterms. For the invertedfile.processterms strategy to be used, set invertedfile.processpointers to 0. |
Property | lexicon.builder.merge.lex.max |
Used in | org.terrier.structures.indexing.LexiconBuilder, org.terrier.structures.indexing.BlockLexiconBuilder |
Possible values | integer values > 1 |
Default value | 16 |
Configures | The number of temporary lexicons to merge at once during indexing. during lexicon building. Bigger is generally faster, but too many open file-handles causes slowness. 16 is a good trade-off. (See also the MERGE_FACTOR in GNU sort source code). |
Property | indexing.excel.maxfilesize.mb |
Used in | org.terrier.indexing.MSExcelDocument |
Possible values | size of a file in megabytes |
Default value | 0.5 |
Configures | The maximum file size of an Excel spreadsheet to be parsed. |
Property | indexing.simplefilecollection.extensionsparsers |
Used in | org.terrier.indexing.SimpleFileCollection |
Possible values | comma delimited list of file extensions and associated parsers to use for the corresponding files. |
Default value | txt:FileDocument,text:FileDocument,tex:FileDocument,bib:FileDocument, pdf:PDFDocument,html:HTMLDocument,htm:HTMLDocument,xhtml:HTMLDocument, xml:HTMLDocument,doc:MSWordDocument,ppt:MSPowerpointDocument,xls:MSExcelDocument |
Configures | The parsers to be used for processing files with the specified extensions. |
Property | indexing.simplefilecollection.defaultparser |
Used in | org.terrier.indexing.SimpleFileCollection |
Possible values | fully qualified class name |
Default value | not specified |
Configures | The parser to use by default for processing files with unknown extensions |
Property | trec.blacklist.docids |
Used in | org.terrier.indexing.TRECCollection |
Possible values | full path to filename |
Default value | not specified |
Configures | The name of a file that contains a black list of document identifiers to be ignored during indexing |
Property | trec.collection.class |
Used in | org.terrier.applications.TRECIndexing |
Possible values | a classname implementing Collection interface |
Default value | TRECCollection |
Configures | The Collection object to be used to parse the collection. This allows test collection similar but not identical to TREC to be parsed using Terrier's TREC tools. New in Terrier 1.1.0 is the ability to chain Collections. The Collection specified last is the inner-most one of the chain, the first is the outer-most (i.e. instantiation right-to-left). the first collection should have a default constructor (no arguments), while the other collections should take as argument in their constructor the inner-collection class. E.g. trec.collection.class=RemoveSmallDocsCollection,TRECCollection. Instantiation handled by the CollectionFactory class. |
Property | indexer.meta.forward.keys |
Used in | CompressingMetaIndexBuilder |
Possible values | comma delimited list of properties of a Document object that should be used as metadata. |
Default value | docno |
Configures | The document properties that should be recorded as document metadata. |
Property | indexer.meta.forward.keylens |
Used in | CompressingMetaIndexBuilder |
Possible values | comma delimited list of the lengths of the values corresponding to the keys to be used as document metadata. |
Default value | 20 |
Configures | How long values can be in the MetaIndex. |
Property | indexer.meta.reverse.keys |
Used in | CompressingMetaIndexBuilder |
Possible values | comma delimited list of the keys that can be used to uniquely identify documents. |
Default value | 20 |
Configures | The MetaIndex keys that can unique identify a document. E.g. docno,url. |
Property | max.term.length |
Used in | org.terrier.utility.ApplicationSetup, org.terrier.indexing.FileDocument, org.terrier.indexing.HTMLDocument, org.terrier.indexing.TRECDocument, org.terrier.indexing.TRECFullTokenizer, org.terrier.structures.Lexicon, org.terrier.structures.BlockLexicon, org.terrier.structures.BlockLexiconInputStream, org.terrier.structures.BlockLexiconOutputStream, org.terrier.structures.LexiconInputStream, org.terrier.structures.LexiconOutputStream |
Possible values | Integer value > 0 |
Default value | 20 |
Configures | MAX_TERM_LENGTH. The size in the lexicon reserved for a string, i.e. the max length of any term in the index. term. |
Property | memory.reserved |
Used in | org.terrier.indexing.BasicSinglePassIndexer |
Possible values | integer > 0, probably around 50 million |
Default value | 50000000 |
Configures | Free memory threshold that forces a run to be committed to disk in the single-pass indexer. Higher values means less chance of OutOfMemoryError occurring, but slower indexing speed as more runs will be generated. |
Property | memory.heap.usage |
Used in | org.terrier.indexing.BasicSinglePassIndexer |
Possible values | positive float, range 0.0f - 1.0f |
Default value | 0.70 |
Configures | amount of max heap allocated to JVM before a run is committed. Smaller values mean more runs and hence slower indexing. Larger values means more risk of OutOfMemoryError occurrences. |
Property | docs.check |
Used in | org.terrier.indexing.BasicSinglePassIndexer |
Possible values | positive integer > 0 |
Default value | 20 |
Configures | how often to check the amount of free memory. Lower values gives more protection from OutOfMemoryError. |
Property | inverted2direct.processtokens |
Used in | org.terrier.structures.indexing.singlepass.Inverted2DirectIndexBuilder |
Possible values | positive long > 0 |
Default value | 100000000, 10000000 for blocks |
Configures | total number of tokens to attempt each iteration of building the direct index. Use a lower value if OutOfMemoryError occurs. |
Property | terrier.index.retrievalLoadingProfile.default |
Used in | org.terrier.structures.Index |
Possible values | true, false |
Default value | true |
Configures | Index.RETRIEVAL_LOADING_PROFILE. Whether index structures should be preloaded for retrieval. |
Property | TaggedDocument.abstracts |
Used in | org.terrier.indexing.TaggedDocument |
Possible values | Comma delimited list of abstract names to save as document properties |
Default value | not specified |
Configures | The list of abstract names to save as document properties when indexing a TaggedDocument or one of its subclasses. |
Property | TaggedDocument.abstracts.tags |
Used in | org.terrier.indexing.TaggedDocument |
Possible values | Comma delimited list of tags from which to save abstracts |
Default value | not specified |
Configures | The names of tags to save text from. ELSE is special tag name, which means anything not consumed by other tags. |
Property | TaggedDocument.abstracts.tags.casesensitive |
Used in | org.terrier.indexing.TaggedDocument |
Possible values | true or false |
Default value | false |
Configures | Configures if the tag matching is case-sensitive or not. |
Property | TaggedDocument.abstracts.lengths |
Used in | org.terrier.indexing.TaggedDocument |
Possible values | Comma delimited list of maximum lengths for each abstract |
Default value | Length 0 |
Configures | The max lengths of the abstracts. Defaults to empty. |
Property | FileDocument.abstract |
Used in | org.terrier.indexing.FileDocument |
Possible values | Name to call the abstract |
Default value | not specified |
Configures | The name of the abstract to save from the document. Note that only if this is set will an abstract be generated. Only a single abstract can be generated from a FileDocument. |
Property | FileDocument.abstract.length |
Used in | org.terrier.indexing.FileDocument |
Possible values | The maximum length for the abstract |
Default value | 0 |
Configures | The maximum length for the abstract. |
Property | ignore.low.idf.terms |
Used in | org.terrier.matching.Matching, org.terrier.matching.LMMatching |
Possible values | true, false |
Default value | true |
Configures | Ignores a term that has a low IDF, ie appears in many documents. You may wish to turn this off for small or focused collections. |
Property | interactive.output.format.length |
Used in | org.terrier.applications.InteractiveQuerying |
Possible values | integer number > 0 |
Default value | 1000 |
Configures | the maximum number of results to be displayed for Interactive querying |
Property | trec.model |
Used in | org.terrier.applications.TRECQuerying |
Possible values | Name of weighting models |
Default value | InL2 |
Configures | The weighting model to use during retrieval. |
Property | trec.results |
Used in | org.terrier.utility.ApplicationSetup, TrecTerrier |
Possible values | Absolute directory path |
Default value | TERRIER_VAR + value of "trec.results"" |
Configures | TREC_RESULTS. Where TREC*Querying applications should store their results files and where evaluation files should be placed. |
Property | trec.results.file |
Used in | org.terrier.applications.TRECQuerying |
Possible values | A valid file name. |
Default value | not specified |
Configures | An arbitrary name for a TREC results file. |
Property | trec.querycounter.type |
Used in | org.terrier.applications.TRECQuerying |
Possible values | sequential, random |
Default value | sequential |
Configures | Whether to use sequential (auto-incremented) or randomly generated suffixes for run names. |
Property | trec.results.suffix |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | string |
Default value | .res |
Configures | ApplicationSetup.TREC_RESULTS_SUFFIX. The suffix to be used for result files. |
Property | trec.runtag |
Used in | org.terrier.applications.TRECQuerying, org.terrier.applications.TRECQueryingExpansion |
Possible values | string |
Default value | not specified |
Configures | An arbitrary runtag (6th field) for a TREC results file. |
Property | trec.topics |
Used in | org.terrier.applications.TRECQuerying |
Possible values | A valid topics file name |
Default value | not specified |
Configures | A single file containing the topics to be processed. |
Property | trec.topics.parser |
Used in | org.terrier.applications.TRECQuerying |
Possible values | A sub-class of org.terrier.structures.QuerySource |
Default value | TRECQuery |
Configures | The class to be used when parsing a topics file. |
Property | trec.encoding |
Used in | org.terrier.structures.TRECQuery, org.terrier.indexing.TRECCollection, org.terrier.indexing.TRECUTFCollection, org.terrier.terms.Stopwords |
Possible values | A valid encoding scheme. |
Default value | The system's default charset. |
Configures | The encoding to use for topics, documents, and stopwords files. |
Property | trec.qrels |
Used in | org.terrier.utility.ApplicationSetup |
Possible values | Absolute filename |
Default value | not specified |
Configures | A single file containing the qrels to evaluate with. |
Property | trec.output.format.length |
Used in | org.terrier.applications.TRECQuerying, org.terrier.applications.TRECQueryingExpansion, org.terrier.applications.TRECLMQuerying |
Possible values | integer number > 0 |
Default value | 1000 |
Configures | the maximum number of results to be displayed for TREC querying |
Property | trec.querying.outputformat |
Used in | org.terrier.applications.TRECQuerying |
Possible values | A sub-class of TRECQuerying$OutputFormat |
Default value | TRECQuerying$TRECDocnoOutputFormat |
Configures | The class used to write the results file. |
Property | trec.querying.resultscache |
Used in | org.terrier.applications.TRECQuerying |
Possible values | A sub-class of TRECQuerying$QueryResultCache |
Default value | TRECQuerying$NullQueryResultCache |
Configures | The class used to cache the results. |
Property | trec.querying.dump.settings |
Used in | org.terrier.applications.TRECQuerying |
Possible values | true, false |
Default value | true |
Configures | Whether the settings used to generate a results file should be dumped to a .settings file in conjunction with the .res file. |
Property | trec.iteration |
Used in | org.terrier.applications.TRECQuerying, org.terrier.applications.TRECQueryingExpansion, org.terrier.applications.TRECLMQuerying |
Possible values | String |
Default value | Q |
Configures | Related to standard format of TREC results |
Property | trec.manager |
Used in | org.terrier.applications.TRECQuerying, org.terrier.applications.TRECQueryingExpansion, |
Possible values | String, Class name in org.terrier.querying |
Default value | Manager |
Configures | The Manager class to use during querying |
Property | trec.matching |
Used in | org.terrier.applications.TRECQuerying, org.terrier.applications.TRECQueryingExpansion, |
Possible values | String, Class name in org.terrier.matching |
Default value | org.terrier.matching.taat.Full |
Configures | The Matching class to use during querying |
Property | matching.trecresults.file |
Used in | org.terrier.matching.TRECResultsMatching |
Possible values | A valid TREC results file |
Default value | not specified |
Configures | The TREC-formatted results file containing search results for each of the topics specified in the trec.topics property |
Property | matching.trecresults.format |
Used in | org.terrier.matching.TRECResultsMatching |
Possible values | DOCNO, DOCID |
Default value | DOCNO |
Configures | Whether the TREC-formatted results file contains DOCNOs or Terrier's internal (integer) docids |
Property | matching.trecresults.scores |
Used in | org.terrier.matching.TRECResultsMatching |
Possible values | true, false |
Default value | true |
Configures | Whether Terrier should use the relevance scores from the TREC-formatted results file |
Property | matching.trecresults.length |
Used in | org.terrier.matching.TRECResultsMatching |
Possible values | a non-negative integer |
Default value | 1000 |
Configures | The maximum number of results to be retrieved from a TREC results file for each query. If set to 0, all available results are retrieved (note that setting this property to 0 may slow down the retrieval process for large collections, as a result set of the size of the collection will be allocated in memory) |
Property | parameter.free.expansion |
Used in | org.terrier.matching.models.queryexpansion.QueryExpansionModel |
Possible values | true or false |
Default value | true |
Configures | Whether we apply parameter-free query expansion or not. |
Property | rocchio.beta |
Used in | org.terrier.matching.models.queryexpansion.QueryExpansionModel |
Possible values | float |
Default value | 0.4 |
Configures | The parameter of Rocchio's automatic query expansion |
Property | trec.qe.model |
Used in | org.terrier.applications.TRECQuerying |
Possible values | Query expansion models |
Default value | Bo1 |
Configures | A name of a query expansion model |
Property | expansion.documents |
Used in | org.terrier.matching.models.queryexpansion.QueryExpansionModel |
Possible values | integer |
Default value | 3 |
Configures | The number of top-ranked documents to be considered in the pseudo relevance set |
Property | expansion.terms |
Used in | org.terrier.matching.models.queryexpansion.QueryExpansionModel, |
Possible values | integer |
Default value | 10 |
Configures | The number of the highest weighted terms from the pseudo relevance set to be added to the original query. There can be overlap between the original query terms and the added terms from the pseudo relevance set |
Property | expansion.mindocuments |
Used in | org.terrier.querying.ExpansionTerms |
Possible values | integer |
Default value | 2 |
Configures | The minimum number of documents a term must exist in before it can be considered to be informative. Defaults to 2. For more information, see Giambattista Amati: Information Theoretic Approach to Information Extraction. FQAS 2006: 519-529 DOI 10.1007/11766254_44 |
Property | qe.feedback.selector |
Used in | org.terrier.querying.QueryExpansion |
Possible values | classname, or comma-delimited class names |
Default value | PseudoRelevanceFeedbackSelector |
Configures | Class(es) that select feedback documents for query expansion. All classes must implement FeedbackSelector. If more than one is specified, then a chain is assumed, with last being innermost in the chain. |
Property | qe.expansion.terms.class |
Used in | org.terrier.querying.QueryExpansion |
Possible values | classname, or comma-delimited class names |
Default value | DFRBagExpansionTerms |
Configures | Class(es) that select terms during query expansion. All classes must extend ExpansionTerms. If more than one is specified, then a chain is assumed, with last being innermost in the chain. |
Property | match.empty.query |
Used in | org.terrier.matching.Matching, org.terrier.matching.LMMatching |
Possible values | true, false |
Default value | true |
Configures | If true, return all documents for an empty query. Use this if you have post filter/processes to filter out the documents. E.g. link: site: etc |
Property | querying.allowed.controls |
Used in | org.terrier.querying.Manager |
Possible values | Comma delimited list of which controls are allowed to be specified on the query. For use in interactive querying. |
Default value | c, range |
Configures | Comma delimited list of which controls are allowed to be specified on the query. For use in interactive querying. "String:String" in the query are assumed to be fields unless the first string is an allowed control. An example value would be: c, range, link, site. |
Property | querying.default.controls |
Used in | org.terrier.querying.Manager |
Possible values | Comma delimited list of control names and values. Names and values are separated by colon. |
Default value | not specified |
Configures | Sets the defaults control values for the querying process. Controls are used to control the querying process, and may be used to set matching models, post filters post processes etc. An example value would be: c:10,site:gla.ac.uk |
Property | querying.postprocesses.order |
Used in | org.terrier.querying.Manager |
Possible values | Comma delimited list of all allowed post processes. |
Default value | not specified |
Configures | Specifies the order in which post processes may be be called, and those that may be called. This is because post processes often have inter-dependencies. An example value would be: QueryExpansion,Scope,Site |
Property | querying.postprocesses.controls |
Used in | org.terrier.querying.Manager |
Possible values | Comma and colon delimited list of control names and post process names. |
Default value | not specified |
Configures | Specifies which controls enable which post processes. An example value would be: site:Site,qe:QueryExpansion,scope:Scope |
Property | querying.postfilters.order |
Used in | org.terrier.querying.Manager |
Possible values | Comma delimited list of all allowed post filters. |
Default value | not specified |
Configures | Specifies the order in which post filters may be be called, and those that may be called. This is because post filters often have inter-dependencies. An example value would be: LinkFilter |
Property | querying.postfilters.controls |
Used in | org.terrier.querying.Manager |
Possible values | Comma and colon delimited list of control names and post filter names. |
Default value | not specified |
Configures | Specifies which controls enable which post filters. An example value would be: link:LinkFilter |
Property | matching.dsms |
Used in | org.terrier.matching.Matching, org.terrier.matching.LMMatching |
Possible values | Comma delimited names of classes in uk/ac/gla/terrier/matching/dsms, or other fully qualified models |
Default value | not specified |
Configures | Specifies the static org.terrier.matching.dsms.DocumentScoreModifiers that should be applied to all terms of all queries. |
Property | matching.retrieved_set_size |
Used in | org.terrier.matching.Matching, org.terrier.matching.LMMatching |
Possible values | integer values > 0 |
Default value | 1000 |
Configures | Maximum size of the result set. |
Property | desktop.file.associations |
Used in | org.terrier.applications.desktop.filehandling.AssociationFileOpener |
Possible values | absolute path to filename |
Default value | TERRIER_VAR/desktop.fileassoc |
Configures | the name of the file in which we save the file type associations with applications. If no absolute path is specified it will be presumed by TERRIER_HOME/var |
Property | desktop.indexing.singlepass |
Used in | org.terrier.applications.desktop.DesktopTerrier |
Possible values | true, false |
Default value | false |
Configures | Whether single-pass indexing is used by in the Desktop Terrier. |
Property | desktop.directories.filelist |
Used in | org.terrier.applications.desktop.DesktopTerrier |
Possible values | absolute path to filename |
Default value | TERRIER_VAR\index\data.filelist |
Configures | the name of the file in which we list all files that have been indexed |
Property | trec.collection.pointers |
Used in | org.terrier.indexing.TRECCollection |
Possible values | full path to filename |
Default value | TERRIER_INDEX_PATH + "docpointers.col" | Configures | The name of a file that saves pointers for each file to the original text in the collection files. |
Property | stopwords.intern.terms |
Used in | org.terrier.terms.Stopwords |
Possible values | true, false |
Default value | false |
Configures | Whether stopwords should be interned during indexing. |
Property | string.use_utf |
Used in | org.terrier.structures.TRECQuery, org.terrier.structures.LexiconMerger, org.terrier.indexing.BasicIndexer, org.terrier.indexing.BlockIndexer, org.terrier.indexing.SimpleXMLCollection |
Possible values | true, false |
Default value | false |
Configures | Whether UTF support should be enabled. |
Copyright © 2011 University of Glasgow | All Rights Reserved