Class TRECQuery
- java.lang.Object
-
- org.terrier.applications.batchquerying.TRECQuery
-
- All Implemented Interfaces:
java.util.Iterator<java.lang.String>
,QuerySource
- Direct Known Subclasses:
SingleLineTRECQuery
public class TRECQuery extends java.lang.Object implements QuerySource
This class is used for reading the queries from TREC topic files.Properties:
- trecquery.ignore.desc.narr.name.tokens - should the token DESCRIPTION and NARRATIVE in the desc and narr fields be ignored? Defaluts to true
- tokeniser - name of the Tokeniser class to use to tokenise topics. Defaults to EnglishTokeniser.
- trec.encoding - use to set the encoding of TREC topic files. Defaults to the systems default encoding.
- Author:
- Ben He & Craig Macdonald
-
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.String
desiredEncoding
Encoding to be used to open all files.protected static boolean
IGNORE_DESC_NARR_NAME_TOKENS
Value of trecquery.ignore.desc.narr.name.tokens - should the token DESCRIPTION and NARRATIVE in the desc and narr fields be ignored? Defaluts to true?protected int
index
The index of the queries.protected static org.slf4j.Logger
logger
The logger used for this classprotected java.lang.String[]
queries
The queries in the topic files.protected java.lang.String[]
query_ids
The query identifiers in the topic files.protected TagSet
tags
protected java.lang.String[]
topicFiles
The topic files used in this object
-
Constructor Summary
Constructors Constructor Description TRECQuery()
Constructs an instance of TRECQuery, that reads and stores all the queries from the files defined in the trec.topics property.TRECQuery(java.lang.String queryfilename)
Constructs an instance of TRECQuery that reads and stores all the queries from a file with the specified filename.TRECQuery(java.lang.String[] queryfilenames)
Constructs an instance of TRECQuery that reads and stores all the queries from files with the specified filename.TRECQuery(java.lang.String[] queryfilenames, java.lang.String docTag, java.lang.String idTag, java.lang.String[] whitelist, java.lang.String[] blacklist)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
checkEncoding()
boolean
extractQuery(java.lang.String[] queryfilenames, TagSet t, java.util.Vector<java.lang.String> vecStringQueries, java.util.Vector<java.lang.String> vecStringIds)
Extracts and stores all the queries from query files.boolean
extractQuery(java.lang.String queryfilename, TagSet t, java.util.Vector<java.lang.String> vecStringQueries, java.util.Vector<java.lang.String> vecStringIds)
Extracts and stores all the queries from a query file.int
getIndexOfCurrentQuery()
Returns the index of the last obtained query.java.lang.String[]
getInfo()
Returns the filenames of the topic files from which the queries were extractedint
getNumberOfQueries()
Returns the number of the queries read from the processed topic files.java.lang.String
getQuery(java.lang.String queryNo)
Return the query for the given query number.java.lang.String
getQueryId()
Returns the query identifier of the last query fetched, or the first one, if none has been fetched yet.java.lang.String[]
getQueryIds()
Returns the query idsboolean
hasNext()
static void
main(java.lang.String[] args)
mainjava.lang.String
next()
protected void
performExtraction()
void
remove()
void
reset()
Resets the query source back to the first query.java.lang.String[]
toArray()
Returns the queries in an array of strings
-
-
-
Field Detail
-
logger
protected static final org.slf4j.Logger logger
The logger used for this class
-
IGNORE_DESC_NARR_NAME_TOKENS
protected static final boolean IGNORE_DESC_NARR_NAME_TOKENS
Value of trecquery.ignore.desc.narr.name.tokens - should the token DESCRIPTION and NARRATIVE in the desc and narr fields be ignored? Defaluts to true?
-
desiredEncoding
protected java.lang.String desiredEncoding
Encoding to be used to open all files.
-
topicFiles
protected java.lang.String[] topicFiles
The topic files used in this object
-
queries
protected java.lang.String[] queries
The queries in the topic files.
-
query_ids
protected java.lang.String[] query_ids
The query identifiers in the topic files.
-
index
protected int index
The index of the queries.
-
tags
protected TagSet tags
-
-
Constructor Detail
-
TRECQuery
public TRECQuery(java.lang.String[] queryfilenames, java.lang.String docTag, java.lang.String idTag, java.lang.String[] whitelist, java.lang.String[] blacklist)
-
TRECQuery
public TRECQuery()
Constructs an instance of TRECQuery, that reads and stores all the queries from the files defined in the trec.topics property.
-
TRECQuery
public TRECQuery(java.lang.String queryfilename)
Constructs an instance of TRECQuery that reads and stores all the queries from a file with the specified filename.- Parameters:
queryfilename
- String the name of the file containing all the queries.
-
TRECQuery
public TRECQuery(java.lang.String[] queryfilenames)
Constructs an instance of TRECQuery that reads and stores all the queries from files with the specified filename.- Parameters:
queryfilenames
- String[] the name of the files containing all the queries.
-
-
Method Detail
-
extractQuery
public boolean extractQuery(java.lang.String[] queryfilenames, TagSet t, java.util.Vector<java.lang.String> vecStringQueries, java.util.Vector<java.lang.String> vecStringIds)
Extracts and stores all the queries from query files.- Parameters:
queryfilenames
- String the name of files containing topics.vecStringQueries
- Vector a vector containing the queries as strings.vecStringIds
- Vector a vector containing the query identifiers as strings.- Returns:
- boolean true if some queries were successfully extracted.
-
extractQuery
public boolean extractQuery(java.lang.String queryfilename, TagSet t, java.util.Vector<java.lang.String> vecStringQueries, java.util.Vector<java.lang.String> vecStringIds)
Extracts and stores all the queries from a query file.- Parameters:
queryfilename
- String the name of a file containing topics.vecStringQueries
- Vector a vector containing the queries as strings.vecStringIds
- Vector a vector containing the query identifiers as strings.- Returns:
- boolean true if some queries were successfully extracted.
-
checkEncoding
protected void checkEncoding()
-
performExtraction
protected void performExtraction()
-
getIndexOfCurrentQuery
public int getIndexOfCurrentQuery()
Returns the index of the last obtained query.- Returns:
- int the index of the last obtained query.
-
getNumberOfQueries
public int getNumberOfQueries()
Returns the number of the queries read from the processed topic files.- Returns:
- int the number of topics contained in the processed topic files.
-
getInfo
public java.lang.String[] getInfo()
Returns the filenames of the topic files from which the queries were extracted- Specified by:
getInfo
in interfaceQuerySource
-
getQuery
public java.lang.String getQuery(java.lang.String queryNo)
Return the query for the given query number.- Parameters:
queryNo
- String The number of a query.- Returns:
- String the string representing the query.
-
hasNext
public boolean hasNext()
- Specified by:
hasNext
in interfacejava.util.Iterator<java.lang.String>
-
next
public java.lang.String next()
- Specified by:
next
in interfacejava.util.Iterator<java.lang.String>
-
getQueryId
public java.lang.String getQueryId()
Returns the query identifier of the last query fetched, or the first one, if none has been fetched yet.- Specified by:
getQueryId
in interfaceQuerySource
- Returns:
- String the query number of a query.
-
getQueryIds
public java.lang.String[] getQueryIds()
Returns the query ids- Returns:
- String array containing the query ids.
- Since:
- 2.2
-
toArray
public java.lang.String[] toArray()
Returns the queries in an array of strings- Returns:
- String[] an array containing the strings that represent the queries.
-
reset
public void reset()
Resets the query source back to the first query.- Specified by:
reset
in interfaceQuerySource
-
remove
public void remove()
- Specified by:
remove
in interfacejava.util.Iterator<java.lang.String>
-
main
public static void main(java.lang.String[] args)
main- Parameters:
args
-
-
-