Desktop Terrier is an example application we have provided with Terrier for two purposes:
Importantly, Desktop Terrier is only a sample application to help users become used to the functionality that Terrier provides. We do not recommend Desktop Terrier to perform large or complex indexing jobs. Instead, once you are comfortable with the Terrier functionality, indexing and batch retrieval should be performed using the command line. You have been warned.
The application window of the Desktop Search features two main tabs: "Search" and "Index". In the following paragraphs we will explain how you can use the application to index and search documents on your computer.
Here we will explain how you can specify which documents you want Desktop Terrier to index.
Indexing is the process where Terrier examines all the files in the folders you specified, reads the documents if it can, and creates an index. There are only two buttons on the "Index" tab. The "Select Folders..." button opens a dialog that will allow you to select which folders should be indexed. The application will examine these folders recursively, and will index all the supported document types. Based on the file extension, the application will try to find a corresponding parser. If no appropriate parser can be found, the file will be ignored. At the moment Terrier supports parsing of Simple text, PDF, MS Word, MS PowerPoint, MS Excel, HTML, XML, XHTML, Tex, and Bib documents. Importantly, Desktop Terrier uses SimpleFileCollection, hence each file counts as a single document. More complex formats like those used at TREC are not detected by default. We recommend using Terrier from the command line to process these types of collection.
The "Create Index" button will initiate the indexing process. At the moment Terrier does not support incremental indexing. That means that every time you press the "Create Index" button Terrier will remove the old index and index all specified folders from scratch. Once you have selected the folders to index, you may click the "Create Index" button in order to start the indexing process. The progress of the indexing is documented in the text field at the bottom of the window, After the indexing has finished, the application will automatically switch to the "Search tab".
You can now use the Search tab of Desktop Terrier to search for documents. Enter terms that you think your document may contain in the text box beside the Search button, and press Search. Documents Terrier thinks are relevant will be displayed in the list below. You can open a document by double clicking on that row in the table. The type of the document is shown in the second column.
In the searching tab, you can enter a query in the text field and press the button "Search" to obtain the retrieval results. The results are shown in the table below the search field, as a ranked list of documents. The table has four columns. The first one contains the rank of a document, the second one contains the file name of a document. The third one contains the full path to the document and finally the fourth one contains the score of the document.
To formulate a query, you can incorporate the query language of Terrier. For example:
By default, Terrier Desktop Search retrieves the documents that contain all the query terms. If there are no such documents, then it returns the documents that contain at least one of the query terms.
In order to open one of the retrieved documents, you may double-click on its filename, i.e. the corresponding cell of the second column. Opening the retrieved files is a platform-dependent function. In Windows environments, the application uses the file associations used by the operating system, while in other environments, such as Linux or Mac OS X, the file associations need to be set by the user. In these cases, the associations are saved in a file with the default name desktop.fileassoc in the var directory of your installation.
If there is already an application associated with the file, then this application will start and open the file you double-clicked on. In the case when there is no application associated, a dialog will appear, in order to assist you with selecting an appropriate application.
This documentation is also available from the Help menu of the Desktop Terrier version.
Should you have trouble using Desktop Terrier, e.g. if the application is not running as expected, you can make use of the "--debug" option:
bin/desktop_terrier.sh --debug (Linux, Mac OS X) bin\desktop_terrier.bat --debug (Windows)
If you use Desktop Terrier regularly, you may wish to have Terrier re-index your documents automatically at set times. You can do this by scheduling Terrier to run with the "--reindex" option:
bin/desktop_terrier.sh --reindex (Linux, Mac OS X) bin\desktop_terrier.bat --reindex (Windows)
In order to schedule this command line for repetitive execution on Unix use the crontab utility. On Windows use the Scheduled Tasks functionality, which can be found in the Control Panel.
You can configure the Desktop using many of the properties listed elsewhere in the Terrier documentation. These can be set in the etc/terrier.properties file. Moreover, it is possible to configure the Desktop using the following properties:
Properties:
Copyright © 2011 University of Glasgow | All Rights Reserved