Terrier Users :  Terrier Forum terrier.org
General discussion about using/developing applications using Terrier 
Some features of Terrier
Posted by: carmen ()
Date: January 19, 2010 09:25PM

Hello:

I´m doing some experiments with Terrier for my thesis and I have a few questions, about some funcionalities that I´ve been trying to use but I don´t seem to find. Please if anyone knows at least one of the answers I´ll really appreciate you could tell me.

1. Have Terrier incremental indexing?? I read an old post (http://ir.dcs.gla.ac.uk/terrier/forum//read.php?3,654,721#msg-721) that someone was thinking of doing this, but never was post the conclusions of this. In Terrier FAQ I found a proposal for three scenarios, but does anyone have some example code about this, or at least what are the classes to use for add new documents to an index already created??

2.Can Terrier make a summarization of a retrieved document, or even highlight some text in it?? How?

3.Terrier index is case insensitive, exists a way to do it case sensitive??? I just found in terrier.properties a way to put this function for trec tags, but not for the index itself. My objective with this is to make case sensitive and case insensitive search.

4. What classes I have to use to configure search for more than one term??

5. Sometimes when a user searches, writes just some characters of the word, for example: instead of "house", writes as a query "h", or "ho", or "hous"...etc Is there a class in Terrier that allows this considerations in the search??

I´m working with SimpleFileCollection, but if Terrier has these functions for TRECCollections I would like to know how I can use them.

Thanks in advance,

Carmen

Options: ReplyQuote
Re: Some features of Terrier
Posted by: rodrygo ()
Date: February 17, 2010 12:07PM

Hi Carmen,

As for your questions:

1) Currently, Terrier offers no support for incremental indexing. However, there is a simple workaround. Basically, you have to index the new content, and merge it with your existing index. Please have a look at StructureMerger: [ir.dcs.gla.ac.uk]

2) Nope, but you can implement one by looking up the terms in any given document using the DirectIndex: [ir.dcs.gla.ac.uk]

3) From the properties page (http://ir.dcs.gla.ac.uk/terrier/doc/properties.html): TrecDocTags.casesensitive

4) What do you mean by "search for more than one term"? By default, Terrier retrieves documents containing *at least one* of the query terms. If you want it to search for documents containing *all* query terms, please have a look at [ir.dcs.gla.ac.uk]

5) Nope. If you really want to do this, you can start by looking at neighbour terms in the lexicon, since these share the same prefix.

Cheers,
Rodrygo

Options: ReplyQuote
Re: Some features of Terrier
Posted by: carmen ()
Date: March 01, 2010 03:35PM

Thanks a lot for your answers, it has been very helpful!

Greetings,
Carmen

Options: ReplyQuote


Sorry, only registered users may post in this forum.
This forum powered by Phorum.