Terrier Users :  Terrier Forum terrier.org
General discussion about using/developing applications using Terrier 
Which scoring function is a probability function in terrier, ie, the rating of documents is less than 1?
Posted by: kasadegh ()
Date: February 15, 2018 09:22AM

Which scoring function is a probability function in terrier, ie, the rating of documents is less than 1?
I examined the output of different models and all methods, the scoring of documents is more than 1. I am looking for a model for scoring documents as probabilities.
Thanks

Options: ReplyQuote
Re: Which scoring function is a probability function in terrier, ie, the rating of documents is less than 1?
Posted by: craigm ()
Date: February 15, 2018 10:09AM

Hi,

A strict probabilistic model would assume term independence, and therefore multiply the probabilities. Multiplying small probabilities leads to increasing floating point error. For that reason, its more conventional to add the logarithms of probabilities.

Probably you should just transform the score of whichever weighting model works best. See [theses.gla.ac.uk] page 107.

Craig

Options: ReplyQuote
Re: Which scoring function is a probability function in terrier, ie, the rating of documents is less than 1?
Posted by: kasadegh ()
Date: February 15, 2018 07:08PM

thanks for your response,but I need a scoring function that can be used to define a common threshold for all queries, and for any queries, I will retrieve a document that the score is more than a threshold. Which one of the retrieval models in Terrier is suitable for this? Can I do this with log transformation of the existing models in Terrier?

Thanks

Options: ReplyQuote
Re: Which scoring function is a probability function in terrier, ie, the rating of documents is less than 1?
Posted by: craigm ()
Date: February 16, 2018 09:20AM

You can make a log transform of any model. However, thresholding (i.e.) selecting the cutoff to stop ranking at is difficult.

See

Avi Arampatzis, Jaap Kamps, Stephen Robertson: Where to stop reading a ranked list?: threshold optimization using truncated score distributions. SIGIR 2009: 524-531

I would advise rethinking your approach.

Craig

Options: ReplyQuote


Sorry, only registered users may post in this forum.
This forum powered by Phorum.