Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-242

Problem with query terms frequency (key frequency = 1) using BM25

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: 3.5
    • Fix Version/s: 3.6
    • Component/s: .indexing, .matching, .querying
    • Labels:
      None

      Description

      I m using the BM25 model for the retrieval step and have noticed some problems while parsing the queries.
      I have evaluated 2 sets of queries :
      1- set1: queries with one occurrence of each query's term
      2- set2: the same queries but the occurrences of each term is > 1
      I have got the same results with both sets, am I missing somthing in the configuration of terrier properties? or it is just a problem with the BM25 formula? (the normalisation of the query terms?)
      I have found a similar issue here: http://terrier.org/forum//read.php?3,1222
      I set querying.normalise.weights to false but nothing changed.
      In the BM25 formula the key frequency is supposed to provide the query term frequency, with set1 the key frequency should be > 1 when I tried to get the value of the query term frequency while computing the score, I noticed that the value returned is always equals to 1 (key frequency =1).
      I also tried to parse SingleLineTrecQueries instead of TRECQueries format and again nothing changed when using the two sets of queries (set1 and set2). Any idea about how to get the query term frequency when the query's terms occurs more than once in the query?
      Many thanks

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                craigm Craig Macdonald
                Reporter:
                bouhini Chahrazed Bouhini
              • Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: