Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-217

CS query expansion model is incorrect

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5
    • Fix Version/s: 3.6
    • Component/s: .matching
    • Labels:
      None

      Description

      It was recognised in forum issue http://terrier.org/forum//read.php?3,2619 that CS.java query expansion model does not reproduce the formulae from Amati's thesis faithfully.

        Attachments

          Activity

          craigm Craig Macdonald created issue -
          Hide
          craigm Craig Macdonald added a comment -

          In the formulae for CS, *totalDocumentLength should be *withinDocumentFrequency, on the second last line of the formulae.

          Show
          craigm Craig Macdonald added a comment - In the formulae for CS, *totalDocumentLength should be *withinDocumentFrequency, on the second last line of the formulae.
          Hide
          craigm Craig Macdonald added a comment -

          Revised function:

              return totalDocumentLength * D
              +0.5d
              * Idf.log(
                2
                * Math.PI
                * withinDocumentFrequency
                * (1d - withinDocumentFrequency / totalDocumentLength));
          
          Show
          craigm Craig Macdonald added a comment - Revised function:     return totalDocumentLength * D    +0.5d    * Idf.log(      2      * Math .PI      * withinDocumentFrequency    * (1d - withinDocumentFrequency / totalDocumentLength));
          Hide
          craigm Craig Macdonald added a comment -

          On Gianni's advice, the revised model is called BA.java, which stands for BinomialApproximation. CS stood for Chi Square, however there was no Chi Square calculation within this class.

          Show
          craigm Craig Macdonald added a comment - On Gianni's advice, the revised model is called BA.java, which stands for BinomialApproximation. CS stood for Chi Square, however there was no Chi Square calculation within this class.
          Hide
          craigm Craig Macdonald added a comment -

          I have committed the revised query expansion model to SVN, r3678. Thanks to all those involved!

          Show
          craigm Craig Macdonald added a comment - I have committed the revised query expansion model to SVN, r3678. Thanks to all those involved!
          craigm Craig Macdonald made changes -
          Field Original Value New Value
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              craigm Craig Macdonald
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: