Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-60

Remove PonteCroft language modelling

    Details

      Description

      The PonteCroft language modelling approach is supported in Terrier, but its use involves the creation of additional index structures. This model is seldom used by ourselves, and by the language modelling community. Terrier has support for Hiemstra's LM, and we have in the common package Dirichlet LM.

      It is believed that the framework is operational at present. However, it does not have any unit tests.

      The purpose of this issue is to have a discussion at whether this package is a strategic part to remain in Terrier long term, or whether it should be removed.

      There are three options relating to the framework:
       a. Remove it completely
       b. Move it to common package (where it may stagnate)
       c. Keep it.

      A pre-requisite for b & c are that we add some method for testing that it is functional.

      Please discuss.

        Attachments

          Issue Links

            Activity

            Hide
            ounis Iadh Ounis added a comment -

            I agree that the Ponte-Croft model is hardly used. We never really used it, but more importantly it is hardly used in recent language modelling papers. In fact, the Hiemstra model is much more effective, and is more suitable as a QL baseline. Therefore, I agree that the presence of the Ponte-Croft model in the Terrier core is not really needed.

            I'm however more inclined to move it from the core to a common package (where it can peacefully die --hummm, I meant stagnate), i.e. I vote for option (b) above. We never know: we might need it for something one day.

            I agree that we need unit testing for it though.

            Show
            ounis Iadh Ounis added a comment - I agree that the Ponte-Croft model is hardly used. We never really used it, but more importantly it is hardly used in recent language modelling papers. In fact, the Hiemstra model is much more effective, and is more suitable as a QL baseline. Therefore, I agree that the presence of the Ponte-Croft model in the Terrier core is not really needed. I'm however more inclined to move it from the core to a common package (where it can peacefully die --hummm, I meant stagnate), i.e. I vote for option (b) above. We never know: we might need it for something one day. I agree that we need unit testing for it though.
            Hide
            craigm Craig Macdonald added a comment -

            Resolved. (Though common version doesnt actually work)

            Show
            craigm Craig Macdonald added a comment - Resolved. (Though common version doesnt actually work)

              People

              • Assignee:
                craigm Craig Macdonald
                Reporter:
                craigm Craig Macdonald
              • Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: