Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-303

Make compression pluggable/selectable during indexing

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0
    • Component/s: None
    • Labels:
      None

      Description

      I want to make it easy to select which suite of compression classes to use during indexing.

        Attachments

          Issue Links

            Activity

            Hide
            craigm Craig Macdonald added a comment -

            Matteo, Richard,

            Can you review the attached class, I have implemented it to specify the classes to write and read a normal Terrier disk index.

            Thanks

            Craig

            Show
            craigm Craig Macdonald added a comment - Matteo, Richard, Can you review the attached class, I have implemented it to specify the classes to write and read a normal Terrier disk index. Thanks Craig
            Hide
            craigm Craig Macdonald added a comment -

            It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index.

            Richard, I'd also like a comment on how this compression factory integrates with the in-memory index classes?

            Show
            craigm Craig Macdonald added a comment - It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index. Richard, I'd also like a comment on how this compression factory integrates with the in-memory index classes?
            Hide
            catena.matteo Matteo Catena added a comment -

            The attached class seems ok to me.

            > It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index.
            I'm not sure I got this. In my case I haven't work with direct indexes, but indeed IntegerCodingPostingIndex gets structureName as a constructor parameter. So MAYBE it can read properly written direct indexes without too many modifications.

            Show
            catena.matteo Matteo Catena added a comment - The attached class seems ok to me. > It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index. I'm not sure I got this. In my case I haven't work with direct indexes, but indeed IntegerCodingPostingIndex gets structureName as a constructor parameter. So MAYBE it can read properly written direct indexes without too many modifications.
            Hide
            catena.matteo Matteo Catena added a comment -

            (In that case, of course direct and inverted index can be compressed differently)

            Show
            catena.matteo Matteo Catena added a comment - (In that case, of course direct and inverted index can be compressed differently)
            Hide
            craigm Craig Macdonald added a comment -

            Committed r3788

            Show
            craigm Craig Macdonald added a comment - Committed r3788

              People

              • Assignee:
                craigm Craig Macdonald
                Reporter:
                craigm Craig Macdonald
              • Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: