[TR-303] Make compression pluggable/selectable during indexing Created: 24/Apr/14  Updated: 16/Jun/14  Resolved: 06/May/14

Status: Resolved
Project: Terrier Core
Component/s: None
Affects Version/s: None
Fix Version/s: 4.0

Type: New Feature Priority: Major
Reporter: Craig Macdonald Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None

Attachments: Text File CompressionFactory.java    
Issue Links:
Block
blocks TR-311 New integer compression techniques fo... Resolved
is blocked by TR-300 (Block)Inverted2Direct should not use... Resolved
Related
relates to TR-311 New integer compression techniques fo... Resolved

 Description   
I want to make it easy to select which suite of compression classes to use during indexing.

 Comments   
Comment by Craig Macdonald [ 25/Apr/14 ]

Matteo, Richard,

Can you review the attached class, I have implemented it to specify the classes to write and read a normal Terrier disk index.

Thanks

Craig

Comment by Craig Macdonald [ 25/Apr/14 ]

It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index.

Richard, I'd also like a comment on how this compression factory integrates with the in-memory index classes?

Comment by Matteo Catena [ 30/Apr/14 ]

The attached class seems ok to me.

> It strikes me that the name of the index data structure could be a parameter to the Factory method, as this would allow different compression techniques for the inverted versus direct index.
I'm not sure I got this. In my case I haven't work with direct indexes, but indeed IntegerCodingPostingIndex gets structureName as a constructor parameter. So MAYBE it can read properly written direct indexes without too many modifications.

Comment by Matteo Catena [ 30/Apr/14 ]

(In that case, of course direct and inverted index can be compressed differently)

Comment by Craig Macdonald [ 06/May/14 ]

Committed r3788

Generated at Wed Dec 13 11:13:25 GMT 2017 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.