Details
-
Type:
New Feature
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 3.6
-
Fix Version/s: 4.0
-
Component/s: None
-
Labels:None
Description
Attached, the files for enabling modern integer compression techniques for the inverted index in Terrier.
Files:
matteo_compression.jar: the code
matteo_compression_test.jar: unit testing
JavaFastPFOR_Terrier.jar: MODIFIED JavaFastPFOR library, contains also Kamikaze. Add this to the build path.
Required modifications to the rest of the code:
1) In org.terrier.compression, make BitInBase public
2) In (tes) org.terrier.tests.ShakespeareEndToEndTest, use always PostingIndex and PostingIndexInputStream instead of InvertedIndex and InvertedIndexInputStream
3) Replace PostingTestUtils with the attached file (it contains some extra methods)
The main entry point for this library may be the InvertedIndexRecompresser utility, which recompress a classical inverted index file using modern integer techinques specified via a configuration file. Read the javadoc documentation to learn about the usage.
Files:
matteo_compression.jar: the code
matteo_compression_test.jar: unit testing
JavaFastPFOR_Terrier.jar: MODIFIED JavaFastPFOR library, contains also Kamikaze. Add this to the build path.
Required modifications to the rest of the code:
1) In org.terrier.compression, make BitInBase public
2) In (tes) org.terrier.tests.ShakespeareEndToEndTest, use always PostingIndex and PostingIndexInputStream instead of InvertedIndex and InvertedIndexInputStream
3) Replace PostingTestUtils with the attached file (it contains some extra methods)
The main entry point for this library may be the InvertedIndexRecompresser utility, which recompress a classical inverted index file using modern integer techinques specified via a configuration file. Read the javadoc documentation to learn about the usage.
Attachments
Issue Links
Activity
Field | Original Value | New Value |
---|---|---|
Attachment | PostingTestUtils.java [ 10388 ] |
Attachment | matteo_compression_test.jar [ 10389 ] |
Description |
Attached the file for enabling modern integer compression techniques for the inverted index in Terrier. Files: matteo_compression.jar: the code matteo_compression_test.jar: unit testing JavaFastPFOR_Terrier.jar: MODIFIED JavaFastPFOR library, contains also Kamikaze. Add this to the build path. Required modifications to the rest of the code: 1) In org.terrier.compression, make BitInBase public 2) In (tes) org.terrier.tests.ShakespeareEndToEndTest, use always PostingIndex and PostingIndexInputStream instead of InvertedIndex and InvertedIndexInputStream 3) Replace PostingTestUtils with the attached file (it contains some extra methods) |
Attached, the files for enabling modern integer compression techniques for the inverted index in Terrier. Files: matteo_compression.jar: the code matteo_compression_test.jar: unit testing JavaFastPFOR_Terrier.jar: MODIFIED JavaFastPFOR library, contains also Kamikaze. Add this to the build path. Required modifications to the rest of the code: 1) In org.terrier.compression, make BitInBase public 2) In (tes) org.terrier.tests.ShakespeareEndToEndTest, use always PostingIndex and PostingIndexInputStream instead of InvertedIndex and InvertedIndexInputStream 3) Replace PostingTestUtils with the attached file (it contains some extra methods) The main entry point for this library may be the InvertedIndexRecompresser utility, which recompress a classical inverted index file using modern integer techinques specified via a configuration file. Read the javadoc documentation to learn about the usage. |
Fix Version/s | 4.0 [ 10050 ] |
Attachment | IntegerCodecCompressionConfiguration.java [ 10402 ] |
Attachment | matteo_compression.jar [ 10406 ] |
Summary | New integer compression techniques for the inverted index | New integer compression techniques for the direct and inverted index structures |
Status | Open [ 1 ] | Resolved [ 5 ] |
Resolution | Fixed [ 1 ] |