[TR-56] 2Way StructureMerger - produces too large termids Created: 07/Sep/09  Updated: 05/Mar/10  Resolved: 09/Sep/09

Status: Resolved
Project: Terrier Core
Component/s: .structures
Affects Version/s: 3.0
Fix Version/s: 3.0

Type: Bug Priority: Major
Reporter: Craig Macdonald Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None

junit.framework.AssertionFailedError: Got too big a termid (3867) from direct index input stream, numTerms=2361
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkDirectIndex(ShakespeareEndToEndTest.java:219)
at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkIndex(ShakespeareEndToEndTest.java:285)
at uk.ac.gla.terrier.tests.BatchEndToEndTest.doTrecTerrierIndexingRunAndEvaluate(BatchEndToEndTest.java:157)
at uk.ac.gla.terrier.tests.BasicShakespeareEndToEndTest.testBasicClassical(BasicShakespeareEndToEndTest.java:19)

Comment by Craig Macdonald [ 09/Sep/09 ]

Problem is that in the inverted merging phase, old-termid -> new-termid mappings are produced for use when the direct index is being merged. However, the new termids may not have the same ordering as the old termids, so postings from the second direct file need to be reordered when being written to the first direct file.

Comment by Craig Macdonald [ 09/Sep/09 ]

Fix committed to trunk. Very easy once the penny has dropped!

Comment by Craig Macdonald [ 09/Sep/09 ]

Meant to add that this issue is checked by the end-to-end tests.

Generated at Mon Aug 10 16:34:33 BST 2020 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.