[TR-56] 2Way StructureMerger - produces too large termids Created: 07/Sep/09  Updated: 05/Mar/10  Resolved: 09/Sep/09

Status: Resolved
Project: Terrier Core
Component/s: .structures
Affects Version/s: 3.0
Fix Version/s: 3.0

Type: Bug Priority: Major
Reporter: Craig Macdonald Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None


 Description   
junit.framework.AssertionFailedError: Got too big a termid (3867) from direct index input stream, numTerms=2361
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkDirectIndex(ShakespeareEndToEndTest.java:219)
at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkIndex(ShakespeareEndToEndTest.java:285)
at uk.ac.gla.terrier.tests.BatchEndToEndTest.doTrecTerrierIndexingRunAndEvaluate(BatchEndToEndTest.java:157)
at uk.ac.gla.terrier.tests.BasicShakespeareEndToEndTest.testBasicClassical(BasicShakespeareEndToEndTest.java:19)


 Comments   
Comment by Craig Macdonald [ 09/Sep/09 ]

Problem is that in the inverted merging phase, old-termid -> new-termid mappings are produced for use when the direct index is being merged. However, the new termids may not have the same ordering as the old termids, so postings from the second direct file need to be reordered when being written to the first direct file.

Comment by Craig Macdonald [ 09/Sep/09 ]

Fix committed to trunk. Very easy once the penny has dropped!

Comment by Craig Macdonald [ 09/Sep/09 ]

Meant to add that this issue is checked by the end-to-end tests.

Generated at Mon Dec 18 08:48:19 GMT 2017 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.