[TR-17] DOCNOs must be in lexicographical order when indexing TREC collections Created: 17/Feb/09  Updated: 13/May/09  Resolved: 29/Apr/09

Status: Resolved
Project: Terrier Core
Component/s: None
Affects Version/s: 2.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Gianni Amati Assignee: Craig Macdonald
Resolution: Duplicate  
Labels: None

Issue Links:
Block
is blocked by TR-14 Refactor Lexicons: LexiconEntry shoul... Resolved
Duplicate
is duplicated by TR-42 Improved Index format and class changes Resolved

 Description   
This might be a bug in case one wants to merge distributed indexes from the same or from different collections. Maybe we need at least a warning when the inconsistency of DOCNO is detected. Supposing that each sub-collection has the lexicographical order for the DOCNO, we might only check the exact order of merging 2 or more files. This also prevent a bug when using the Merge method class only.

We suggest to create a file with <docno, docid> information.
Generated at Sat Dec 16 16:57:08 GMT 2017 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.