[TR-41] Hadoop Indexing loads CompressedMetaIndex into memory during reduce phase Created: 08/May/09  Updated: 05/Mar/10  Resolved: 18/Jun/09

Status: Resolved
Project: Terrier Core
Component/s: None
Affects Version/s: 3.0
Fix Version/s: 3.0

Type: Improvement Priority: Major
Reporter: Craig Macdonald Assignee: Craig Macdonald
Resolution: Fixed  
Labels: None

Attachments: File TR-30.v1.patch    

The use of the MetaIndex during the reduce phase to merge the meta indexes. If there are lots of map tasks, this can result in too many meta indices being loaded into memory at once.

The solution is to access the MetaIndex as a stream.

Comment by Craig Macdonald [ 18/Jun/09 ]

Committed to SVN. Also uses MapReduce job to do metaindex inversion, i.e. meta->docid

Generated at Wed May 12 06:58:32 BST 2021 using JIRA 7.1.1#71004-sha1:d6b2c0d9b7051e9fb5e4eb8ce177ca56d91d7bd8.