Terrier Core

Hadoop Indexing loads CompressedMetaIndex into memory during reduce phase

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.0
  • Component/s: None
  • Description:
    The use of the MetaIndex during the reduce phase to merge the meta indexes. If there are lots of map tasks, this can result in too many meta indices being loaded into memory at once.

    The solution is to access the MetaIndex as a stream.
  1. TR-30.v1.patch
    (15 kB)
    Craig Macdonald
    08/May/09 11:01 AM

Activity

Hide
Craig Macdonald added a comment - 18/Jun/09 2:24 PM

Committed to SVN. Also uses MapReduce job to do metaindex inversion, i.e. meta->docid

Show
Craig Macdonald added a comment - 18/Jun/09 2:24 PM Committed to SVN. Also uses MapReduce job to do metaindex inversion, i.e. meta->docid

People

Dates

  • Created:
    08/May/09 11:00 AM
    Updated:
    05/Mar/10 4:48 PM
    Resolved:
    18/Jun/09 2:24 PM