Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-41

Hadoop Indexing loads CompressedMetaIndex into memory during reduce phase

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.0
    • Component/s: None
    • Labels:
      None

      Description

      The use of the MetaIndex during the reduce phase to merge the meta indexes. If there are lots of map tasks, this can result in too many meta indices being loaded into memory at once.

      The solution is to access the MetaIndex as a stream.

        Attachments

          Activity

          Hide
          craigm Craig Macdonald added a comment -

          Committed to SVN. Also uses MapReduce job to do metaindex inversion, i.e. meta->docid

          Show
          craigm Craig Macdonald added a comment - Committed to SVN. Also uses MapReduce job to do metaindex inversion, i.e. meta->docid

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              craigm Craig Macdonald
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: