Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-40

Enable Hadoop-mode Map Output Compression

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.0
    • Component/s: .indexing
    • Labels:
      None

      Description

      Hadoop supports the compression of map outputs. Some examination has found that the sequence files of map output that Hadoop moves to the reducer can be halfed in size for Terrier map reduce indexing by applying gzip. This suggests that using Haoop map output compression may be beneficial. See http://hadoop.apache.org/core/docs/r0.18.3/mapred_tutorial.html#Data+Compression for more details.

      In this issue I will report space and efficiency changes in applying various compression changes.

        Attachments

          Activity

            People

            • Assignee:
              craigm Craig Macdonald
              Reporter:
              richardm Richard McCreadie
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: