Terrier IR Platform
2.2.1

uk.ac.gla.terrier.structures.indexing.singlepass.hadoop
Class ByMapPartitioner

java.lang.Object
  extended by uk.ac.gla.terrier.structures.indexing.singlepass.hadoop.ByMapPartitioner
All Implemented Interfaces:
org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>

public class ByMapPartitioner
extends java.lang.Object
implements org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>

Partitions the term postings lists from the map function, such that the created indexes is partitioned evenly across the reducers. This partitioner partitions by an even number of maps, assuming that Map sizes are approximately equal.

Since:
2.2
Version:
$Revision: 1.2 $
Author:
Richard McCreadie and Craig Macdonald

Constructor Summary
ByMapPartitioner()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf job)
           
 int getPartition(MapEmittedTerm key, MapEmittedPostingList value, int numPartitions)
          Forces each Map output to get its own reduce step
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ByMapPartitioner

public ByMapPartitioner()
Method Detail

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable

getPartition

public int getPartition(MapEmittedTerm key,
                        MapEmittedPostingList value,
                        int numPartitions)
Forces each Map output to get its own reduce step

Specified by:
getPartition in interface org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>

Terrier IR Platform
2.2.1

Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow