uk.ac.gla.terrier.structures.indexing.singlepass.hadoop
Class ByMapPartitioner
java.lang.Object
uk.ac.gla.terrier.structures.indexing.singlepass.hadoop.ByMapPartitioner
- All Implemented Interfaces:
- org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>
public class ByMapPartitioner
- extends java.lang.Object
- implements org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>
Partitions the term postings lists from the map function,
such that the created indexes is partitioned evenly across
the reducers. This partitioner partitions by an even number of
maps, assuming that Map sizes are approximately equal.
- Since:
- 2.2
- Version:
- $Revision: 1.2 $
- Author:
- Richard McCreadie and Craig Macdonald
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ByMapPartitioner
public ByMapPartitioner()
configure
public void configure(org.apache.hadoop.mapred.JobConf job)
- Specified by:
configure
in interface org.apache.hadoop.mapred.JobConfigurable
getPartition
public int getPartition(MapEmittedTerm key,
MapEmittedPostingList value,
int numPartitions)
- Forces each Map output to get its own reduce step
- Specified by:
getPartition
in interface org.apache.hadoop.mapred.Partitioner<MapEmittedTerm,MapEmittedPostingList>
Terrier Information Retrieval Platform 2.2.1. Copyright 2004-2008 University of Glasgow