org.terrier.structures.indexing.singlepass.hadoop
Class SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
java.lang.Object
org.terrier.structures.indexing.singlepass.hadoop.SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
- All Implemented Interfaces:
- org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>
- Enclosing class:
- SplitEmittedTerm
public static class SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
- extends java.lang.Object
- implements org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>
Partitions SplitEmittedTerms by term. This version assumes that most initial characters are in lowercase a-z.
0-9 will goto the first partition, all character higher than 'z' will go to the last partition.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
public SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm()
getPartition
public int getPartition(SplitEmittedTerm term,
MapEmittedPostingList posting,
int numPartitions)
- Retuns the partition for the specified term and posting list, given the specified
number of partitions.
- Specified by:
getPartition
in interface org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>
calculatePartition
public int calculatePartition(char _initialChar,
int numPartitions)
- Calculates the partitions for a given split number.
- Parameters:
_initialChar
- - what's the first character in the termnumPartitions
- - number of partitions (reducers) configured
- Returns:
- the reduce partition number to allocate the split to.
configure
public void configure(org.apache.hadoop.mapred.JobConf jc)
-
- Specified by:
configure
in interface org.apache.hadoop.mapred.JobConfigurable
Terrier 3.5. Copyright © 2004-2011 University of Glasgow