org.terrier.structures.indexing.singlepass.hadoop
Class SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm

java.lang.Object
  extended by org.terrier.structures.indexing.singlepass.hadoop.SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
All Implemented Interfaces:
org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>
Enclosing class:
SplitEmittedTerm

public static class SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm
extends java.lang.Object
implements org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>

Partitions SplitEmittedTerms by term. This version assumes that most initial characters are in lowercase a-z. 0-9 will goto the first partition, all character higher than 'z' will go to the last partition.


Constructor Summary
SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm()
           
 
Method Summary
 int calculatePartition(char _initialChar, int numPartitions)
          Calculates the partitions for a given split number.
 void configure(org.apache.hadoop.mapred.JobConf jc)
          
 int getPartition(SplitEmittedTerm term, MapEmittedPostingList posting, int numPartitions)
          Retuns the partition for the specified term and posting list, given the specified number of partitions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm

public SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm()
Method Detail

getPartition

public int getPartition(SplitEmittedTerm term,
                        MapEmittedPostingList posting,
                        int numPartitions)
Retuns the partition for the specified term and posting list, given the specified number of partitions.

Specified by:
getPartition in interface org.apache.hadoop.mapred.Partitioner<SplitEmittedTerm,MapEmittedPostingList>

calculatePartition

public int calculatePartition(char _initialChar,
                              int numPartitions)
Calculates the partitions for a given split number.

Parameters:
_initialChar - - what's the first character in the term
numPartitions - - number of partitions (reducers) configured
Returns:
the reduce partition number to allocate the split to.

configure

public void configure(org.apache.hadoop.mapred.JobConf jc)

Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable


Terrier 3.5. Copyright © 2004-2011 University of Glasgow