|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.utility.io.HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,java.lang.Object,java.lang.Object> org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce
public class Inv2DirectMultiReduce
This class inverts an inverted index into a direct index, making use of a single MapReduce job. On completion of the MapReduce job, the counters can be used as validation of the correct running of the job. For instance "Map input records" should equal the number of terms in the index and "Map output records" should equal the number of pointers.
Nested Class Summary | |
---|---|
static class |
Inv2DirectMultiReduce.ByDocidPartitioner<K>
Partitioner partitioning by docid |
static class |
Inv2DirectMultiReduce.ByDocidPartitionerPosting
Partitioner partitioning by docid |
static class |
Inv2DirectMultiReduce.Inv2DirectMultiReduceJob
This class performs contains setup for the MR job. |
Field Summary |
---|
Fields inherited from class org.terrier.utility.io.HadoopUtility.MapReduceBase |
---|
jc |
Constructor Summary | |
---|---|
Inv2DirectMultiReduce()
|
Method Summary | |
---|---|
protected void |
closeMap()
|
protected void |
closeReduce()
|
protected void |
configureMap()
|
protected void |
configureReduce()
|
static void |
invertStructure(Index index,
HadoopPlugin.JobFactory jf,
int numberOfReduceTasks)
Performs the inversion, from "inverted" structure to "direct" structure. |
static void |
main(java.lang.String[] args)
main |
void |
map(org.apache.hadoop.io.IntWritable termId,
Wrapper<IterablePosting> postingWrapper,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.VIntWritable,Posting> collector,
org.apache.hadoop.mapred.Reporter reporter)
Take an iterator of postings. |
void |
reduce(org.apache.hadoop.io.VIntWritable _targetDocid,
java.util.Iterator<Posting> documentPostings,
org.apache.hadoop.mapred.OutputCollector<java.lang.Object,java.lang.Object> collector,
org.apache.hadoop.mapred.Reporter reporter)
|
Methods inherited from class org.terrier.utility.io.HadoopUtility.MapReduceBase |
---|
close, configure |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Inv2DirectMultiReduce()
Method Detail |
---|
public static void main(java.lang.String[] args) throws java.lang.Exception
args
-
java.lang.Exception
public static void invertStructure(Index index, HadoopPlugin.JobFactory jf, int numberOfReduceTasks) throws java.lang.Exception
index
- - the index to perform the inversion onjf
- - MapReduce job factorynumberOfReduceTasks
- - as it says. More is better.
java.lang.Exception
protected void configureMap() throws java.io.IOException
configureMap
in class HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,java.lang.Object,java.lang.Object>
java.io.IOException
public void map(org.apache.hadoop.io.IntWritable termId, Wrapper<IterablePosting> postingWrapper, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.VIntWritable,Posting> collector, org.apache.hadoop.mapred.Reporter reporter) throws java.io.IOException
java.io.IOException
protected void closeMap() throws java.io.IOException
closeMap
in class HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,java.lang.Object,java.lang.Object>
java.io.IOException
protected void configureReduce() throws java.io.IOException
configureReduce
in class HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,java.lang.Object,java.lang.Object>
java.io.IOException
public void reduce(org.apache.hadoop.io.VIntWritable _targetDocid, java.util.Iterator<Posting> documentPostings, org.apache.hadoop.mapred.OutputCollector<java.lang.Object,java.lang.Object> collector, org.apache.hadoop.mapred.Reporter reporter) throws java.io.IOException
java.io.IOException
protected void closeReduce() throws java.io.IOException
closeReduce
in class HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,java.lang.Object,java.lang.Object>
java.io.IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |