|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.terrier.utility.io.HadoopUtility.MapReduceBase<org.apache.hadoop.io.IntWritable,Wrapper<IterablePosting>,org.apache.hadoop.io.VIntWritable,Posting,Object,Object>
org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce
public class Inv2DirectMultiReduce
This class inverts an inverted index into a direct index, making use of a single MapReduce job. On completion of the MapReduce job, the counters can be used as validation of the correct running of the job. For instance "Map input records" should equal the number of terms in the index and "Map output records" should equal the number of pointers.
Nested Class Summary | |
---|---|
static class |
Inv2DirectMultiReduce.ByDocidPartitioner<K>
Partitioner partitioning by docid |
static class |
Inv2DirectMultiReduce.ByDocidPartitionerPosting
Partitioner partitioning by docid |
static class |
Inv2DirectMultiReduce.Inv2DirectMultiReduceJob
This class performs contains setup for the MR job. |
Field Summary | |
---|---|
protected org.apache.hadoop.mapred.JobConf |
jc
|
Constructor Summary | |
---|---|
Inv2DirectMultiReduce()
|
Method Summary | |
---|---|
void |
close()
Called at end of map or reduce task. |
protected void |
closeMap()
|
protected void |
closeReduce()
|
void |
configure(org.apache.hadoop.mapred.JobConf _jc)
|
protected void |
configureMap()
|
protected void |
configureReduce()
|
static void |
invertStructure(Index index,
HadoopPlugin.JobFactory jf,
int numberOfReduceTasks)
Performs the inversion, from "inverted" structure to "direct" structure. |
static void |
main(String[] args)
main |
void |
map(org.apache.hadoop.io.IntWritable termId,
Wrapper<IterablePosting> postingWrapper,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.VIntWritable,Posting> collector,
org.apache.hadoop.mapred.Reporter reporter)
Take an iterator of postings. |
void |
reduce(org.apache.hadoop.io.VIntWritable _targetDocid,
Iterator<Posting> documentPostings,
org.apache.hadoop.mapred.OutputCollector<Object,Object> collector,
org.apache.hadoop.mapred.Reporter reporter)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.mapred.JobConfigurable |
---|
configure |
Methods inherited from interface java.io.Closeable |
---|
close |
Field Detail |
---|
protected org.apache.hadoop.mapred.JobConf jc
Constructor Detail |
---|
public Inv2DirectMultiReduce()
Method Detail |
---|
public static void main(String[] args) throws Exception
args
-
Exception
public static void invertStructure(Index index, HadoopPlugin.JobFactory jf, int numberOfReduceTasks) throws Exception
index
- - the index to perform the inversion onjf
- - MapReduce job factorynumberOfReduceTasks
- - as it says. More is better.
Exception
protected void configureMap() throws IOException
IOException
public void map(org.apache.hadoop.io.IntWritable termId, Wrapper<IterablePosting> postingWrapper, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.VIntWritable,Posting> collector, org.apache.hadoop.mapred.Reporter reporter) throws IOException
IOException
protected void closeMap() throws IOException
IOException
protected void configureReduce() throws IOException
IOException
public void reduce(org.apache.hadoop.io.VIntWritable _targetDocid, Iterator<Posting> documentPostings, org.apache.hadoop.mapred.OutputCollector<Object,Object> collector, org.apache.hadoop.mapred.Reporter reporter) throws IOException
IOException
protected void closeReduce() throws IOException
IOException
public void configure(org.apache.hadoop.mapred.JobConf _jc)
configure
in interface org.apache.hadoop.mapred.JobConfigurable
public void close() throws IOException
close
in interface Closeable
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |