Package org.terrier.structures.indexing.singlepass.hadoop

Provides classes implemeting the Hadoop MapReduce indexing in Terrier.

See:
          Description

Class Summary
BitPostingIndexInputFormat An InputFormat, i.e.
CollectionRecordReader<SPLITTYPE extends PositionAwareSplit<?>> An abstract RecordReader class which provides methods to read a collection within the Hadoop framework.
FileCollectionRecordReader Record Reader for Hadoop Indexing.
FileSplit An instance of org.apache.hadoop.mapred.FileSplit that provides a default constructor.
HadoopRunIteratorFactory Creates a new Iterator over runs which can be used within the Hadoop framework.
HadoopRunPostingIterator This class allows the iteration of over a postings within a run within the Hadoop framwork.
HadoopRunsMerger This is the main merger class for Hadoop runs.
HadoopRunWriter RunWriter for the MapReduce indexer.
IDComparator Compares String objects.
Inv2DirectMultiReduce This class inverts an inverted index into a direct index, making use of a single MapReduce job.
Inv2DirectMultiReduce.ByDocidPartitioner<K> Partitioner partitioning by docid
Inv2DirectMultiReduce.ByDocidPartitionerPosting Partitioner partitioning by docid
Inv2DirectMultiReduce.Inv2DirectMultiReduceJob This class performs contains setup for the MR job.
MapData Storage class for information about each Map.
MapEmittedPostingList Sub-Class of WritableByteArray, i.e.
MultiFileCollectionInputFormat Input Format Class for Hadoop Indexing.
MultiFileSplit An instance of org.apache.hadoop.mapred.MultiFileSplit that provides a default constructor.
PositionAwareSplit<T extends InputSplit> An InputSplit, i.e.
SplitAwareWrapper<T> Ironically a wrapper around a wrapper.
SplitEmittedTerm Represents a Term key used during MapReduce Indexing.
SplitEmittedTerm.SETPartitioner Partitions SplitEmittedTerms by split that they came from.
SplitEmittedTerm.SETPartitionerLowercaseAlphaTerm Partitions SplitEmittedTerms by term.
SplitEmittedTerm.SETRawComparatorTerm Sorter by term only
SplitEmittedTerm.SETRawComparatorTermSplitFlush A comparator for comparing different split emitted terms.
WritableByteArray Represents a Writable Posting List.
 

Package org.terrier.structures.indexing.singlepass.hadoop Description

Provides classes implemeting the Hadoop MapReduce indexing in Terrier.



Terrier 3.5. Copyright © 2004-2011 University of Glasgow