public class HadoopUtility extends Object
JobFactory jf = HadoopUtility.getJobFactory("TerrierJob");
JobConf jc = jf.newJob();
HadoopUtility.makeTerrierJob(jc);
&47;&47; populate jc
&47;&47; if an index is needed in the MR job:
HadoopUtility.toHConfiguration(index, jc);
Running rj = JobClient.runJob(jc);
HadoopUtility.finishTerrierJob(jc);
During a MR job, the configure method should call HadoopUtility.loadTerrierJob(jc);
To obtain an index, Index index = HadoopUtility.fromHConfiguration(jc);Modifier and Type | Class and Description |
---|---|
static class |
HadoopUtility.MapBase<K1,V1,K2,V2>
Abstract class that provides default configure and close methods for a Mapper.
|
static class |
HadoopUtility.MapReduceBase<K1,V1,K2,V2,K3,V3>
Handy base class for MapReduce jobs.
|
static class |
HadoopUtility.ReduceBase<K1,V1,K2,V2>
Abstract class that provides default configure and close methods for a Reducer.
|
Modifier and Type | Field and Description |
---|---|
protected static String[] |
checkSystemProperties |
protected static String |
HADOOP_TMP_PATH |
protected static org.slf4j.Logger |
logger |
protected static Random |
random |
Constructor and Description |
---|
HadoopUtility() |
Modifier and Type | Method and Description |
---|---|
protected static void |
deleteJobApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) |
protected static org.apache.hadoop.fs.Path |
findCacheFileByFragment(org.apache.hadoop.mapred.JobConf jc,
String name) |
protected static String[] |
findJarFiles(String[] classPathLines) |
static void |
finishTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
Call this after the MapReduce job specified by jobConf has completed,
to clean up any leftover files
|
static IndexOnDisk |
fromHConfiguration(org.apache.hadoop.conf.Configuration c)
Get an Index saved to the specifified Hadoop configuration by toHConfiguration()
|
static boolean |
isMap(org.apache.hadoop.mapred.JobConf jc)
Utility method to detect if a task is a Map task or not
|
protected static void |
loadApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) |
static void |
loadTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
When the current ApplicationSetup has been saved to the JobConf, by makeTerrierJob(),
use this method during the MR job to properly initialise Terrier.
|
protected static org.apache.hadoop.fs.Path |
makeTemporaryFile(org.apache.hadoop.mapred.JobConf jobConf,
String filename) |
static void |
makeTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
Saves the current ApplicationSetup to the specified JobConf.
|
protected static void |
removeClassPathFromJob(org.apache.hadoop.mapred.JobConf jobConf) |
protected static void |
saveApplicationSetupToJob(org.apache.hadoop.mapred.JobConf jobConf,
boolean getFreshProperties) |
protected static void |
saveClassPathToJob(org.apache.hadoop.mapred.JobConf jobConf) |
static boolean |
setJobOutputCompression(org.apache.hadoop.mapred.JobConf conf)
Utility method to set JobOutputCompression if possible.
|
static boolean |
setMapOutputCompression(org.apache.hadoop.mapred.JobConf conf)
Utility method to set MapOutputCompression if possible.
|
protected static boolean |
startsWithAny(String source,
String[] checks)
Returns true if source contains any of the Strings held in checks.
|
static void |
toHConfiguration(Index i,
org.apache.hadoop.conf.Configuration c)
Puts the specified index onto the given Hadoop configuration
|
protected static final org.slf4j.Logger logger
protected static final String HADOOP_TMP_PATH
protected static final String[] checkSystemProperties
protected static final Random random
public static final boolean isMap(org.apache.hadoop.mapred.JobConf jc)
public static boolean setMapOutputCompression(org.apache.hadoop.mapred.JobConf conf)
conf
- JobConf of job.public static boolean setJobOutputCompression(org.apache.hadoop.mapred.JobConf conf)
conf
- JobConf of job.public static void makeTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
public static void loadTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
public static void finishTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
protected static void removeClassPathFromJob(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
protected static void saveClassPathToJob(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
protected static org.apache.hadoop.fs.Path makeTemporaryFile(org.apache.hadoop.mapred.JobConf jobConf, String filename) throws IOException
IOException
protected static void deleteJobApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
protected static void saveApplicationSetupToJob(org.apache.hadoop.mapred.JobConf jobConf, boolean getFreshProperties) throws Exception
Exception
protected static org.apache.hadoop.fs.Path findCacheFileByFragment(org.apache.hadoop.mapred.JobConf jc, String name) throws IOException
IOException
protected static void loadApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
public static IndexOnDisk fromHConfiguration(org.apache.hadoop.conf.Configuration c)
public static void toHConfiguration(Index i, org.apache.hadoop.conf.Configuration c)
protected static boolean startsWithAny(String source, String[] checks)
source
- String to checkchecks
- Strings to check forTerrier Information Retrieval Platform4.1. Copyright © 2004-2015, University of Glasgow