|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.terrier.utility.io.HadoopUtility
public class HadoopUtility
Utility class for the setting up and configuring of Terrier MapReduce jobs.
General scheme for a Hadoop Job
JobFactory jf = HadoopUtility.getJobFactory("TerrierJob");
JobConf jc = jf.newJob();
HadoopUtility.makeTerrierJob(jc);
&47;&47; populate jc
&47;&47; if an index is needed in the MR job:
HadoopUtility.toHConfiguration(index, jc);
Running rj = JobClient.runJob(jc);
HadoopUtility.finishTerrierJob(jc);
During a MR job, the configure method should call HadoopUtility.loadTerrierJob(jc);
To obtain an index, Index index = HadoopUtility.fromHConfiguration(jc);
Nested Class Summary | |
---|---|
static class |
HadoopUtility.MapReduceBase<K1,V1,K2,V2,K3,V3>
Handy base class for MapReduce jobs. |
Field Summary | |
---|---|
protected static java.lang.String[] |
checkSystemProperties
|
protected static org.apache.log4j.Logger |
logger
|
protected static java.util.Random |
random
|
Constructor Summary | |
---|---|
HadoopUtility()
|
Method Summary | |
---|---|
protected static void |
deleteJobApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf)
|
protected static org.apache.hadoop.fs.Path |
findCacheFileByFragment(org.apache.hadoop.mapred.JobConf jc,
java.lang.String name)
|
protected static java.lang.String[] |
findJarFiles(java.lang.String[] classPathLines)
|
static void |
finishTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
Call this after the MapReduce job specified by jobConf has completed, to clean up any leftover files |
static Index |
fromHConfiguration(org.apache.hadoop.conf.Configuration c)
Get an Index saved to the specifified Hadoop configuration by toHConfiguration() |
static boolean |
isMap(org.apache.hadoop.mapred.JobConf jc)
Utility method to detect if a task is a Map task or not |
protected static void |
loadApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf)
|
static void |
loadTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
When the current ApplicationSetup has been saved to the JobConf, by makeTerrierJob(), use this method during the MR job to properly initialise Terrier. |
protected static org.apache.hadoop.fs.Path |
makeTemporaryFile(org.apache.hadoop.mapred.JobConf jobConf,
java.lang.String filename)
|
static void |
makeTerrierJob(org.apache.hadoop.mapred.JobConf jobConf)
Saves the current ApplicationSetup to the specified JobConf. |
protected static void |
removeClassPathFromJob(org.apache.hadoop.mapred.JobConf jobConf)
|
protected static void |
saveApplicationSetupToJob(org.apache.hadoop.mapred.JobConf jobConf,
boolean getFreshProperties)
|
protected static void |
saveClassPathToJob(org.apache.hadoop.mapred.JobConf jobConf)
|
static boolean |
setJobOutputCompression(org.apache.hadoop.mapred.JobConf conf)
Utility method to set JobOutputCompression if possible. |
static boolean |
setMapOutputCompression(org.apache.hadoop.mapred.JobConf conf)
Utility method to set MapOutputCompression if possible. |
protected static boolean |
startsWithAny(java.lang.String source,
java.lang.String[] checks)
Returns true if source contains any of the Strings held in checks. |
static void |
toHConfiguration(Index i,
org.apache.hadoop.conf.Configuration c)
Puts the specified index onto the given Hadoop configuration |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final org.apache.log4j.Logger logger
protected static final java.lang.String[] checkSystemProperties
protected static final java.util.Random random
Constructor Detail |
---|
public HadoopUtility()
Method Detail |
---|
public static final boolean isMap(org.apache.hadoop.mapred.JobConf jc)
public static boolean setMapOutputCompression(org.apache.hadoop.mapred.JobConf conf)
conf
- JobConf of job.
public static boolean setJobOutputCompression(org.apache.hadoop.mapred.JobConf conf)
conf
- JobConf of job.
public static void makeTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
public static void loadTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
public static void finishTerrierJob(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
protected static void removeClassPathFromJob(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
protected static void saveClassPathToJob(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
protected static java.lang.String[] findJarFiles(java.lang.String[] classPathLines)
protected static org.apache.hadoop.fs.Path makeTemporaryFile(org.apache.hadoop.mapred.JobConf jobConf, java.lang.String filename) throws java.io.IOException
java.io.IOException
protected static void deleteJobApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
protected static void saveApplicationSetupToJob(org.apache.hadoop.mapred.JobConf jobConf, boolean getFreshProperties) throws java.lang.Exception
java.lang.Exception
protected static org.apache.hadoop.fs.Path findCacheFileByFragment(org.apache.hadoop.mapred.JobConf jc, java.lang.String name) throws java.io.IOException
java.io.IOException
protected static void loadApplicationSetup(org.apache.hadoop.mapred.JobConf jobConf) throws java.io.IOException
java.io.IOException
public static Index fromHConfiguration(org.apache.hadoop.conf.Configuration c)
public static void toHConfiguration(Index i, org.apache.hadoop.conf.Configuration c)
protected static boolean startsWithAny(java.lang.String source, java.lang.String[] checks)
source
- String to checkchecks
- Strings to check for
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |