Package org.terrier.utility.io
Class HadoopPlugin
- java.lang.Object
-
- org.terrier.utility.io.HadoopPlugin
-
- All Implemented Interfaces:
ApplicationSetup.TerrierApplicationPlugin
public class HadoopPlugin extends java.lang.Object implements ApplicationSetup.TerrierApplicationPlugin
This class provides the main glue between Terrier and Hadoop. It has several main roles:- Configure Terrier such that the Hadoop file systems can be accessed by Terrier.
- Provide a means to access the Hadoop map-reduce cluster, using Hadoop on Demand (HOD) if necessary.
Configuring Terrier to access HDFS
Terrier can access a Hadoop Distributed File System (HDFS), allowing collections and indices to be placed there. To do so, ensure that your Hadoop conf/ is on your CLASSPATH, and that the Hadoop plugin is loaded by Terrier, by setting terrier.plugins=org.terrier.utility.io.HadoopPlugin in your terrier.properties file.
- Since:
- 2.2
- Author:
- Craig Macdonald
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.hadoop.conf.Configuration
config
configuration used by this pluginprotected org.apache.hadoop.fs.FileSystem
hadoopFS
distributed file system used by this pluginprotected static org.slf4j.Logger
logger
The logger usedprotected static org.apache.hadoop.conf.Configuration
singletonConfiguration
main configuration object to use for Hadoop accessprotected static HadoopPlugin
singletonHadoopPlugin
instance of this class - it is a singleton
-
Constructor Summary
Constructors Constructor Description HadoopPlugin()
Constructs a new plugin
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.hadoop.conf.Configuration
getConfiguration()
Returns the Hadoop configuration underlying this plugin instancestatic org.apache.hadoop.fs.FileSystem
getDefaultFileSystem()
What is the default file system according to Hadoopstatic java.lang.String
getDefaultFileSystemPrefix()
What is the String prefix of the default file system according to Hadoopstatic java.net.URI
getDefaultFileSystemURI()
What is the URI of the default file system according to Hadoopstatic org.apache.hadoop.conf.Configuration
getGlobalConfiguration()
Obtain the global Hadoop configuration in use by the pluginvoid
initialise()
Initialises the Plugin, by connecting to the distributed file systemstatic void
setGlobalConfiguration(org.apache.hadoop.conf.Configuration _config)
Update the global Hadoop configuration in use by the plugin
-
-
-
Field Detail
-
singletonHadoopPlugin
protected static HadoopPlugin singletonHadoopPlugin
instance of this class - it is a singleton
-
singletonConfiguration
protected static org.apache.hadoop.conf.Configuration singletonConfiguration
main configuration object to use for Hadoop access
-
logger
protected static final org.slf4j.Logger logger
The logger used
-
config
protected org.apache.hadoop.conf.Configuration config
configuration used by this plugin
-
hadoopFS
protected org.apache.hadoop.fs.FileSystem hadoopFS
distributed file system used by this plugin
-
-
Method Detail
-
setGlobalConfiguration
public static void setGlobalConfiguration(org.apache.hadoop.conf.Configuration _config)
Update the global Hadoop configuration in use by the plugin
-
getGlobalConfiguration
public static org.apache.hadoop.conf.Configuration getGlobalConfiguration()
Obtain the global Hadoop configuration in use by the plugin
-
getDefaultFileSystemPrefix
public static java.lang.String getDefaultFileSystemPrefix()
What is the String prefix of the default file system according to Hadoop
-
getDefaultFileSystemURI
public static java.net.URI getDefaultFileSystemURI()
What is the URI of the default file system according to Hadoop
-
getDefaultFileSystem
public static org.apache.hadoop.fs.FileSystem getDefaultFileSystem() throws java.io.IOException
What is the default file system according to Hadoop- Throws:
java.io.IOException
-
initialise
public void initialise() throws java.lang.Exception
Initialises the Plugin, by connecting to the distributed file system- Specified by:
initialise
in interfaceApplicationSetup.TerrierApplicationPlugin
- Throws:
java.lang.Exception
-
getConfiguration
public org.apache.hadoop.conf.Configuration getConfiguration()
Returns the Hadoop configuration underlying this plugin instance
-
-