Class HadoopPlugin

  • All Implemented Interfaces:
    ApplicationSetup.TerrierApplicationPlugin

    public class HadoopPlugin
    extends java.lang.Object
    implements ApplicationSetup.TerrierApplicationPlugin
    This class provides the main glue between Terrier and Hadoop. It has several main roles:
    1. Configure Terrier such that the Hadoop file systems can be accessed by Terrier.
    2. Provide a means to access the Hadoop map-reduce cluster, using Hadoop on Demand (HOD) if necessary.

    Configuring Terrier to access HDFS

    Terrier can access a Hadoop Distributed File System (HDFS), allowing collections and indices to be placed there. To do so, ensure that your Hadoop conf/ is on your CLASSPATH, and that the Hadoop plugin is loaded by Terrier, by setting terrier.plugins=org.terrier.utility.io.HadoopPlugin in your terrier.properties file.

    Since:
    2.2
    Author:
    Craig Macdonald
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected org.apache.hadoop.conf.Configuration config
      configuration used by this plugin
      protected org.apache.hadoop.fs.FileSystem hadoopFS
      distributed file system used by this plugin
      protected static org.slf4j.Logger logger
      The logger used
      protected static org.apache.hadoop.conf.Configuration singletonConfiguration
      main configuration object to use for Hadoop access
      protected static HadoopPlugin singletonHadoopPlugin
      instance of this class - it is a singleton
    • Constructor Summary

      Constructors 
      Constructor Description
      HadoopPlugin()
      Constructs a new plugin
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.hadoop.conf.Configuration getConfiguration()
      Returns the Hadoop configuration underlying this plugin instance
      static org.apache.hadoop.fs.FileSystem getDefaultFileSystem()
      What is the default file system according to Hadoop
      static java.lang.String getDefaultFileSystemPrefix()
      What is the String prefix of the default file system according to Hadoop
      static java.net.URI getDefaultFileSystemURI()
      What is the URI of the default file system according to Hadoop
      static org.apache.hadoop.conf.Configuration getGlobalConfiguration()
      Obtain the global Hadoop configuration in use by the plugin
      void initialise()
      Initialises the Plugin, by connecting to the distributed file system
      static void setGlobalConfiguration​(org.apache.hadoop.conf.Configuration _config)
      Update the global Hadoop configuration in use by the plugin
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • singletonHadoopPlugin

        protected static HadoopPlugin singletonHadoopPlugin
        instance of this class - it is a singleton
      • singletonConfiguration

        protected static org.apache.hadoop.conf.Configuration singletonConfiguration
        main configuration object to use for Hadoop access
      • logger

        protected static final org.slf4j.Logger logger
        The logger used
      • config

        protected org.apache.hadoop.conf.Configuration config
        configuration used by this plugin
      • hadoopFS

        protected org.apache.hadoop.fs.FileSystem hadoopFS
        distributed file system used by this plugin
    • Constructor Detail

      • HadoopPlugin

        public HadoopPlugin()
        Constructs a new plugin
    • Method Detail

      • setGlobalConfiguration

        public static void setGlobalConfiguration​(org.apache.hadoop.conf.Configuration _config)
        Update the global Hadoop configuration in use by the plugin
      • getGlobalConfiguration

        public static org.apache.hadoop.conf.Configuration getGlobalConfiguration()
        Obtain the global Hadoop configuration in use by the plugin
      • getDefaultFileSystemPrefix

        public static java.lang.String getDefaultFileSystemPrefix()
        What is the String prefix of the default file system according to Hadoop
      • getDefaultFileSystemURI

        public static java.net.URI getDefaultFileSystemURI()
        What is the URI of the default file system according to Hadoop
      • getDefaultFileSystem

        public static org.apache.hadoop.fs.FileSystem getDefaultFileSystem()
                                                                    throws java.io.IOException
        What is the default file system according to Hadoop
        Throws:
        java.io.IOException
      • getConfiguration

        public org.apache.hadoop.conf.Configuration getConfiguration()
        Returns the Hadoop configuration underlying this plugin instance