Uploaded image for project: 'Terrier Core'
  1. Terrier Core
  2. TR-201

log4j conflicts can occur for hadoop indexing


    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5
    • Fix Version/s: 3.6
    • Component/s: None
    • Labels:


      I am trying to index ClueWeb09B with terrier, but it does not work, apparently due to a conflict in log4j configuration:

      Setting JAVA_HOME to /usr
      INFO - JAAS Configuration already set up for Hadoop, not re-installing.
      INFO - Term-partitioned Mode, 26 reducers creating one inverted index.
      INFO - Copying terrier share/ directory (/home/bpiwowar/terrier-3.5/share) to shared storage area (hdfs://oops1/tmp/-1138700486-terrier.share)
      INFO - Copying classpath to job
      WARN - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
      WARN - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
      INFO - Allocating 100 files across 2 map tasks
      INFO - Running job: job_201204061503_0009
      INFO - map 0% reduce 0%
      INFO - Task Id : attempt_201204061503_0009_m_000001_0, Status : FAILED
      at org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:94)
      at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:337)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:272)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:416)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
      at org.apache.hadoop.mapred.Child.main(Child.java:264)


      Here is a discussion about the topic

      The first work-around is to uncomment the code in isLog4JConfigured (ApplicationSetup), but a better solution would be to rely on a Terrier specific logger repository, i.e. following


          Issue Links


            bpiwowar Benjamin Piwowarski created issue -
            craigm Craig Macdonald made changes -
            Field Original Value New Value
            Summary Indexing with hadoop fails log4j conflicts can occur for hadoop indexing
            craigm Craig Macdonald made changes -
            Fix Version/s 3.6 [ 10060 ]
            craigm Craig Macdonald made changes -
            Attachment TR-201.patch [ 10395 ]
            craigm Craig Macdonald made changes -
            Link This issue duplicates TR-111 [ TR-111 ]
            craigm Craig Macdonald made changes -
            Status Open [ 1 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            craigm Craig Macdonald made changes -
            Link This issue is duplicated by TR-232 [ TR-232 ]


              • Assignee:
                craigm Craig Macdonald
                bpiwowar Benjamin Piwowarski
              • Watchers:
                0 Start watching this issue


                • Created: