Announcement Announcement Module
Collapse
No announcement yet.
class not found exception in MR job invoked by toolttasklet Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • class not found exception in MR job invoked by toolttasklet

    I always get following error when I run tool-tasklet.

    java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.examples.WordCount$TokenizerMapp er
    at org.apache.hadoop.conf.Configuration.getClass(Conf iguration.java:1081)
    at org.apache.hadoop.mapreduce.JobContext.getMapperCl ass(JobContext.java:212)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapT ask.java:609)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java: 325)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:27 0)
    at java.security.AccessController.doPrivileged(Native Method)

    My configuration is as follows.
    <hdp:tool-tasklet id="wc-tasklet" scope="step" tool-class="com.abc.WordCountToolRunner" libs="cp/hadoop-examples-0.20.204.0.jar" jar="cp/tool.jar">
    <hdp:arg value="#{jobParameters['inputFile']}"/>
    <hdp:arg value="#{jobParameters['outputFile']}"/>
    property=value
    </hdp:tool-tasklet>

    I am wondering how we can add 3 rd party libraries to classpath of MR job from tool-tasklet.
    Last edited by timodavid; Sep 1st, 2012, 01:11 PM.

  • #2
    That's because the examples are not in the classpath for the tool execution. You specify them under libs but the tool-class itself (WordCountToolRunner) seems to need them. So you can either add them to the Spring classpath or, in case you don't use them in your app, specify them (using a comma) under the jar field.

    Comment


    • #3
      Hi Costin,
      I tried adding to classpath and adding this to jar attribute with comma, hard luck I could not resolve this error it still says class not found.
      Any pointer on how this can be fixed or can you post an example for tool-runner in a scenario where 3rd party libraries need to be added to hadoop job to make it work.

      Comment


      • #4
        Here's an example from the test suite:

        Code:
            <hdp:job id="custom-jar-job" 
                   input-path="/ide-test/input/word/" output-path="/ide-test/runner/output-4/"
                mapper="test.SomeTool$CustomMapper"
                validate-paths="false"
                jar="some-tool.jar"
                libs="some-tool.jar"
                scope="prototype" />
        Make sure you're using the BUILD-SNAPSHOT instead of 1.0 M2.

        Comment


        • #5
          Hi Costin, I tried the batch-wordcount example with job configuration as below: It didn't work and continued failing with the ClassNotFoundException exception for org.apache.hadoop.examples.WordCount$TokenizerMapp er.

          <job id="wordcount-job" input-path="${wordcount.input.path:/user/gutenberg/input/word/}"
          output-path="${wordcount.output.path:/user/gutenberg/output/word/}"
          mapper="org.apache.hadoop.examples.WordCount.Token izerMapper"
          reducer="org.apache.hadoop.examples.WordCount.IntS umReducer"
          validate-paths="false"/>

          I tried providing the libs attribute but the hadoop schema could not be validated with that. The example worked when "jar-by-class" was added: "jar-by-class="org.apache.hadoop.examples.WordCount.Tokeni zerMapper".

          Comment

          Working...
          X