Announcement Announcement Module
Collapse
No announcement yet.
Map Reduce Example Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Map Reduce Example

    I posted this on the Github repo for the Spring Hadoop samples, but it has occurred to me that this is probably the better medium. I've installed Hadoop-2.2.0 and I'm able to run the hadoop samples just fine from the command line. Interested in using Spring though so I tried the Map Reduce sample.
    Whenever I run the job I get this output:

    09:24:53,938 INFO ctory.support.DefaultListableBeanFactory: 596 - Pre-instantiating singletons inorg.springframework.beans.factory.support.DefaultL [email protected]: defining beans [org.springframework.context.support.PropertySource sPlaceholderConfigurer#0,hadoopConfiguration,wordc ountJob,setupScript,runner]; root of factory hierarchy
    09:24:54,096 INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
    2014-01-18 09:24:54.573 java[4237:1703] Unable to load realm info from SCDynamicStore
    09:25:28,945 WARN org.apache.hadoop.util.NativeCodeLoader: 62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    09:25:29,488 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
    09:25:29,501 INFO org.apache.hadoop.fs.TrashPolicyDefault: 92 - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    09:25:30,707 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
    09:25:30,759 INFO ramework.data.hadoop.mapreduce.JobRunner: 192 - Starting job [wordcountJob]
    09:25:30,790 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
    09:25:31,055 WARN org.apache.hadoop.mapreduce.JobSubmitter: 258 - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
    09:25:31,111 INFO doop.mapreduce.lib.input.FileInputFormat: 287 - Total input paths to process : 1
    09:25:31,260 INFO org.apache.hadoop.mapreduce.JobSubmitter: 394 - number of splits:1
    09:25:31,271 INFO he.hadoop.conf.Configuration.deprecation: 840 - user.name is deprecated. Instead, use mapreduce.job.user.name
    09:25:31,272 INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
    09:25:31,275 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
    09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
    09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.name is deprecated. Instead, use mapreduce.job.name
    09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
    09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
    09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
    09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
    09:25:31,278 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    09:25:31,278 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
    09:25:31,279 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
    09:25:31,412 INFO org.apache.hadoop.mapreduce.JobSubmitter: 477 - Submitting tokens for job: job_1390012296433_0009
    09:25:31,641 INFO org.apache.hadoop.mapred.YARNRunner: 368 - Job jar is not present. Not adding any jar to the list of resources.
    09:25:31,705 INFO doop.yarn.client.api.impl.YarnClientImpl: 174 - Submitted application application_1390012296433_0009 to ResourceManager at localhost/127.0.0.1:8032
    09:25:31,748 INFO org.apache.hadoop.mapreduce.Job:1272 - The url to track the job: http://admins-macbook-pro.local:8088...12296433_0009/
    09:25:31,749 INFO org.apache.hadoop.mapreduce.Job:1317 - Running job: job_1390012296433_0009
    09:25:35,778 INFO org.apache.hadoop.mapreduce.Job:1338 - Job job_1390012296433_0009 running in uber mode : false
    09:25:35,780 INFO org.apache.hadoop.mapreduce.Job:1345 - map 0% reduce 0%
    09:25:35,796 INFO org.apache.hadoop.mapreduce.Job:1358 - Job job_1390012296433_0009 failed with state FAILED due to: Application application_1390012296433_0009 failed 2 times due to AM Container for appattempt_1390012296433_0009_000002 exited with exitCode: 127 due to: Exception from container-launch:
    org.apache.hadoop.util.Shell$ExitCodeException:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java :464)
    at org.apache.hadoop.util.Shell.run(Shell.java:379)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor. execute(Shell.java:589)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultC ontainerExecutor.launchContainer(DefaultContainerE xecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containe rmanager.launcher.ContainerLaunch.call(ContainerLa unch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containe rmanager.launcher.ContainerLaunch.call(ContainerLa unch.java:79)
    at java.util.concurrent.FutureTask$Sync.innerRun(Futu reTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.jav a:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)

    .Failing this attempt.. Failing the application.
    09:25:35,850 INFO org.apache.hadoop.mapreduce.Job:1363 - Counters: 0
    09:25:35,858 INFO ramework.data.hadoop.mapreduce.JobRunner: 202 - Completed job [wordcountJob]
    09:25:35,876 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
    09:25:35,914 INFO ctory.support.DefaultListableBeanFactory: 444 - Destroying singletons inorg.springframework.beans.factory.support.DefaultL [email protected]: defining beans [org.springframework.context.support.PropertySource sPlaceholderConfigurer#0,hadoopConfiguration,wordc ountJob,setupScript,runner]; root of factory hierarchy
    Exception in thread "main" org.springframework.beans.factory.BeanCreationExce ption: Error creating bean with name 'runner': Invocation of init method failed; nested exception is java.lang.IllegalStateException: Job wordcountJob] failed to start; status=FAILED
    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.initializeBean(Abstract AutowireCapableBeanFactory.java:1488)
    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.doCreateBean(AbstractAu towireCapableBeanFactory.java:524)
    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.createBean(AbstractAuto wireCapableBeanFactory.java:461)
    at org.springframework.beans.factory.support.Abstract BeanFactory$1.getObject(AbstractBeanFactory.java:2 95)
    at org.springframework.beans.factory.support.DefaultS ingletonBeanRegistry.getSingleton(DefaultSingleton BeanRegistry.java:223)
    at org.springframework.beans.factory.support.Abstract BeanFactory.doGetBean(AbstractBeanFactory.java:292 )
    at org.springframework.beans.factory.support.Abstract BeanFactory.getBean(AbstractBeanFactory.java:194)
    at org.springframework.beans.factory.support.DefaultL istableBeanFactory.preInstantiateSingletons(Defaul tListableBeanFactory.java:626)
    at org.springframework.context.support.AbstractApplic ationContext.finishBeanFactoryInitialization(Abstr actApplicationContext.java:932)
    at org.springframework.context.support.AbstractApplic ationContext.refresh(AbstractApplicationContext.ja va:479)
    at org.springframework.context.support.ClassPathXmlAp plicationContext.(ClassPathXmlApplicationContext.j ava:197)
    at org.springframework.context.support.ClassPathXmlAp plicationContext.(ClassPathXmlApplicationContext.j ava:172)
    at org.springframework.context.support.ClassPathXmlAp plicationContext.(ClassPathXmlApplicationContext.j ava:158)
    at org.springframework.samples.hadoop.mapreduce.Wordc ount.main(Wordcount.java:28)
    Caused by: java.lang.IllegalStateException: Job wordcountJob] failed to start; status=FAILED
    at org.springframework.data.hadoop.mapreduce.JobExecu tor$2.run(JobExecutor.java:223)
    at org.springframework.core.task.SyncTaskExecutor.exe cute(SyncTaskExecutor.java:49)
    at org.springframework.data.hadoop.mapreduce.JobExecu tor.startJobs(JobExecutor.java:172)
    at org.springframework.data.hadoop.mapreduce.JobExecu tor.startJobs(JobExecutor.java:164)
    at org.springframework.data.hadoop.mapreduce.JobRunne r.call(JobRunner.java:52)
    at org.springframework.data.hadoop.mapreduce.JobRunne r.afterPropertiesSet(JobRunner.java:44)
    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.invokeInitMethods(Abstr actAutowireCapableBeanFactory.java:1547)
    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.initializeBean(Abstract AutowireCapableBeanFactory.java:1485)
    ... 13 more

    Any thoughts on what the problem could be?


  • #2
    The GitHub issue is here: https://github.com/spring-projects/s...mples/issues/4

    Any progress on this?

    Comment


    • #3
      No luck getting this to run yet. I've tried several things based on your recommendation about permissions.

      1. Tried using different directories. I did notice that the input directory on HDFS is getting created by the groovy script, and that the data is being copied into it.

      2. I tried disabling permission checks since I'm only running this locally in pseudo-distributed mode.

      3. Tried the suggestions in this Stack Overflow post: http://stackoverflow.com/questions/2...ainer-exceptio

      I am running Mac OSX Mavericks. Could possibly be a Mac env issue?

      I upped the logging as well, but haven't found anything more descriptive.

      I am able to run the examples that ship with Hadoop 2.2.0 just fine from the command line.

      Comment


      • #4
        Something I noticed...

        The sample app uses this configuration:

        <job id="wordcountJob"
        input-path="${wordcount.input.path}"
        output-path="${wordcount.output.path}"
        libs="file:${app.repo}/hadoop-examples-*.jar"
        mapper="org.apache.hadoop.examples.WordCount.Token izerMapper"
        reducer="org.apache.hadoop.examples.WordCount.IntS umReducer"/>

        When I upped the logs to DEBUG I noticed that it wasn't resolving any .jar for hadoop-examples-*.jar. I changed the config to this:

        <job id="wordcountJob"
        input-path="${wordcount.input.path}"
        output-path="${wordcount.output.path}"
        libs="file:${app.repo}/hadoop-mapreduce-examples-*.jar"
        mapper="org.apache.hadoop.examples.WordCount.Token izerMapper"
        reducer="org.apache.hadoop.examples.WordCount.IntS umReducer"/>

        Since it looks like that is the name used b Hadoop 2.2.0.

        Unfortunately...same result.

        Comment

        Working...
        X