Announcement Announcement Module
No announcement yet.
HBase/DistributedCache issue Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • HBase/DistributedCache issue

    Distributed cache is not working for me with HBase. The mapper reads from hdfs and the reducer writes to HBase. To set-up the conf correctly for the reduce job, I am using the following bean. The initReducerJob() method invokes "TableMapReduceUtil.initTableReducerJob(table, reducer, job);"

    <bean id="setupConf4HBase" class="org.springframework.beans.factory.config.Me thodInvokingFactoryBean">
    <property name="targetClass"><value>dimension.setup.Initiali zeMRJob</value></property>
    <property name="targetMethod"><value>initReducerJob</value></property>
    <property name="arguments">
    <ref local="dimension.calculator"/>

    The reducer job fails to retrieve files (property files, jars) from DistributedCache. The files are not getting deployed to DistributedCache. I checked the job.xml, there is no trace of these files.

    The path.separator is set up properly though. Another non-HBase job defined in the same context file work well and can access files in the DC. In this case (non-HBase job), I can see in job.xml that mapred.cache.files & mapred.job.classpath.files are set-up properly but not in the HBase job.

    Any further suggestion or areas to look into?

  • #2
    Try to isolate the DC problem - first I assume you're submitting the job from windows. Is HBase running on Windows or Linux - what about the DC?
    Can you add a depends-on to your invoke bean to make sure that the DC is configured before your initReducerJob() is called?