Announcement Announcement Module
Collapse
No announcement yet.
Bug - HBaseConfigurationFactoryBean returns Wrong Type Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug - HBaseConfigurationFactoryBean returns Wrong Type

    UPDATE

    After looking at the type hierarchy, I've realised this is down to the fact HBaseConfiguration extends Configuration. Sorry!

    Hi,

    I'm reposting this possible bug in the hope of further exposure.

    https://jira.springsource.org/browse/SHDP-38

    HbaseConfigurationFactoryBean appears to be returning an instance of org.apache.hadoop.conf.Configuration instead of org.apache.hadoop.hbase.HBaseConfiguration. Looking at the source, it appears that this is because both member variables are of the hadoop.conf type.

    This can be reproduced quite easily by using both <hdp:configuration /> and <hdp:hbase-configuration /> and specifying a bean with an @Autowired org.apache.hadoop.conf.Configuration property: context startup will fail due to the ambiguous autowire candidacy.
    Last edited by BinaryTweedDeej; Mar 2nd, 2012, 05:12 AM.

  • #2
    No problem - I've replied on the issue raised as well in case HBaseConfiguration class was what you were looking for.

    Comment


    • #3
      I am not getting the correct files list when using hdp:hbase-configuration
      I create a Configuration using HBaseConfiguration.create() and this is what it contains:

      Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml, hbase-default.xml, hbase-site.xml

      when I use the hdp:hbase-configuration in my job which gets passed to the Reducer this is what it contains:

      Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-shaneebersole/mapred/local/localRunner/job_local_0001.xml

      I can't connect to hbase without the hbase-default.xml and hbase-site.xml files being read.

      Comment


      • #4
        the namespace uses HBaseConfiguration.create() underneath (it also merges it with the given Hadoop config if any). Can you verify that the Hbase configs are the same before submitting the job?
        My guess is that on the reducer side, the wrong configuration is used (or something happens to it).
        How does your mapper/reducer job definition looks like?

        Comment


        • #5
          <configuration id="hadoop-configuration">
          fs.default.name=${hdfs.namenode:hdfs://localhost:9000}
          </configuration>

          <hbase-configuration id="hbase-configuration"
          configuration-ref="hadoop-configuration">
          hbase.zookeeper.quorum=server1.bericotechnologies. com
          </hbase-configuration>

          <job id="adventureworks-job" properties-location="classpath:conf/adventureworks.properties"
          configuration-ref="hbase-configuration"
          input-path="${input}"
          output-path="${output}"
          mapper="haruspex.etl.csv.CsvToCubeMapper"
          reducer="haruspex.etl.csv.CsvToCubeReducer"
          validate-paths="false"
          />


          In my reducer this works:

          Configuration conf = HBaseConfiguration.create();
          conf.set("hbase.zookeeper.quorum", "server1.bericotechnologies.com");
          hbase = new HBaseAdmin(conf);


          this doesn't:

          hbase = new HBaseAdmin(context.getConfiguration());

          ERROR:

          WARN 2012-06-07 13:58:29,101 [org.apache.hadoop.hbase.zookeeper.ZKConfig(ZKConfi g.java)]: java.net.UnknownHostException: server1.bericotechnologies.com
          at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
          at java.net.InetAddress$1.lookupAllHostAddr(InetAddre ss.java:849)
          at java.net.InetAddress.getAddressFromNameService(Ine tAddress.java:1202)
          at java.net.InetAddress.getAllByName0(InetAddress.jav a:1153)
          at java.net.InetAddress.getAllByName(InetAddress.java :1083)
          at java.net.InetAddress.getAllByName(InetAddress.java :1019)
          at java.net.InetAddress.getByName(InetAddress.java:96 9)
          at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQu orumServersString(ZKConfig.java:206)
          at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQu orumServersString(ZKConfig.java:250)
          at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher .<init>(ZooKeeperWatcher.java:113)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.getZooKeeperWatcher(HCon nectionManager.java:1209)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.setupZookeeperTrackers(H ConnectionManager.java:511)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.<init>(HConnectionManage r.java:502)
          at org.apache.hadoop.hbase.client.HConnectionManager. getConnection(HConnectionManager.java:172)
          at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(H BaseAdmin.java:92)
          at haruspex.etl.csv.CsvToCubeReducer.setup(CsvToCubeR educer.java:84)
          at org.apache.hadoop.mapreduce.Reducer.run(Reducer.ja va:174)
          at org.apache.hadoop.mapred.ReduceTask.runNewReducer( ReduceTask.java:571)
          at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask .java:413)
          at org.apache.hadoop.mapred.LocalJobRunner$Job.run(Lo calJobRunner.java:256)

          ERROR 2012-06-07 13:58:29,101 [org.apache.hadoop.hbase.zookeeper.ZKConfig(ZKConfi g.java)]: no valid quorum servers found in zoo.cfg
          WARN 2012-06-07 13:58:29,121 [org.apache.hadoop.mapred.LocalJobRunner(LocalJobRu nner.java)]: job_local_0001
          org.apache.hadoop.hbase.ZooKeeperConnectionExcepti on: An error is preventing HBase from connecting to ZooKeeper
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.getZooKeeperWatcher(HCon nectionManager.java:1213)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.setupZookeeperTrackers(H ConnectionManager.java:511)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.<init>(HConnectionManage r.java:502)
          at org.apache.hadoop.hbase.client.HConnectionManager. getConnection(HConnectionManager.java:172)
          at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(H BaseAdmin.java:92)
          at haruspex.etl.csv.CsvToCubeReducer.setup(CsvToCubeR educer.java:84)
          at org.apache.hadoop.mapreduce.Reducer.run(Reducer.ja va:174)
          at org.apache.hadoop.mapred.ReduceTask.runNewReducer( ReduceTask.java:571)
          at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask .java:413)
          at org.apache.hadoop.mapred.LocalJobRunner$Job.run(Lo calJobRunner.java:256)
          Caused by: java.io.IOException: Unable to determine ZooKeeper ensemble
          at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(Z KUtil.java:92)
          at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher .<init>(ZooKeeperWatcher.java:119)
          at org.apache.hadoop.hbase.client.HConnectionManager$ HConnectionImplementation.getZooKeeperWatcher(HCon nectionManager.java:1209)
          ... 9 more

          Comment


          • #6
            Something strange is happening. I've added another test on top of the existing ones (note we also have an HBase sample that we'll ship in the upcoming m2) and it passes just fine:
            Code:
                <hdp:hbase-configuration configuration-ref="hadoopConfiguration">
                    head=bucket
                </hdp:hbase-configuration>
            Code:
                @Resource(name = "hbaseConfiguration")
                Configuration config;
            
                @Autowired
                HbaseTemplate template;
            
                @Test
                public void testConfigProperties() throws Exception {
                    Assert.notNull(config);
                    assertEquals("bucket", config.get("head"));
                }
            Note that both Hbase and Hadoop share the same object type - Configuration. The XML wiring looks fine but I'm curious if context.getConfiguration() returns the proper object or not. You could double check by looking at the number of beans that match the same type and double check their names.

            Comment


            • #7
              If the above (double checking the beans and everything else) doesn't work, can you try the latest snapshot [1] to see if it helps?

              http://www.springsource.org/spring-data/hadoop#maven

              Cheers,

              Comment


              • #8
                I have an example that was shipped in the release that I am using that like your test uses the template which is injected with the hbase-configuration and then injected into the test class. I haven't run it but assume that it works (spring being really good et al) but I wonder if it has something to do with how the jobs are being loaded...

                I have checked to see if the correct Configuration is being used and the properties that I add in the hbase-configuration are available in the reducer.

                Comment


                • #9
                  the snapshot doesn't like the script-tasklet:

                  Caused by: java.lang.IllegalStateException: Cannot create filesystem
                  at org.springframework.data.hadoop.fs.HdfsResourceLoa der.<init>(HdfsResourceLoader.java:79)
                  at org.springframework.data.hadoop.fs.HdfsResourceLoa der.<init>(HdfsResourceLoader.java:92)
                  at org.springframework.data.hadoop.fs.HdfsResourceLoa der.<init>(HdfsResourceLoader.java:58)
                  at org.springframework.data.hadoop.scripting.HdfsScri ptFactoryBean.detectHdfsRL(HdfsScriptFactoryBean.j ava:155)
                  at org.springframework.data.hadoop.scripting.HdfsScri ptFactoryBean.postProcess(HdfsScriptFactoryBean.ja va:89)
                  at org.springframework.data.hadoop.scripting.Jsr223Sc riptEvaluatorFactoryBean.afterPropertiesSet(Jsr223 ScriptEvaluatorFactoryBean.java:77)
                  at org.springframework.data.hadoop.scripting.HdfsScri ptFactoryBean.afterPropertiesSet(HdfsScriptFactory Bean.java:66)
                  at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.invokeInitMethods(Abstr actAutowireCapableBeanFactory.java:1514)
                  at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.initializeBean(Abstract AutowireCapableBeanFactory.java:1452)
                  ... 60 more
                  Caused by: java.lang.NullPointerException
                  at org.apache.hadoop.fs.FileSystem.getDefaultUri(File System.java:119)
                  at org.springframework.data.hadoop.fs.HdfsResourceLoa der.<init>(HdfsResourceLoader.java:74)
                  ... 68 more

                  Comment


                  • #10
                    Try to see whether the classpath inside your cluster is the same as in your test.

                    Comment


                    • #11
                      Can you post your config? Thanks.

                      Comment


                      • #12
                        Hmm - seems there is no Hadoop configuration defined (a null one is sent pass through). Again, can you post your config? Thanks.

                        Comment


                        • #13
                          Costin - I noticed that the hbasetemplate isn't in the m1 release. If there is a fix for the scriptlet problem I would like to move to the snapshot. Do you have any insight into why it is erroring out?

                          Comment


                          • #14
                            I can only guess without looking at a sample configuration file - as I mentioned before, posting the of your config with the scriptlet definition and the relevant dependencies would help a lot.

                            Comment


                            • #15
                              It looks like your script was declared in a context without any Hadoop configuration which was then wired to create other Hadoop components resulting in exceptions. I've pushed a fix which gives out warnings and does not bind the variables in this scenario.
                              Will be available in the next nightly build.

                              Comment

                              Working...
                              X