Wednesday 2 October 2013

How to manually setup hbase for your cloudera cdh4 cluster in CentOS | RHEL | Linux

Follow these steps


1. Install hbase on all the machines

sudo yum install hbase

2.Install hbase-master and zookeper-server on your master machine

sudo yum install zookeper-server
sudo yum install hbase-master

zookeeper-server automatically installs the base zookeper package also.
For hbase to start it needs to have zookeeper 


3. Install hbase-region server and zookeeper in all your slave machines

sudo yum install zookeeper
sudo yum install hbase-regionserver

4.Modifying the HBase Configuration


To enable pseudo-distributed mode, you must first make some configuration changes. Open /etc/hbase/conf/hbase-site.xml in your editor of choice, and insert the following XML properties between the <configuration> and </configuration> tags. Be sure to replace myhost with the hostname of your HDFS NameNode (as specified by fs.default.name or fs.defaultFS in your hadoop/conf/core-site.xml file); you may also need to change the port number from the default (8020).
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://myhost:8020/hbase</value>
</property>

5.Configuring for Distributed Operation


After you have decided which machines will run each process, you can edit the configuration so that the nodes may locate each other. In order to do so, you should make sure that the configuration files are synchronized across the cluster. Cloudera strongly recommends the use of a configuration management system to synchronize the configuration files, though you can use a simpler solution such as rsync to get started quickly.
The only configuration change necessary to move from pseudo-distributed operation to fully-distributed operation is the addition of the ZooKeeper Quorum address in hbase-site.xml. Insert the following XML property to configure the nodes with the address of the node where the ZooKeeper quorum peer is running:

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>mymasternode</value>
</property>

6.Creating the /hbase Directory in HDFS


Before starting the HBase Master, you need to create the /hbase directory in HDFS. The HBase master runs as hbase:hbase so it does not have the required permissions to create a top level directory.
To create the /hbase directory in HDFS:
$ sudo -u hdfs hadoop fs -mkdir /hbase
$ sudo -u hdfs hadoop fs -chown hbase /hbase

7.Starting the ZooKeeper Server

  • To start ZooKeeper after a fresh install:
$ sudo service zookeeper-server init
$ sudo service zookeeper-server start

8.Starting zookeeper

$ sudo service zookeeper start


9.Starting the HBase Master

  • On Red Hat and SLES systems (using .rpm packages) you can now start the HBase Master by using the included service script:
$ sudo service hbase-master start

To start the Region Server:
$ sudo service hbase-regionserver start

10.Accessing HBase by using the HBase Shell

After you have started HBase, you can access the database by using the HBase Shell:
$ hbase shell