Home blog How to Install Apache HBase on Ubuntu 13.10

How to Install Apache HBase on Ubuntu 13.10

How to Install Apache HBase on Ubuntu 13.10

First, install HBase from here. Extract tar.gz file to a directory, hbase in my case. Quick Start is very clear and may help you, before continuing here, you may want to check that.

HBase configuration is relatively easy, but there are some key points that should be considered. First thing I did was, configuring HBase to create its own zookeeper instance and adding correct Java path.

$cd hbase/conf

$nano hbase-env.sh

Uncomment the line

export HBASE_MANAGES_ZK=true

and add your Java path

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386

ctrl+O, ctrl+X

Next thing is adding properties to hbase-site.xml

$nano hbase-site.xml

<property>
<name>hbase.rootdir</name>
<value>file:///home/USER/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/USER/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>Property from Zookeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>

ctrl+O, ctrl+X

Up and running.

After that I wanted to add records to my database from a java program. With a little modification, I set one of my old codes to write data to HBase database. Before jumping to code, I want to show adding HBase libraries to a Java Project with Eclipse IDE.
[youtube https://www.youtube.com/watch?v=LDFg51utAHw&w=560&h=315]

Sample program at video takes some data from an audio and records it to database. Before explaining code I'll put libraries that I used there.

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;

After importing libraries, first thing you should do is creating a Configuration object.

Configuration hConfig = HBaseConfiguration.create();

Then you should create a HTableDescriptor object to tell table how its structure will be.

HTableDescriptor hDesc = new HTableDescriptor("Audio");
hDesc.addFamily(new HColumnDescriptor("Time"));
hDesc.addFamily(new HColumnDescriptor("Location"));
hDesc.addFamily(new HColumnDescriptor("Data"));

As you can see, unlike RDMS', you can add multiple columns to your table easily. Next thing will be creating an admin instance and the table with it.

HBaseAdmin hAdmin = new HBaseAdmin(hConfig);
if(!hAdmin.tableExists("Audio")){
hAdmin.createTable(hDesc);
}
HTable hTable = new HTable(hConfig, "Audio");

Now, we created a table named Audio with given columns. For adding data to this table, we need to create a Put instance and then, insert it to table.

Put p0 = new Put(Bytes.toBytes("Channel0"));
for(int i=0; i<this.fingerPrint.get(0).getFingerPrint().size(); i++){
p0.add(Bytes.toBytes("Time"), Bytes.toBytes(i), Bytes.toBytes(this.fingerPrint.get(0).getFingerPrint(i).getTime()));
p0.add(Bytes.toBytes("Location"), Bytes.toBytes(i), Bytes.toBytes(this.fingerPrint.get(0).getFingerPrint(i).getLocation()));
p0.add(Bytes.toBytes("Data"), Bytes.toBytes(i), Bytes.toBytes(this.fingerPrint.get(0).getFingerPrint(i).getValue()));
}
hTable.put(p0);

You need to convert your data to bytes according to record it to HBase. Anyways, after all this you should close connections.

hAdmin.close();
hTable.close();

And that's it. But, as I mentioned, that code is a sample and neither optimized nor designed for further applications. It just shows how it works and that's all. Even for RDMS' performance and security issues should be considered really carefully. For a distributed database system, any wrong decision may turn everything into hell.

Lastly, during connection to HBase from Java, you may get an error like this

WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused
ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries

 

I did. Problem was the zookeeper client port property at hbase-site.xml for my situation. If you get this error during any phase, remembering that might help and save your time.