Installing Pseudo- Distributed HBase on Ubuntu

HBase run modes: Standalone and Distributed

Standalone mode: By default HBase runs in standalone mode. In standalone mode, HBase does not use HDFS. 

Distributed mode: Distributed mode can be subdivided into distributed but all daemons run on a single node is pseudo-distributed— and fully-distributed where the daemons are spread across all nodes in the cluster.

Hadoop version support matrix

HBase-0.92.x HBase-0.94.x HBase-0.95
Hadoop-0.20.205

S

X

X

Hadoop-0.22.x

S

X

X

Hadoop-1.0.0-1.0.2[a]

S

S

X

Hadoop-1.0.3+

S

S

S

Hadoop-1.1.x

NT

S

S

Hadoop-0.23.x

X

S

NT

Hadoop-2.x

X

S

S

[a] HBase requires hadoop 1.0.3 at a minimum; there is an issue where we cannot find KerberosUtil compiling against earlier versions of Hadoop.

Where

S = supported and tested,
X = not supported,
NT = it should run, but not tested enough.

Pseudo- Distributed Installation

The hbase-0.94.8 installation is done in below versions of Linux, Java and Hadoop respectively.

UBUNTU 13.4

JAVA 1.7.0_25

HADOOP 1.1.2

I have hduser as a dedicated hadoop system user. I had installed my Hadoop in /home/hduser/hadoop folder. Now I am going to install hbase in /usr/lib/hbase folder.

  • Download hbase<version>.tar.gz stable version from here
  • Enter into the directory where the stable version is downloaded. By default it downloads in “Downloads” directory
$ cd Downloads/
  • Unzip the tar file.
$ tar -xvf hbase-0.94.8.tar.gz
  • Create directory
$ sudo mkdir /usr/lib/hbase
  • move  hbase-0.94.8 to hbase
$ mv hbase-0.94.8 /usr/lib/hbase/hbase-0.94.8
  • Open your hbase/conf/hbase-env.sh and modify these lines
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

export HBASE_REGIONSERVERS=/usr/lib/hbase/hbase-0.94.8/conf/regionservers

export HBASE_MANAGES_ZK=true
  • Set the HBASE_HOME path in bashrc file

To open bashrc file use this command

$ gedit ~/.bashrc

In bashrc file append the below 2 statements

export HBASE_HOME=/usr/lib/hbase/hbase-0.94.8

export PATH=$PATH:$HBASE_HOME/bin

 

  • Update hbase-site.xml in HBASE_HOME/conf folder with required properties.

    hbase-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>hbase.rootdir</name>

<value>hdfs://localhost:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>localhost</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>hbase.zookeeper.property.clientPort</name>

<value>2181</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/hduser/hbase/zookeeper</value>

</property>

</configuration>
  • Now check Hadoop version support matrix. If Hadoop is not supported your hbase version then you will get some exception. To fix this simply copy hadoop-core-*.jar from your HADOOP_HOME and commons-collections-*.jar from HADOOP_HOME/lib folder into your HBASE_HOME/lib folder.
  • Extra steps

In /etc/hosts there are two entries:127.0.0.1 and 127.0.1.1.Change the second entry 127.0.1.1 to 127.0.0.1  otherwise it gives error: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

  • To start Hbase [ First start hadoop ]
hduser@archana:~$ start-hbase.sh

localhost: starting zookeeper, logging to /usr/lib/hbase/hbase-0.94.8/bin/../logs/hbase-hduser-zookeeper-archana.out
 starting master, logging to /usr/lib/hbase/hbase-0.94.8/logs/hbase-hduser-master-archana.out
 localhost: starting regionserver, logging to /usr/lib/hbase/hbase-0.94.8/bin/../logs/hbase-hduser-regionserver-archana.out

jps command list down all currently running processes

hduser@archana:~$ jps

 4334 HQuorumPeer
 2882 SecondaryNameNode
 4867 Jps
 3207 TaskTracker
 2460 NameNode
 4671 HRegionServer
 4411 HMaster
 2977 JobTracker
 2668 DataNode

Hbase Shell

hduser@archana:~$ hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.
 Type "exit<RETURN>" to leave the HBase Shell
 Version 0.94.8, r1485407, Wed May 22 20:53:13 UTC 2013

hbase(main):001:0> create 't1','c1'
  • To stop HBase
HBASE_PATH$ bin/stop-hbase.sh

stopping hbase...............

To use the web interfaces

http://localhost:60010 for master
http://localhost:60030 for region server

  • Reference :

http://hbase.apache.org/book/standalone_dist.html

http://hbase.apache.org/book/standalone_dist.html#confirm

Note:The information provided here is best of my knowledge and experience if at all any modifications are to be made please help me with ur valuable suggestion which are always welcome…. :)

Advertisements

Installing Apache HBase on Ubuntu for Standalone Mode

Standalone HBase

By default HBase runs in standalone mode. In standalone mode, HBase does not use HDFS — it uses the local file system instead — and it runs all HBase daemons and a local zookeeper all up in the same JVM. Zookeeper binds to a well-known port so clients may talk to HBase. HBase requires java 6 or newer version. If this is not the case, HBase will not start.

The hbase-0.94.8 installation is done in below versions of Linux, Java and Hadoop respectively.

UBUNTU 13.4

JAVA 1.7.0_25

HADOOP 1.1.2

I have hduser as a dedicated hadoop system user. I had installed my Hadoop in /home/hduser/hadoop folder. Now I am going to install hbase in /usr/lib/hbase folder.

  • Download hbase-0.94.8.tar.gz from here
  • Enter into the directory where the stable version is downloaded. By default it downloads in “Downloads” directory
$ cd Downloads/
  • Unzip the tar file.
$ tar -xvf hbase-0.94.8.tar.gz
  • Create directory
$ sudo mkdir /usr/lib/hbase
  • move  hbase-0.94.8 to hbase
$ mv hbase-0.94.8 /usr/lib/hbase/hbase-0.94.8
  • Configuring HBase with java

Open your hbase/conf/hbase-env.sh and set the path to the java installed in your system

export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
  • Set the HBASE_HOME path in bashrc file

To open bashrc file use this command

hduser@system_name:~$ gedit ~/.bashrc

In bashrc file append the below 2 statements

export HBASE_HOME=/usr/lib/hbase/hbase-0.94.8

export PATH=$PATH:$HBASE_HOME/bin
  •  At this point, you are ready to start HBase. But before starting it, you might want to edit conf/hbase-site.xml and set the directory you want HBase to write to, hbase.rootdir.
  •  By default, hbase.rootdir is set to /tmp/hbase-${user.name} which means you’ll lose all your data whenever your server reboots
  •  So replace DIRECTORY in the hbase-site.xml with a path to a directory where you want HBase to store its data.
  •  hbase-site.xml
<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>hbase.rootdir</name>

<value>file:///home/hduser/HBASE/hbase</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/hduser/HBASE/zookeeper</value>

</property>

</configuration>
  • Extra steps

In /etc/hosts there are two entries:127.0.0.1 and 127.0.1.1.Change the second entry 127.0.1.1 to 127.0.0.1  otherwise it gives error: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

  • To start Hbase [ in standalone mode no need to start hadoop ]
HBASE_PATH$bin/start-hbase.sh

HBASE_PATH$ bin/hbase shell
  • To stop HBase
HBASE_PATH$ bin/stop-hbase.sh

stopping hbase...............
  • To use the web interfaces

http://localhost:60010 for master
http://localhost:60030 for region server

  • Reference :

http://archive.cloudera.com/cdh/3/hbase-0.90.1-cdh3u0/quickstart.html

http://archive.cloudera.com/cdh/3/hbase-0.90.1-cdh3u0/notsoquick.html

Note:The information provided here is best of my knowledge and experience if at all any modifications are to be made please help me with ur valuable suggestion which are always welcome…. :)

Creating a Servlet with Eclipse and Tomcat

ServletDefinition: What is Servlet

Java Servlets are server-side Java program modules that process and answer client requests and implement the servlet interface. It helps in enhancing Web server functionality with minimal overhead, maintenance and support.

A servlet acts as an intermediary between the client and the server. As servlet modules run on the server, they can receive and respond to requests made by the client.

A servlet is integrated with the Java language, it possesses all the Java features such as high portability, platform independence, and security and Java database connectivity.

Environment Used

  • JDK 7
  • Eclipse IDE
  • Apache Tomcat 6.x
  1. Java Installation

Java Installation Tutorial for instructions of how to install JDK 7.

2.  Tomcat Installation

Apache Tomcat Installation Tutorial for instructions of how to install Apache Tomcat.

After the installation test if Tomcat in correctly installed by opening a browser to http://localhost:8080/ .This should open an information page of Tomcat.               Afterwards stop Tomcat. Eclipse needs to start Tomcat itself for its deployments.

3.  Eclipse Installation

Eclipse Installation Tutorial for instructions of how to install Eclipse.

4.  Now we are ready to create Dynamic Web Project.

Creating Dynamic Web Project

To create a Servlet we need to create a new ‘Dynamic Web Project’. Follow following steps:

  • File menu -> New -> Dynamic Web Project

cre

  • Enter the project name as ‘HelloWorld‘ and make sure the Apache Tomcat v6.0 Target Runtime has been selected with the Dynamic web module version as 2.5 and click on ‘Finish‘ button.

Creating Servlet

  • Project -> New -> Servlet
  • Enter the Class name as HelloServlet
  • Click on ‘Finish‘

The complete source code for the class will now look like this:

import java.io.IOException;

import java.io.PrintWriter;

import javax.servlet.ServletException;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

/**

* Servlet implementation class HelloServlet

*/

public class HelloServlet extends HttpServlet {

private static final long serialVersionUID = 1L;

/**

* @see HttpServlet#HttpServlet()

*/

public HelloServlet()

{

super();

// TODO Auto-generated constructor stub

}

/**

* @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)

*/

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException

{

// TODO Auto-generated method stub

response.setContentType(“text/html”);

PrintWriter pw = response.getWriter();

pw.println(“<h1>Hello World</h1>”);

}

}

  • response.setContentType(); means we are setting what type of response we are sending back.
  • PrintWriter will be used to write the response, we can get the PrintWriter object from the response object.

Run the Servlet

  • Right click on the ‘HelloServlet.java’ Dynamic Web project -> Run As -> Run on Server. Select the existing ‘Tomcat v6.0 Server at localhost’ and click Finish.

http://localhost:8080/HelloWord/HelloServlet

ds

Note:The information provided here is best of my knowledge and experience if at all any modifications are to be made please help me with ur valuable suggestion which are always welcome…. 🙂

 

Install Kettle 4.4.0 on Ubuntu 13.04

images

Kettle is Pentaho’s ETL tool, which is also called Pentaho Data Integration (PDI).

  • The Kettle 4.4.0 installation is done in below versions of Linux, Java and Hadoop  respectively.

UBUNTU 13.4

JAVA 1.7.0_25

HADOOP 1.1.2

  • Download kettle stable version from here
  • Enter into the directory where the stable version is downloaded. By default it downloads in “Downloads” directory

     cd /Downloads

  • Unzip the tar file.

    tar -xzf pdi-ce-4.4.0-stable.tar.gz

  • Move data-integration to /bin/pdi-ce-4.4.0

    mv data-integration/ /bin/pdi-ce-4.4.0

  • Create a symlink

    cd  /bin

    ln -s pdi-ce-4.4.0 data-integration

  • To run Spoon:

    cd  /bin/data-integration

    ./spoon.sh

Apache Hadoop 1.1.2 is not compatible with the Apache Hadoop 0.20.x line, and thus PDI doesn’t work with 1.1.2.  Follow following steps to make it compatible 🙂

  • Create Folder “hadoop-112” in hadoop-configuration directory [data-integration /plugins/ pentaho-big-data-plugin/hadoop-configurations].
  • Copy “hadoop-20” folder to “hadoop-112” folder.
  • Replace the following JARs in the client/ subfolder [data-integration /plugins/ pentaho-big-data-plugin/hadoop-configurations/hadoop-112 /lib/client] with the versions from the Apache Hadoop 1.1.2 distribution:
  1.  commons-codec-<version>.jar
  2.  hadoop-core-<version>.jar    
  • Add the following JAR from the Hadoop 1.1.2  distribution to the client/ subfolder as well:

       commons-configuration-<version>.jar

  • Change the property in plugins.properties [ data-integration /plugins/ pentaho-big-data-plugin/] to point to my new folder:

active.hadoop.configuration=hadoop-112

  • Start PDI

    ./spoon.sh

    Reference:

http://funpdi.blogspot.in/2013/03/pentaho-data-integration-44-and-hadoop.html

Note:The information provided here is best of my knowledge and experience if at all any modifications are to be made please help me with ur valuable suggestion which are always welcome….

Tomcat Installation on Ubuntu

The Apache tomcat6 installation is done in below versions of Linux, Java

UBUNTU 13.4

JAVA 1.7.0_25

  •  Download apache tomcat tar file from here
  •  Unzip the tar

hduser@archana:~$ tar xzf apache-tomcat-6.0.37.tar.gz

Or

Right click on apache-tomcat-6.0.37.tar.gz and choose extract here option.

  •  Move apache-tomcat-6.0.37 to tomcat6

hduser@archana:~$sudo mv apache-tomcat-6.0.37 /usr/local/tomcat6

  •  Now Configure bashrc file

CATALINA_HOME is path of tomcat extracted folder

JAVA_HOME is path where java is installed

hduser@archana:~$gedit ~/.bashrc

      In bashrc file append the below 2 statements

export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

export CATALINA_HOME=/usr/local/tomcat6

Save and exit out of .bashrc. You can make the changes effective by restarting the bashrc file.

hduser@archana:~$ . .bashrc

  • Once we are done with path it’s time to run (start) tomcat

Move to the bin folder

hduser@archana:~$cd /usr/local/tomcat6/bin/

  •  Start tomcat server

hduser@archana:/usr/local/tomcat6/bin$ sh startup.sh

Using CATALINA_BASE:   /usr/local/tomcat6

Using CATALINA_HOME:   /usr/local/tomcat6

Using CATALINA_TMPDIR: /usr/local/tomcat6/temp

Using JRE_HOME:        /usr/lib/jvm/jdk1.7.0_25

Using CLASSPATH:       /usr/local/tomcat6/bin/bootstrap.jar

  • Stop tomcat server

hduser@archana:/usr/local/tomcat6/bin$ sh shutdown.sh

Using CATALINA_BASE:   /usr/local/tomcat6

Using CATALINA_HOME:   /usr/local/tomcat6

Using CATALINA_TMPDIR: /usr/local/tomcat6/temp

Using JRE_HOME:        /usr/lib/jvm/jdk1.7.0_25

Using CLASSPATH:       /usr/local/tomcat6/bin/bootstrap.jar

🙂

Hive Installation On Ubuntu

The hive-0.10.0 installation is done in below versions of Linux, Java and Hadoop respectively.

UBUNTU 13.4

JAVA 1.7.0_25

HADOOP 1.1.2

I have hduser as a dedicated hadoop system user. I had installed my Hadoop in /home/hduser/hadoop folder. Now I am going to install hive  in /usr/lib/hive folder.

  • Download hive stable version from this link

http://mirror.tcpdiag.net/apache/hive/stable/

  • Enter into the directory where the stable version is downloaded. By default it downloads in “Downloads” directory
$ cd ~/Downloads
  • Unzip the tar file.

[go to root user by using command: su ]

# tar xzf hive-0.10.0.tar.gz
  • Create directory
# mkdir /usr/lib/hive
  • move  hive-0.10.0 to hive
 # mv hive-0.10.0 /usr/lib/hive/hive-0.10.0

[Exit from root to hduser by using command: su hduser or exit ]

  • Set the HIVE_HOME path in bashrc file

To open bashrc file use this command

hduser@system_name:~$ gedit ~/.bashrc

            In bashrc file append the below 2 statements

export HIVE_HOME=/usr/lib/hive/hive-0.10.0

export PATH=$PATH:$HIVE_HOME/bin
  •  Type hive in command line and now you can see hive shell.
$ hive

hive>
  • Now you can play with Hive 🙂

How to install MySQL on Ubuntu

The MySql installation is done in below version of Ubuntu.

UBUNTU 13.4

  • First of all, make sure your package management tools are up-to-date. Also make sure you install all the latest software available.

            sudo apt-get update

             sudo apt-get dist-upgrade

  • Install the MySQL server and client packages:

sudo apt-get install mysql-server mysql-client

The apt-get command will also install the mysql-client package which is necessary to login to mysql from the server itself.

During the installation, MySQL will ask you to set a root password.

c

  • You can now access your MySQL server like this:

mysql -u root -p

 mysql>

  •  Have fun using MySQL Server 🙂
  • What is mysql server and mysql client

The mysql server package will install the mysql database server which you can interact with using a mysql client. You can use the mysql client to send commands to any mysql server; on a remote computer or your own.

The mysql server is used to persist the data and provide a query interface for it (SQL). The mysql clients purpose is to allow you to use that query interface.

How to install Java in Ubuntu

The JDK 7 installation is done in below version of Ubuntu.

UBUNTU 13.4

  •  Download  from here 32bit or 64bit Linux “compressed binary file” – it has a “.tar.gz” file extension i.e. “[java-version]-i586.tar.gz” for 32bit and “[java-version]-x64.tar.gz” for 64bit
  • Once the download is complete, uncompressed the file using following command

sudo tar -xvf jdk-7u25-linux-i586.tar.gz

JDK 7 package is extracted into /jdk1.7.0_25 directory.

  • Now move the JDK 7 directory to /usr/lib

            sudo mkdir -p /usr/lib/jvm

            sudo mv jdk1.7.0_25/ /usr/lib/jvm/jdk1.7.0_25

  • Now run

sudo update-alternatives –install “/usr/bin/java” “java” “/usr/lib/jvm/jdk1.7.0_25/bin/java” 1

sudo update-alternatives –install “/usr/bin/javac” “javac” “/usr/lib/jvm/jdk1.7.0_25/bin/javac” 1

sudo update-alternatives –install “/usr/bin/javaws” “javaws” “/usr/lib/jvm/jdk1.7.0_25/bin/javaws” 1

  • Run

sudo update-alternatives –config java

You will see output similar one below – choose the number of jdk1.7.0_25

There are 2 choices for the alternative java (providing /usr/bin/java).

 Selection    Path                                           Priority          Status

————————————————————————————–

  0      /usr/lib/jvm/java-7-openjdk-i386/jre/bin/java     1071   auto mode

  1      /usr/lib/jvm/java-7-openjdk-i386/jre/bin/java    1071   manual mode

* 2     /usr/lib/jvm/jdk1.7.0_25/bin/java                       1       manual mode

 Press enter to keep the current choice[*], or type selection number: 2

 

  • Repeat the above step 5 for below commands:

sudo update-alternatives –config javac

sudo update-alternatives –config javaws

  • Check the version of your new JDK 7 installation:

java -version

java version “1.7.0_25”

Java(TM) SE Runtime Environment (build 1.7.0_25-b15)

Java HotSpot(TM) Server VM (build 23.25-b01, mixed mode)

  • Set JAVA_HOME in Ubuntu.

Open the .bashrc

gedit ~/.bashrc

Now add the following to the end of the file.

export JAVA_HOME=/usr/lib/jvm/java-1.7.0_25

export PATH=$PATH:$JAVA_HOME/bin

NOTE: If /usr/lib/jvm/java does not match the actual JAVA_HOME path in your environment, then set the actual JAVA_HOME, where you have installed Java in your machine.

Now, JDK 7 has been successfully installed on your Ubuntu 🙂

Installing Eclipse in Ubuntu

Before installing eclipse IDE you need to check few things.

First, check whether you have Java installed or not. For that you need to run following command in your terminal

hduser@archana:~$ java -version

If you get the java version as a output that means you have java otherwise u need to install java.

  • First download the eclipse tar.gz package from here
  • Use the below command line to extract the tar.gz package.

hduser@archana:~$ tar xzf eclipse-jee-kepler-R-linux-gtk.tar.gz

 Or

you can also use- right click on eclipse-jee-kepler-R-linux-gtk.tar.gz and chose extract here option.

  • Move the extracted eclipse in the /opt/ folder.

hduser@archana:~$ sudo mv eclipse /opt/

  • Create a desktop file and place it into /usr/share/applications

hduser@archana:~$ gedit /usr/share/applications/eclipse.desktop

and copy the following to the eclipse.desktop file

[Desktop Entry]

Name=Eclipse

Type=Application

Exec=/opt/eclipse/eclipse

Terminal=false

Icon=/opt/eclipse/icon.xpm

Comment=Integrated Development Environment

NoDisplay=false

Categories=Development;IDE

Name[en]=eclipse.desktop

  •  Create a symlink in /usr/local/bin using

hduser@archana:~$ cd /usr/local/bin

hduser@archana:~$ sudo ln -s /opt/eclipse/eclipse

  •   Now goto /usr/share/applications and find eclipse.desktop file for launching eclipse , you can drag this file to the launcher.
  • Start Eclipse
%d bloggers like this: