Namenode not starting

I was using Hadoop in a pseudo-distributed mode and everything was working fine. But when I restarted my computer I can’t start Namenode. Only way I can start Namenode is by formatting it and I end up losing data in HDFS.

  • Make following changes to start Namenode 

In conf/hdfs-site.xml, you should have a property like

<property>

    <name>dfs.name.dir</name>

    <value>/home/hduser/hadoop/data</value>

</property>

The property “dfs.name.dir” allow you to control where Hadoop writes NameNode metadata. And giving it another dir rather than /tmp makes sure the NameNode data isn’t being deleted when you reboot.

Format Namenode after you change it

$ bin/hadoop namenode -format

$ bin/hadoop start-all.sh


Reference:

http://hadoop.apache.org/docs/stable/hdfs-default.html

Advertisements

Pig Installation on Ubuntu

pigExecution Modes

Pig has two execution modes :

  • Local Mode – To run Pig in local mode, you need access to a single machine; all files are installed and run using your local host and file system. Specify local mode using the -x flag (pig -x local).
  • MapReduce Mode – To run Pig in MapReduce mode, you need access to a Hadoop cluster and HDFS installation. MapReduce mode is the default mode; you can, but don’t need to, specify it using the -x flag (pig OR pig -x mapreduce).

The pig-0.11.1 installation is done in below versions of Linux and Hadoop respectively.

UBUNTU 13.4

HADOOP 1.1.2

I have hduser as a dedicated hadoop system user. I had installed my Hadoop in /home/hduser/hadoop folder. Now I am going to install pig in /usr/lib/pig folder.

  • Download Pig from here.
  • Enter into the directory where the stable version is downloaded. By default it downloads in “Downloads” directory.
$ cd Downloads/
  • Unzip the tar file.
$ tar -xvf pig-0.11.1.tar.gz
  • Create directory
$ sudo mkdir /usr/lib/pig
  • move pig-0.11.1 to pig
$ mv pig-0.11.1 /usr/lib/pig/
  • Set the PIG_HOME path in bashrc file

To open bashrc file use this command

$ gedit ~/.bashrc

 In bashrc file append the below 2 statements

export PIG_HOME=/usr/lib/pig/pig-0.11.1
export PATH=$PATH:$PIG_HOME/bin
  • Restart your computer or use [ . .bashrc]

Now let’s test the installation

On the command prompt type

$ pig -h

It shows the help related to Pig, and its various commands.

  • Starting pig in local mode
 $ pig -x local grunt>
  •  Starting pig in mapreduce mode
 $ pig -x mapreduce

                        or

 $ pig

Reference:

http://pig.apache.org/docs/r0.10.0/start.html

Note:The information provided here is best of my knowledge and experience if at all any modifications are to be made please help me with your valuable suggestion which are always welcome…. :)

How to add RevolverMaps Widget to your Blog /Website

RevolverMaps

This widget displays all visitor locations as well as recent hits with city, state and country information live and in real time. A click on the enlarge button opens the live statistics page.

Go to the site http://www.revolvermaps.com/ and click on “Get Standard Version”.

Customise the look of your globe by changing Globe, Dimensions, Colors and Advanced Settings to suit your tastes by clicking on the round button

map

Copy the code from step number 5 [Copy The Code Your Site…]

  •  How to add a widget to WordPress.com?

Login to your WordPress account

  1. Go to ‘My Blog’ – ‘Dashboard’ – ‘Appearance’ – ‘Widgets’
  2. Drag the Element ‘Text – Arbitrary text or HTML’ to the sidebar
  3. Copy the code from the RevolverMaps setup page to the big textbox, optionally add a title
  4. Click on save, you’re done.
  • How to add a widget to a blogger.com (blogspot.com) layout?

Login to your Blogger-account

  1. Choose your blog on the dashboard, click on ‘Layout’. You get an overview of the page elements on your blog.
  2. Click on one of the ‘Add a Gadget’ links, a pop-up opens
  3. Under ‘Basics’ click on ‘HTML/JavaScript’
  4. Paste the code you get at revolvermaps.com into ‘Content’, optionally add a title
  5. Click on ‘SAVE’
  6. Drag the new page element representing the widget to a position of your choice
  7. Click on ‘PREVIEW’, check if the widget fits into your layout. You may have to experiment a little in order to find appropriate size settings for the widget.
  8. Click on ‘SAVE’, you’re done
  •   How to add a widget to Website?

Copy the code from the RevolverMaps setup page into your web page html code.

 

Happy bloging 🙂

Install MongoDB on Ubuntu

MongoDB is an open-source document database, and the leading NoSQL database. Written in C++.

This example is using MongoDB 2.4.6, running on Ubuntu13.4, both MongoDB client and server console are run on localhost, same machine.

  • Download MongoDB from here.
  • Enter into the directory where the MongoDB is downloaded. By default it downloads in “Downloads” directory
$ cd Downloads/
  • Unzip the tar file.
$ tar xzf mongodb-linux-i686-2.4.6.tgz
  • Move mongodb-linux-i686-2.4.6 to mongodb
$ sudo mkdir /usr/lib/mongodb

$ sudo mv mongodb-linux-i686-2.4.6 /usr/lib/mongodb/

  • Before you start mongod for the first time, you will need to create the data directory. By default, mongod writes data to the /data/db/ directory. To create this directory, and set the appropriate permissions use the following commands:
# mkdir -p /data/db

# chmod 777 /data/*
 

1st command prompt: Start mongodb server

$ cd /usr/lib/mongodb/mongodb-linux-i686-2.4.6/bin/

$ ./mongod

mongo

2nd command prompt: Start the client

$ cd /usr/lib/mongodb/mongodb-linux-i686-2.4.6/bin/

$ ./mongo

mongo1

mongo2

Reference:

http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/

http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/

Note: The information provided here is best of my knowledge and experiences if at all any modifications are to be made please help me with your valuable suggestions which are always welcome…. :)

Mount and Unmount USB drive in Ubuntu

Mount the Drive

Step 1 : Go to media  

$ cd /media

Step 2 : Create the Mount Point

Now we need to create a mount point for the device, let’s say we want to call it “usb-drive “. You can call it whatever you want. Create the mount point:

$ sudo mkdir usb-drive

 Step 3 : Mount the Drive

We can now mount the drive. Let’s say the device is /dev/sdb1, the filesystem is FAT16 or FAT32 , and we want to mount it at /media/usb-drive (having already created the mount point)

 $ sudo mount -t vfat /dev/sdb1 /media/usb-drive

 Step 4 : Check USB drive contents

 $ ls /media/usb-drive/

Unmount the Drive

$ sudo umount /dev/sdb1

            Or

$ sudo umount /media/usb-drive

—————————-*———————*——————–*———————————–

NOTE
Error: umount: /media/usb-drive: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
Solution: It means that some process has a working directory or an open file handle underneath the mount point. The best thing to do is to close the file before unmounting.

Note: The information provided here is best of my knowledge and experiences if at all any modifications are to be made please help me with your valuable suggestions which are always welcome…. 🙂

Database Access with Apache Hadoop

The DBInputFormat and DBOutputFormat component provided in Hadoop 0.19 finally allows easy import and export of data between Hadoop and many relational databases, allowing relational data to be more easily incorporated into your data processing pipeline.

To import and export data between Hadoop and MySQL, you surely need Hadoop, MySQL installation on your machine.

  • My System Configuration

UBUNTU 13.4

JAVA 1.7.0_25

HADOOP 1.1.2

MySQL

Download mysql-connector-java-5.0.5.jar file and copy it to in $HADOOP_HOME/lib and restart the Hadoop ecosystem.

  • Database and table creation in MySQL
mysql> use testDb;

mysql> create table studentinfo (  id integer ,  name varchar(32) );

mysql> insert into studentinfo values(1,'archana');

mysql> insert into studentinfo values(2,'XYZ');

mysql> insert into studentinfo values(3,'archana');

  • Project Structure

The program contains the following java files.

Main.java
Map.java
Reduce.java
DBInputWritable.java
DBOutputWritable.java

To access the data from DB we have to create a class to define the data which we are going to fetch and write back to DB. In my project I created a class namely DBInputWritable.java and DBOutputWritable.java to accomplish the same.

DBInputWritable.java

package example;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.sql.ResultSet;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.lib.db.DBWritable;

public class DBInputWritable implements Writable, DBWritable
{
   private int id;
   private String name;

   public void readFields(DataInput in) throws IOException {   }

   public void readFields(ResultSet rs) throws SQLException
   //Resultset object represents the data returned from a SQL statement
   {
     id = rs.getInt(1);
     name = rs.getString(2);
   }

   public void write(DataOutput out) throws IOException {  }

   public void write(PreparedStatement ps) throws SQLException
   {
     ps.setInt(1, id);
     ps.setString(2, name);
   }

   public int getId()
   {
     return id;
   }

   public String getName()
   {
     return name;
   }
}

This class “DBInputWritable” will be used in our Map class. Now let’s write our Mapper class.

Map.java

package example;

import java.io.IOException;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;

public class Map extends Mapper<LongWritable, DBInputWritable, Text, IntWritable>
{
   private IntWritable one = new IntWritable(1);

   protected void map(LongWritable id, DBInputWritable value, Context ctx)
   {
     try
     {
        String[] keys = value.getName().split(" ");

        for(String key : keys)
        {
           ctx.write(new Text(key),one);
        }
     } catch(IOException e)
     {
        e.printStackTrace();
     } catch(InterruptedException e)
     {
        e.printStackTrace();
     }
   }
}

DBOutputWritable.java

package example;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.sql.ResultSet;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.lib.db.DBWritable;

public class DBOutputWritable implements Writable, DBWritable
{
   private String name;
   private int count;

   public DBOutputWritable(String name, int count)
   {
     this.name = name;
     this.count = count;
   }

   public void readFields(DataInput in) throws IOException {   }

   public void readFields(ResultSet rs) throws SQLException
   {
     name = rs.getString(1);
     count = rs.getInt(2);
   }

   public void write(DataOutput out) throws IOException {    }

   public void write(PreparedStatement ps) throws SQLException
   {
     ps.setString(1, name);
     ps.setInt(2, count);
   }
}

This class “DBOutputWritable” will be used in our Reduce class. Now let’s write our Reducer class.

Reduce.java

package example;

import java.io.IOException;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;

public class Reduce extends Reducer<Text, IntWritable, DBOutputWritable, NullWritable>
{
   protected void reduce(Text key, Iterable<IntWritable> values, Context ctx)
   {
     int sum = 0;

     for(IntWritable value : values)
     {
       sum += value.get();
     }

     try
     {
     ctx.write(new DBOutputWritable(key.toString(), sum), NullWritable.get());
     } catch(IOException e)
     {
       e.printStackTrace();
     } catch(InterruptedException e)
     {
       e.printStackTrace();
     }
   }
}

Main.java

package example;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.db.DBConfiguration;
import org.apache.hadoop.mapreduce.lib.db.DBInputFormat;
import org.apache.hadoop.mapreduce.lib.db.DBOutputFormat;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;

public class Main
{
   public static void main(String[] args) throws Exception
   {
     Configuration conf = new Configuration();
     DBConfiguration.configureDB(conf,
     "com.mysql.jdbc.Driver",   // driver class
     "jdbc:mysql://localhost:3306/testDb", // db url
     "root",    // user name
     "hadoop123"); //password

     Job job = new Job(conf);
     job.setJarByClass(Main.class);
     job.setMapperClass(Map.class);
     job.setReducerClass(Reduce.class);
     job.setMapOutputKeyClass(Text.class);
     job.setMapOutputValueClass(IntWritable.class);
     job.setOutputKeyClass(DBOutputWritable.class);
     job.setOutputValueClass(NullWritable.class);
     job.setInputFormatClass(DBInputFormat.class);
     job.setOutputFormatClass(DBOutputFormat.class);

     DBInputFormat.setInput(
     job,
     DBInputWritable.class,
     "studentinfo",   //input table name
     null,
     null,
     new String[] { "id", "name" }  // table columns
     );

     DBOutputFormat.setOutput(
     job,
     "output",    // output table name
     new String[] { "name", "count" }   //table columns
     );

     System.exit(job.waitForCompletion(true) ? 0 : 1);
   }
}

Now you are ready to run the program. Convert the code to .jar file and run it.

Execute jar file

$ hadoop jar /home/hduser/DbIpOp.jar

Result

mysql> select * from output;

+-----------+-------+

| name      | count |

+-----------+-------+

| archana |     2    |

|  XYZ      |     1  |

+----------+------=--+2 rows in set (0.19 sec)

Reference

http://blog.cloudera.com/blog/2009/03/database-access-with-hadoop/

Sqoop:Exporting Data From HDFS to MySQL

Step 1: Install and start MySQL if you have not already done so

MySQL Installation Tutorial for instructions of how to install MySQL.

Step 2: Configure the MySQL Service and Connector

Downloadmysql-connector-java-5.0.5.jar file and copy it to $SQOOP_HOME/lib directory.

Step 3: Sqoop Installation

Sqoop Installation Tutorial for instructions of how to install Sqoop.

  • Database and table creation in MySQL

First connect to MySQL

$ mysql -u root -p

Enter password:

Create database ‘testDb’ and use ‘testDb’ database as a current database.

mysql> create database testDb;

mysql> use testDb;

Create table ‘stud1’

mysql> create table stud1(id integer,name char(20)); 

mysql> exit; 
  • HDFS File ‘student’
$hadoop dfs -cat /user/hduser/student 

1,Archana 

2,XYZ 

Sqoop Export

$sqoop export --connect jdbc:mysql://localhost/testDb --table stud1 -m 1 --export-dir /user/hduser/student

This example takes the files in /user/hduser/student and injects their contents in to the “stud1” table in the testDb” database. The target table must already exist in the database.

Note :

If you will get this

Error

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Access denied for user ''@'localhost' to database 'testDb' 

Solution

Grant all privileges on testDb database to user:

mysql> grant all privileges on testDb.* to ''@localhost ;

Table Contents in MySQL

mysql> use testDb; 

mysql> select * from stud1; 

+------+----------+ 
| id   | name     | 
+------+----------+ 
| 1    | Archana  | 
| 2    | XYZ      | 
+------+----------+ 
2 rows in set (0.00 sec) 

Reference:

http://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html

Note: The information provided here is best of my knowledge and experiences if at all any modifications are to be made please help me with your valuable suggestions which are always welcome….  🙂