Sqoop:Exporting Data From HDFS to MySQL

Step 1: Install and start MySQL if you have not already done so

MySQL Installation Tutorial for instructions of how to install MySQL.

Step 2: Configure the MySQL Service and Connector

Downloadmysql-connector-java-5.0.5.jar file and copy it to $SQOOP_HOME/lib directory.

Step 3: Sqoop Installation

Sqoop Installation Tutorial for instructions of how to install Sqoop.

  • Database and table creation in MySQL

First connect to MySQL

$ mysql -u root -p

Enter password:

Create database ‘testDb’ and use ‘testDb’ database as a current database.

mysql> create database testDb;

mysql> use testDb;

Create table ‘stud1’

mysql> create table stud1(id integer,name char(20)); 

mysql> exit; 
  • HDFS File ‘student’
$hadoop dfs -cat /user/hduser/student 

1,Archana 

2,XYZ 

Sqoop Export

$sqoop export --connect jdbc:mysql://localhost/testDb --table stud1 -m 1 --export-dir /user/hduser/student

This example takes the files in /user/hduser/student and injects their contents in to the “stud1” table in the testDb” database. The target table must already exist in the database.

Note :

If you will get this

Error

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Access denied for user ''@'localhost' to database 'testDb' 

Solution

Grant all privileges on testDb database to user:

mysql> grant all privileges on testDb.* to ''@localhost ;

Table Contents in MySQL

mysql> use testDb; 

mysql> select * from stud1; 

+------+----------+ 
| id   | name     | 
+------+----------+ 
| 1    | Archana  | 
| 2    | XYZ      | 
+------+----------+ 
2 rows in set (0.00 sec) 

Reference:

http://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html

Note: The information provided here is best of my knowledge and experiences if at all any modifications are to be made please help me with your valuable suggestions which are always welcome….  🙂

Advertisements

8 Responses to Sqoop:Exporting Data From HDFS to MySQL

  1. Rahul says:

    I am unable to import into hdfs from mysql. it is throwing this error thats hard to fathom.

    sqoop import –connect jdbc:mysql://localhost:3306/test –username root –pass word password –table empmaster –m 1 –target-dir /sql/empmaster
    Warning: /usr/lib/hbase does not exist! HBase imports will fail.
    Please set $HBASE_HOME to the root of your HBase installation.
    Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: $HADOOP_HOME is deprecated.

    14/09/08 08:23:45 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecu re. Consider using -P instead.
    14/09/08 08:23:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    14/09/08 08:23:45 INFO tool.CodeGenTool: Beginning code generation
    14/09/08 08:23:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `empmaster ` AS t LIMIT 1
    14/09/08 08:23:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `empmaster ` AS t LIMIT 1
    14/09/08 08:23:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
    Note: /tmp/sqoop-rahul/compile/d0c54d7bed898bc399c98e20e2eaf930/empmaster.java uses or overrid es a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    14/09/08 08:23:46 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-rahul/compile/d0c5 4d7bed898bc399c98e20e2eaf930/empmaster.jar
    14/09/08 08:23:46 WARN manager.MySQLManager: It looks like you are importing from mysql.
    14/09/08 08:23:46 WARN manager.MySQLManager: This transfer can be faster! Use the –direct
    14/09/08 08:23:46 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
    14/09/08 08:23:46 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull ( mysql)
    14/09/08 08:23:46 INFO mapreduce.ImportJobBase: Beginning import of empmaster
    14/09/08 08:23:47 INFO mapred.JobClient: Running job: job_201409080821_0001
    14/09/08 08:23:48 INFO mapred.JobClient: map 0% reduce 0%
    14/09/08 08:23:55 INFO mapred.JobClient: Task Id : attempt_201409080821_0001_m_000000_0, Statu s : FAILED
    java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.Commun icationsException: Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190 )
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:193)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:162)
    … 9 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failur e

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1121)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:355)
    at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2479)
    at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2516)
    at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2301)
    at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:834)
    at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
    at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:215)
    at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:29 4)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:187)
    … 10 more
    Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.(Socket.java:425)
    at java.net.Socket.(Socket.java:241)
    at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:305)
    … 26 more

    14/09/08 08:23:55 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:23:55 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:24:01 INFO mapred.JobClient: Task Id : attempt_201409080821_0001_m_000000_1, Statu s : FAILED
    java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.Commun icationsException: Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190 )
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:193)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:162)
    … 9 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failur e

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1121)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:355)
    at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2479)
    at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2516)
    at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2301)
    at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:834)
    at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
    at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:215)
    at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:29 4)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:187)
    … 10 more
    Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.(Socket.java:425)
    at java.net.Socket.(Socket.java:241)
    at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:305)
    … 26 more

    14/09/08 08:24:01 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:24:01 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:24:05 INFO mapred.JobClient: Task Id : attempt_201409080821_0001_m_000000_2, Statu s : FAILED
    java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.Commun icationsException: Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190 )
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:193)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:162)
    … 9 more
    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failur e

    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not rec eived any packets from the server.
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1121)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:355)
    at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2479)
    at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2516)
    at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2301)
    at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:834)
    at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:47)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl .java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcce ssorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
    at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:215)
    at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:29 4)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:187)
    … 10 more
    Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.(Socket.java:425)
    at java.net.Socket.(Socket.java:241)
    at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)
    at com.mysql.jdbc.MysqlIO.(MysqlIO.java:305)
    … 26 more

    14/09/08 08:24:05 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:24:05 WARN mapred.JobClient: Error reading task outputConnection refused
    14/09/08 08:24:11 INFO mapred.JobClient: Job complete: job_201409080821_0001
    14/09/08 08:24:11 INFO mapred.JobClient: Counters: 6
    14/09/08 08:24:11 INFO mapred.JobClient: Job Counters
    14/09/08 08:24:11 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=20349
    14/09/08 08:24:11 INFO mapred.JobClient: Total time spent by all reduces waiting after res erving slots (ms)=0
    14/09/08 08:24:11 INFO mapred.JobClient: Total time spent by all maps waiting after reserv ing slots (ms)=0
    14/09/08 08:24:11 INFO mapred.JobClient: Launched map tasks=4
    14/09/08 08:24:11 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
    14/09/08 08:24:11 INFO mapred.JobClient: Failed map tasks=1
    14/09/08 08:24:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 24.5751 seconds (0 byte s/sec)
    14/09/08 08:24:11 INFO mapreduce.ImportJobBase: Retrieved 0 records.
    14/09/08 08:24:11 ERROR tool.ImportTool: Error during import: Import job failed!

    Like

  2. Ashwen says:

    even am facing the same problem… if some one knows plz reply

    Like

    • Shiva says:

      sqoop import –connect jdbc:mysql://localhost:3306/test –username root –password password –table empmaster –m 1 –target-dir /sql/empmaster

      no gap in password arg

      Like

    • Shiva says:

      sqoop import –connect jdbc:mysql://localhost:3306/test –username root –pass word password –table empmaster –m 1 –target-dir /sql/empmaster

      remove the gap in password argument and try

      sqoop import –connect jdbc:mysql://localhost:3306/test –username root –password password –table empmaster –m 1 –target-dir /sql/empmaster

      Like

  3. Atul says:

    I am having requirement to export where 5 – 6 tales with relation to each other and need to select few fields with join and put to HDFS. Please explain how i can do it.

    Like

  4. akash madame says:

    start the multinode first and then import the data

    Like

  5. Avi says:

    Need help on below

    $hdfs dfs -cat file1.txt
    123,2322,asdf,2222
    343,222,”asdfs,sdsd”,3433
    123,1234,”abc””def”,2222

    $sqoop export \
    -m 1 \
    –connect jdbc:mysql://localhost/movielens \
    –username root \
    –password cloudera \
    –table tags \
    –export-dir /user/cloudera/file1.txt \
    –optionally-enclosed-by ‘”‘

    So this will export first 2 records but not the 3rd record.
    Could anyone please help me on the syntax to import the 3rd record as well.

    Thanks

    Like

  6. Avi says:

    Need your help for below task.

    Below is the file i want to load

    [cloudera@quickstart ~]$ hdfs dfs -cat file1.txt
    12,2321,asdfb,123123
    23,3434,rqwerwrwerwe,123212
    34,234234,wewerewrwer,2322322
    45,22222,srwerwerw,121212
    343,222,”wer,ere”,22222
    3333,2222,abc’def’ghi,4444
    76676,23423,asdfghj,12345
    9999,8888,”abc,sdf’d,wer”,32432
    123,2322,”abc””asds”,2222

    Below is the command i used to export

    sqoop export –connect jdbc:mysql://localhost/movielens -m 1 –username root –password cloudera –table tags –export-dir /user/cloudera/file1.txt –optionally-enclosed-by ”

    Now the last line of the file is not exported to table. Could you please help on this syntax?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: