Changing Default metastore [derby] of Hive to MySQL

Hive comes configured to use Derby as the metastore. , but it can support only one active user at a time and is not certified for production use. The solution to supporting multiple users is to use a standalone database. MySQL is a popular choice for the standalone metastore.

  • Step 1: Install and start MySQL if you have not already done so

MySQL Installation Tutorial for instructions of how to install MySQL.

  • Step 2: Configure the MySQL Service and Connector

Download mysql-connector-java-5.0.5.jar file and copy it to $HIVE_HOME/lib directory.

  • Step 3. Create the Database and User

Create a metastore_db database in MySQL database using root user

$ mysql -u root -p

Enter password:

mysql> CREATE DATABASE metastore_db;

Create a User [hiveuser] in MySQL database using root user

mysql> CREATE USER 'hiveuser'@'%' IDENTIFIED BY 'hivepassword';

mysql> GRANT all on *.* to 'hiveuser'@localhost identified by 'hivepassword';

[here the user hiveuser is same as ConnectionUserName in hive-site.xml file.]

mysql>  flush privileges;
  • Step 4: Install Hive if you have not already done so

Hive Installation Tutorial for instructions of how to install Hive.

  • Step 5: Configure the Metastore Service to Communicate with the MySQL Database

Create hive-site.xml file in $HIVE_HOME/conf directory and add the following configurations:

            <configuration>

            <property>

            <name>javax.jdo.option.ConnectionURL</name>

            <value>jdbc:mysql://localhost/metastore_db?createDatabaseIfNotExist=true</value>

            <description>metadata is stored in a MySQL server</description>

            </property>

            <property>

            <name>javax.jdo.option.ConnectionDriverName</name>

            <value>com.mysql.jdbc.Driver</value>

            <description>MySQL JDBC driver class</description>

            </property>

            <property>

            <name>javax.jdo.option.ConnectionUserName</name>

            <value>hiveuser</value>

            <description>user name for connecting to mysql server </description>

            </property>

            <property>

            <name>javax.jdo.option.ConnectionPassword</name>

            <value>hivepassword</value>

            <description>password for connecting to mysql server </description>

            </property>

            </configuration>
  • That’s all. Now hive will create the schema at the backend. Let us check:

Hive console:

hive> create table hivetesting(id string);

MySql console:

There are 2 ways to access metastore_db

1.      mysql -u root -p
        Enter password:
        mysql> use metastore_db;
        mysql> show tables ;

2.     mysql -u hiveuser -p metastore_db
       Enter password:
       mysql> show tables;

metastore

You can query the metastore schema in your MySQL database. Something like:

mysql> select * from TBLS;

On your MySQL database you will see the names of your Hive tables.

Add a new table and verify that the above query returns updated results.

  • Reference:

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_18_4.html

https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-Introduction

9 thoughts on “Changing Default metastore [derby] of Hive to MySQL

  1. Amaresh says:

    Good doc..But I have one problem..
    I have installed mysql in 10.210.51.207 machine.
    Hive in 10.210.51.206 machine.

    I need to access meatadata_db in hive machine from 207 manhine.
    I am getting error: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    javax.jdo.option.ConnectionURL
    jdbc:mysql://localhost:3306/metastore_db
    JDBC connect string for a JDBC metastore

    javax.jdo.option.ConnectionDriverName
    com.mysql.jdbc.Driver
    Driver class name for a JDBC metastore

    javax.jdo.option.ConnectionUserName
    hive

    javax.jdo.option.ConnectionPassword
    hive

    datanucleus.autoCreateSchema
    false

    datanucleus.fixedDatastore
    true

    hive.metastore.uris
    thrift://10.210.51.206:9083
    IP address and port of the metastore host

    Like

  2. Alex says:

    Great Tutorial so far… but when i try to create a table in hive i get the following error:

    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    I tried to solve it in different ways, but now i haven’t any clue what to do….

    Like

    • abirami says:

      Hi
      Problem: Hive throws above exception when old version of MySQL is used as hive metastore.

      Solution: Set Latin1 as the charset for metastore
      mysql> alter database metastore character set latin1.

      Like

  3. Priyal says:

    Tutorial is great. It works Well but, I am unable to create new table through mysql command prompt or insert values into existing table through mysql prompt.

    Any help is Appreciated.

    Like

  4. Tutorial is good. But I am getting “Empty set (0.00 sec)” in mysql when I hit ‘select * from TBLS;”

    I have created 3 tables in hive using same commands as you mentioned.

    Any Idea what might be wrong here?

    Like

  5. Good Tutorial.
    I have question: In cluster environment where mysql installed (NameNode or all DataNodes) and hive installed in client machine (or) any other nodes in cluster?

    Thanks and Regards
    Hareesh A

    Like

Leave a comment