Setting Up Database in Hive Environment

1 minute read

Published:

In this post, we’re going to look at how to set up a database along with the tables in Hive.

I’ll use docker to set up the Hive environment.

Step 1. Go to the following repo and clone the repo.

Step 2. Run docker-compose up -d in the same directory as the location of docker-compose.yml.

Step 3. Create a bash session in the hive-server container with docker exec -it hive-server bash.

Step 4. Connect to the Hive server with /opt/hive/bin/beeline -u jdbc:hive2://localhost:10000.

Step 5. By default, Hive server won’t ask for the username & password.

Step 6. Let’s create a database called example_db. To do so, use CREATE DATABASE example_db.

Step 7. Use the newly created database with USE example_db.

Step 8. Now let’s create a table called example_table. To do so, use the following command.

CREATE TABLE IF NOT EXISTS example_table (ColA int, ColB int, ColC double) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n' 
STORED AS TEXTFILE 
tblproperties("skip.header.line.count”=“1”);

Step 9. Load your data into the example_table. To do so, ensure that your data has been moved to the docker container.

# copy the data from host to docker container (execute the command outside the container)
docker cp <file_path_in_host> container_id:<file_path_in_container>

# load the data to the hive table
LOAD DATA LOCAL INPATH '<path_to_your_data_in_container>' OVERWRITE INTO TABLE example_table;

Step 10. The data has been loaded to the hive table.