How To Install Hive?
Prior Requirement to Install Hive
As I said earlier, it’s very important to understand Apache hive runs on top of the Hadoop Ecosystem and Hadoop Should be up and running with all demons.
Some of the basic Hadoop demons are as follows:
- Name node
- Data node
- Resource manager
- Node manager
To check the Hadoop version, below is the command:
Type → Hadoop version in command prompt it will give you the version of Hadoop.
To check the Hadoop cluster report trigger below command:
Type →Hadoop dfsadmin –report in command prompt it will give you the whole cluster report if your server is running.
If Hadoop is not installed on your machine requesting you to please follow the apache instruction to install Hadoop on your system.
I hope java has been installed already on your system as well. to check the java version, please refer to below screenshot.
Steps To Install Hive on Ubuntu
Below are the steps to install Hive on Ubuntu are as follows:
Step 1: Hive tar we can download by using the below command in the terminal we can directly download from the terminal as well.
Step 2: Extract the tar file; using the below command in the terminal, we can directly extract the tar above the downloaded tar hive tar file.
Command: tar -xzf apache-hive-2.1.0-bin.tar.gz
I will suggest you verify with ls command about extracted hive file.
Step 3: Edit the “.bashrc” file to update the environment variables for the user.
Command: sudo the .bashrc
Add the following at the end of the file:
# Set HIVE_HOME
Execute the below-given command to complete the changes work in the current terminal.
Command: source .bashrc
Step 5: We need to create Hive directories within the HDFS location. This directory ‘warehouse’ will be the location to store the metadata related information of the hive table and data related to Hive.
- hdfs dfs -mkdir -p /user/hive/warehouse
- hdfs dfs -mkdir /tmp
Step 6: To set the read and write permission for the hive table execute the below command.
In the below command, providing write permission to the user group:
- hdfs dfs -chmod g+w /user/hive/warehouse
- hdfs dfs -chmod g+w /tmp
Configuring Hive: It’s very important to point of install hive to configure with Hadoop. We need to edit hive-env.sh, a file which is placed in the $HIVE_HOME/conf directory. The following commands redirect to Hive conf folder and copy the template file:
Step 7: Set a Hadoop path in hive-env.sh
Edit the hive-env.sh file by appending the following line:
Now by this process, we are almost done and the hive installations have been completed successfully it’s important to configure Metastore with the external database server and by default, Apache Hive framework uses Derby database. By using below command Initializing Derby database.
Command: bin/schematool -initSchema -dbType derby
Step 8: Launch Hive.
Command: hive (type hive in the terminal within second hive terminal will open.)
Working with Hive: Now we will see some of the operations in the hive to see how many tables we have in default database use refer below screenshots in the below screenshots it’s not showing any tables means we don’t have any tables in the default database.
To create a table in the hive it’s very important to refer required database otherwise any table will get created under the default database.
Important commands in Hive
1: show databases (it will show all databases that have been created till yet).
2: create the database if not exists mydb (this command will create one database with the name of ‘mydb’ if ‘mydb’ not exists and if ‘mydb already exists it will not give any error as well’)
3: use database whenever we have to use some DDL command on the particular database we should use the command “use database” in our case we have already created “mydb” show command would be used mydb.
Important Hive DDL command
CREATE, DROP, TRUNCATE, SHOW, DESCRIBE.
- Create: – Create a statement used to create a database or create a table in a hive.
Example: hive> create database Company; (database create)
Hive> create table employee (id int, name String, salary String); (this will create table employee under database Company because we have already executed the command Use database.)
- Describe provides information about the schema of the table.
Hive>describe employee; (this will give the schema details of employee table in details)
- TRUNCATE will delete the data of the table.
Hive> truncate table employee;
We can Install the Hive on a window as well, but for best practice, I will prefer Ubuntu to use, it will give a better view of productions environment and your data will increase in the future it will easy to manage.
This has been a guide to Install Hive. Here we have discussed the different steps to install Hive, DDL command etc. You may also look at the following articles to learn more: