Introduction to Hive Database
Hive Databases is providing the facility to store and manage the huge records or datasets on top of a distributed Hadoop platform. It is having the capability to store the structure and semi-structured data. With the help of JDBC Driver, the end-user is able to connect with the Hive database and fetch the data/records with the support of SQL Query. In the Hive, the database framework is developed by Facebook to analyze structured data or semi-structured data. The database framework supports all the regular commands like create, drop, alter, use database.
We can also use the database layer with different Hadoop components like Hbase, Kafka, spark, different streaming tools, etc.
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database name
[COMMENT ‘database information | database comment’ ] [LOCATION ‘ HDFS Path’ ] [WITH DATABASE PROPERTIES (property name= property value, ...)];
How to Create Database in Hive?
We can create the database as similar to the SQL one. But the major thing is whether the tables are created in the internal table format or external table format. If all the tables are created in the internal table format then the database is also called an internal database. If all the tables are created in external table format then the database is also called an external database. In the database, the data is stored in a tabular manner. The default storage location of the hive database varies from the hive version. From HDP 3.0, we are using version 3.0 and more. The default database location was changed. The location for external hive database is “/warehouse/tablespace/external/hive/” and the location for manage database is “/warehouse/tablespace/managed/hive”.
In the older version of the hive, the default storage location of the hive database is “/apps/hive/warehouse/”.
Query to Create Database
create database organization;
When you need to access the hive DB. We need to establish the connection between the hive client and the hive server. To establish the connection we need to configure the hive JDBC / ODBC connection string in hive client and try to establish the connection in the hive server. With the help of Hive view, Hive CLI, beeline (interactive and non-interactive), or different third party client software we can able to connect with the database.
- JDBC and ODBC are two different ways to connect with Hive. Few applications like MSTR support ODCB connection only. We need to configure the MSTR “ini” file. Few applications like Zeppelin, squirrel client, support the JDBC connection.
- With the help of Knox gateway, we can able to connect with Database.
- Hive View is only available in Ambari 2.6 or less. From version 2.7 and above Hive view is not available.
How to Modify or Alter Database?
We have the facility to alter or modify the existing database. We can also change the existing database comments, database properties, etc of the database as well.
We can also alter the database as below
1. Alter in Database Properties
ALTER ( DATABASE | SCHEMA ) database name SET database properties ( property name = property value, ...);
2. Alter the Owner
ALTER (DATABASE| SCHEMA) database name SET OWNER [USER|ROLE ] user or role;
3. Alter the HDFS Location
ALTER (DATABASE|SCHEMA ) database name SET LOCATION HDFS Path;
Hive Query to Alter Database
ALTER DATABASE organization SET DBPROPERTIES ('Date' = '2020-03-31');
How to Drop Database in Hive?
In the hive, we can drop or delete the database form from the hive system. By default, there is a restriction on the drop database command. We cannot drop the exiting database with subject to the database is empty. If the database is empty then only we can drop the database. If we have the tables in the database then there is a restriction on the drop. There are two ways to drop the database having the tables in it.
- First Way: We need to drop all the tables that are present in the current database. Then we can drop the current hive database.
- Second Way: If we want to drop the hive database without dropping the tables from the current database. Then we can use the “CASCADE” keyword in the drop query. With the help of “CASCADE”, we can drop the hive database without touching the internal tables of the database.
Hive Drop Query with Empty Database
drop database organization;
In the above screenshot 1, we are able to see the hive database “organization” is empty. We can easily drop the database using a hive query.
Hive Drop Query with Table Contain
drop database company CASCADE;
As per the above screenshot, we can see the “company” DB is having two tables. (table1 : products and table2 : sales).
If we will directly use the drop command “drop database company;” to drop the company DB. Then the command will not work due to tables are present in it. Please refer screenshot 3
When we will use the “CASCADE” keyword in the query. Then we can able to drop the database. Please refer to screenshot 4.
We have seen the uncut concept of “Hive Database” with the proper example, explanation, syntax, SQL Query with different screenshots. This framework resides on top of Hadoop. It will summarize the huge datasets/records and do querying on it. Helpful for easy analysis on top of it. If the datasets will increase we can add the number of machines to it without impacting the performance.
This is a guide to Hive Database. Here we discuss the Introduction and how to Create a Hive Database along with practical examples and different subquery expressions. You can also go through our related articles to learn more –