Introduction to Hive Table
In the hive, the tables are consisting of columns and rows and store the related data in the table format within the same database. The table is storing the records or data in tabular format. The tables are broadly classified into two parts i.e.; external table and internal table.
The default storage location of the Table varies from the hive version. From HDP 3.0, we are using hive version 3.0 and more. The default Table location was changed from HDP 3.0 version / Hive version 3.0. The location for external hive Table is “/warehouse/tablespace/external/hive/” and the location for manage Table is “/warehouse/tablespace/managed/hive”.
In the older version of the hive, the default storage location of hive Table is “/apps/hive/warehouse/”.
Syntax of Hive Table
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [ database name ] table name
[( column name data type [ COMMENT column comment], ...)]
[ COMMENT table comment]
[ ROW FORMAT row format]
[ STORED AS file format]
How to create Table in Hive?
As per the requirement, we can create the tables. We can broadly classify our table requirement in two different ways;
- Hive internal table
- Hive external table
1. Internal Table
The internal table is also called a managed table and it is own by “hive” only. Whenever we are creating the table without specifying the keyword “external” then the tables will create in the default location.
If we will drop the internal or manage table then the table DDL, metadata information, and table data will be lost. The table data is available on HDFS it will also lose. We should be very careful while dropping any internal or manage the table.
DDL Code for Internal Table
create table emp.customer
(
idint,
first_name string,
last_name string,
gender string,
company_name string,
job_title string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
location "/emp/table1"
tblproperties ("skip.header.line.count"="1");
Output:
2. External Table
The best practice is to create an external table. Many organizations are following the same practice to create tables. It does not manage the data of the external table and the table is not creating in the warehouse directory. We can store the external table data anywhere on the HDFS level.
The external tables having the facility to recover the data i.e. if we will delete/drop the external table. Still no impact on the external table data present on the HDFS. It will only drop the metadata associated with the table.
If we will drop the internal or manage table then the table DDL, metadata information, and table data will be lost. The table data is available on HDFS it will also lose. We should be very careful while dropping any internal or manage the table.
DDL Code for External Table
create external table emp.sales
(
idint,
first_name string,
last_name string,
gender string,
email_id string,
city string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
location "/emp/sales"
tblproperties ("skip.header.line.count"="1");
Output:
How to modify/alter the Table?
Here we have the facility to alter or modify the existing attributes of the Table. With the help of “alter” functionality, we can change the column name, add the column, drop the column, change the column name, replace the column.
We can alter the below Table attributes
1. Alter/ rename the tablename
Syntax:
ALTER TABLE [current table name] RENAME TO [new table name]
Query to Alter Table Name :
ALTER TABLE customer RENAME TO cust;
Output:
Before alter
After alter
2. Alter/ add column in the table
Syntax:
ALTER TABLE [current table name] ADD COLUMNS (column spec[, col_spec ...])
Query to add Column :
ALTER TABLE cust ADD COLUMNS (dept STRING COMMENT 'Department');
Output:
Sample view of the table
We are adding a new column in the table “department = dept”
3. Alter/change the column name
Syntax:
ALTER TABLE [current table name] CHANGE [column name][new name][new type]
Query to change column name :
ALTER TABLE cust CHANGE first_name name string;
Output:
Sample view of the customer table.
Now we are changing the column name “first_name” to “name”
How to drop the Table?
Here we can drop or delete the table form the system. When you will drop/delete the table form the hive database, the table entry will delete it from hive metastore. If it is an internal table then the table and data will complete delete. If it is an external table then the table entry will delete it from metastore but the data is available on HDFS Level.
Drop Internal or External Table
Syntax:
DROP TABLE [IF EXISTS] table name;
Drop Query:
drop table cust;
Output:
Before drop query run
After drop query run on “cust” table.
Conclusion
We have seen the uncut concept of “Hive Table” with the proper example, explanation, syntax, SQL Query with different outputs. The table is useful to store the structure data. The table data is helpful for various analysis purposes like BI, reporting, helpful/easy in data slicing and dicing, etc. The internal table is managed and the external table is not managed by the hive. As per the requirement, we can choose which type of table we need to create.
Recommended Articles
This is a guide to Hive Table. Here we discuss the concept of “Hive Table” with the proper example, explanation, syntax, SQL Query. You may also have a look at the following articles to learn more –
2 Online Courses | 5 Hands-on Projects | 25+ Hours | Verifiable Certificate of Completion
4.5
View Course
Related Courses