EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Hive Tutorial Dynamic Partitioning in Hive
Secondary Sidebar
Hive Tutorial
  • Basics
    • Hive JDBC Driver
    • What is a Hive
    • Hive Architecture
    • Hive Installation
    • How To Install Hive
    • Hive Versions
    • Hive Commands
    • Hive Data Types
    • Hive Built-in Functions
    • Hive Function
    • Hive String Functions
    • Date Functions in Hive
    • Hive Table
    • Hive Drop Table
    • Hive Show Tables
    • Hive Group By
    • Hive Order By
    • Hive Cluster By
    • Joins in Hive
    • Hive Inner Join
    • Map Join in Hive
    • Hive nvl
    • Hive UDF
    • Dynamic Partitioning in Hive
    • HiveQL
    • HiveQL Queries
    • HiveQL Group By
    • Partitioning in Hive
    • Bucketing in Hive
    • Views in Hive
    • Indexes in Hive
    • External Table in Hive
    • Hive TimeStamp
    • Hive Database
    • Hive Interview Questions
    • Hive insert into

Related Courses

Hive Certification Course

Hadoop Course Training

All in One Data Science Course

Dynamic Partitioning in Hive

By Arpit AnandArpit Anand

Dynamic Partitioning in Hive

Introduction to Dynamic Partitioning in Hive

Partitioning is an important concept in Hive that partitions the table based on data by rules and patterns. Dynamic partition is a single insert to the partition table. We don’t need explicitly to create the partition over the table for which we need to do the dynamic partition. Lots of sub-directories are made when we are using the dynamic partition for data insertion in Hive.

Syntax

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

To Enable the dynamic partition, we use the following HIVE Commands:

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,112 ratings)

set hive.exec.dynamic.partition = true;

This will set the dynamic partitioning for our hive application.

set hive.exec.dynamic.partition.mode = nonstrict;

This will set the mode to non-strict. The non-strict mode means it will allow all the partition to be dynamic.

It can also be called as variable partitioning. Variable partitioning means the partitions are not configured before execution else it is made during run time depending on the size of file or partitions required. It ensures the best way of the utilization of RAM and the distribution of memory.

In a dynamic partition, every row data is read and partitioned with a Map-reduce job. By default, the dynamic partitioning is disabled in a hive to prevent accidental partitions.

To use this, we need to set some property in a hive or the hive configuration XML file.

<property>
<name>hive.exec.dynamic.partition</name>
<value>true</value>
</property>
<property>
This is used to enable the dynamic Partition in Hive
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
Non strict mode means the table will not have any static partition
<name>hive.exec.max.dynamic.partitions</name>
<value>1000</value>
</property>
<property>
Maximum no of partitions that can be created with dynamic partition with one statement
<name>hive.exec.max.dynamic.partitions.pernode</name>
<value>100</value>
</property>
<property>
This is the maximum number of partitions created by each mapper and reducer

So basically with these values, we tell hive to dynamically partition the data based on the size of data and space available. Generally, as compared to static, dynamic partition takes more time to load the data, and the data load is done from a non-partitioned table. We can perform the partitioning in both managed as well as an external table.

How Dynamic Partition Works?

Let us look for an Example of how Dynamic Partition works:

  • We need to create a non-partitioned table to store the data may be a staging table.
  • We will take an EMP table for our reference:

Query:

Create table stud_demo ( id int , name string , age int , institute string , course string)
row format delimited fields terminated by “,”;

dynamic partitioning hive-query 1

  • Load the Data in Table from any external source say it a text file: –

LOAD DATA local inpath ‘path name’ into table employee_np;

  • Now Create a partitioned table where we want to insert the data with dynamic partition.

Query:

Create table student_part ( id int , name string , age int , institute string)
Partitioned by (course string)
Row format delimited fields terminated by “,”;

dynamic partitioning hive-query 2

Note: We will check whether the dynamic partition is activated for HIVE tables or not.
  • Once this table is created, we can check for the partition where the partition is done in the right way or not with the following commands:

SHOW PARTITIONS student_part;

  • Insert the data we want to insert with the partition needed:

Insert into student_part  partition(course)
Select id,name,age,institute,course from  stud_demo;

  • With this Query, we can insert data with the dynamic partition of Table over column course.

Query 3

Query 4

Advantages of Dynamic Partition

  • Good for loading huge files in tables.
  • Row wise data is read.
  • Partition is based on memory and RAM available, so resources are utilized well all over.
  • Generally used to load data from the non-partitioned table.
  • If columns count is unknown and we want to partition data based on columns, a dynamic partition is used.
  • Data load is distributed horizontally.
  • Generally, the query processing time is reduced.
  • The column values over which partition is to be done are known at RUN TIME.
  • We can use to load data from the table that is not partitioned.
  • Both external and managed tables can be used for dynamic partition.

Disadvantages of Dynamic Partition

  • It generally takes more time in loading data as compared to static partition.
  • We cannot perform alter on Dynamic Partition.
  • Having large no of partition makes the possibility of creating overhead for NameNode.
  • Query processing sometimes can take more time to execute.
  • It can sometimes be a costly operation.

Conclusion

From the above article, we saw how it is used in the hive and how to create it. We also check the advantage of having a dynamic partition over the hive and how to use it. So from this article, we can have a fair idea of how it works in the hive and its advantage.

Recommended Articles

This is a guide to Dynamic Partitioning in Hive. Here we discuss the basic concept, how dynamic partition works, and the advantages and disadvantages of Partitioning in Hive. You can also go through our other suggested articles to learn more –

  1. Hive Data Types
  2. Indexes in Hive
  3. Hive Function
  4. Hive Architecture
  5. Hive Inner Join | Working and Examples
Popular Course in this category
Hive Training (2 Courses, 5+ Projects)
  2 Online Courses |  5 Hands-on Projects |  25+ Hours |  Verifiable Certificate of Completion
4.5
Price

View Course

Related Courses

Hadoop Training Program (20 Courses, 14+ Projects, 4 Quizzes)4.9
All in One Data Science Bundle (360+ Courses, 50+ projects)4.8
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more