EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • Featured Skills
    • New & Trending
    • Fresh Entries
    • Finance
    • Data Science
    • Programming and Dev
    • Excel
    • Marketing
    • HR
    • PDP
    • VFX and Design
    • Project Management
    • Exam Prep
    • All Courses
  • Blog
  • Enterprise
  • Free Courses
  • Log in
  • Sign Up
Home Data Science Data Science Tutorials Hive Tutorial Hive Inner Join
 

Hive Inner Join

Priya Pedamkar
Article byPriya Pedamkar

Hive Inner Join

Introduction to Hive Inner Join

We will get the appropriate records or data from the two or more different hive tables and get the resulted new hive table in hive inner join. But the selected columns in the join condition having the common value or same data type. In an inner join, we can consider two common columns (having the same datatype or same value) from the two different tables and use the join condition. We can join or combine the records of two tables and get the new join hive table. If we are writing the simply “join” in the join query its nothing but the “inner join” only.

 

 

Syntax:

Watch our Demo Courses and Videos

Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

Hive_join_table:
table_reference [INNER] JOIN table_factor [join_condition] | table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_referencejoin_condition
| table_reference LEFT SEMI JOIN table_referencejoin_condition
| table_reference CROSS JOIN table_reference [join_condition]

How does Inner Join work in Hive?

In the hive, we are getting the combined result of two hive tables with subject to common or same column values of both the tables in the join condition.

How does Inner Join work in Hive

As per the above image, we have two hive tables “table1” and “table2”. Table1 having different columns with “customer_id”. Table2 having different columns with “product_id”. Here, “customer_id” and “product_id” having the same value (like 1,2,3.. etc.) or the same data type. In an inner join, the result will come with a combination of both the matching id of customer_id and product_id. If the values match both the table columns of “customer_id” and “product_id” then only the result/records will come or display. Otherwise, if the value is present in “customer_id” but the same id is not present in “product_id” then the resultant output will not come or vice a versa.

Hive Version 0.13.0

The hive version 0.13.0, it allows the “FROM” clause to join the multiple tables with comma-separated. Moreover, we can omit the “join” keyword in the query.

Code:

SELECT * FROM table1 t1, table2 t2 WHERE t1.customer_id = t2.product_id;

In the latest versions of the hive, we can use the “JOIN” keyword.

  • Note #1: In Hive, the query will convert the joins over multiple tables, and we want to run a single map/reduce job. Then it is mandatory that the same column should be used in the join clause.
  • Note #2: If we use the different and multiple columns in the same join clause, the query will execute with the multiple map / reduce jobs.
  • Note #3: In the hive, every map / reduce stage of the join query. The last table in the sequence and it’s streamed through the reducers whereas the others are buffered.

Explanation

We have two tables (table name: -sales and products) in the “company” database of the hive.

Below are the lists of fields/columns in the “sales” table:

  • ID (data type “int”)
  • First Name (data type “string”)
  • Last Name (data type “string”)
  • Gender (data type “string”)
  • Email_ID (data type “string”)
  • City (data type “string”)

Below are the lists of fields/columns in the “Products” table:

  • customer_id (data type “int”)
  • product_name (data type “string”)

1. DDL Code for “sales” table

Code:

create external table company.sales
(
id int,
first_name string,
last_name string,
gender string,
email_id string,
city string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
tblproperties ("skip.header.line.count"="1");

Output: We have 1000 records in the “sales” table (manually loaded the data).

Hive Inner Join - 2

2. DDL Code for “Products” table

Code:

create external table company.products
(
customerid int,
product_name string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
tblproperties ("skip.header.line.count"="1");

We have 1000 records in the “products” table (manually loaded the data).

Hive Inner Join - 3

Sample “sales” table view

Hive Inner Join - 4

Sample “products” table view

products

Example to Implement Hive Inner Join

In the hive, we have “company” DB. It contains two different tables,’ i.e. sales table and product table in the above two tables sales and products. We have seen two columns having common data types as well as common value. In sales tables – “id” is present and product tables – “customerid” is present. Both the column are referencing to each other. With the help of sales and product tables, we will get the information from those users who buy with products. Now we need complete records or data of those users who buy which products.

SQL Query with Using “JOIN Clause”

Below is the SQL Query

Code #1

SELECT * FROM sales s JOIN products p ON (s.id = p.customerid) limit 10;

Output:

SQL Query 1

Code #2

SELECT s.id, s.first_name, s.last_name, p.customerid, p.product_name FROM sales s JOIN products p ON (s.id = p.customerid) limit 10;

Output:

SQL Query 2

Note: In query 1 and query 2, we will get the same output. But in query 1, we have added “*” to get the complete fields or data. But in query 2, we have customized the query and keep the necessary or required fields.

SQL Query without Using “JOIN Clause”

Code:

select * from sales s1, products p1 where s1.id = p1.customerid  limit 10;

Output:

SQL Query without Using “JOIN Clause”

Note: We Are Using Hive Version 3.1

Advantages of using Hive Inner Join

  • The major advantage of hive join is to get the different tables data in a single query.
  • No need to add the same or common columns fields in the table.
  • Get the result faster.
  • Less data store in the indivisible table
  • In the hive, mapper and reduces are using to execute the join query. It will minimize the cost of processing and storing the data.
  • The map side joins help improve the job or query or application’s performance, and it will take less time to execute.

Conclusion

We have seen the uncut concept of “Hive Inner Join” with the proper example, explanation, syntax, and code. With the help of the “JOIN” condition, we can get the data or records from two or more different columns. No need to keep or add the same files in multiple tables. We need to keep the separate table and join with the necessary tables with common join vales or data types.

Recommended Articles

This is a guide to Hive Inner Join. Here we discuss how does inner join works, hive version 0.13.0 with explanation and examples to implement. You can also go through our other related articles to learn more –

  1. Indexes in Hive
  2. Bucketing in Hive
  3. Hive Cluster By
  4. Date Functions in Hive

Primary Sidebar

Footer

Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA Login

Forgot Password?

🚀 Limited Time Offer! - 🎁 ENROLL NOW