EDUCBA Logo

EDUCBA

MENUMENU
  • Explore
    • EDUCBA Pro
    • PRO Bundles
    • All Courses
    • All Specializations
  • Blog
  • Enterprise
  • Free Courses
  • All Courses
  • All Specializations
  • Log in
  • Sign Up
Home Data Science Data Science Tutorials Hive Tutorial HiveQL Group By
 

HiveQL Group By

Priya Pedamkar
Article byPriya Pedamkar

Updated February 28, 2023

HiveQL Group By

 

 

Introduction to HiveQL Group By

HiveQL Group By is grouping the particular hive table column values mentioned in the hive group by clause and display the output value in a group format rather than displaying the value in a single/individual format. It is just grouping the number of values in the tables and showcase/display the output in a group format. The hive group works on the hive column level only, but we can add the different and number of aggregation functions with the same select query.

Watch our Demo Courses and Videos

Valuation, Hadoop, Excel, Mobile Apps, Web Development & many more.

Types of Aggregate Functions

In HiveQL Group By, it is mandatory to add the aggregate function in the select statement. Below are the 5 types of different aggregate functions that we can use in the group by the select statement.

  • Maximum (MAX)
  • Minimum (MIN)
  • Count (COUNT)
  • Average (AVG)
  • Addition (SUM)

Syntax of HiveQL Group By:

SELECT [ALL | DISTINCT | Hive Column] select_expr1, select_expr2,….., select_expr_n
FROM table_name
[WHERE where_condition] [GROUP BY column_list] [HAVING having_condition] [ORDER BY column_list]] [LIMIT number];

How HiveQL Group By Query Works?

In HiveQL Group by is working with the aggregate function only. It aggregates the Hive Column output when we will enter the select statement with the group by command. As per the aggregation function provided ( MAX, MIN, COUNT, AVG, SUM ) in the select query. The query will aggregate the given hive column’s output and provided the result in a group format. If we do not provide the aggregate function in SQL select statement, then the group by the query will not work.

Examples to Implement HiveQL Group By

Below are the examples of HiveQL Group By:

Explanation:

We have a hive table (table name: – emp_group_by) in “emp” database of the hive. Below are the lists of fields/columns in the “emp_group_by” table.

  • Adhar Number
  • First Name
  • Last Name
  • Department
  • Salary
  • Location

From the 1000 records, we have the employee data in the table. We see the different cases of “group by” with the different aggrade function, SQL query and output.

DDL Code for “emp_group_by” Table

Code:

create external table emp_group_by
(
adhar_no int,
first_name string,
last_name string,
department string,
salary float,
location string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
tblproperties ("skip.header.line.count"="1")

Output:

We have 1000 records in the above table (manually loaded the data).

HiveQL Group By Example 1

Sample “emp_group_by” Table View

Code:

select * from  emp_group_by ;

Output:

HiveQL Group By Example 2

1. Group By with Aggregate Function “MAX.”

We have 1000 records in the “emp_group_by” table; we need the aggregated maximum salary of the individual department of “emp_group_by” table. We are using the aggregate function “MAX’ in the select SQL query. From the below SQL Query, we are selecting the “department” column and applying the “MAX” aggregated function on the salary column of “emp_group_by” table. For “group by” clause we are using the department column. So we will get the aggregated maximum salary of the individual department from the “emp_group_by” table.

Query:

select department,MAX(salary) from emp_group_by group by department;

Output:

MAX Example 1

2. Group By with Aggregate Function “MIN.”

We have the number of records in the “emp_group_by” table; we need the minimum salary of the individual department of “emp_group_by” table. We are using the aggregate function “MIN” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “MIN” aggregated function on the salary column of “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the aggregated minimum salary of the individual department from the “emp_group_by” table.

Query:

select department,MIN(salary) from emp_group_by group by department;

Output:

MIN Example 2

3. Group by with Aggregate Function “COUNT.”

We have 1000 records in the table “emp_group_by”; we need the number of employees or people in the individual department of “emp_group_by” table. We are using the aggregate function “COUNT” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “COUNT” aggregated function as “*” on the “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the total number of employees or people in the individual department of the “emp_group_by” table.

Query:

select department,COUNT(*) from emp_group_by group by department;

Output:

COUNT Example 3

4. Group by with Aggregate Function “AVG.”

We have the number of records in the “emp_group_by” table. We need the average salary of the individual department of “emp_group_by” table. We are using the aggregate function “AVG” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “AVG” aggregated function on the salary column of “emp_group_by” table. For the “group by” clause, we are using a department column. So we will get the aggregated average salary paid of the individual department from the “emp_group_by” table.

Query:

select department,AVG(salary) from emp_group_by group by department;

Output:

HiveQL Group By Example 4

5. Group by with Aggregate Function “SUM.”

We have 1000 records in the table “emp_group_by”, we need the total salary paid by the individual department of “emp_group_by” table. We are using the aggregate function “SUM” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “SUM” aggregated function on the salary column of “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the aggregated total salary of the individual department paid to the individual department from the “emp_group_by” table.

SQL Query

select department,SUM(salary) from emp_group_by group by department;

Output:

HiveQL Group By Example 5

Conclusion

We have seen the uncut concept of “HiveQL Group by” the Hive service query with the proper example, explanation, syntax, and code. When we need an output of the hive query in an aggregated format, we can use the “group by” with different aggregated function, and the result will come to the combined or aggregated format. It is not mandatory to use the single aggregated function with a single select statement. We can use the multiple aggregated functions in a single query with a different clause like group by, having, order by.

Recommended Articles

This is a guide to HiveQL Group By. Here we discuss the Introduction to HiveQL Group By and how the query works along with its examples. You can also go through our related articles to learn more –

  1. What is Hive Data Types?
  2. Hive Alternatives | Find out the Features
  3. Top Components of Hive Commands
  4. Top 10 Hive Interview Questions
Primary Sidebar
Footer
Follow us!
  • EDUCBA FacebookEDUCBA TwitterEDUCBA LinkedINEDUCBA Instagram
  • EDUCBA YoutubeEDUCBA CourseraEDUCBA Udemy
APPS
EDUCBA Android AppEDUCBA iOS App
Blog
  • Blog
  • Free Tutorials
  • About us
  • Contact us
  • Log in
Courses
  • Enterprise Solutions
  • Free Courses
  • Explore Programs
  • All Courses
  • All in One Bundles
  • Sign up
Email
  • [email protected]

ISO 10004:2018 & ISO 9001:2015 Certified

© 2025 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

🚀 Limited Time Offer! - 🎁 ENROLL NOW