EDUCBA

EDUCBA

MENUMENU
  • Blog
  • Free Courses
  • All Courses
  • All in One Bundle
  • Login
Home Data Science Data Science Tutorials Hive Tutorial HiveQL Group By

HiveQL Group By

By Priya PedamkarPriya Pedamkar

HiveQL Group By

Introduction to HiveQL Group By

HiveQL Group By is grouping the particular hive table column values mentioned in the hive group by clause and display the output value in a group format rather than displaying the value in a single/individual format. It is just grouping the number of values in the tables and showcase/display the output in a group format. The hive group works on the hive column level only, but we can add the different and number of aggregation functions with the same select query.

Types of Aggregate Functions

In HiveQL Group By, it is mandatory to add the aggregate function in the select statement. Below are the 5 types of different aggregate functions that we can use in the group by the select statement.

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

  • Maximum (MAX)
  • Minimum (MIN)
  • Count (COUNT)
  • Average (AVG)
  • Addition (SUM)

Syntax of HiveQL Group By:

SELECT [ALL | DISTINCT | Hive Column] select_expr1, select_expr2,….., select_expr_n
FROM table_name
[WHERE where_condition] [GROUP BY column_list] [HAVING having_condition] [ORDER BY column_list]] [LIMIT number];

How HiveQL Group By Query Works?

In HiveQL Group by is working with the aggregate function only. It aggregates the Hive Column output when we will enter the select statement with the group by command. As per the aggregation function provided ( MAX, MIN, COUNT, AVG, SUM ) in the select query. The query will aggregate the given hive column’s output and provided the result in a group format. If we do not provide the aggregate function in SQL select statement, then the group by the query will not work.

Examples to Implement HiveQL Group By

Below are the examples of HiveQL Group By:

Explanation:

We have a hive table (table name: – emp_group_by) in “emp” database of the hive. Below are the lists of fields/columns in the “emp_group_by” table.

  • Adhar Number
  • First Name
  • Last Name
  • Department
  • Salary
  • Location

From the 1000 records, we have the employee data in the table. We see the different cases of “group by” with the different aggrade function, SQL query and output.

DDL Code for “emp_group_by” Table

Code:

create external table emp_group_by
(
adhar_no int,
first_name string,
last_name string,
department string,
salary float,
location string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
tblproperties ("skip.header.line.count"="1")

Output:

We have 1000 records in the above table (manually loaded the data).

HiveQL Group By Example 1

Sample “emp_group_by” Table View

Code:

select * from  emp_group_by ;

Output:

HiveQL Group By Example 2

1. Group By with Aggregate Function “MAX.”

We have 1000 records in the “emp_group_by” table; we need the aggregated maximum salary of the individual department of “emp_group_by” table. We are using the aggregate function “MAX’ in the select SQL query. From the below SQL Query, we are selecting the “department” column and applying the “MAX” aggregated function on the salary column of “emp_group_by” table. For “group by” clause we are using the department column. So we will get the aggregated maximum salary of the individual department from the “emp_group_by” table.

Query:

select department,MAX(salary) from emp_group_by group by department;

Output:

MAX Example 1

2. Group By with Aggregate Function “MIN.”

We have the number of records in the “emp_group_by” table; we need the minimum salary of the individual department of “emp_group_by” table. We are using the aggregate function “MIN” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “MIN” aggregated function on the salary column of “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the aggregated minimum salary of the individual department from the “emp_group_by” table.

Query:

select department,MIN(salary) from emp_group_by group by department;

Output:

MIN Example 2

3. Group by with Aggregate Function “COUNT.”

We have 1000 records in the table “emp_group_by”; we need the number of employees or people in the individual department of “emp_group_by” table. We are using the aggregate function “COUNT” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “COUNT” aggregated function as “*” on the “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the total number of employees or people in the individual department of the “emp_group_by” table.

Query:

select department,COUNT(*) from emp_group_by group by department;

Output:

COUNT Example 3

4. Group by with Aggregate Function “AVG.”

We have the number of records in the “emp_group_by” table. We need the average salary of the individual department of “emp_group_by” table. We are using the aggregate function “AVG” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “AVG” aggregated function on the salary column of “emp_group_by” table. For the “group by” clause, we are using a department column. So we will get the aggregated average salary paid of the individual department from the “emp_group_by” table.

Query:

select department,AVG(salary) from emp_group_by group by department;

Output:

HiveQL Group By Example 4

5. Group by with Aggregate Function “SUM.”

We have 1000 records in the table “emp_group_by”, we need the total salary paid by the individual department of “emp_group_by” table. We are using the aggregate function “SUM” in the select SQL query.

From the below SQL Query, we are selecting the “department” column and applying the “SUM” aggregated function on the salary column of “emp_group_by” table. For “group by” clause, we are using the department column. So we will get the aggregated total salary of the individual department paid to the individual department from the “emp_group_by” table.

SQL Query

select department,SUM(salary) from emp_group_by group by department;

Output:

HiveQL Group By Example 5

Conclusion

We have seen the uncut concept of “HiveQL Group by” the Hive service query with the proper example, explanation, syntax, and code. When we need an output of the hive query in an aggregated format, we can use the “group by” with different aggregated function, and the result will come to the combined or aggregated format. It is not mandatory to use the single aggregated function with a single select statement. We can use the multiple aggregated functions in a single query with a different clause like group by, having, order by.

Recommended Articles

This is a guide to HiveQL Group By. Here we discuss the Introduction to HiveQL Group By and how the query works along with its examples. You can also go through our related articles to learn more –

  1. What is Hive Data Types?
  2. Hive Alternatives | Find out the Features
  3. Top Components of Hive Commands
  4. Top 10 Hive Interview Questions
SPSS Training Program
41+ Hours of HD Videos
14 Courses
5 Mock Tests & Quizzes
Verifiable Certificate of Completion
Lifetime Access
4.5
MICROSOFT AZURE
97+ Hours of HD Videos
15 Courses
12 Mock Tests & Quizzes
Verifiable Certificate of Completion
Lifetime Access
4.5
HADOOP Training Program
170+ Hours of HD Videos
32 Courses
4 Mock Tests & Quizzes
Verifiable Certificate of Completion
Lifetime Access
4.5
INFORMATICA Training
69+ Hours of HD Videos
7 Courses
Verifiable Certificate of Completion
Lifetime Access
4.5
Primary Sidebar
Popular Course in this category
HIVE Training
 35+ Hours of HD Videos
7 Courses
Verifiable Certificate of Completion
  Lifetime Access
4.5
Price

View Course
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more