Introduction to MySQL RANK()
The RANK() function in MySQL will display the rank of a row. This rank of a row will be defined within its partition, and this rank will have gaps in-between. Consider there is a set of rows with different values like the cost of a commodity, marks of students, and so on. When ordered in ascending order, if we need to get the rank of a student with a certain score, it is easy to capture with the RANK() function.RANK() function has similarities with ROW_NUMBER() function and DESE_RANK() function.
Syntax
The RANK() function is used along with the ‘over’ clause, ‘partition by’ clause, and ‘order by’ clause. But the ‘partition by’ clause is optional. The RANK() function can be performed on small sets of the entire result set or the complete result set as is. Incase, small sets of the result set is to be considered, then the ‘partition by’ clause is a must. Note that the function is re-initialized as partition boundary is crossed.
The syntax for the RANK() function is as follows:
SELECT column_name
RANK() over (
PARTITION BY {expression}
ORDER BY {expression} [ASC|DESC]
) rank_column_name
from table_name;
- RANK() is used in the SELECT query, to return the desired ranks of rows as output from the table which is considered.
- RANK() over ( … ) rank_column_name – this part of the query will return the result of RANK() function and the output will hold the column as rank_column_name
- PARTITION BY {expression} – this part of the query will do the partition (if needed), according to the expression provided in the clause. This is an optional clause in RANK()
- ORDER BY {expression} [ASC|DESC] – this part of the query will order the result set in ascending or descending order. Ordering the results is mandatory to assign the ranks to rows. Once arranged in an order, only then can we identify whether there are ties in the values.
How RANK() works in MySQL?
So as discussed, the RANK() function returns ranks of rows under a specific partition which is neatly ordered in ascending or descending order. This function considers the same values in a column as ties and skips assigning the consecutive ranks instead assigns the same rank to the row with a tie. This makes the gaps or holes in the rankings.
Let us explore this in detail with an example.
With the above-given syntax, let us write a query with RANK function. (Without using the partition clause). For reference, we can consider the EMPLOYEE table.
The table holds data on sales by each employee per year. Now we can rank the rows and get the output.
Query:
select employee, year, sales,
RANK() over (
order by salesasc
)sales_rank
from sales;
Output:
Let’s understand the query and output.
The query says to display the column employee, year, sales, and RANK() with a RANK column titled as sales_rank. Also, there is an ORDER BY clause which specifies to order the data in ascending order of sales column. So first, data is sorted in ascending order of the number of sales, then the rank is assigned to each row and the data is displayed.
Now, consider the output. We can see that ranks are assigned with gaps. When there is a tie in the sales column, for example, Alan, Carl, and Esther have sales of 150 (though during different years), the same rank of ‘3’ is assigned to all the three rows. Also note that, after assigning rank ‘3’ thrice, the very next rank assigned is ‘6’ because, rows which could have the ranks as ‘4’ and ‘5’ already gets the rank of ‘3’, due to tie in the sales column. So ranks ‘4’ and ‘5’ are omitted and rank ‘6’ is assigned to the 6th row. Now, again rank ‘6’ repeats for two more times and thus ranks ‘7’ and ‘8’ are omitted due to tie in the sales column for values 200. Rank ‘9’ is assigned to the 9th row.
4.5 (5,268 ratings)
View Course
Thus we saw the gaps in the RANK() function is due to the ties in the value column.
Let us include the partition by clause also in the above query and see the result.
Query:
select employee, year, sales,
RANK() over (
partition by year
order by salesasc
)sales_rank
from sales;
Output:
The query can be explained as follows-
- Select and display employee, year, sales, and RANK() with a RANK column titled as sales_rank.
- Partition the data according to year- so the data should be displayed sales by each employee from the year 2010, 2011, and 2015 together.
- Now sort the content with the number of sales by each employee.
- So in the year 2010, the least sales done by Esther which is 100, then by Carl which is 140 and finally by Alan which is 150. Same way for other employees too.
Output can be explained as, for the year 2010, no ties in the sales column, and hence the ranks are in ascending order. But in the year 2011, there is a tie of value 150 between Carl and Esther which makes the ranks as ‘1’ for both, and the rank for Alan is ‘3’. The same case for the year 2015 also.
The important point to be noted here is that, when partition by clause is used, the rank is reinitialized at every partition.
We can rewrite the query with DENSE RANK() to understand the ranking without gaps. Here all the rows are provided with rank values in a series order with no gaps, even though there are ties in the sales column. And the row_number() function will simply display the row numbers without considering any tied values.
Query:
select employee, year, sales,
RANK() over (
order by salesasc
)sales_rank,
DENSE_RANK() over (
order by salesasc
)sales_dense_rank,
row_number() over (
order by salesasc
)sales_row_number
from sales;
Output:
This example is provided just to understand the difference between RANK() and DENSE_RANK() and row_number() functions.
Conclusion
So the RANK() function will assign and return the ranks of every row within a partition or full set of the result table that is ordered. This function shares the ranks of rows with the same values and thus leaves holes in the rank set.
Recommended Articles
This is a guide to MySQL RANK(). Here we discuss how RANK() works in MySQL along with the syntax, queries examples with explanation. You may also have a look at the following articles to learn more –