Updated May 17, 2023

Introduction to MySQL RANK()

The RANK() function in MySQL will display the rank of a row. The rank of a row is determined within its partition, and it may contain gaps between values. Consider a set of rows with different values like the cost of a commodity, marks of students, and so on. When ordered in ascending order, if we need to get a student’s rank with a certain score, it is easy to capture with the RANK() function. The RANK() function has similarities with ROW_NUMBER() function and DESE_RANK() function.

Syntax

The RANK() function uses the ‘over’ clause, ‘partition by’ clause, and ‘order by’ clause. But the ‘partition by’ clause is optional. The RANK() function can be performed on small sets of the entire result set or the complete result set as is. If you want to handle small sets of the result set, you must consider the ‘partition by’ clause. Please note that the function will be reinitialized each time the partition boundary is crossed.

The syntax for the RANK() function is as follows:

SELECT column_name
RANK() over (
PARTITION BY {expression}
ORDER BY {expression} [ASC|DESC]
) rank_column_name
from table_name;

SELECT – In the SELECT query; we utilize the RANK() function to obtain the desired ranks of rows as output from the table being considered.
RANK() over ( … ) rank_column_name – this part of the query will return the result of RANK() function and the output will hold the column as rank_column_name
PARTITION BY {expression} – this part of the query will do the partition (if needed) according to the expression provided in the clause. This is an optional clause in RANK()
ORDER BY {expression} [ASC|DESC] – this part of the query will order the result set in ascending or descending order. Ordering the results is mandatory to assign the ranks to rows. Once arranged in an order, we can only identify whether there are ties in the values.

How RANK() works in MySQL?

So as discussed, the RANK() function returns ranks of rows under a specific partition which is neatly ordered in ascending or descending order. This function considers the same values in a column as ties and skips assigning consecutive ranks instead of giving the same rank to the row with a tie. This makes the gaps or holes in the rankings.

Let us explore this in detail with an example.

With the above-given syntax, let us write a query with the RANK function. (Without using the partition clause). For reference, we can consider the EMPLOYEE table.

The table holds data on sales by each employee per year. Now we can rank the rows and get the output.

Query:

select employee, year, sales,
RANK() over (
order by salesasc
)sales_rank
from sales;

Output:

Let’s understand the query and output.

The query says to display the column employee, year, sales, and RANK() with a RANK column titled as sales_rank. Also, there is an ORDER BY clause which specifies to order the data in ascending order of sales column. So first, we sort the data in ascending order of the number of sales, assign a rank to each row, and display the data.

Now, consider the output. We can observe that there are gaps in the assigned ranks. In cases with a tie in the sales column, such as Alan, Carl, and Esther having sales of 150 (though in different years), the same rank of ‘3’ is assigned to all three rows. It’s important to note that after giving the rank ‘3’ to those three rows, the subsequent rank assigned is ‘6’. This occurs because the rows that could have been assigned ranks ‘4’ and ‘5’ have already been assigned the rank of ‘3’ due to the tie in the sales column. Consequently, ranks ‘4’ and ‘5’ are skipped, and the rank ‘6’ is assigned to the sixth row. Due to the tie in the sales column for values of 200, the ranks ‘7’ and ‘8’ are omitted, and the 9th row is assigned the rank ‘9’. This occurs because ‘6’ repeats two more times.

Thus we saw the gaps in the RANK() function is due to the ties in the value column.

Let us also include the partition by clause in the above query and see the result.

Query:

select employee, year, sales,
RANK() over (
partition by year
order by salesasc
)sales_rank
from sales;

Output:

We can explain the query as follows:

Select and display employee, year, sales, and RANK() with a RANK column titled as sales_rank.
Partition the data according to year- so the data should display sales by each employee from 2010, 2011, and 2015 together.
Now sort the content with the number of sales by each employee.
In 2010, Esther achieved the lowest sales of 100, followed by Carl with 140 and Alan with 150. Same way for other employees too.

Output can be explained as, for the year 2010, there are no ties in the sales column; hence, the ranks are in ascending order. But in 2011, there was a tie of value 150 between Carl and Esther, which makes the rank ‘1’ for both, and the rank for Alan is ‘3’. The same case for the year 2015 also.

Please note that the ‘partition by’ clause resets the rank at each partition, which is an important aspect to consider.

We can rewrite the query with DENSE RANK() to understand the ranking without gaps. The rows receive rank values in a sequential order without any gaps, even in cases where there are ties in the score column. And the row_number() function will display the row numbers without considering any tied values.

Query:

select employee, year, sales,
RANK() over (
order by salesasc
)sales_rank,
DENSE_RANK() over (
order by salesasc
)sales_dense_rank,
row_number() over (
order by salesasc
)sales_row_number
from sales;

Output:

We provide this example to help you understand the distinction between the RANK(), DENSE_RANK(), and ROW_NUMBER() functions.

Conclusion

So the RANK() function will assign and return the ranks of every row within a partition or full set of the ordered result table. This function shares the ranks of rows with the same values and thus leaves holes in the rank set.