Reproducing ROW_NUMBER() in MySQL

Recently I was working on a project involving MySQL and needed to partition and rank results in a SELECT statement. This is a pretty straightforward approach in SQL Server using ROW_NUMBER() or RANK(). In my particular case I needed functionality equivalent to ROW_NUMBER() with partitioning by a single column and sorting by another. Below is an alternative since earlier versions of MySQL do not have an equivalent ROW_NUMBER() function.

Data

I am interested in ranking the highest sale days for each company based on the table below.

company_id company_name sale_date total_amount
1 Company A 2018-03-01 500.00
2 Company B 2017-09-15 700.00
1 Company A 2017-07-02 900.00
2 Company B 2017-06-10 560.00
2 Company B 2018-03-03 1000.00
1 Company A 2018-03-03 100.00
2 Company B 2017-12-01 650.00

*company_sales table

SQL Server query

Using T-SQL for SQL Server my query would be this:

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    ROW_NUMBER<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">OVER</span><span class="token punctuation">(</span><span class="token keyword">PARTITION</span> <span class="token keyword">BY</span> company_id <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> total_amount <span class="token keyword">DESC</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank
<span class="token keyword">FROM</span>
    company_sales

and the results:

company_name sale_date total_amount day_rank
Company A 2017-07-02 900.00 1
Company A 2018-03-01 500.00 2
Company A 2018-03-03 100.00 3
Company B 2018-03-03 1000.00 1
Company B 2017-09-15 700.00 2
Company B 2017-12-01 650.00 3
Company B 2017-06-10 560.00 4

MySQL query

In order to perform the same query in MySQL, I need to do things a little differently as MySQL does not have directly equivalent functions to ROW_NUMBER() and RANK().

Conveniently, one feature that MySQL does have is the ability to assign and read variables inside a SELECT statement. Using this, I can write an equivalent statement in MySQL.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL discussion

The query for MySQL is noticeably longer and a little more complicated, but it gives us the exact same output. Let’s break it down a bit and step through each piece.

1. Initialize a ranking variable:

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>

2. Write the sub query

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

There are 3 important pieces to this sub SELECT

  • Decide the ranking logic. This is equivalent to PARTION BY...ORDER BY in SQL Server. For this query, I want to partition by company and rank by total_amount, largest to smallest. To accomplish this, I double sort, first by company_id and then by total_amount.
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
  • Next, assign the company_id for each row to a variable
<span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id 
  • Finally, above this assignment, place the logic for either incrementing the ranking, or resetting it. This states that if the previous value of @comp_id is the same, then increase the rank (the next best sale day for the same company), otherwise reset to 1.
<span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>

3. Put it all together and sort
Now wrap the subquery, extracting only the columns needed and double sort again to get the same output as SQL Server.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL simplified

The query for MySQL can be simplified if you are not concerned about an extra column in the output(company_id) or about having the results properly sorted. They will still be ranked correctly, just not sorted by the database.

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id <span class="token keyword">AS</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

produces output similar to below:

company_name sale_date total_amount day_rank company_id
Company A 2017-07-02 900 1 1
Company A 2018-03-01 500 2 1
Company A 2018-03-03 100 3 1
Company B 2018-03-03 1000 1 2
Company B 2017-09-15 700 2 2
Company B 2017-12-01 650 3 2
Company B 2017-06-10 560 4 2

Conclusion

By using this variable assignment and increment technique with proper sorting, we now have the ability to partition and assign row numbers to SELECT statements in MySQL. I hope this has been informative and adds yet another item to your MySQL toolbox!


Featured image by Nacho Capelo @ unsplash.com