Install mssql-tools (sqlcmd) on Amazon Linux AMI

The steps below were a result of figuring out how to install mssql-tools on aLinux instance. Along the way I learned a little about `yum` and repository priorities.

Problem

During the past couple of weeks I have set out to learn more about Docker and how I can incorporate it more into development processes to automate and standardize workflows. I have been using a small AWS Linux EC2 instance based on a Amazon Linux AMI. I wanted to use mssql-tools to connect to a SQL Server instance running inside a container. It did not appear that mssql-tools were installed on the machine at the time (In retrospect, it may just have been that I did not have the bin directory set up properly in my Path).

The steps below were a result of figuring out how to install mssql-tools on the Linux instance. Along the way I learned a little about yum and repository priorities. Let’s dig in…

sqlcmd Version

My first step when trying to use sqlcmd was to check the verion

sqlcmd | grep Version

It appeared to be not installed, which is what led me to figuring out how to install it.

As I mentioned before, I should have verified for sure that it was not installed and just not in my PATH yet. Lesson learned!

Uninstall previous version

Although I didn’t think there was a previous version to uninstall, I decided to continue to follow the full installation instructions from Microsoft, which included uninstalling of mssql-tools and unixODBC-utf16-devel.

sudo yum remove mssql-tools unixODBC-utf16-devel

Install current version

The next step for me was to install mssql-tools and unixODBC-devel.

Download the repository configuration file from Microsoft

sudo su
curl https://packages.microsoft.com/config/rhel/7/prod.repo > /etc/yum.repos.d/msprod.repo
exit

Install tools

sudo yum install mssql-tools unixODBC-devel

First snag!

Most interesting was the message:

2 packages excluded due to repository priority protections

Time to do some digging.

yum repository priorities

Given my very limited experience with yum, I did a quick search and found the documentation for yum priorities. The usage sections stated “Packages from repositories with a lower priority will never be used to upgrade packages that were installed from a repository with a higher priority.”. This seemed to be the first indication I was on the right path.

Given that the plugin appeared to be installed and active on this install of Linux, I located the location of the configuration file (/etc/yum/pluginconf.d/priorities.conf) and updated the following line from

[main]
enabled=1

to

[main]
enabled=0

Side note: While writing this post I came across some documentation from Amazon explains why this is on by default for Amazon Linux AMI’s and suggests there is an alternate way around this problem to allow for other repositories. See AWS documentation.

Side note #2: There is an old Centos thread related to yum-priorities that is also worth reading.

With the plugin now disabled, I was successfully able to install mssql-tools and unixODBC-devel and accepts the licensce terms.

Setup PATH

Lastly, I added the bin directory for sqlcmd to my bash PATH in the ~/.bash_profile file…

…and confirmed the install.

Automation

During my search, I came across a post by Kagarlickij Dmitriy. He provides a nice shell script to check your version, turn off yum priorities, and install mssql-tools. Github repository


Featured image by Cesar Carlevarino Aragon @ unsplash.com

First look at SQL Server on Docker

After finishing my previous quick experiment with getting Docker up and running, I immediately wanted to experiment with SQL Server portability using Docker. The overal long term goal is to have standard docker images that all developers on the team could use for development and to assist with the onbording process of new developers to our team.

Goal

After finishing my previous quick experiment with getting Docker up and running, I immediately wanted to experiment with SQL Server portability using Docker. The overal long term goal is to have standard docker images that all developers on the team could use for development and to assist with the onbording process of new developers to our team. By the end of this article, I will show how I used resources provided by Microsoft to successfully get SQL Server running in a Docker container and be able to connect to the instance from outside the container, and from a different machine. Here we go!

Microsoft container images

Thankfully, Microsoft has provided a big helping hand to get started with SQL Server on Docker. For Linux, the provide a standard SQL Server container image and instructions on how to get started. I followed this quick start guide as closely as possible for my setup.

It is so easy to get started on a new project or technology and skip right past the ‘prerequisites’ sections of a tutorial or documentation. I do it all the time, but I am trying to get in a better habit of taking the time to verify everything is in order before beginning. At the very least I learn a new command or location for where something is stored.

I initiallly used the same previous Amazon Liux AMI that I used when first setting up my first exposure to Docker, and while I know that the version of Docker is compliant with the container image from Microsoft, it never hurts to get in the habit of double-checking yourself.

$ docker version

I also wanted to check my Docker storage driver to make sure it is compatable (overlay2) with the image.

$ docker info

Next I checked for total RAM, which Microsoft states should be at least 2GB.

$free -m

Turns out this was still a t2.micro instance, so I had to change the instance size of my EC2 to get more RAM. I chose a t2.medium instance to get a bit more RAM to work with than just the minimum and pick an additional core.

Better! Back on track.

Also checked my available storage(8GB based on the t2.micro instance I had before).

$ df

I decided to bump that up to 30GB using the instructions from AWS for extending a volume size for Linux.

A quick check on 64-bit

$ uname -m

Pull and run the image

Finally, it was time to pull the container image from Microsoft and run it. It took a bit longer than I wanted to get to this point, but succesfully resized an EC2 instance and extended the size of my EBS volume as well!

$ sudo docker pull mcr.microsoft.com/mssql/server:2017-latest

$ sudo docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD={YourNewStrong!Passw0rd}' \
   -p 1433:1433 --name sql1 \
   -d mcr.microsoft.com/mssql/server:2017-latest

Then verified the container was OK

$ docker ps

All looks good.

Connecting to the database

In the interest of learning to use the interactive shell inside of a Docker container I used the example from Microsoft to run SQL Server commands to make sure everything was running OK inside the container.

$ sudo docker exec -it sql1 "bash"
[email protected]:/# /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P '{YourNewStrong!Passw0rd}'

This successfully produced a sqlcmd prompt:
1>

From this sqlcmd prompt I quickly ran through the steps from the Microsoft Quick Start Guide to create a database, table, and 2 records within that table.

Connect outside the container

I had finally arrived at the point I was most excited about. Connecting to the SQL Server running in the container from outside of the container.

sqlcmd -S {your server ip},1433 -U SA -P '{YourNewStrong!Passw0rd}'

In my case I wanted to test connecting using SSMS on my local machine to the EC2 instance on AWS running the container (making sure the security group allowed for inbound connections on the correct port from my machine. Success!

Conclusion

I’m really happy I was able to get this to work. It was amazing how straightforward it ended up being since the container image was provided by Microsoft. Some day I will do a full SQL Server install on Linux to take my journey a little deeper.

My next project with Docker will be backing up an existing database to a Docker container to provide a prepackaged image of a database that all developers can pull and test against.


Featured image by Tobias Fischer @ unsplash.com

Troubleshooting and turning on SQL Server Authentication mode

Anyone who has ever connected to a remote system before has invariably encountered authentication or connection issues. Unfortunatly, due to the security implications of connecting to remote systems, the error message returned is not always helpful.

Scenario

Anyone who has ever connected to a remote system before has invariably encountered authentication or connection issues. Unfortunatly, due to the security implications of connecting to remote systems, the error message returned is not always helpful.

For instance, I came across the message below when trying to connect to a remote SQL Server instance using SQL Server Management Studio (SSMS) and SQL Authentication.

Cannot connect to '{server}'.

Additional information:

Login failed for user '{username}' (Microsoft SQL Server, Error: 18456)

Not initially very helpful.

Troubleshooting

To Google we go!

Hang on. There is a better way to go about this! We have an error code, so a good starting point for detailed information is the SQL Server errors page.

The error page for MS SQL Server error 18456 gives some potential causes, but narrowing down the root cause required more detailed error information.

Server Logs

The SQL Server Log will give you more detailed information related to the error, specifically a more detailed message and a state code to accompany the error code. If you do not have access to the server you will need to request the information from your server administrator. However, if you do have access, there are 2 different ways to view the logs.

1.) On the server itself, the error logs will be located in a director similar to this path:

%Program-Files%\Microsoft SQL Server\MSSQL.1MSSQL\LOG\ERRORLOG

2.) In SSMS you can access the same logs using the Object Explorer to navigate to:

SSMS -> server node -> Management -> SQL Server Logs

Detailed Error

In this case, I quickly found the details of the error:

SQL Server error 18456, severity 14, state 58

Login failed for user {username}. Reason: An attempt to login using SQL authentication failed. Server is configured for Integrated authentication only. [CLIENT:{ip address} ] 2019-03-01 14:25:36.60 Logon Error: 18456, Severity: 14, State: 58.

OK. Here now we have an error state code: 58. Unfortunately the error documentation from Microsoft does not specifically reference a reason for this state. However, the detailed error message tells us that the authentication failed because the server is configured for Integrated (Windows) authentication only and we are trying to log in with SQL Server credentials.

Authentication Modes

We can confirm that the server is indeed NOT set up for mixed authentication in two ways:
1.) By querying the SQL Server property related to Integrated Security:

<span class="token keyword">SELECT</span> SERVERPROPERTY<span class="token punctuation">(</span><span class="token string">'IsIntegratedSecurityOnly'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  • If the return value is 1 then the server is configured for Windows Authentication only.
  • If the return value is 0 then the server is configured for Windows and SQL Server Authentication (mixed authentication). See documentation related to server properties for more details.

2.) In SSMS by going right clicking on the server node -> Properties -> Security.

If you are able to and need to change server authentication modes, refer to the authentication mode documentation for details and security issues pertaining to the steps.

Once the authentication mode has been changed, the authentication error should go away and a connection to the server can be established.

Summary

While troubleshooting error codes from any server or software can be challenging, taking the time to track down the proper documentation and detailed error logs can be well worth the time spent. The first few times may be quite time consuming to figure out, but writing down clean documentation and bookmarking resources will go a long way towards helping debug future errors.

This is a MUST BOOKMARK reference for MS SQL Server error logs:
https://docs.microsoft.com/en-us/sql/relational-databases/errors-events/database-engine-events-and-errors


Featured image by Robert Anasch @ unsplash.com

Linked Server: SQL Server to MySQL

In this example, we will see how to set up a link to a MySQL instance using SQL Server. By doing this, we will have access to write queries against a remote MySQL database any time data access is needed.

Using SSIS is a great way to perform ETL operations and bulk load or process data coming from multiple databases, especially if various database vendors are involved. However, setting up a full SSIS package just to do a quick query, import, or join is cumbersome and time consuming. Thankfully, setting up a Linked Server in SQL Server can be a straightforward task to keep productivity high in these types of situations.

The Goal

In this example, we will see how to set up a link to a MySQL instance using SQL Server. By doing this, we will have access to write queries against a remote MySQL database any time data access is needed. We will even be able to JOIN against tables in our SQL Server database to provide quick analysis across disparate database engines.

Installing the MySQL driver

The first step to setting up the linked server is to download and install the MySQL ODBC driver. For this, go to the MySQL ODBC driver download page and select the driver that meets your OS specifications.

Verifying the driver install

After installing the driver, check the ODBC data sources to make sure the driver was successfully installed. The easiest way to do this is to search for ‘ODBC’ from the Windows Start Menu and select either the 32-bit or 64-bit option depending on which driver was installed.

From the ODBC Data Source Administration window, select the ‘Drivers’ tab to verify the install. Take note of the driver name; this exact string will be used later when setting up the linked server.

Setting up the Linked Server

Now that the driver is installed, we are ready to set up the linked server. For this, open SQL Server Management Studio and connect to the local server. Expand Object Explorer tree for the connection to find the ‘Linked Servers’ section (Server Objects -> Linked Servers)

Right click on ‘Linked Servers’ and click ‘New Linked Server…’

This will bring up the ‘New Linked Server’ dialog.

Give the linked server a name (in my example ‘DATAWAREHOUSE’) . Since we are linking to a MySQL instance, select ‘Other data source’ and fill out as follows:

  • Provider: Microsoft OLE DB Provider for ODBC Drivers
  • Product name: MySQL

Now we need to fill out the Provider string for the connection. For this example, my provider string is below:

Driver<span class="token operator">=</span>{MySQL ODBC <span class="token number">5.3</span> ANSI Driver}<span class="token punctuation">;</span><span class="token keyword">DATABASE</span><span class="token operator">=</span>testapp<span class="token punctuation">;</span>PWD<span class="token operator">=</span>test<span class="token punctuation">;</span>UID<span class="token operator">=</span>jon<span class="token punctuation">;</span><span class="token keyword">OPTION</span><span class="token operator">=</span><span class="token number">134217728</span><span class="token punctuation">;</span>SERVER<span class="token operator">=</span>my_remote_server<span class="token punctuation">;</span>PORT<span class="token operator">=</span><span class="token number">3306</span><span class="token punctuation">)</span>

Notice the driver name is exactly the name of the driver from the ODBC driver installation. The MySQL ODBC driver has many options available for the provider string. We will only concentrate on the few needed for a basic connection, however, you should take a look at the full list here in order to make sure your connection does not need additional values.

One additional parameter for us connecting from SQL Server is the OPTION parameter. For connections from SQL Server, we set this to the recommended value of 134217728. Be sure to see the full list of recommended parameter options depending on your configuration.

One final step before creating the linked server is to change make sure that ‘RCP’ and ‘RCP Out’ are set to True on the ‘Server Options’ tab.

Click ‘OK’ to created the linked server. The server should now appear in the Object Explorer.

An alternative to creating a linked server is to script the creation.

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_addlinkedserver <span class="token variable">@server</span> <span class="token operator">=</span> N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@srvproduct</span><span class="token operator">=</span>N<span class="token string">'MySQL'</span><span class="token punctuation">,</span> <span class="token variable">@provider</span><span class="token operator">=</span>N<span class="token string">'MSDASQL'</span><span class="token punctuation">,</span> <span class="token variable">@provstr</span><span class="token operator">=</span>N<span class="token string">'Driver={MySQL ODBC 5.3 ANSI Driver};DATABASE=testapp;OPTION=134217728;SERVER=my_remote_server;PORT=3306)'</span>

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_addlinkedsrvlogin <span class="token variable">@rmtsrvname</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span><span class="token variable">@useself</span><span class="token operator">=</span>N<span class="token string">'False'</span><span class="token punctuation">,</span><span class="token variable">@locallogin</span><span class="token operator">=</span><span class="token boolean">NULL</span><span class="token punctuation">,</span><span class="token variable">@rmtuser</span><span class="token operator">=</span>N<span class="token string">'jon'</span><span class="token punctuation">,</span><span class="token variable">@rmtpassword</span><span class="token operator">=</span><span class="token string">'test'</span>

GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'collation compatible'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'data access'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'dist'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'pub'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'rpc'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'rpc out'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'sub'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'connect timeout'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'0'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'collation name'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span><span class="token boolean">null</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'lazy schema validation'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'query timeout'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'0'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'use remote collation'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'remote proc transaction promotion'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>

Notice that we set up the login information for the remote server using sp_addlinkedsrvlogin instead of placing it in the Provider string of sp_addlinkedserver.

How to query the linked server

Now that the linked server is successfully set up, we can execute queries against it:

<span class="token keyword">EXEC</span><span class="token punctuation">(</span>
    <span class="token string">'SELECT
    *
    FROM
    company_sales'</span>
<span class="token punctuation">)</span> at DATAWAREHOUSE

A second method is to use OPENQUERY in order to be able to reference the linked server in a FROM clause just as if it were a normal table. This allows for the use of JOIN between two completely separate databases.

<span class="token keyword">SELECT</span>
    <span class="token operator">*</span>
<span class="token keyword">FROM</span> 
    <span class="token keyword">OPENQUERY</span><span class="token punctuation">(</span>DATAWAREHOUSE<span class="token punctuation">,</span><span class="token string">'select * from company_sales'</span><span class="token punctuation">)</span> <span class="token number">c</span>
<span class="token keyword">JOIN</span>
    legacy_companies <span class="token keyword">ON</span> <span class="token number">c</span><span class="token punctuation">.</span>company_id <span class="token operator">=</span> legacy_companies<span class="token punctuation">.</span>company_id

Or to insert the results into a local table.

<span class="token keyword">SELECT</span>
    <span class="token operator">*</span>
<span class="token keyword">INTO</span> 
    sales
<span class="token keyword">FROM</span> 
    <span class="token keyword">OPENQUERY</span><span class="token punctuation">(</span>DATAWAREHOUSE<span class="token punctuation">,</span><span class="token string">'select * from company_sales'</span><span class="token punctuation">)</span> <span class="token number">c</span>

Conclusion

Using ODBC drivers and SQL Server Linked Servers provides us with a powerful way to access a remote database in a convenient manner within queries and stored procedures on our local database. It may not completely replace your need for SSIS, but does provide a faster solution for querying and pulling in information.

Reproducing ROW_NUMBER() in MySQL

Recently I was working on a project involving MySQL and needed to partition and rank results in a SELECT statement. This is a pretty straightforward approach in SQL Server using ROW_NUMBER() or RANK(). In my particular case I needed functionality equivalent to ROW_NUMBER() with partitioning by a single column and sorting by another. Below is an alternative since earlier versions of MySQL do not have an equivalent ROW_NUMBER() function.

Recently I was working on a project involving MySQL and needed to partition and rank results in a SELECT statement. This is a pretty straightforward approach in SQL Server using ROW_NUMBER() or RANK(). In my particular case I needed functionality equivalent to ROW_NUMBER() with partitioning by a single column and sorting by another. Below is an alternative since earlier versions of MySQL do not have an equivalent ROW_NUMBER() function.

Data

I am interested in ranking the highest sale days for each company based on the table below.

company_id company_name sale_date total_amount
1 Company A 2018-03-01 500.00
2 Company B 2017-09-15 700.00
1 Company A 2017-07-02 900.00
2 Company B 2017-06-10 560.00
2 Company B 2018-03-03 1000.00
1 Company A 2018-03-03 100.00
2 Company B 2017-12-01 650.00

*company_sales table

SQL Server query

Using T-SQL for SQL Server my query would be this:

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    ROW_NUMBER<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">OVER</span><span class="token punctuation">(</span><span class="token keyword">PARTITION</span> <span class="token keyword">BY</span> company_id <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> total_amount <span class="token keyword">DESC</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank
<span class="token keyword">FROM</span>
    company_sales

and the results:

company_name sale_date total_amount day_rank
Company A 2017-07-02 900.00 1
Company A 2018-03-01 500.00 2
Company A 2018-03-03 100.00 3
Company B 2018-03-03 1000.00 1
Company B 2017-09-15 700.00 2
Company B 2017-12-01 650.00 3
Company B 2017-06-10 560.00 4

MySQL query

In order to perform the same query in MySQL, I need to do things a little differently as MySQL does not have directly equivalent functions to ROW_NUMBER() and RANK().

Conveniently, one feature that MySQL does have is the ability to assign and read variables inside a SELECT statement. Using this, I can write an equivalent statement in MySQL.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL discussion

The query for MySQL is noticeably longer and a little more complicated, but it gives us the exact same output. Let’s break it down a bit and step through each piece.

1. Initialize a ranking variable:

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>

2. Write the sub query

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

There are 3 important pieces to this sub SELECT

  • Decide the ranking logic. This is equivalent to PARTION BY...ORDER BY in SQL Server. For this query, I want to partition by company and rank by total_amount, largest to smallest. To accomplish this, I double sort, first by company_id and then by total_amount.
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
  • Next, assign the company_id for each row to a variable
<span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id 
  • Finally, above this assignment, place the logic for either incrementing the ranking, or resetting it. This states that if the previous value of @comp_id is the same, then increase the rank (the next best sale day for the same company), otherwise reset to 1.
<span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>

3. Put it all together and sort
Now wrap the subquery, extracting only the columns needed and double sort again to get the same output as SQL Server.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL simplified

The query for MySQL can be simplified if you are not concerned about an extra column in the output(company_id) or about having the results properly sorted. They will still be ranked correctly, just not sorted by the database.

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id <span class="token keyword">AS</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

produces output similar to below:

company_name sale_date total_amount day_rank company_id
Company A 2017-07-02 900 1 1
Company A 2018-03-01 500 2 1
Company A 2018-03-03 100 3 1
Company B 2018-03-03 1000 1 2
Company B 2017-09-15 700 2 2
Company B 2017-12-01 650 3 2
Company B 2017-06-10 560 4 2

Conclusion

By using this variable assignment and increment technique with proper sorting, we now have the ability to partition and assign row numbers to SELECT statements in MySQL. I hope this has been informative and adds yet another item to your MySQL toolbox!


Featured image by Nacho Capelo @ unsplash.com