Linked Server: SQL Server to MySQL

In this example, we will see how to set up a link to a MySQL instance using SQL Server. By doing this, we will have access to write queries against a remote MySQL database any time data access is needed.

Using SSIS is a great way to perform ETL operations and bulk load or process data coming from multiple databases, especially if various database vendors are involved. However, setting up a full SSIS package just to do a quick query, import, or join is cumbersome and time consuming. Thankfully, setting up a Linked Server in SQL Server can be a straightforward task to keep productivity high in these types of situations.

The Goal

In this example, we will see how to set up a link to a MySQL instance using SQL Server. By doing this, we will have access to write queries against a remote MySQL database any time data access is needed. We will even be able to JOIN against tables in our SQL Server database to provide quick analysis across disparate database engines.

Installing the MySQL driver

The first step to setting up the linked server is to download and install the MySQL ODBC driver. For this, go to the MySQL ODBC driver download page and select the driver that meets your OS specifications.

Verifying the driver install

After installing the driver, check the ODBC data sources to make sure the driver was successfully installed. The easiest way to do this is to search for ‘ODBC’ from the Windows Start Menu and select either the 32-bit or 64-bit option depending on which driver was installed.

From the ODBC Data Source Administration window, select the ‘Drivers’ tab to verify the install. Take note of the driver name; this exact string will be used later when setting up the linked server.

Setting up the Linked Server

Now that the driver is installed, we are ready to set up the linked server. For this, open SQL Server Management Studio and connect to the local server. Expand Object Explorer tree for the connection to find the ‘Linked Servers’ section (Server Objects -> Linked Servers)

Right click on ‘Linked Servers’ and click ‘New Linked Server…’

This will bring up the ‘New Linked Server’ dialog.

Give the linked server a name (in my example ‘DATAWAREHOUSE’) . Since we are linking to a MySQL instance, select ‘Other data source’ and fill out as follows:

  • Provider: Microsoft OLE DB Provider for ODBC Drivers
  • Product name: MySQL

Now we need to fill out the Provider string for the connection. For this example, my provider string is below:

Driver<span class="token operator">=</span>{MySQL ODBC <span class="token number">5.3</span> ANSI Driver}<span class="token punctuation">;</span><span class="token keyword">DATABASE</span><span class="token operator">=</span>testapp<span class="token punctuation">;</span>PWD<span class="token operator">=</span>test<span class="token punctuation">;</span>UID<span class="token operator">=</span>jon<span class="token punctuation">;</span><span class="token keyword">OPTION</span><span class="token operator">=</span><span class="token number">134217728</span><span class="token punctuation">;</span>SERVER<span class="token operator">=</span>my_remote_server<span class="token punctuation">;</span>PORT<span class="token operator">=</span><span class="token number">3306</span><span class="token punctuation">)</span>

Notice the driver name is exactly the name of the driver from the ODBC driver installation. The MySQL ODBC driver has many options available for the provider string. We will only concentrate on the few needed for a basic connection, however, you should take a look at the full list here in order to make sure your connection does not need additional values.

One additional parameter for us connecting from SQL Server is the OPTION parameter. For connections from SQL Server, we set this to the recommended value of 134217728. Be sure to see the full list of recommended parameter options depending on your configuration.

One final step before creating the linked server is to change make sure that ‘RCP’ and ‘RCP Out’ are set to True on the ‘Server Options’ tab.

Click ‘OK’ to created the linked server. The server should now appear in the Object Explorer.

An alternative to creating a linked server is to script the creation.

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_addlinkedserver <span class="token variable">@server</span> <span class="token operator">=</span> N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@srvproduct</span><span class="token operator">=</span>N<span class="token string">'MySQL'</span><span class="token punctuation">,</span> <span class="token variable">@provider</span><span class="token operator">=</span>N<span class="token string">'MSDASQL'</span><span class="token punctuation">,</span> <span class="token variable">@provstr</span><span class="token operator">=</span>N<span class="token string">'Driver={MySQL ODBC 5.3 ANSI Driver};DATABASE=testapp;OPTION=134217728;SERVER=my_remote_server;PORT=3306)'</span>

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_addlinkedsrvlogin <span class="token variable">@rmtsrvname</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span><span class="token variable">@useself</span><span class="token operator">=</span>N<span class="token string">'False'</span><span class="token punctuation">,</span><span class="token variable">@locallogin</span><span class="token operator">=</span><span class="token boolean">NULL</span><span class="token punctuation">,</span><span class="token variable">@rmtuser</span><span class="token operator">=</span>N<span class="token string">'jon'</span><span class="token punctuation">,</span><span class="token variable">@rmtpassword</span><span class="token operator">=</span><span class="token string">'test'</span>

GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'collation compatible'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'data access'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'dist'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'pub'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'rpc'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'rpc out'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'sub'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'connect timeout'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'0'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'collation name'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span><span class="token boolean">null</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'lazy schema validation'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'false'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'query timeout'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'0'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'use remote collation'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>
GO

<span class="token keyword">EXEC</span> master<span class="token punctuation">.</span>dbo<span class="token punctuation">.</span>sp_serveroption <span class="token variable">@server</span><span class="token operator">=</span>N<span class="token string">'DATAWAREHOUSE'</span><span class="token punctuation">,</span> <span class="token variable">@optname</span><span class="token operator">=</span>N<span class="token string">'remote proc transaction promotion'</span><span class="token punctuation">,</span> <span class="token variable">@optvalue</span><span class="token operator">=</span>N<span class="token string">'true'</span>

Notice that we set up the login information for the remote server using sp_addlinkedsrvlogin instead of placing it in the Provider string of sp_addlinkedserver.

How to query the linked server

Now that the linked server is successfully set up, we can execute queries against it:

<span class="token keyword">EXEC</span><span class="token punctuation">(</span>
    <span class="token string">'SELECT
    *
    FROM
    company_sales'</span>
<span class="token punctuation">)</span> at DATAWAREHOUSE

A second method is to use OPENQUERY in order to be able to reference the linked server in a FROM clause just as if it were a normal table. This allows for the use of JOIN between two completely separate databases.

<span class="token keyword">SELECT</span>
    <span class="token operator">*</span>
<span class="token keyword">FROM</span> 
    <span class="token keyword">OPENQUERY</span><span class="token punctuation">(</span>DATAWAREHOUSE<span class="token punctuation">,</span><span class="token string">'select * from company_sales'</span><span class="token punctuation">)</span> <span class="token number">c</span>
<span class="token keyword">JOIN</span>
    legacy_companies <span class="token keyword">ON</span> <span class="token number">c</span><span class="token punctuation">.</span>company_id <span class="token operator">=</span> legacy_companies<span class="token punctuation">.</span>company_id

Or to insert the results into a local table.

<span class="token keyword">SELECT</span>
    <span class="token operator">*</span>
<span class="token keyword">INTO</span> 
    sales
<span class="token keyword">FROM</span> 
    <span class="token keyword">OPENQUERY</span><span class="token punctuation">(</span>DATAWAREHOUSE<span class="token punctuation">,</span><span class="token string">'select * from company_sales'</span><span class="token punctuation">)</span> <span class="token number">c</span>

Conclusion

Using ODBC drivers and SQL Server Linked Servers provides us with a powerful way to access a remote database in a convenient manner within queries and stored procedures on our local database. It may not completely replace your need for SSIS, but does provide a faster solution for querying and pulling in information.

Reproducing ROW_NUMBER() in MySQL

Recently I was working on a project involving MySQL and needed to partition and rank results in a SELECT statement. This is a pretty straightforward approach in SQL Server using ROW_NUMBER() or RANK(). In my particular case I needed functionality equivalent to ROW_NUMBER() with partitioning by a single column and sorting by another. Below is an alternative since earlier versions of MySQL do not have an equivalent ROW_NUMBER() function.

Recently I was working on a project involving MySQL and needed to partition and rank results in a SELECT statement. This is a pretty straightforward approach in SQL Server using ROW_NUMBER() or RANK(). In my particular case I needed functionality equivalent to ROW_NUMBER() with partitioning by a single column and sorting by another. Below is an alternative since earlier versions of MySQL do not have an equivalent ROW_NUMBER() function.

Data

I am interested in ranking the highest sale days for each company based on the table below.

company_id company_name sale_date total_amount
1 Company A 2018-03-01 500.00
2 Company B 2017-09-15 700.00
1 Company A 2017-07-02 900.00
2 Company B 2017-06-10 560.00
2 Company B 2018-03-03 1000.00
1 Company A 2018-03-03 100.00
2 Company B 2017-12-01 650.00

*company_sales table

SQL Server query

Using T-SQL for SQL Server my query would be this:

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    ROW_NUMBER<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">OVER</span><span class="token punctuation">(</span><span class="token keyword">PARTITION</span> <span class="token keyword">BY</span> company_id <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> total_amount <span class="token keyword">DESC</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank
<span class="token keyword">FROM</span>
    company_sales

and the results:

company_name sale_date total_amount day_rank
Company A 2017-07-02 900.00 1
Company A 2018-03-01 500.00 2
Company A 2018-03-03 100.00 3
Company B 2018-03-03 1000.00 1
Company B 2017-09-15 700.00 2
Company B 2017-12-01 650.00 3
Company B 2017-06-10 560.00 4

MySQL query

In order to perform the same query in MySQL, I need to do things a little differently as MySQL does not have directly equivalent functions to ROW_NUMBER() and RANK().

Conveniently, one feature that MySQL does have is the ability to assign and read variables inside a SELECT statement. Using this, I can write an equivalent statement in MySQL.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL discussion

The query for MySQL is noticeably longer and a little more complicated, but it gives us the exact same output. Let’s break it down a bit and step through each piece.

1. Initialize a ranking variable:

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>

2. Write the sub query

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

There are 3 important pieces to this sub SELECT

  • Decide the ranking logic. This is equivalent to PARTION BY...ORDER BY in SQL Server. For this query, I want to partition by company and rank by total_amount, largest to smallest. To accomplish this, I double sort, first by company_id and then by total_amount.
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
  • Next, assign the company_id for each row to a variable
<span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id 
  • Finally, above this assignment, place the logic for either incrementing the ranking, or resetting it. This states that if the previous value of @comp_id is the same, then increase the rank (the next best sale day for the same company), otherwise reset to 1.
<span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>

3. Put it all together and sort
Now wrap the subquery, extracting only the columns needed and double sort again to get the same output as SQL Server.

<span class="token keyword">SET</span> <span class="token variable">@rank</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    day_rank
<span class="token keyword">FROM</span> <span class="token punctuation">(</span>
    <span class="token keyword">SELECT</span>
        company_name<span class="token punctuation">,</span>
        sale_date<span class="token punctuation">,</span>
        total_amount<span class="token punctuation">,</span>
        <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
        <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id
    <span class="token keyword">FROM</span>
        company_sales
    <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>
<span class="token punctuation">)</span> rankings
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_name<span class="token punctuation">,</span> day_rank

MySQL simplified

The query for MySQL can be simplified if you are not concerned about an extra column in the output(company_id) or about having the results properly sorted. They will still be ranked correctly, just not sorted by the database.

<span class="token keyword">SELECT</span>
    company_name<span class="token punctuation">,</span>
    sale_date<span class="token punctuation">,</span>
    total_amount<span class="token punctuation">,</span>
    <span class="token variable">@rank</span> :<span class="token operator">=</span> <span class="token keyword">IF</span><span class="token punctuation">(</span> <span class="token variable">@comp_id</span> <span class="token operator">=</span> company_id<span class="token punctuation">,</span> <span class="token variable">@rank</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> day_rank<span class="token punctuation">,</span>
    <span class="token variable">@comp_id</span> :<span class="token operator">=</span> company_id <span class="token keyword">AS</span> company_id
<span class="token keyword">FROM</span>
    company_sales
<span class="token keyword">ORDER</span> <span class="token keyword">BY</span> company_id<span class="token punctuation">,</span> total_amount <span class="token keyword">DESC</span>

produces output similar to below:

company_name sale_date total_amount day_rank company_id
Company A 2017-07-02 900 1 1
Company A 2018-03-01 500 2 1
Company A 2018-03-03 100 3 1
Company B 2018-03-03 1000 1 2
Company B 2017-09-15 700 2 2
Company B 2017-12-01 650 3 2
Company B 2017-06-10 560 4 2

Conclusion

By using this variable assignment and increment technique with proper sorting, we now have the ability to partition and assign row numbers to SELECT statements in MySQL. I hope this has been informative and adds yet another item to your MySQL toolbox!


Featured image by Nacho Capelo @ unsplash.com