Thursday, 8 March 2018

Loading To Azure From On-Prem SSIS


Loading to Azure from On-Prem SSISIn my introduction to the topic of using SSIS in the cloud, I noted that SSIS is not currently available in a platform as a service (PaaS). Unlike SQL server databases and more recently SSAS, which are both available as both on-prem and PaaS services, Integration Services must be run on a user-installed instance of SQL Server. This limitation means that your SSIS package must run either in an on-premises server or in a cloud VM. In this post, I’ll address the former by loading to Azure from on-prem SSIS.

Loading to Azure from On-Prem SSIS

When loading on-premises data to an Azure relational database, the easiest solution will usually be to move the data using Integration Services running on an on-prem machine. As an aside, SSIS won’t necessarily be the best performing solution for loading Azure databases using on-prem data (especially when loading SQL Data Warehouse). I’m purposefully starting with the simplest design to ease into more complicated load scenarios later in this series.
In this scenario, moving on-prem data to an Azure relational database requires the following:
  • An Azure database. I’ll be demonstrating by using a SQL Database as a destination, but the method to load a SQL Data Warehouse from SSIS would be very similar.
  • On-prem data that can be moved to said database
  • An on-prem instance of SQL Server configured to execute SSIS packages

Set up the Connection Manager

There aren’t any special tools required to connect to an Azure database from SSIS. Use one of the native connection managers – OLEDB, ADO.NET, or ODBC – to make a connection to SQL Database or SQL Data Warehouse. To create the connection manager, you’ll need the full name of the Azure SQL Server, which can be found in the server properties blade of the Azure portal.
Azure SQL Database
You’ll need the login credentials for that server as well (unless you’re using Azure Active Directory). Also, you’ll have to add the public-facing IP address corresponding to the on-premises SSIS server to the Azure firewall inclusion list, if you haven’t already. You’ll get a helpful error message as shown below if you try to create an SSIS connection manager to the Azure database before the firewall rule is set up.
image
If this happens, you can use the Azure portal to configure the firewall rule to allow traffic from that server. Just keep in mind that the address to add in this rule is the internet-facing IP address (or range) used by the machine running SSIS, not the internal IP address of that machine.
image

Build the Load Logic

Once the connection manager to the Azure database has been created, you’ll build out the SSIS package just like any other on-prem load process. By this point, the Azure connection is just another connection, so you can use it as a data flow source or destination, or use the Execute SQL Task to run T-SQL against it. For this example, I’ll build a Data Flow task to load data from a locally-stored flat file containing aviation data into an Azure SQL Database table.
image
In the OLEDB destination above * (yes, I know that technically OLEDB is deprecated, but I’m going to keep driving it until the wheels fall off), I can select an existing table, or use the create table dialog box to create a new table just like an on-prem connection.
image
With the local flat file as the source and the Azure SQL Database table as the destination, I’ll run the package. Although it runs a bit slower than a fully on-prem load, the narrow 208,000-row set of data loads into the SQL Database in under 15 seconds.
image
In this example, I am loading to Azure from on-prem SSIS, using my instance of SQL Database as a destination. However, the data load could go either way. As noted, once the connection manager has been set up, SSIS treats an Azure database just like any other connection and can be used as a source, destination, or transformation. In fact, you could use on-premises SSIS to load data between two different Azure databases, but in that case you’d get better performance by running SSIS in an Azure VM (to be covered in the next post in this series).

Conclusion

Although SSIS is not yet available in a PaaS architecture, loading to Azure from on-prem SSIS is a good option for local-to-cloud load scenarios. As shown here, connecting to an Azure database is relatively simple, and works similarly on instances of both SQL Database and SQL Data Warehouse.
* Update 12/8/2017: In October of this year, Microsoft announced that they have undeprecated OLE DB, and it will continue to be a part of the data access stack.

No comments:

Post a Comment