SQL Server Integration Services (SSIS) is a popular and mature tool for performing data movement and cleanup operations. In the Microsoft ecosystem, SSIS development is one of the most common extract, load, and transform (ETL) tools in use today. SSIS is powerful and configurable, yet surprisingly easy to use.
Before we explore the machinery of SSIS, it’s important to understand some of the terms you’ll often hear used by SSIS professionals. Here are some of the key terms and phrases you’ll need to know to be successful with SSIS.
- Package: The SSIS package is the central component of the SSIS code, and it’s the canvas on which you will spend most of your development time. An SSIS package is a collection of one or more operations that are invoked together.
- Task: A task is a single operation within a package. There are dozens of different types of tasks available in SSIS.
- Component: A component is part of a data pipeline, representing either a source from which data is retrieved, a destination to which data is written, or a transformation that manipulates or reshapes the data.
- Execution: This is the act of invoking, or running, the logic in an SSIS package. Packages can be executed from within the SQL Server Data Tools (SSDT) development environment or directly on a properly configured instance of SQL Server.
- Deployment: A deployment occurs when the fully developed SSIS project is pushed from the development workstation to an instance of SQL Server, where it can then be executed either manually or through a scheduling tool such as SQL Server Agent. Deployment is usually more complex than copying code from one machine to another, although SQL Server does a good job of hiding that complexity for most deployments.
- Project: Source code in SSIS is arranged into functional units called projects. A project can contain one package or many packages. In most cases, when deploying SSIS code, the entire project is deployed to the server.
- Solution: A solution is a logical grouping of related projects.
- The SSIS runtime engine: This is the logic that allows a package to run. When you’re working with SSIS packages in SSDT, the packages will be executed using the SSIS runtime on your development machine. After the code is deployed to SQL Server, any execution run on that server will use the server’s SSIS runtime.
The SSIS Toolset
SSIS is licensed as part of SQL Server, and the internals of running an SSIS package can be installed as part of the normal SQL Server install. As an SSIS developer, however, you will need a couple of additional tools installed on your workstation:
- SSDT: SSDT is a lighter version of Microsoft Visual Studio, configured with the extensions to create SSIS projects. When building or testing SSIS packages, SSDT is the tool in which you will spend most of your time.
- SQL Server Management Studio (SSMS): Although not strictly required for building SSIS packages, at some point during the development cycle you’ll need SSMS to test and deploy the ETL processes you create in SSIS.
When setting up your SSIS development environment, it’s important to remember that you must install the SSDT version of Visual Studio to be able to work with SSIS projects. If you already have Visual Studio installed, you’ll need to download and install the SSDT components.
The SSIS Development Environment
If you’ve ever worked with Visual Studio, the SSIS development environment will look familiar. SSDT uses the same multiple-document interface (MDI) that enables you to open or close each SSIS package independently, open several packages at once, and conveniently dock them as needed.
Inside this development environment, you’ll be frequently using the following assets:
- Solution Explorer: This window allows easy browsing of the current solution and all the files contained within it. You can also add, delete, or rename files in the current solution.
- Properties Window: This context-aware window shows the properties for the currently selected item.
- SSIS Toolbox: Like the name implies, this is your toolbox of available operations in SSIS. This list is also context-aware and shows the tools for either the control flow or data flow (more on these later in the article) based on where you’re currently working.
- SSIS Package: This is the package you’re currently working on. Although a single package is shown in Figure 1, the MDI allows you to have more than one package open at the same time.
- Connection Manager Tray: This is the area where any currently available connections are shown. We’ll go into more details on the connection managers shortly.