Integration Services Tutorials: Tutorial: Creating A Basic Package Using A Wizard
Integration Services Tutorials: Tutorial: Creating A Basic Package Using A Wizard
Microsoft Integration Services makes it easy to create robust and complex solutions for
extracting, transforming, and loading data (ETL). The Integration Services tools let you
design, create, deploy, and manage packages that address everyday business
requirements. The step-by-step tutorials in the following list will help you learn about the
Business Intelligence Development Studio environment. The first two tutorials
demonstrate how to create packages in an Integration Services project and guide you
through use of the Integration Services tools so that you can work efficiently from the
very start. The third tutorial demonstrates how to use the Integration Services tools to
easily install packages and their dependencies to a different computer.
Microsoft Integration Services provides the SQL Server Import and Export Wizard for
building packages that perform data transfers.
• These packages can extract data from a source and load it into a destination, but
the package can perform only minimal data transformation in the transfer
process.
• In addition, the wizard is a quick way to create basic packages that can then be
enhanced in SSIS Designer.
• In this tutorial, you will learn how to use the SQL Server Import and Export
Wizard to create a basic package.
• The package that you create extracts data from an Excel workbook and loads it
into a table in the AdventureWorks2008R2 database.
• The table is defined as one of the steps in the wizard and then created
dynamically when you run the package.
• In subsequent lessons, the package will be expanded to include a data flow that
sorts the data, creates a new column, and populates the column with values.
• To generate the new values, you will learn how to use the new Integration
Services expression language together with the graphical expression builder to
write an expression that creates new values based on existing data columns.
• When you install the sample data that the tutorial uses, you also install the
completed versions of the packages for each lesson of the tutorial.
• By using the completed lesson 1 package, you can skip ahead and begin the
tutorial with lesson 2 if you like.
• If this is your first time working with packages, the SQL Server Import and Export
Wizard, or the new development environment, we recommend that you begin
with lesson 1.
Requirements
• This tutorial is intended for users familiar with fundamental database operations,
but who have limited exposure to the new features available in SQL
Server Integration Services.
To use this tutorial, your system must meet the following requirements:
• You must run the package that this tutorial creates in 32-bit mode.
o This sample uses the Microsoft Jet 4.0 OLE DB provider, for which there is no
64-bit version.
o The package fails if you run it in 64-bit mode.
o For more information about running packages in 32-bit mode on a 64-bit
computer, see 64-bit Considerations for Integration Services.
• SQL Server with the AdventureWorks2008R2 database must be installed on the
computer.
o To enhance security, the sample databases are not installed by default.
o To install the sample databases, see Considerations for Installing SQL Server
Samples and Sample Databases.
• You must have permission to create and drop tables in AdventureWorks2008R2.
• The sample data must be installed on the computer.
o The sample data is installed together with the samples.
o If you cannot find the sample data, return to the procedure above and
complete installation as described.
Note
When reviewing tutorials it is recommended you add Next and Previous buttons to the
document viewer toolbar. For more information, see Adding Next and Previous Buttons
to Help.
• This tutorial assumes that you have not reconfigured SSIS Designer to use auto-
connect features between control flow elements or between data flow elements.
• If SSIS Designer uses auto-connect, an element may be connected automatically
when added to the design surface.
• Also, the auto-connect feature for control flow supports the use of Failure and
Completion as the default constraint, instead of Success.
• If SSIS Designer is not using Success as its default constraint, you should reset
this configuration while doing the tutorial.
• You configure the auto-connect features in the Business Intelligence
Designers section in the Options dialog box that is available from Options on
the Tool menu.
In this lesson, you will create a basic package by using the SQL Server Import and
Export Wizard. The package selects and extracts data from an Excel spreadsheet and
writes that data to the ProspectiveCustomers table in the
AdventureWorks2008R2 sample database.
• The table is defined in the wizard and created when you run the package.
• The SQL Server Import and Export Wizard will be run in Business Intelligence
Development Studio and you will launch the wizard from an Integration Services
project.
• After you complete the SQL Server Import and Export Wizard the package is
added to the Integration Services project.
• You will open the package in SSIS Designer, the Integration Services graphical
tool for building complex packages, and verify that certain properties of the
package are configured correctly.
• Finally, you will test the package by running it in Business Intelligence
Development Studio.
Important
This tutorial requires the AdventureWorks2008R2 sample database. For more
information on installing and deploying AdventureWorks2008R2, see Considerations for
Installing SQL Server Samples and Sample Databases.
Lesson Tasks
This lesson contains the following tasks:
4. Verify that the Excel version box contains Microsoft Excel 97-2003 and
the First Row has column names check box is selected.
5. On the Choose Destination page, do the following steps:
1. In the Destination list, Select SQL Server Native Client, and in the
Server name box, type localhost.
When you specify localhost as the server name, the connection manager
connects to the default instance of SQL Server on the local computer. To
use a remote default instance or a named instance of SQL Server, replace
localhost with the name of the server or server and named instance to
which you want to connect. To connect to a named instance use the
format <server name>\<instance name>.
2. If the instance of the Database Engine that you specified supports
Windows Authentication, use the default Windows Authentication mode;
otherwise, click Use SQL Server Authentication and type a user name
in the User name box and a password in the Password box.
3. In the Database list, select AdventureWorks2008R2.
6. On the Specify Table Copy or Query page, click Write a query to specify the
data to transfer.
7. On the Provide a Source Query page, in the SQL statement box, type or copy
the following SQL statement:
Copy
SELECT * FROM [Customers$] WHERE NumberCarsOwned > 0
8. On the Select Source Tables and Views page, do the following steps:
1. In the Destination list, click [dbo].[Query], and then change the table
name, Query, to ProspectiveCustomers.
2. To edit column metadata and table options, click Edit Mappings.
9. On the Columns Mappings page, do the following steps:
1. Verify that the Create Destination table option is selected, select the
Drop and re-create destination table check box, and modify the
metadata of the destination columns. The following table lists the columns
and the metadata changes that you need to make:
Column name Default type Updated type Default size Updated size
FirstName nvarchar No change 255 50
MiddleIntial nvarchar nchar 255 1
LastName nvarchar No change 255 50
BirthDate datetime No change N/A N/A
MaritalStatus nvarchar nchar 255 1
Gender nvarchar nchar 255 1
EmailAddress nvarchar No change 255 50
YearlyIncome float money N/A N/A
TotalChildren float tinyint N/A N/A
NumberChildrenAtHome float tinyint N/A N/A
Education nvarchar No change 255 50
Occupation nvarchar No change 255 50
HouseOwnerFlag float bit N/A N/A
NumberCarsOwned float tinyint N/A N/A
AddressLine1 nvarchar No change 255 60
AddressLine2 nvarchar No change 255 60
City nvarchar No change 255 30
Important
The first time you run the package, the task named Drop table(s) SQL Task will fail. This
behavior is expected. The reason the task fails is that the package attempts to drop and
re-create the ProspectiveCustomers table; however, the first time that the package runs
the table does not exist and the DROP statement fails. This does not cause the package
to fail because the precedence constraint between the Drop table(s) SQL Task and
Preparation SQL tasks has been set to Completion rather than Success.
Data Flow
Also, the package should include the following two connection managers. One connects
to the Excel workbook file Customers.xls and the other one connects to the
AdventureWorks2008R2 database.
In Lesson 1: Creating the Basic Package, we used the SQL Server Import and Export
Wizard to get a quick start on a basic Integration Services package.
The package has limited functionality; it only extracts data from an Excel
workbook file and loads the data into the ProspectiveCustomers table of the
AdventureWorks2008R2 sample database.
Typically, a package also needs to manipulate and transform the data.
Integration Services provides a wealth of transformations that you can use to
copy, cleanse, modify, sort, and aggregate data.
If you need to transform data in ways that are not supported by the standard
transformations, you can easily write a script for the Script transformation or
code a custom transformation to address your needs.
In this lesson you will enhance the basic package to sort the data and add a new
column based on values from other columns to the dataset.
In this scenario, one column contains null values, which present problems when
concatenating values from existing columns.
To work around this problem and generate the value for the new column, you will
use a new Integration Services feature—expressions.
The Integration Services expression language includes functions, operators, and
type casts that you can use to build complex expressions.
You will use an expression to concatenate the values from three columns and
conditionally insert a space between columns, and then add the new value to the
new column.
Because a new column is added to the dataset, the ProspectiveCustomers table
and the OLE DB destination must be modified to include this column.
You will update both the SQL statement in the Execute SQL task that created the
ProspectiveCustomers table, and the OLE DB destination that writes data to the
table, to include this new column.
You will also map the new column in the dataset to the new column in the table.
In this lesson, you will copy and then enhance the basic created in Lesson 1. If you have
not completed the previous lesson, you can also copy the completed package for Lesson
1 that is included with the tutorial.
Important
This tutorial requires the AdventureWorks2008R2 sample database. For more
information about how to install and deploy AdventureWorks2008R2, see Considerations
for Installing SQL Server Samples and Sample Databases.
Lesson Tasks
This lesson contains the following tasks:
• Step 1: Copying the Lesson 1 Basic Package
• Step 2: Updating the Execute SQL Task
• Step 3: Adding and Configuring the Sort Transformation
• Step 5: Modifying the OLE DB Destination
• Step 6: Testing the Lesson 2 Basic Package
In this task, you will create a copy of the package that you created in Lesson 1, named
Basic Package Lesson 1.dtsx. If you did not complete Lesson 1, you can add the
completed Lesson 1 package that is included with the tutorial to the project, and then
copy it instead. You will use this new copy throughout the rest of Lesson 2.
In this task, you will update the SQL statement in the Execute SQL task named
Preparation SQL Task. The existing SQL statement was automatically generated from the
options you specified when you stepped through the SQL Server Import and Export
Wizard pages to create the lesson 1 package. This SQL statement creates the Query
table in the AdventureWorks2008R2 database when the package is run.
Later in this lesson, you will generate an additional column to the data that is extracted
from the Excel spreadsheet, and you need to include a definition of that column in the
SQL statement.
4. In the Enter SQL Query dialog box, add a comma at the end of the line, [Phone]
nvarchar (50), press Enter, and on the new line, type [FullName] nvarchar (103).
The completed SQL statement should look like this:
CREATE TABLE [AdventureWorks2008R2].[dbo].[Query] (
[FirstName] nvarchar(50),
[MiddleInitial] nchar(1),
[LastName] nvarchar(50),
[BirthDate] datetime,
[MaritalStatus ] nchar(1),
[Gender] nchar(1) NOT NULL,
[EmailAddress] nvarchar(50),
[YearlyIncome] money,
[TotalChildren] tinyint,
[NumberChildrenAtHome] tinyint,
[Education] nvarchar(50),
[Occupation] nvarchar(50),
[HouseOwnerFlag] bit,
[NumberCarsOwned] tinyint,
[AddressLine1] nvarchar(60),
[AddressLine2] nvarchar(60),
[City] nvarchar(30),
[State] nchar(3),
[ZIP] float,
[Phone] nvarchar(50),
[FullName] nvarchar (103)
)
GO
5. Click OK.
6. Click Parse Query. The SQL statement should parse successfully.
7. Click OK.
In this task, you will add and configure a Sort transformation to your package. A Sort
transformation is a data flow component that sorts data, and optionally applies rules to
the comparison that the sort performs. The sort transformation can also be used to
remove rows of data that have duplicate sort key values.
The sort transformation will sort the data extracted from the Excel spreadsheet by state
and by city.
1. Open the Data Flow designer, either by double-clicking Data Flow Task or by
clicking the Data Flow tab.
2. Right-click the path (the green arrow) between Data Conversion and
Destination - Query and then click Delete.
3. In the Toolbox, expand Data FlowTransformations, and then drag Sort onto
the design surface of the Data Flow tab, below Data Conversion. If
Destination - Query is in the way, click it and drag it to a position lower on the
Data Flow design surface.
4. On the Data Flow design surface, click Sort in the Sort transformation, and
change the name to Sort by State and City.
5. Click Source - Query and drag its green arrow to Sort by State and City.
6. Double-click Sort by State and City to open the Sort Transformation Editor
dialog box.
7. In the Available Input Columns list, first select the check box to the left of the
State column, and then the select the check box by the City column.
The columns now appear in the Input Column list. State has the sort order 1
and City has the sort order 2. This means that the dataset is sorted first by state
and then by city.
8. In the Input Column list, click the row that contains State. Click the
Comparison Flags box, select the Ignore case check box, and then click OK.
9. Click OK.
10. Right-click Sort by State and City and then click Properties.
11. In the Properties window, verify that the LocaleID property is set to English
(United States).
In this task, you will add a Derived Column transformation to your package. A Derived
Column transformation is a data flow component that creates new data values by using
values in a dataset, constants, and variables, or by applying functions. You will use this
transformation to add a new column and then populate the column with the evaluation
results of an expression.
The user interface for the Derived Column transformation includes the expression
builder. This graphical tool makes it easy to quickly write complex expressions using
drag and drop operations, and provides templates for functions, type casts, and
operators as well as the input columns and variables.
In the Derived Column transformation, you will create an expression that concatenates
the values in the FirstName, MiddleInitial, and LastName columns in the dataset and
then writes the result to a new column. Because the middle initial may be null, the
expression will include special handling of this column. The new column, FullName, will
be added to the transformation output.
1. If not already open, open the Data Flow designer, either by double-clicking Data
Flow Task or by clicking the Data Flow tab.
2. In the Toolbox, expand Data FlowTransformations, and then drag a Derived
Column transformation onto the design surface of the Data Flow tab, below
Sort by State and City.
3. On the Data Flow design surface, click Derived Column in the Derived Column
transformation, and change the name to Add FullName Column.
4. Click Sort by State and City and drag its green arrow to Add FullName
Column.
5. Double-click Add FullName Column to open the Derived Column
Transformation Editor dialog box.
6. In the left pane, expand the Columns folder, click the FirstName column and
drag it to the Expression box.
7. In the Expression box, after [FirstName], type + " " +.
8. In the Columns folder, click the MiddileInitial column and drag it to the
Expression box.
9. Update [MiddleInitial] to (ISNULL(MiddleInitial) ? "" : MiddleInitial + " ") + .
10. In the Columns folder, click the LastName column and drag it to the Expression
box.
11. Verify that the value in the Expression box is the following:
FirstName + " " + (ISNULL([MiddleInitial]) ? "" : [MiddleInitial] + " ") +
[LastName]
You may optionally remove the brackets that enclose column names in the
expression. The column names are regular identifiers, which do not need to be
enclosed in brackets. Names that contain invalid characters, such as spaces, must
be enclosed in brackets. If the expression has been typed incorrectly, the
expression text will appear in red.
12. In the Derived Column box for the row you just created, select <add as new
column>.
13. In the Derived Column Name box for the same row, type FullName.
14. If the Data Type box is not already set to Unicode string [DT_WSTR], select
Unicode string [DT_WSTR] in the Data Type list.
15. Set the value of the Length box to 103 (the sum of the lengths of the
FirstName, MiddleInitial, LastName columns, and two spaces.)
16. Click OK.
17. In the Properties window, verify that the LocaleID property is set to English
(United States).
Earlier in lesson 2, you updated the SQL statement in the Execute SQL task,
Preparation SQL Task, to include a definition of the FullName column in the Query
table. In this task, you will modify the OLE DB destination, Destination - Query, to
support the FullName column.
You will also restore the column mappings in Destination - Query that are no longer
valid because you added a Sort transformation to the data flow. The Sort transformation
generates a new set of columns with different column identifiers, and you therefore need
to remap the input columns and destination columns in Destination - Query.
1. If not already open, open the Data Flow designer, either by double-clicking Data
Flow Task or by clicking the Data Flow tab.
2. Click the Derived Column transformation named Add FullName Column and
drag its green arrow to Destination - Query.
3. Double-click Destination - Query.
4. In the Restore Invalid Column Reference Editor dialog box, click Select All,
select the <Map using column name> option in the Column mapping option
for selected rows list, and then click Apply.
You can clear the Include downstream invalid column references check box.
In this package, there are no downstream data flow components and this option
has no effect.
5. Click OK.
6. Right-click Destination - Query and click Show Advanced Editor.
7. In the Advanced Editor dialog box, click the Input and Output Properties
tab, expand Destination Input, click External Columns, and then click Add
Column.
A new column named Column is added to the External Columns folder.
8. Click the new column.
9. In the right-hand pane, update the Name property to FullName, click the
DataType property and select Unicode string [DT_WSTR] from the list. Update
the Length property to 103.
10. Click the Columns Mappings tab, and scroll down to the row with FullName in
the Destination Column list. Click <ignore> in the Input Column list of that
row, and then click FullName in the list.
11. Verify that all input and output columns that have the same names are mapped.
12. Click OK.
• Added and configured a Sort transformation to sort the dataset by state and then
by city.
• Added a Derived Column transformation and configured it to use an expression to
generate values for a new column.
• Modified the OLE DB destination to write the new column, FullName, to the
ProspectiveCustomers table.
Important
The first time you run the package, the Drop table(s) Task will fail. This behavior is
expected. The reason this happens is that the package attempts to drop and re-create
the ProspectiveCustomers table; however, the first time that the package runs the table
does not exist and the DROP statement fails.
Data Flow
Also, the package should include the following two connection managers. One connects
to the customers.xls Excel workbook file and the other one connects to the
AdventureWorks2008R2 database.
1. On the Start menu, point to All Programs, point to Microsoft SQL Server, and
click SQL ServerManagement Studio.
2. In the Connect to Server dialog, select Database Engine in the Server type
list, provide the name of the server on which the
AdventureWorks2008R2 database is installed in the Server name box, and select
an authentication mode option. If you select SQL Server Authentication, provide a
user name and a password.
3. Click Connect. SQL Server Management Studio opens.
4. On the toolbar, click New Query.
5. Type or copy the following query in the query window.
SELECT * FROM AdventureWorks2008R2.dbo.ProspectiveCustomers
6. On the toolbar, click Execute. The Results pane shows the dataset, including the
new FullName column. You can verify that your expression formatted the column
value correctly depending on whether the middle initial is null.