Slowly Changing Dimension transformation in SSIS
In this article, I create a simple ETL package that inserting and updating the records from Source database to data warehouse dimension database/tables using Slowly Changing Dimension transformation.I am going to provide you the steps and guidance needed to manage Slowly Changing Dimension with Slowly Changing Dimension Transformation in data flow task with an example.
The Slowly Changing Dimension transformation used to inserting and updating the records from Source database to data warehouse dimension database/tables.
We have one source SQL tables an "EMP". We want to insert and update the records into data warehouse dimension table "DimEmp"
SQL Scripts:-
Source Table:-
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Emp](
[EMPLOYEE_ID] [nchar](20) NOT NULL,
[FIRST_NAME] [nchar](20) NULL,
[LAST_NAME] [nchar](20) NULL,
[DEPARTMENT] [nchar](20) NULL,
[COUNTRY] [nchar](10) NULL,
[COMPANYNAME] [nchar](20) NULL,
CONSTRAINT [PK_Emp] PRIMARY KEY CLUSTERED
(
[EMPLOYEE_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_001', N'John', N'Abraham', N'Banking', N'USA', N'Ernst Handel
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_002', N'Michael', N'Clarke', N'Insurance', N'Canada
', N'Around the Horn
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_003', N'Roy', N'Thomas', N'Services', N'USA', N'Around the Horn Ltd ')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_004', N'Tom', N'Jose', N'Insurance', N'Germany
', N'Frankenversand
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_005', N'Jerry', N'Pinto', N'Insurance', N'UK
', N'Around the Horn
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_006', N'Philip', N'Mathew', N'Services', N'Canada
', N'Around the Horn
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_007', N'Cook', N'Adams', N'Services', N'USA', N'Ernst Handel
')
GO
INSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME]) VALUES (N'EMP_008', N'Ian', N'Botham', N'Insurance', N'Germany', N'Frankenversand
')
GO
Dimension Table:- SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DimEmp](
[EMPLOYEEID] [int] IDENTITY(1,1) NOT NULL,
[EMPLOYEE_CODE] [nchar](20) NOT NULL,
[FIRST_NAME] [nchar](20) NULL,
[LAST_NAME] [nchar](20) NULL,
[DEPARTMENT] [nchar](20) NULL,
[COUNTRY] [nchar](10) NULL,
[COMPANYNAME] [nchar](20) NULL,
[EFFECTIVEDATE] [date] NULL,
[EXPIRATIONDATE] [date] NULL,
[STATUSFLAG] [BIT] NULL,
CONSTRAINT [PK_DimEMPLOYEEID] PRIMARY KEY CLUSTERED
(
[EMPLOYEEID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Slowly Changing Dimension
1. From the Toolbox Double-click or drag the two "OLB DB SOURCE" component to the Data Flow Task.
2. We first need to select one of the connection managers we created earlier. From the OLE DB connection manager drop-down list, select the connection manager you created to the "Test" database. On my system, the name of the connection manager is "LocalHost.Test". Next, select the "dbo.Emp" table from the Name of the table or the view drop-down list. The OLE DB Source Editor should now look similar to the one shown in below image
3. From the Toolbox Double-click or drag the two "Slowly Changing Dimension" component to the Data Flow Task.
4. Right click and Edit the Slowly Changing Dimension component. Select one of the connection managers we created earlier. From the OLE DB connection manager drop-down list, select the connection manager you created to the "Test" database. On my system, the name of the connection manager is "LocalHost.Test".
5. Next, select the "dbo.DimEmp" table from the Name of the table or the view drop-down list.
6. Next, Map the input columns with dimension columns and select a Business Key column. The SCD Editor should now look similar to the one shown in below image
The Slowly Changing Dimension transformation supports below types of changes:
Fixed Attribute
Select this type when the value in a column should not change. Changes are treated as errors.
Changing Attribute
Select this type when changed values should overwrite existing values. This is a type 1 change.
Historical Attribute
Select this type when changes in column values are saved in new records. Previous values are saved in records marked as outdated. This is a Type 2 change.
7. Select the Dimension columns and change type as shown in below image
8. Click next. Mark the Check box to fail the transformation if it fined the changes in the fixed attribute. The SCD Editor should now look similar to the one shown in below image
9. Select the single column or start & end date columns for record historical attributes. Select start date and end date as shown in below image
10. The remaining components will be created automatically as show in below image
11. Execute the package. It will insert the 8 rows into dimension tables. Refer the below screen shots
12. Now, Inserting a row into source tableINSERT [dbo].[Emp] ([EMPLOYEE_ID], [FIRST_NAME], [LAST_NAME], [DEPARTMENT], [COUNTRY], [COMPANYNAME])
VALUES (N'EMP_009', N'Deep', N'Jai', N'DWH', N'USA', N'Microsoft')
13. Execute the package. It will insert the 1 row into dimension tables. Refer the below screen shots
14. Now, Updating the company name for EmployeeID 'EMP_003'
UPDATE [Emp] SET COMPANYNAME='Test Ltd' WHERE EMPLOYEE_ID='EMP_003'
15. The company is Changing Attribute. So it will overwrite existing values.
16. Now, Updating the Department name for EmployeeID 'EMP_003'UPDATE [Emp] SET DEPARTMENT='Test Dept' WHERE EMPLOYEE_ID='EMP_003'
17. The Department is Historical Attribute. So it will insert a new record and maintain the old record with Expiration Date as history.
18. We can capture the expired values as shown in below image
Conclusion
In this article I talked about Slowly Changing Dimension Transformation, which is available out of the box in SSIS toolkit and can be used easily and quickly configured for managing smaller slowly changing dimension.