Sql server etl framework
Transcript of Sql server etl framework
SQL Server ETL-framework
Generate and execute SSIS packages
Jeroen NijsSenior DWH/BI ConsultantEindhovenThe [email protected]
2
Requirements
• Generate SSIS Packages (fixed patterns)• Meta data• Project Deployment Model• Use SQL Server Agent to execute SSIS packages• Control• Logging• Restart possibility• Simple• Understandable
3
Improvements
• Controlled Parallel Execution of SSIS packages• Dependencies between SSIS packages• Possibility to set execution order based on runtime• Masterpackage in Control Domain• Use of environment variables
4
Domains
Logging
SSIS PackagesControl
SSIS packages
Types:• Source to Staging area• Staging area to Historical data layer (Persistent Staging Area)• Historical data layer to Dimensions• Historical data layer and Dimensions to Facts• Starmodel to dedicated Starmodel
5
SSIS packages
Meta data• Sourcetable / File• Staging table• CreateStagingTable.sql
• MD5• MD5Hash_Formula.sql
• Historical data layer table• CreateHistoricTable.sql
6
SSIS packages
Source to Staging• SSIS package with DELTA-load• SRC2STG_MEDICAT_RECDEEL
• SSIS package without DELTA-load• SRC2STG_MEDICAT_MEDICIJN
• BIML Script• SRC2STG.biml
7
SSIS packages
MetaData• SQL Server• Database: Control• Schema: Meta
8
SSIS packages
Staging to Historical Data Layer• SSIS package• STG2HIS_MEDICAT_RECDEEL
• BIML Script• STG2HIS.biml
9
Logging
• What• When• Result• Conditions• Levels:• Job• Step• Details
• Number of records processed• SQL Server• Database: Control• Schema: Log
10
StatusDescriptionEErrorIInfoNNumbersPParameterRRunningSSuccessfulTTableWWarning
Control
• What• Sequence• Dependencies• Conditions• Levels:• Application• Package• Parameters
• SQL Server• Database: Control• Schema: Control
11
LoadStrategyDELTADELTA_KEYSFULLKEYS
Control
Storage Procedures and Functions:
• Schema Control• Add, Set, Delete and Get• ControlMaintenance.sql• LastStartDateOfSuccessfulTableLoad• UpdateParameterTypeSelectDateFrom• UpdateParameterTypeSelectDateFrom2 (dependent on other package)
Disable/Enable:• Application• ApplicationPackage• PackageParameter
12
Control
Storage Procedures: Execute
• Control.GetEnvironmentReferenceID• Control.GetEnabledApplicationPackages• Control.ExecuteCatalogPackage
SQL Server Agent:• Job• Steps• Parameters
13
Master Package
14
• Status of application• Packages to execute• Environment variable• Parallel execution• Number of parallel tasks (0..n)• Dependencies between packages
Future improvements/extensions
15
• Generate staging tables• Generate tables for Historical Data Layer• Improve the generation of SSIS packages with BIML Script by using more MetaData• Documentation (inclusive the sequence of execution)• Reports: status Control domain, Logging• Convert SQL into SP: determine the sequence of package execution based on runtime• When the source is cleaned up , the no longer existing resource records in the historical
data layer should not be regarded as deleted .