Defensive ETL
description
Transcript of Defensive ETL
Defensive ETL
Tim MitchellArtis Consulting
The World’s Largest Communityof SQL Server Professionals
Session Objectives
• What is Defensive ETL?• Review threats and challenges• Discuss countermeasures in SSIS
The World’s Largest Community of SQL Server Professionals
Tim Mitchell
• Business Intelligence Consultant – Artis Consulting, Dallas TX
• Contributing author, MVP Deep Dives 2• Coauthor, SSIS Design Patterns• SQL Server MVP• TimMitchell.net | Twitter.com/Tim_Mitchell
The World’s Largest Community of SQL Server Professionals
Housekeeping
• Questions• Surveys
The World’s Largest Community of SQL Server Professionals
Common Data Challenges
• Accidental– Data corruption– Data loss– Transactional inconsistency
• Malicious– Data theft– SQL injection
The World’s Largest Community of SQL Server Professionals
Common Data Challenges
• System– Hardware/software failures unrelated to the
operation of the ETL– Unexpected downtime– Resource contention
The World’s Largest Community of SQL Server Professionals
ETL Challenges
• Data sources/destination challenges– Inconsistent types– Inconsistent lengths– Unreliable connections– Just plain bad data
The World’s Largest Community of SQL Server Professionals
ETL Challenges
• Transformation challenges– Business logic errors– Generated duplicates– “Lost” data– Time allotment overrun
The World’s Largest Community of SQL Server Professionals
What Is Defensive ETL?
Two universal truths:• I’m doing it right.• Everybody else is crazy.
The World’s Largest Community of SQL Server Professionals
What Is Defensive ETL?
Assumptions of a Defensive ETL professional:• Any input/output over which I do not have
complete control cannot be trusted.• Any process that relies on either hardware or
software cannot be trusted.• Murphy’s Law is inescapable
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Multifaceted approach:• Package• Sources/destinations• Transformations• Error/event handling• Restartability• Validation
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Package-Level Defenses
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Transactions• Package,
container, and/or task
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Checkpoints• Task or
container level restart
• Avoid use with transactions
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Logging
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
ETL Framework• Structured approach• Shared logging and
error handling• Restartability
The World’s Largest Community of SQL Server Professionals
ETL
Cont
rolle
r
Defensive ETL in SSIS
Event Handling
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Control Flow Defenses
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Precedence Constraints• Alternate paths
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Script Task• Wait/pause• Retry
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Data Flow Defenses
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Connections• Metadata is
configurable for some connection types
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Data Sources• Error or
Truncation• Column by
column
The World’s Largest Community of SQL Server Professionals
Defensive ETL in SSIS
Inline Cleansing• Conditional Split• Derived Column
transform
The World’s Largest Community of SQL Server Professionals
Defensive strategies
• Incoming data– Inline cleansing– Post-ETL cleansing– Third-party cleansing• DQS in SQL Server Denali
The World’s Largest Community of SQL Server Professionals
Defensive strategies
• Incoming data– Data type/length compensation– Loops/retry
The World’s Largest Community of SQL Server Professionals
Defensive strategies
• Transformations– Error/lookup outputs– Branching– Rowcount validations
The World’s Largest Community of SQL Server Professionals
Thanks for Attending
Visit www.sqlservercentral.com for free SQL ServereBooks, articles, videos, blogs, news, and more.
Please Don’t Forget to Turn in Your Evaluations
The World’s Largest Communityof SQL Server Professionals