Defensive ETL

28
Defensive ETL Tim Mitchell Artis Consulting The World’s Largest Community of SQL Server Professionals

description

The World’s Largest Community of SQL Server Professionals. Defensive ETL. Tim Mitchell Artis Consulting. Session Objectives. What is Defensive ETL? Review threats and challenges Discuss countermeasures in SSIS. Tim Mitchell. Business Intelligence Consultant – Artis Consulting, Dallas TX - PowerPoint PPT Presentation

Transcript of Defensive ETL

Page 1: Defensive ETL

Defensive ETL

Tim MitchellArtis Consulting

The World’s Largest Communityof SQL Server Professionals

Page 2: Defensive ETL

Session Objectives

• What is Defensive ETL?• Review threats and challenges• Discuss countermeasures in SSIS

The World’s Largest Community of SQL Server Professionals

Page 3: Defensive ETL

Tim Mitchell

• Business Intelligence Consultant – Artis Consulting, Dallas TX

• Contributing author, MVP Deep Dives 2• Coauthor, SSIS Design Patterns• SQL Server MVP• TimMitchell.net | Twitter.com/Tim_Mitchell

The World’s Largest Community of SQL Server Professionals

Page 4: Defensive ETL

Housekeeping

• Questions• Surveys

The World’s Largest Community of SQL Server Professionals

Page 5: Defensive ETL

Common Data Challenges

• Accidental– Data corruption– Data loss– Transactional inconsistency

• Malicious– Data theft– SQL injection

The World’s Largest Community of SQL Server Professionals

Page 6: Defensive ETL

Common Data Challenges

• System– Hardware/software failures unrelated to the

operation of the ETL– Unexpected downtime– Resource contention

The World’s Largest Community of SQL Server Professionals

Page 7: Defensive ETL

ETL Challenges

• Data sources/destination challenges– Inconsistent types– Inconsistent lengths– Unreliable connections– Just plain bad data

The World’s Largest Community of SQL Server Professionals

Page 8: Defensive ETL

ETL Challenges

• Transformation challenges– Business logic errors– Generated duplicates– “Lost” data– Time allotment overrun

The World’s Largest Community of SQL Server Professionals

Page 9: Defensive ETL

What Is Defensive ETL?

Two universal truths:• I’m doing it right.• Everybody else is crazy.

The World’s Largest Community of SQL Server Professionals

Page 10: Defensive ETL

What Is Defensive ETL?

Assumptions of a Defensive ETL professional:• Any input/output over which I do not have

complete control cannot be trusted.• Any process that relies on either hardware or

software cannot be trusted.• Murphy’s Law is inescapable

The World’s Largest Community of SQL Server Professionals

Page 11: Defensive ETL

Defensive ETL in SSIS

Multifaceted approach:• Package• Sources/destinations• Transformations• Error/event handling• Restartability• Validation

The World’s Largest Community of SQL Server Professionals

Page 12: Defensive ETL

Defensive ETL in SSIS

Package-Level Defenses

The World’s Largest Community of SQL Server Professionals

Page 13: Defensive ETL

Defensive ETL in SSIS

Transactions• Package,

container, and/or task

The World’s Largest Community of SQL Server Professionals

Page 14: Defensive ETL

Defensive ETL in SSIS

Checkpoints• Task or

container level restart

• Avoid use with transactions

The World’s Largest Community of SQL Server Professionals

Page 15: Defensive ETL

Defensive ETL in SSIS

Logging

The World’s Largest Community of SQL Server Professionals

Page 16: Defensive ETL

Defensive ETL in SSIS

ETL Framework• Structured approach• Shared logging and

error handling• Restartability

The World’s Largest Community of SQL Server Professionals

ETL

Cont

rolle

r

Page 17: Defensive ETL

Defensive ETL in SSIS

Event Handling

The World’s Largest Community of SQL Server Professionals

Page 18: Defensive ETL

Defensive ETL in SSIS

Control Flow Defenses

The World’s Largest Community of SQL Server Professionals

Page 19: Defensive ETL

Defensive ETL in SSIS

Precedence Constraints• Alternate paths

The World’s Largest Community of SQL Server Professionals

Page 20: Defensive ETL

Defensive ETL in SSIS

Script Task• Wait/pause• Retry

The World’s Largest Community of SQL Server Professionals

Page 21: Defensive ETL

Defensive ETL in SSIS

Data Flow Defenses

The World’s Largest Community of SQL Server Professionals

Page 22: Defensive ETL

Defensive ETL in SSIS

Connections• Metadata is

configurable for some connection types

The World’s Largest Community of SQL Server Professionals

Page 23: Defensive ETL

Defensive ETL in SSIS

Data Sources• Error or

Truncation• Column by

column

The World’s Largest Community of SQL Server Professionals

Page 24: Defensive ETL

Defensive ETL in SSIS

Inline Cleansing• Conditional Split• Derived Column

transform

The World’s Largest Community of SQL Server Professionals

Page 25: Defensive ETL

Defensive strategies

• Incoming data– Inline cleansing– Post-ETL cleansing– Third-party cleansing• DQS in SQL Server Denali

The World’s Largest Community of SQL Server Professionals

Page 26: Defensive ETL

Defensive strategies

• Incoming data– Data type/length compensation– Loops/retry

The World’s Largest Community of SQL Server Professionals

Page 27: Defensive ETL

Defensive strategies

• Transformations– Error/lookup outputs– Branching– Rowcount validations

The World’s Largest Community of SQL Server Professionals

Page 28: Defensive ETL

Thanks for Attending

Visit www.sqlservercentral.com for free SQL ServereBooks, articles, videos, blogs, news, and more.

Please Don’t Forget to Turn in Your Evaluations

The World’s Largest Communityof SQL Server Professionals