Post on 28-Dec-2021
Peter Aiken PhD
Data Driven Transformation & InnovationEvolving your information Architecture
Copyright 2018 by Data Blueprint Slide # !1
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
Peter Aiken, Ph.D.• 33+ years in data management • Repeated international recognition • Founder, Data Blueprint (datablueprint.com) • Associate Professor of IS (vcu.edu) • DAMA International (dama.org) • 10 books and dozens of articles • Experienced w/ 500+ data
management practices • Multi-year immersions:
– US DoD (DISA/Army/Marines/DLA)– Nokia – Deutsche Bank– Wells Fargo – Walmart– …
PETER AIKEN WITH JUANITA BILLINGSFOREWORD BY JOHN BOTTEGA
MONETIZINGDATA MANAGEMENT
Unlocking the Value in Your Organization’sMost Important Asset.
The Case for theChief Data OfficerRecasting the C-Suite to LeverageYour Most Valuable Asset
Peter Aiken andMichael Gorman
Copyright 2018 by Data Blueprint Slide #
Data Driven Transformation & Innovation
• What a difference 8,000 years makes? – Enron and lessons – Architectures means different things to different
professionals – Definitions (includes architecture and engineering
concepts)
• What is meant by use of a data architecture? – Application of data assets towards organizational
strategic objectives – Assessed by the maturity of organizational data
management practices – Results in increased capabilities, dexterity, and
self awareness – Accomplished through use of data-centric
development practices • How does an organization achieve better use
through its data architecture? – Continuous re-development; the starting point isn't
the beginning – Data architecture components must typically be
reengineered – Using an iterative, incremental approach, typically
focusing on one component at a time and following a formal transformation cycle
!3Copyright 2018 by Data Blueprint Slide #
Enron• Fortune named Enron "America's Most Innovative Company"
for six consecutive years • Suffered the largest Chapter 11 bankruptcy in history
(up to that time) • August 2001: $90.00 → $42.00 → $0.26 • Dynegy (several $ billion) attempted rescue • Enron spends entire amount in 1 week
– Any person can write a check at Enron for – Any amount of money for – Any purchase at – Any time ...
• Enron goes back to Dynegy for more $? • Dynegy: What happened to the
several $ billion I gave you last week? • Enron:
http://en.wikipedia.org/wiki/Enron
!4Copyright 2018 by Data Blueprint Slide #
CFO Necessary Prerequisites/Qualifications• CPA
• CMA
• Masters of Accountancy
• Other recognized degrees/certifications
• These are necessary but insufficient prerequisites/qualifications
!5Copyright 2018 by Data Blueprint Slide #
What is the world's oldest profession?
!6Copyright 2018 by Data Blueprint Slide #
Augusta Ada KingCountess of Lovelace
(1815-52)
• 8,000+ years • formalize practices • GAAP
It is appropriate that we (data professionals) acknowledge that we are currently not as mature a discipline as we would like to be but it is not okay for our discipline to remain in its current state of maturity
Confusion• IT thinks data is a business problem
– "If they can connect to the server, then my job is done!"
• The business thinks IT is managing data adequately – "Who else would be taking care of it?"
!7Copyright 2018 by Data Blueprint Slide #
2005 2006 2007 2008 2009 2010 2011 0.000
0.200
0.400
0.600
0.800
IT/Infor
mation S
ecurity
/Privacy
Virtualiz
ation
Data ce
nter/IT
effici
encie
s/Clou
d
Social M
edia
Impro
ving p
eople
/leade
rship
BI/ana
lytics
Standa
rdizat
ion/co
nsolida
tion
IT workfor
ce de
velop
ment
IT gover
nance
Risk m
anag
emen
t
Mobile
applic
ations/
techn
ologie
s
Inform
ation S
harin
g
Imple
menting
plans/
initativ
es/ach
ieving
resul
ts
Acquisit
ion/pr
oject m
gt
Process
/syste
m integ
ration
Strateg
ic plan
ning
Top Five CIO Concerns 2005-2011
8Copyright 2018 by Data Blueprint Slide #
Put simply, organizations:
!9Copyright 2018 by Data Blueprint Slide #
• Have little idea what data they have • Do not know where it is (and) • Do not know what their knowledge workers do with it
What do we teach knowledge workers about data?
!10Copyright 2018 by Data Blueprint Slide #
What percentage of the deal with it daily?
What do we teach IT professionals about data?
!11Copyright 2018 by Data Blueprint Slide #
• 1 course
– How to build a new database
• What impressions do IT professionals get from this education?
– Data is a technical skill that is needed when developing new databases
• If we are migrating databases, we are not creating new databases and we don't need organizational data management knowledge, skills, and abilities (KSAs).
• If we are implementing a new software package, we are not creating a new database and therefore we do not need data management KSAs.
• If we are installing an enterprise resource package (ERP), we are not creating a new database and therefore we do not need data management KSAs.
Running Query
!12Copyright 2018 by Data Blueprint Slide #
Optimized Query
!13Copyright 2018 by Data Blueprint Slide #
• SQL Server – 47,000,000,000,000 bytes – Largest 34 billion records 3.5 TBs
• Informix – 1,800,000,000 queries/day – 65,000,000 tables / 517,000 databases
• Teradata – 117 billion records – 23 TBs for one table
• DB2 – 29,838,518,078 daily queries
Data Footprints
!14Copyright 2018 by Data Blueprint Slide #
Repeat 100s, thousands, millions of times ...
!15Copyright 2018 by Data Blueprint Slide #
Death by 1000 Cuts
!16Copyright 2018 by Data Blueprint Slide #
Leverage is an Engineering Concept
• Using proper engineering techniques, a human can lift a bulk that is weighs much more than the human
!17Copyright 2018 by Data Blueprint Slide #
Data Leverage is an Engineering Concept
• Note: Reducing ROT increases data leverage
!18Copyright 2018 by Data Blueprint Slide #
Organizational Data
Organizational Data Managers
Technologies
Process
People
Less Data ROT ->
Data Leverage is an Engineering Concept
• Permits organizations to better manage their data – within the organization, and – with organizational data exchange partners – in support of the organizational mission
• Leverage obtained by implementation of data-centric – Technologies – Processes – Human skill sets – is increased by elimination of data ROT (redundant, obsolete, or trivial)
• The bigger the organization, the greater potential leverage exists
• Treating data more asset-like simultaneously 1. lowers organizational IT costs and 2. increases organizational knowledge worker productivity
!19Copyright 2018 by Data Blueprint Slide #
Less ROT
Technologies
Process
People
Results
Increasing utility of organizational data
Individual IT Project
Requirements
Design
Implement
Requests Results
Individual IT Project
Requirements
Design
Implement
Requests
Results
Individual IT Project
Requirements
Design
Implement
Requests
Organized, shared data
Organized, shared data
Organized, shared data
Shared Data preceding completed software
!20Copyright 2018 by Data Blueprint Slide #
• Over time the: – Number of requests increase – Utility of the results increase – Data's contribution increases – and is recognized!
Shared data structures cannot exist without
programmatic development and evaluation
!21Copyright 2018 by Data Blueprint Slide #
http://slummagazine.wordpress.com/2012/09/25/linden-labs-new-games-creatorverse-and-patterns/
Data Driven Transformation & Innovation
• What a difference 8,000 years makes? – Enron and lessons – Architectures means different things to different
professionals – Definitions (includes architecture and engineering
concepts)
• What is meant by use of a data architecture? – Application of data assets towards organizational
strategic objectives – Assessed by the maturity of organizational data
management practices – Results in increased capabilities, dexterity, and
self awareness – Accomplished through use of data-centric
development practices • How does an organization achieve better use
through its data architecture? – Continuous re-development; the starting point isn't
the beginning – Data architecture components must typically be
reengineered – Using an iterative, incremental approach, typically
focusing on one component at a time and following a formal transformation cycle
http://slummagazine.wordpress.com/2012/09/25/linden-labs-new-games-creatorverse-and-patterns/
The art and technique of designing and building, as distinguished from the skills associated with construction. The practice of architecture is employed to fulfill both practical and expressive requirements of civilized people and thus embraces both utilitarian and aesthetic ends. Although these two ends may be distinguished, they cannot be separated
– Encyclopedia Britannica definition dates to 1555 http://www.britannica.com/ - accessed 10/02
!22Copyright 2018 by Data Blueprint Slide #
engineering
Architecture
Agreement isn't necessarily correctness!
!23Copyright 2018 by Data Blueprint Slide #
Understanding• Definition:
– 'Understanding an architecture'
– Documented and articulated as a digital blueprint illustrating the commonalities and interconnections among the architectural components
– Ideally the understanding is shared by systems and humans
!24Copyright 2018 by Data Blueprint Slide #
4 Minute Architecture Lesson from Steve Jobs, Introducing iCloud
!25Copyright 2018 by Data Blueprint Slide #
Architecture is about ...• Things
– (components)
• The functions of the things – (individually)
• How the things interact – (as a system, – towards a goal)
!26Copyright 2018 by Data Blueprint Slide #
Three Architectural Concepts: Abstraction• Visualization of Complexity
– A means of reducing complexity by handling different details at different levels
• Architects visualize and create complete schemes for constructing complex products: – Buildings – Mechanical wonders – Extensive communications systems – Complex computer systems
• How do we visualize information systems? – Today's architecting is indeed driven by, and serves much the same purpose as
civil architecture – to create and build systems too complex to be treated by engineering analysis alone.
• Examples - referring to: – A "house" rather than a combination of glass, wood, and nails – Referring to the "Database Coordinator" instead of John Smith – Logical versus physical models
!27Copyright 2018 by Data Blueprint Slide #
Three Architectural Concepts: Decomposition• Breaking the problem down into
more manageable components • Using decomposition the
details don't go away completely;
• They are pushed them to a different level so that you can think about them when you want to rather than all at the same time.
• Examples: – Construction blueprint layers representing: electrical, plumbing,
transit patterns – Module hierarchy of functional designs – Inheritance hierarchies in object-oriented design – Nested data-structures
SystemProcess
Process2
Process1
Process3
Subprocess1.1
Subprocess1.2
Subprocess1.3
!28Copyright 2018 by Data Blueprint Slide #
Three Architectural Concepts: Structure• Framework for organizing and
classifying components • A fundamental and
sometimes intangible notion covering the – Recognition – Observation – Nature
– Stability of patterns – Relationships of entities
• A structure defines what a system is made of. It is a configuration of items. It is a collection of inter-related components or services. [Wikipedia]
!29Copyright 2018 by Data Blueprint Slide #
Architectural BenefitsIT-related • Complexity Management
– Facilitate the scoping and coordination of programs and information systems projects
• Technical Resource oversight – Identify and remove redundancy
• Knowledge management – Manage and share knowledge
modularity so it can be visualized across different levels
• IT visibility – IT resources and systems are more
aligned to business strategies and are better placed for responsiveness
Business-related • Reduction in impact of staff turnover
– Capture knowledge from employees and consultants. Provide business solutions from third party organizations consistently so they can conform to the current model.
• Faster adaptability – Facilitate knowledge acquisition
necessary for changing systems and adopting new components.
• Operating procedures improvement – Understand and model business
processes. Review and reengineer processes.
• Decision making – Represent an enterprise's layers and
component's modularity to let the organization make business decisions in the context of the whole instead of a stand-alone part.
!30Copyright 2018 by Data Blueprint Slide #
[Adapted from Shah & El Kourdi 2007]
• Analysis/model evaluation
• Risk evaluation
• Volume considerations
• Workload forecasting
• Tradeoff analysis
• ...
Architecture involves at least ...
!31Copyright 2018 by Data Blueprint Slide #
Standard data
Data supply
Data literacy
Making a Better Data Governance Sandwich
!32Copyright 2018 by Data Blueprint Slide #
Data literacy
Standard data
Data supply
Making a Better Data Governance Sandwich
!33Copyright 2018 by Data Blueprint Slide #
Standard data
Data supply
Data literacy
Making a Better Data Sandwich
!34Copyright 2018 by Data Blueprint Slide #
Standard data
Data supply
Data literacy
This cannot happen without engineering and architecture!
Quality engineering/architecture work products do not happen accidentally!
Arc
hite
ctur
e Ja
rgon
!35Copyright 2018 by Data Blueprint Slide #
You cannot architect after implementation!
!36Copyright 2018 by Data Blueprint Slide #
Good Architectural Foundation?
!37Copyright 2018 by Data Blueprint Slide #
USS Midway & Pancakes
What is this?
• It is tall • It has a clutch • It was built in 1942 • It is still in regular use!
!38Copyright 2018 by Data Blueprint Slide #
Architecture: here, whether you like it or not
39Copyright 2018 by Data Blueprint Slide #
deviantart.com
• All organizations have architectures – Some are better
understood and documented (and therefore more useful to the organization) than others
Typically Managed Architectures • Process Architecture
– Arrangement of inputs -> transformations = value -> outputs – Typical elements: Functions, activities, workflow, events, cycles, products, procedures
• Systems Architecture – Applications, software components, interfaces, projects
• Business Architecture – Goals, strategies, roles, organizational structure, location(s)
• Security Architecture – Arrangement of security controls relation to IT Architecture
• Technical Architecture/Tarchitecture – Relation of software capabilities/technology stack – Structure of the technology infrastructure of an enterprise, solution or system – Typical elements: Networks, hardware, software platforms, standards/protocols
• Data/Information Architecture – Arrangement of data assets supporting organizational strategy – Typical elements: specifications expressed as entities, relationships, attributes,
definitions, values, vocabularies
!40Copyright 2018 by Data Blueprint Slide #
J.A. Zachman "A Framework for Information Systems Architecture " IBM Systems Journal: Volume 26, Number 3, Page 276 (1987)
Architectural Representations
!41Copyright 2018 by Data Blueprint Slide #
Why Architecture?
!42Copyright 2018 by Data Blueprint Slide #
• Would you build a house without an architecture sketch?
• Model is the sketch of the system to be built in a project.
• Would you like to have an estimate how much your new house is going to cost?
• Your model gives you a very good idea of how demanding the implementation work is going to be!
• If you hired a set of constructors from all over the world to build your house, would you like them to have a common language?
• Model is the common language for the project team.
• Would you like to verify the proposals of the construction team before the work gets started?
• Models can be reviewed before thousands of hours of implementation work will be done.
• If it was a great house, would you like to build something rather similar again, in another place?
• It is possible to implement the system to various platforms using the same model.
• Would you drill into a wall of your house without a map of the plumbing and electric lines?
• Models document the system built in a project. This makes life easier for the support and maintenance!
Data Data
Data
Information
Fact Meaning
Request
A Model Defining 3 Important Concepts
[Built on definitions from Dan Appleton 1983]
Intelligence
Strategic Use
1. Each FACT combines with one or more MEANINGS.
2. Each specific FACT and MEANING combination is referred to as a DATUM.
3. An INFORMATION is one or more DATA that are returned in response to a specific REQUEST 4. INFORMATION REUSE is enabled when one FACT is combined with more than one MEANING.
5. INTELLIGENCE is INFORMATION associated with its STRATEGIC USES.
6. DATA/INFORMATION must formally arranged into an ARCHITECTURE.
Wisdom & knowledge are often used synonymously
Data
Data
Data Data
!43Copyright 2018 by Data Blueprint Slide #
Data Architecture – Better Definition
!44Copyright 2018 by Data Blueprint Slide #
• Common vocabulary expressing integrated requirements ensuring that data assets are stored, arranged, managed, and used in systems in support of organizational strategy
• A structure of data-based information assets supporting implementation of organizational strategy [Aiken 2010]
Copyright 2013 by Data Blueprint
An organization's data architecture ...
45
Software
Package 1
Software
Package 2
Software
Package 3
Software
Package 4
Software
Package 5
Software
Package 6
Data Architecture
... maps between and across software packages
!46Copyright 2018 by Data Blueprint Slide #
Copyright 2013 by Data Blueprint
Traditional Engine
47
Copyright 2013 by Data Blueprint
Prius Hybrid Engine
48
!49Copyright 2018 by Data Blueprint Slide #
Existing System
Identify System Components & Their Arrangement
!50Copyright 2018 by Data Blueprint Slide #
Understand individual component inputs, processes, and outputs
!51Copyright 2018 by Data Blueprint Slide #
Understand individual component inputs, processes, and outputs, add component descriptions to metadata repository, repeat process for all components
Inputs: Processes: Outputs: (Business Rules)
Metadata Repository
!52Copyright 2018 by Data Blueprint Slide #
message bus
Develop message bus-based communication among components
!53Copyright 2018 by Data Blueprint Slide #
message busObject-based component
Object-based component
Object-based component
Non-object-based
component
Replace existing, understood components with more maintainable/better performing components
!54Copyright 2018 by Data Blueprint Slide #
System 2
System 3
System 4
System 5
System 6
System 1
Existing
Information Architecture Simplification
!55Copyright 2018 by Data Blueprint Slide #
System 2
System 3
System 4
System 5
System 6
System 1
Existing New
TransformationsData Store
Generated Programs
System-to-System Program Transformation Knowledge
Transformations
Transformations
Transformations
Data Architecture Simplification
!56Copyright 2018 by Data Blueprint Slide #
System 2
System 3
System 6
System 1
Existing New
TransformationsData Store
Generated Programs
System-to-System Program Transformation Knowledge
TransformationsTransformationsTransformations
Architecture Simplification
!57Copyright 2018 by Data Blueprint Slide #
How are components expressed as architectures?• Details are
organized into larger components
• Larger components are organized into models
• Models are organized into architectures
!58Copyright 2018 by Data Blueprint Slide #
A B
C D
A B
C D
A
D
C
B
Data
DataData
Data
Data Data
Data
Focus of asoftware
architectureengineering
effort Program A
Program B
Program C
Program F
Program E
Program DProgram G
Program H
Program I
Applicationdomain 1
Applicationdomain 2Application
domain 3
Data
databasearchitectureengineering
effort
Focus of a
Data
Data
Data Architecture Focus has Greater Potential Business Value
• Broader focus than either software architecture or database architecture
• Analysis scope is on the system wide use of data
• Problems caused by data exchange or interface problems
• Architectural goals more strategic than operational
!59Copyright 2018 by Data Blueprint Slide #
As Is InformationRequirements Assets
As Is Data Design Assets As Is Data Implementation Assets
Exi
stin
gN
ew
Data Architecture Component Reengineering Reverse Engineering
Forward engineering
To Be Data Implementation Assets
To Be Design Assets
To Be Requirements Assets
Metadata
!60Copyright 2018 by Data Blueprint Slide #
!61Copyright 2018 by Data Blueprint Slide #
Archeology-based Transformations Solve a Puzzle• Primary sources of guidance:
– The edge-pieces are easy to identify
– Distinct physical piece features exist, such as colors, patterns, pictures, etc.
• Steps for solving: – Physically segregate all identified edge
pieces (not always present in existing environment.)
– Create puzzle framework - connecting edge pieces using the puzzle picture
– Within frame, physically group remaining pieces by distinct physical features
– Solve a smaller section of the puzzle containing just a portion of the picture that is focused on similar physical features such as a ball or a puppy as images in the picture. This is an effective approach because the
• Focus is on a common domain–one distinct aspect of the entire picture
• Because it focuses the analysis on a smaller number of puzzle pieces it is proportionately smaller than attempting to solve the overall puzzle at once.
– As the components are assembled, combine them to solve the complete puzzle.
!62Copyright 2018 by Data Blueprint Slide #
Design Patterns
!63
• Why are the restrooms generally in the same place in each building?
• What about the electrical wiring? • HVAC, plumbing, floor plans? ... • Architecture design patterns (spoke and hub,
hub of hubs, warehouse, cloud, MDM, portals, ...)
Copyright 2018 by Data Blueprint Slide #
Each architectural analysis has a purpose
!64Copyright 2018 by Data Blueprint Slide #
Improving Data Quality during System Migration• Challenge
– Millions of NSN/SKUs maintained in a catalog
– Key and other data stored in clear text/comment fields
– Original suggestion was manual approach to text extraction
– Left the data structuring problem unsolved • Solution
– Proprietary, improvable text extraction process – Converted non-tabular data into tabular data – Saved a minimum of $5 million
– Literally person centuries of work
Copyright 2018 by Data Blueprint Slide #!65
Unmatched Items
Ignorable Items
Items Matched
Week # (% Total) (% Total) (% Total)1 31.47% 1.34% N/A2 21.22% 6.97% N/A3 20.66% 7.49% N/A4 32.48% 11.99% 55.53%
… … … …14 9.02% 22.62% 68.36%15 9.06% 22.62% 68.33%16 9.53% 22.62% 67.85%17 9.5% 22.62% 67.88%18 7.46% 22.62% 69.92%
Determining Diminishing Returns
Copyright 2018 by Data Blueprint Slide #!66
BeforeAfter
Time needed to review all NSNs once over the life of the project:NSNs 2,000,000Average time to review & cleanse (in minutes) 5Total Time (in minutes) 10,000,000
Time available per resource over a one year period of time:Work weeks in a year 48Work days in a week 5Work hours in a day 7.5Work minutes in a day 450Total Work minutes/year 108,000
Person years required to cleanse each NSN once prior to migration:Minutes needed 10,000,000Minutes available person/year 108,000Total Person-Years 92.6
Resource Cost to cleanse NSN's prior to migration:Avg Salary for SME year (not including overhead) $60,000.00Projected Years Required to Cleanse/Total DLA Person Year Saved 93Total Cost to Cleanse/Total DLA Savings to Cleanse NSN's: $5.5 million
Quantitative Benefits
Copyright 2018 by Data Blueprint Slide #!67
ReferencesWebsites
!68Copyright 2018 by Data Blueprint Slide #
References, cont’d
!69Copyright 2018 by Data Blueprint Slide #
References, cont’d
!70Copyright 2018 by Data Blueprint Slide #
10124 W. Broad Street, Suite C Glen Allen, Virginia 23060 804.521.4056
Copyright 2018 by Data Blueprint Slide # !71