Peter Aiken - irmac.ca Overview.pdf · • TQM • TQdM • TDQM • ISO 9000 And focus on...
Transcript of Peter Aiken - irmac.ca Overview.pdf · • TQM • TQdM • TDQM • ISO 9000 And focus on...
1
© Copyright 12/9/07 by Data Blueprint - all rights reserved!3 - datablueprint.com
Measuring DataManagement
Practice Maturity:A Community’s
Self-Assessment
© Copyright 12/9/07 by Data Blueprint - all rights reserved!4 - datablueprint.com
Peter AikenPeter Aiken• Full time in information technology since 1981• IT engineering research and project background• University teaching experience since 1979• Seven books and dozens of articles• Research Areas
– reengineering, data reverse engineering, software requirements engineering, information engineering, human-computer interaction, systems integration/systems engineering, strategic planning, and DSS/BI
• Director– George Mason University/Hypermedia Laboratory (1989-1993)
• DoD Computer Scientist– Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
• Visiting Scientist– Software Engineering Institute/Carnegie Mellon University (2001-2002)
• Published Papers– Communications of the ACM, IBM Systems Journal, InformationWEEK, Information & Management,
Information Resources Management Journal, Hypermedia, Information Systems Management, Journal ofComputer Information Systems and IEEE Computer & Software
• DAMA International Advisor/Board Member (http://dama.org)
– 2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)– 2005 DAMA Community Award
• Founding Advisor/International Association for Information and Data Quality (http://iaidq.org)
• Founding Advisor/Meta-data Professionals Organization (http://metadataprofessional.org)
• Founding Director Data Blueprint 1999
© Copyright 12/9/07 by Data Blueprint - all rights reserved!6 - datablueprint.com
Dogs New ClothesDogs New Clothes
© Copyright 12/9/07 by Data Blueprint - all rights reserved!8 - datablueprint.com
Two Brilliant Einstein Quotes
• "The significant problems we face cannot besolved at the same level of thinking we wereat when we created them."
• "Everything should be made as simple aspossible, but no simpler."– Albert Einstein
2
© Copyright 12/9/07 by Data Blueprint - all rights reserved!16 - datablueprint.com
Misunderstanding Data ManagementMisunderstanding Data Management
© Copyright 12/9/07 by Data Blueprint - all rights reserved!19 - datablueprint.com
IT Project Failure RatesIT Project Failure RatesRecent IT project failure rates statisticscan be summarized as follows:
– Carr 1994• 16% of IT Projects completed on time,
within budget, with full functionality
– OASIG Study (1995)• 7 out of 10 IT projects "fail" in some respect
– The Chaos Report (1995)• 75% blew their schedules by 30% or more• 31% of projects will be canceled before they ever get completed• 53% of projects will cost over 189% of their original estimates• 16% for projects are completed on-time and on-budget
– KPMG Canada Survey (1997)• 61% of IT projects were deemed to have failed
– Conference Board Survey (2001)• Only 1 in 3 large IT project customers were very “satisfied"
– Robbins-Gioia Survey (2001)• 51% of respondents viewed their large IT implementation project as unsuccessful
– MacDonalds Innovate (2002)• Automate fast food network from fry temperature to # of burgers sold-$180M USD write-
off
– Ford Everest (2004)• Replacing internal purchasing systems-$200 million over budget
– FBI (2005)• Blew $170M USD on suspected terrorist database-"start over from scratch"
http://www.it-cortex.com/stat_failure_rate.htm (accessed9/14/02)
New York Times 1/22/05 pA31
© Copyright 12/9/07 by Data Blueprint - all rights reserved!21 - datablueprint.com
Why Data Projects Fail by Joseph R. Hudicka
• Assessed 1200migration projects!– Surveyed only
experienced migrationspecialists who havedone at least fourmigration projects
• The median projectcosts over 10 times the amount planned!
• Biggest Challenges: Bad Data; Missing Data; Duplicate Data
• The survey did not consider projects that were cancelled largelydue to data migration difficulties
• "… problems are encountered rather than discovered"
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-31© Copyright 12/9/07 by Data Blueprint - all rights reserved!22 - datablueprint.com
Platform: UniSysOS: OS1998 Age: 21Data Structure: DMS (Network)Physical Records: 4,950,000Logical Records: 250,000Relationships: 62Entities: 57Attributes: 1478
Predicting Engineering Problem CharacteristicsPredicting Engineering Problem Characteristics
New System
Legacy System #1: Payroll
Legacy System #2: Personnel
Platform: AmdahlOS: MVS1998 Age: 15Data Structure: VSAM/virtual
database tablesPhysical Records: 780,000Logical Records: 60,000Relationships: 64Entities: 4/350Attributes: 683
Characteristics Logical PhysicalPlatform: WinTel Records: 250,000 600,000OS: Win'95 Relationships: 1,034 1,0201998 Age: new Entities: 1,600 2,706Data Structure: Client/Sever RDBMS Attributes: 15,000 7,073
3
© Copyright 12/9/07 by Data Blueprint - all rights reserved!25 - datablueprint.com
"Extreme" Data Engineering"Extreme" Data Engineering
• 2 person months = 40 person days• 2,000 attributes mapped onto 15,000• 2,000/40 person days = 50 attributes
per person dayor 50 attributes/8 hour = 6.25 attributes/hour
and• 15,000/40 person days = 375 attributes
per person dayor 375 attributes/8 hours = 46.875
attributes/hour• Locate, identify, understand, map, transform,
document, QA at a rate of -• 52 attributes every 60 minutes or
.86 attributes/minute!
© Copyright 12/9/07 by Data Blueprint - all rights reserved!28 - datablueprint.com
Data Integration/Exchange ChallengesData Integration/Exchange Challenges
• Customer typically has had different meanings todifferent parts of the organization:– Accounting -> organization that buys products or services
– Service -> client
– Sales -> prospect
• Assigning the same mission to the DoD ‘lines ofbusiness’ to: “Secure the building” elicits verydifferent results from each ‘line of business’:– Army: Posts guards at all entrances and ensures no
unauthorized access
– Navy: Turns out all the lights, locks up, and leaves
– Marines: Sends in a company to clear the building room-by-room; forms perimeter defense around the building
– Air Force: Signs three year lease with option to buy[Second example courtesy of Burt Parker]
© Copyright 12/9/07 by Data Blueprint - all rights reserved!29 - datablueprint.com
Typical System EvolutionTypical System Evolution
Payroll Application(3rd GL)
Payroll Data(database)
R& D Applications(researcher supported, no documentation)
R & DData(raw)
Mfg. Data(home grown
database) Mfg. Applications(contractor supported)
FinanceData
(indexed)
Finance Application(3rd GL, batch system, no source)
Marketing Application(4rd GL, query facilities, no reporting, very large)
Marketing Data(external database)
Personnel Data(database)
Personnel App.(20 years old,
un-normalized data)
© Copyright 12/9/07 by Data Blueprint - all rights reserved!30 - datablueprint.com
Building from the Top
4
© Copyright 12/9/07 by Data Blueprint - all rights reserved!33 - datablueprint.com
StudentStudentSystemSystem
DataDataModelModel
© Copyright 12/9/07 by Data Blueprint - all rights reserved!34 - datablueprint.com
Proposed Data ModelProposed Data Model
© Copyright 12/9/07 by Data Blueprint - all rights reserved!35 - datablueprint.com
Clinical Systems
Billing/RegistrationSystems
Financial Systems
Decisian Support
Personnel Systems
DepartmentalSystems StandAlone
Planned Systems
AssociatedPhysicians
External Agencies
OPEN HUB
MEDICALINFORMATION
SYSTEM
(ECLIPSYS)
RxOBOT
Radiation Oncology (VARIS)
CLINICAPPOINTMENT
(ECLYPSIS)
DELINQUENT MEDICALRECORDS
PATIENTTRANSPORTATION
OB
TRANSCRIPTION SYSTEMS
MIDAS
RADIOLOGY
DHT
PATHOLOGY
CERNER
SYNERSOURCE
PROVIDER
DB
OR(ORMIS)
EDNET
DECISION SUPPORT
EIS
MARKETING(SACHS)
REGISTRATIONAND BILLING
(HBOC)
(PARS)
MATERIALS
MANAGEMENT
PURCHASING
RECEIVING
ACCTS PAYABLE
(ESI)
OUTPATIENTPHARMACY
(PCSI)
LAB OUTREACHBILLING
Managed care (idx)/
Open Referrals
MCVPREGISTRATON/BILLING (IDX)
ENTERPRISE APPTSCHEDULING (IDX)
MEDICALRECORDS 3M
REIMBURSEMENT AGENCIES
EXTERNALAGENCIES
COLLECTIONSSYSTEMS
(HBOC)ER
CODING/BILLING
(DATA STRIPPER)
PROFIT/LOSS
(KREG)
REVENUEANALYSIS
(KREG)
GENERAL LEDGER(CONSIST)
BANK
VCUSYSTEMS
BUDGET
(KREG)
COSTACCOUNTING
MEDICUS
FIXED ASSETS
(AMERICAN APPRAISAL)
HR/PAYROLL
(GENSYS)
TIMEREPORTING
(DDI)
LANIERDICTATION
PACS
Future
CERNERBLOODBANK
SET OFF DEPT
ComputritionDietary System
AnesthesiologySystem
Poisiondex
EEG (Siemens)
Cardiology (H-P)
MCVH INFORMATION SYSTEMS8/27/99
OfficeAutomation
Credentialing(Morrissee)
Transplant
TraumaRegistry
GOVERNMENT
BENEFITVENDORS
BANK
CASHLOGS
© Copyright 12/9/07 by Data Blueprint - all rights reserved!36 - datablueprint.com
Sample Conversation (Developing Constraints)Sample Conversation (Developing Constraints)
• I'd like to build a building.• What kind of building - do you want to sleep in it? Eat
in it? Work in it?• I'd like to sleep in it.• Oh, you want to build a house?• Yes, I'd like a house.• How large a house do you have in mind?• Well, my lot size is 100 feet by 300 feet.• Then you want a house about 50 feet by 100 feet.• Yes, that's about right.• How many bedrooms do you need?• Well, I have two children, so I'd like three bedrooms ...
5
© Copyright 12/9/07 by Data Blueprint - all rights reserved!37 - datablueprint.com
Data Data
Data
Information
Fact Meaning
Request
A Model Specifying Relationships Among Important TermsA Model Specifying Relationships Among Important Terms
[Built on definition by Dan Appleton 1983]
Intelligence
Use
1. Each FACT combines with one or more MEANINGS.
2. Each specific FACT and MEANING combination is referred to as a DATUM.
3. An INFORMATION is one or more DATA that are returned in response to a specificREQUEST
4. INFORMATION REUSE is enabled when one FACT is combined with more than oneMEANING.
5. INTELLIGENCE is INFORMATION associated with its USES.
Wisdom & knowledge are often used synonymously
Data
Data
Data Data
© Copyright 12/9/07 by Data Blueprint - all rights reserved!38 - datablueprint.com
2000-
Data Quality, Data SecurityData Compliance, Mashups
(more)
1990-2000
Enterprise data management coordinationEnterprise data integration
Data stewardshipData use
1970-1990
Data requirements analysisData modeling
Expanding ScopeExpanding Scope
Years 1950-1970
Database designDatabase operation
© Copyright 12/9/07 by Data Blueprint - all rights reserved!39 - datablueprint.com
Change RequestsChange Requests
© Copyright 12/9/07 by Data Blueprint - all rights reserved!41 - datablueprint.com
Avoiding Unnecessary Work Using Business Rule MetadataAvoiding Unnecessary Work Using Business Rule Metadata
Person Job Class
Employee Position
BR1) Zero, one, or moreEMPLOYEES can be
associated with one PERSON
BR2) Zero, one, or moreEMPLOYEES can be associatedwith one JOB CLASS;
BR3) Zero, one, or more EMPLOYEES can be associated with one POSITION
BR4) One ormorePOSITIONScan beassociatedwith one JOBCLASS.
6
© Copyright 12/9/07 by Data Blueprint - all rights reserved!42 - datablueprint.com
"Understanding thecurrent and futuredata needs of anenterprise andmaking that dataeffective andefficient insupporting businessactivities"
Aiken, P, Allen, M. D., Parker, B., Mattia, A., "MeasuringData Management's Maturity: A Community's Self-Assessment" IEEE Computer (research feature April 2007)
Data ManagementData Management
© Copyright 12/9/07 by Data Blueprint - all rights reserved!49 - datablueprint.com
As Is InformationRequirementsAssets
As Is Data Design Assets As Is Data Implementation Assets
Exi
stin
gN
ew
Metadata EngineeringMetadata Engineering
O2 RecreateData Design
Reverse Engineering
Forward engineering
O5 Reconstitute Requirements
O9Reimplement
Data
To Be DataImplementationAssets
O8 RedesignData
O4Recon-stitute
DataDesign
O3 RecreateRequirements
O6RedesignData
To BeDesign Assets
O7 Re-developRequire-ments
To BeRequirementsAssets
O-1/3 reconstitute original metadataO-4/5 improve the current metadataO-6/9 improve system data capabilities based on the improved metadata
O1 Recreate Data Implementation
Metadata
© Copyright 12/9/07 by Data Blueprint - all rights reserved!67 - datablueprint.com
One concept for processimprovement, othersinclude:• Norton Stage Theory• TQM• TQdM• TDQM• ISO 9000And focus onunderstanding currentprocesses and determiningwhere improvements canbe made.
Our DMpractices are
ad hoc
We have DM experience andhave the ability to implement
disciplined processes
We have experience that wehave standardized so that all in
the organization can follow it
We manage our DM processes sothat the whole organization can
follow our standard DM guidance
We have a process forimproving our DM
capabilities
SEI CMM CapabilitySEI CMM CapabilityMaturity Model LevelsMaturity Model Levels
Initial(1)
Repeatable(2)
Defined(3)
Managed(4)
Optimizing(5)
"Self" Improving
Out of control
Inconsistent
Unpredictable
Unsustainable
© Copyright 12/9/07 by Data Blueprint - all rights reserved!70 - datablueprint.com
Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005Member Survey Results, Washington D.C.: Corporate Executive Board 2006, p. 23.
Percentage of Projects on BudgetBy Process Framework Adoption
…while the same pattern generally holds true for on-time performance
Percentage of Projects on TimeBy Process Framework Adoption
Key Finding: Process Frameworks are not Created EqualKey Finding: Process Frameworks are not Created Equal
With the exception of CMM and ITIL, use of process-efficiencyWith the exception of CMM and ITIL, use of process-efficiencyframeworks does not predict higher on-budget project deliveryframeworks does not predict higher on-budget project delivery……
7
© Copyright 12/9/07 by Data Blueprint - all rights reserved!72 - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationshipsOrganizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
ApplicationModels & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
© Copyright 12/9/07 by Data Blueprint - all rights reserved!73 - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationshipsOrganizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
ApplicationModels & Designs
Feedback
Implementation
Direction
DataDevelopment
GuidanceDefining, coordinating, resourcing, implementing, andmonitoring organizational data program strategies,policies, plans, etc. as coherent set of activities.
Identifying, modeling, coordinating, organizing, distributing, and architecting datashared across business areas or organizational boundaries
Ensuring that specific individuals areassigned the responsibility for themaintenance of specific data asorganizational assets, and that thoseindividuals are provided the requisiteknowledge, skills, and abilities toaccomplish these goals in conjunctionwith other data stewards in theorganization
Specifying and designing appropriately architected dataassets that are engineered to be capable of supportingorganizational needs
Initiation, operation, tuning, maintenance, backup/recovery,archiving and disposal of data assets in support oforganizational activities.
© Copyright 12/9/07 by Data Blueprint - all rights reserved!74 - datablueprint.com
StandardData
Organizational DM Functions and their Inter-relationshipsOrganizational DM Functions and their Inter-relationships
Data Program Coordination
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
Organizational Strategies
Goals
IntegratedModels
BusinessData
Business Value
ApplicationModels & Designs
Feedback
Implementation
Direction
DataDevelopment
Guidance
Leverage data in organizational activities
Data managementprocesses andinfrastructure
Combining multipleassets to produceextra value
Organizational-entity subject areadataintegration
Provide reliableaccess to data
Achieve sharing of datawithin a business area
© Copyright 12/9/07 by Data Blueprint - all rights reserved!75 - datablueprint.com
Data Program Coordination Individual ResponsesData Program Coordination Individual ResponsesData management process and infrastructure
1
2
3
4
5
Development
guidance
Data
Adminstration
Support
systems
Asset recovery
capability
Development
trainingResult 1 Result 2 Result 3 Result 4 Result 5
8
© Copyright 12/9/07 by Data Blueprint - all rights reserved!76 - datablueprint.com
0 1 2 3 4 5
Development guidance
Data Adminstration
Support systems
Asset recovery capability
Development training
Nokia Industry Competition All Respondents
DataData Management Practices AssessmentManagement Practices Assessment
Challenge
Challenge
Challenge
Client
Result 1
Result 2
Result 3
Result 4
Result 5
© Copyright 12/9/07 by Data Blueprint - all rights reserved!77 - datablueprint.com
Data Management PracticesData Management PracticesMeasurement (DMPA)Measurement (DMPA)
• Defined industrystandard
• Collaboration withCMU's SoftwareEngineeringInstitute (SEI)
• Attempt todetermine datamanagement's"state of thepractice"
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support OperationsIn
itial (I)
Re
pe
ata
ble
(II)
De
fine
d (III)
Ma
na
ge
d (IV
)
Op
timizin
g (V
)
Focus:Guidance and
Facilitation
Focus:Implementation
and Access
© Copyright 12/9/07 by Data Blueprint - all rights reserved!79 - datablueprint.com
Organizations SurveyedOrganizations Surveyed
Results from a survey of
more than 200 organizations
– Public Companies – State Government Agencies– Federal Government– International Organizations
© Copyright 12/9/07 by Data Blueprint - all rights reserved!80 - datablueprint.com
The challenge aheadThe challenge ahead
0.00
1.00
2.00
3.00
4.00
5.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
The chart represents the average scoresThe chart represents the average scorespresented on the previous slide - interestingpresented on the previous slide - interestingthat none have apparently reached level-3that none have apparently reached level-3
9
© Copyright 12/9/07 by Data Blueprint - all rights reserved!81 - datablueprint.com
After more than a decade After more than a decade ……
Question How many software practices (surveyed) are above level 1 on theCMM?
Answer By far most organizations (95%) surveyed are producing softwareusing informal processes
Question How many organizations have demonstrated at least some proficiencyaccording to the DM3? (i.e., scored above level 1)
Answer One in ten organizations has scored above level 1 in the DM3according to our surveys
© Copyright 12/9/07 by Data Blueprint - all rights reserved!82 - datablueprint.com
Service Orient or Be Doomed!Service Orient or Be Doomed!• Service Orient or Be
Doomed!– How Service Orientation
Will Change YourBusiness (Hardcover) byJason Bloomberg &Ronald Schmelzer
– I'm not quite sure what"doom" awaits by notservice orienting, otherthan remaining mired inarchaic, calcified andsiloed processes —which a lot of businessesdo anyway, and stillmanage to stay afloat.But that's the topic foranother posting.• Reviewer
© Copyright 12/9/07 by Data Blueprint - all rights reserved!83 - datablueprint.com
ServicesServices
Integration Possibilities
• User Interface
• Business Process
• Application
• Data
AV Component
• Well defined components
• Self-contained
• No interdependencies
Analogy derived from D. Barry "Web Services" Intelligent Enterprise 10/10/03 pp. 26-47 - wiring diagram from sunflowerbroadband.com
© Copyright 12/9/07 by Data Blueprint - all rights reserved!84 - datablueprint.com
Contractor Implemented WiringContractor Implemented Wiring
10
© Copyright 12/9/07 by Data Blueprint - all rights reserved!85 - datablueprint.com
Concise Notes onConcise Notes onSoftware EngineeringSoftware Engineering
– Published in 1979– 93 pages including appendices & references– Out of print– $1.99 at half.com
• Principles of Information Hiding(p. 32-33)
– Conceal complex datastructures whenever possible
– Allow only selected servicemodules to know about theconcealed data structures
– Bind together modules thatknow about concealed datastructures
– Package such modules alongwith the data itself
© Copyright 12/9/07 by Data Blueprint - all rights reserved!86 - datablueprint.com
The basketball and golfball slide
How Does SOA Fit In Existing Architectures?How Does SOA Fit In Existing Architectures?
Bank
© Copyright 12/9/07 by Data Blueprint - all rights reserved!87 - datablueprint.com
Data Quality Specific SOA RequirementsData Quality Specific SOA Requirements
© Copyright 12/9/07 by Data Blueprint - all rights reserved!88 - datablueprint.com
SOA & Data & ???SOA & Data & ???
11
© Copyright 12/9/07 by Data Blueprint - all rights reserved!89 - datablueprint.com
New DM Realities New DM Realities –– Revised DM Goals Revised DM Goals
• Focus Short Term on Measurable Goals
• Implement Instead of Planning
• (Practically) Any Technology Can Help
• Identifying The Pareto Subsets forAnalyses
• Practice "Good Enough" Data Modeling
• The "Enterprise Model" Is Not Required
• Engineer Measurable Data QualityImprovements
© Copyright 12/9/07 by Data Blueprint - all rights reserved!92 - datablueprint.com
http://peteraiken.net
Copyright 12/9/07 by Data Blueprint - all rights reserved!
Contact Information:
Peter Aiken, Ph.D.
Department of Information Systems School of BusinessVirginia Commonwealth University1015 Floyd Avenue - Room 4170Richmond, Virginia 23284-4000
Data Blueprint Maggie L. Walker Business & Technology Center501 East Franklin StreetRichmond, VA 23219804.521.4056http://datablueprint.com
office :+1.804.883.759cell:+1.804.382.5957
e-mail:[email protected]://peteraiken.net