Big data analystics
-
Upload
all-india-institute-of-medical-sciences -
Category
Documents
-
view
50 -
download
0
description
Transcript of Big data analystics
Translational Bioinformatics (TBI)
Sushil K. Meher MCA(NIT, RKL), MBA (Hospital Management), M.Phil (CS),(Ph.D(eHealth)).
Computer FacilityALL INDIA INSTITUTE OF MEDICAL SCIENCES
NEW DELHI
Big Data Analystics in Health Care
The Cycles of Innovation
Innovation in Health Industry
“Keeping Afloat in a Sea of 'Big Data”ITBusinessEdge – 9/6/11
“Why big data is a big deal”InfoWorld – 9/1/11
“The challenge–and opportunity–of big data”McKinsey Quarterly—5/11
“Getting a Handle on Big Data with Hadoop”Businessweek-9/7/11
“Ten reasons why Big Data will change the travel industry”Tnooz -8/15/11
“The promise of Big Data in Health Care”Intelligent Utility-8/28/11
Big Data Buzz
Our Journey To The Cloud/Big Data
OLTP: Online Transaction Processing (DBMSs)OLAP: Online Analytical Processing (Data Warehousing)RTAP: Real-Time Analytics Processing (Big Data Architecture & Technology)
So What is Big Data?
Big Data refers to datasets that grow so large that it is difficult to capture, store, manage, share, analyze and visualize with the
typical database software tools.
How much is Big?It is not a single number but a set of
parameters
!!!
!!!
!!!
!!!
!!!
“Big Data Is Less About Size, And More About Freedom”
―Techcrunch
!!!
!!!
!!!“Findings: ‘Big Data’ Is More Extreme Than Volume” ― Gartner
“Big Data! It’s Real, It’s Real-time, and It’s Already Changing Your World”
―IDC
“Total data: ‘bigger’ than big data”
― 451 Group
THE ERA OF
BIG DATAIS HERE
Big Data Analytics
The Path to Advanced Health Care
Big Data in Healthcare
VOLUME VELOCITY VARIETY VARACITY
SOCIAL
BLOG
SMARTMETER
101100101001001001101010101011100101010100100101
HEALTH
• In 2011 alone, 1.8 zettabytes of data were created globally. To put this into perspective, this volume of data equated to 200 billion, 2-hour long HD movies, which one person would need 47 million years to watch in their entirety.
• U.S health care data alone reached 150 exabytes in 2011. Five exabytes (1018
gigabytes) of data would contain all the words ever spoken by human beings on earth. At this rate, big data for U.S. health care will soon reach zettabyte (1021 gigabytes) scale and yottabytes (1024 gigabytes) not long after.
“Translating Health Care Through Big Data, Strategies for leveraging big data in the health care industry” - Institute for Health Technology Transformation
The register.co.uk
Data Measurement Units
The Model Has Changed… The Model of Generating/Consuming Data has Changed
Old Model: Few Hospitals are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
Healthcare is Positioned to Gain from Big Data
Innovate With Big Data AnalyticsBig Data Analytics Accelerate Health Care 2.0 for Evidence-based Care Provider
TRADITIONAL DATA LEVERAGED
LOW
HIGH
Qua
lity
of C
are
LegacySystem
Treatment Pathways on
Summary Data
Database
BI Reporting
TreatmentPathways onAll the Data
Delivering 10 Years Of Data In Seconds
International
ResultsDrug
Interaction
Predictions Individual
Patient Histo
ry
BIG DATA LEVERAGED
Social &
Economic
Factors
Big Data Analytics
In-Database
Analytics
Geographical
Facto
rs
NIH Historic
al
Data
Associative Rule Mining and User Clustering Improves Pathways
External Data Sources Enable Personalized Medicine
USE CASE
Big Data Key Drivers
PopulationHealth
Patient Experience
Per Capita Cost
• New Delivery Models
• Meaningful Use
• ICD-10 / SNOMED-CT
• Better Data = Improved Outcomes
• Shift from volume-based care to value-based care
• Fraud Detection
• Cost Savings
What’s driving Big Data
- Ad-hoc querying and reporting- Data mining techniques- Structured data, typical sources- Small to mid-size datasets
- Optimizations and predictive analytics- Complex statistical analysis- All types of data, and many sources- Very large datasets- More of a real-time
Who’s Generating Big Data in Health Care
Where Does the Data Come From?
Supply Chain and Revenue Cycle
Clinical and HIM Administrative• Structured
─ EHR ─ HIS
• Unstructured─ Image based – PACS
and radiology, EKG’s, Monitor data
─ Insurance card, patient photo, consent forms, orders
─ Paper based patient information
• Semi-Structured─ DNA-RNA- Protein
Genomics
• Human Resources– HR Management
Systems – Documents such as
new hire paperwork, employee records, credentialing, etc.
• Legal– Documents include
contracts and agreements, correspondence, compliance
• Finance– Statements
• Business Office– Back Office
• Supply Chain
– Materials Management
– Documents such as requisitions, purchase orders, invoices, packing slips, receiving paperwork
• Revenue Cycle
– Pre-registration
– Denials Management
– Documents include EOB’s, correspondence
Definition of Translational Bioinformatics (TBI)
• Development of storage, analytic, and visualization methods.
Bergman, 2010
Our Aim
Personalizing Health & Care (PHC)
1. Better understanding health, ageing & disease2. Effective health promotion, prediction, screening and disease prevention3. Early diagnosis (detection)4. Innovative treatments & technologies5. Advancing active & healthy ageing6. Integrated, sustainable, citizen-centered care7. Improving health information, data exploitation & knowledge translation
Approach for 4P
Basic Biomedic
al Research
Clinical Knowled
ge&
Research
Population
Health
Personal Health
Public HealthTranslational Research
Text mining, BioPatch, CAMA, DzMap, CKD, PWAS, Drug repositionReverse translational research
Data Interaction Model for Translational Bioinformatics Research
Patient Profile
Diagnosis/Problem
ProceduresMedication
Lab/Exam
Age, sex, allergy, weight, height, blood type, body
temperature, …etc.
YC (Jack) Li et. al., 2004
Current and/or chronic dz, malignancy,
Pregnancy…etc.
Surgery, transfusion, endoscopy,
angiogram, PTCA, rehabilitation…etc.
Fluoruracil vs Theophylline,
Doxorubicin vs Methotrexate, …etc.
CBC, D/C, LFT, hCG, PT, APTT, INR…etc.
e.g. Fluorouracil vs thrombocytopenia
e.g. Wafarin vs colonoscopy
e.g. Tamoxifen vs Nausea
e.g. Valproic acid vs
pregnancyGene
The Galaxy of Disease Map
Forwarding towards
• Formulate new questions and become much more agile
• Make evidence based decisions• Democratize your data• Visualize invisible knowledge
• Big data is here – now• Data breaches• Intrusion of privacy• Unfair use of Data
Big Data in Health Care
Big Data Technology
What Technology Do We HaveFor Big Data ??
Hadoop NoSQL Databases Analytic Databases
Hadoop• Low cost, reliable
scale-out architecture• Distributed computing
Proven success in Fortune 500 companies
• Exploding interest
NoSQL Databases• Huge horizontal scaling
and high availability• Highly optimized for
retrieval and appending• Types
• Document stores• Key Value stores• Graph databases
Analytic RDBMS• Optimized for bulk-load
and fast aggregate query workloads
• Types• Column-oriented• MPP• In-memory
Major Hadoop Utilities
Apache Hive
Apache Pig
Apache HBase
Sqoop
Oozie
Hue
Flume
Apache Whirr
Apache Zookeeper
SQL-like language and metadata
repository
High-level language for
expressing data analysis programs
The Hadoop database. Random,
real -time read/write access
Highly reliable distributed
coordination service
Library for running Hadoop in the
cloud
Distributed service for collecting and aggregating log and event data
Browser-based desktop interface
for interacting with Hadoop
Server-based workflow engine
for Hadoop activities
Integrating Hadoop with
RDBMS
Thank you for attention.
Q/A