A brief history of data processing
-
Upload
gary-orenstein -
Category
Technology
-
view
1.774 -
download
6
Transcript of A brief history of data processing
![Page 1: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/1.jpg)
A Brief Historyof Data Processing
@garyorenstein
Deckset Theme - Next White
(c) Gary Orenstein 1
![Page 2: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/2.jpg)
In The Beginning
(c) Gary Orenstein 2
![Page 3: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/3.jpg)
Computers and Data
(c) Gary Orenstein 3
![Page 4: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/4.jpg)
Accounting Transactions
(c) Gary Orenstein 4
![Page 5: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/5.jpg)
Financial Transactions
(c) Gary Orenstein 5
![Page 6: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/6.jpg)
Inventory Management
(c) Gary Orenstein 6
![Page 7: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/7.jpg)
Human Resources
(c) Gary Orenstein 7
![Page 8: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/8.jpg)
Enter the Database
(c) Gary Orenstein 8
![Page 9: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/9.jpg)
Enter the DatabasePut stuff in. Take stuff out.
Reliably. Quickly.
(c) Gary Orenstein 9
![Page 10: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/10.jpg)
Now Let Me Ask A Question
(c) Gary Orenstein 10
![Page 11: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/11.jpg)
Just A Moment Please
(c) Gary Orenstein 11
![Page 12: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/12.jpg)
Let's Build A Bigger, Faster Database
(c) Gary Orenstein 12
![Page 13: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/13.jpg)
maybe this is not as easy as we thought
(c) Gary Orenstein 13
![Page 14: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/14.jpg)
We Need A Data Warehouse
(c) Gary Orenstein 14
![Page 15: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/15.jpg)
Database, Meet Data Warehouse
(c) Gary Orenstein 15
![Page 16: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/16.jpg)
Welcome to the ETL Gap
(c) Gary Orenstein 16
![Page 17: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/17.jpg)
And Never The Two Shall Meet
(c) Gary Orenstein 17
![Page 18: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/18.jpg)
Four Ways Your DBMS is Holding You Back1
• ETL (Extract, Transform, Load)
• Analytic Latency
• Synchronization
• Copies of data
1 Source: Gartner Hybrid/Transactional/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation, Published: 28 January 2014
(c) Gary Orenstein 18
![Page 19: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/19.jpg)
Why Did We Separate• Performance
• Performance
• Performance
• Governance
(c) Gary Orenstein 19
![Page 20: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/20.jpg)
Primary Performance Impediment
Disk Drives
(c) Gary Orenstein 20
![Page 21: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/21.jpg)
Scale-up for Databases and Data
Warehouses
(c) Gary Orenstein 21
![Page 22: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/22.jpg)
Complex and Costly
(c) Gary Orenstein 22
![Page 23: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/23.jpg)
Quest for Scale-out
(c) Gary Orenstein 23
![Page 24: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/24.jpg)
Paths Across Databases and Data Warehouses
(c) Gary Orenstein 24
![Page 25: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/25.jpg)
NoSQL Wave And Hadoop Ecosystem
(c) Gary Orenstein 25
![Page 26: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/26.jpg)
NoSQL Theory
• Scale
• Performance
• Eventual Consistency
• No need for SQL
(c) Gary Orenstein 26
![Page 27: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/27.jpg)
NoSQL Reality
• Scale and performance?
• Stick to one thing at a time
• Consistency?
• Just wait
• Analytics?
• Thank goodness for SQL on NoSQL
(c) Gary Orenstein 27
![Page 28: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/28.jpg)
Hadoop Theory
• Just store it
• Who needs a schema
• Let's learn MapReduce
• Compute on disk, no problem
(c) Gary Orenstein 28
![Page 29: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/29.jpg)
Hadoop Reality
• Data lakes are deep and dark
• Unclear what is going on
• Hard to fill shoes
• MapReduce
• Hadoop ecosystem engineering
• Occasionally feels like the data strategy is upside down
(c) Gary Orenstein 29
![Page 30: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/30.jpg)
What is the one thing never
intended for NoSQL and
Hadoop?SQL.
(c) Gary Orenstein 30
![Page 31: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/31.jpg)
Hadoop (HDFS) is a filesystem, not a
database
(c) Gary Orenstein 31
![Page 32: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/32.jpg)
NoSQL is, well...
Just part of a complete solution(c) Gary Orenstein 32
![Page 33: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/33.jpg)
Why did we pursue a split data warehouse, NoSQL, HDFS?
Performance, performance, performance, governance
(c) Gary Orenstein 33
![Page 34: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/34.jpg)
Idea
(c) Gary Orenstein 34
![Page 35: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/35.jpg)
Let's Use Memory
(c) Gary Orenstein 35
![Page 36: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/36.jpg)
Let's Use Memory
And understandably architect for persistence
(c) Gary Orenstein 36
![Page 37: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/37.jpg)
What About Flash
(c) Gary Orenstein 37
![Page 38: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/38.jpg)
The Right Solution Spans Memory, Flash, and Disk
(c) Gary Orenstein 38
![Page 39: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/39.jpg)
New Tech: Distributed Systems
(c) Gary Orenstein 39
![Page 40: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/40.jpg)
Old Tech: Relational Databases
Proudly serving SQL since 1970
(c) Gary Orenstein 40
![Page 41: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/41.jpg)
Do we really have to split
databases and data
warehouses?
(c) Gary Orenstein 41
![Page 42: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/42.jpg)
Mergewith in-memory solutions
(c) Gary Orenstein 42
![Page 43: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/43.jpg)
Do I need to worry about high costs?
(c) Gary Orenstein 43
![Page 44: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/44.jpg)
Distributescale across low cost machines or
cloud instances
(c) Gary Orenstein 44
![Page 45: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/45.jpg)
Do I need to give up SQL?
(c) Gary Orenstein 45
![Page 46: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/46.jpg)
OrchestrateA Multi-Model Solution
• Full transactional SQL
• Inserts, updates, deletes
• JSON
• Geospatial
• Spark
(c) Gary Orenstein 46
![Page 47: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/47.jpg)
But not all of my data needs to be in-memoryExactly
• Combine with a disk/flash based columnstore
• Keep real-time data in memory
• Keep historical data on disk
• Query both datastores through a single interface
(c) Gary Orenstein 47
![Page 48: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/48.jpg)
What happens if a node goes down?
Replicate for availability
(c) Gary Orenstein 48
![Page 49: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/49.jpg)
What happens if I need to
recover?
Persist logs to disk, take snapshots, make backups
(c) Gary Orenstein 49
![Page 50: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/50.jpg)
Explore the possibilities• In-memory, distributed database
• Relational and multi-model
• Software for your data center or the cloud
• Real-time data pipelines and analytics
• New world of modern applications
(c) Gary Orenstein 50
![Page 51: A brief history of data processing](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ed02621a28abf31c8b4703/html5/thumbnails/51.jpg)
Find Your Inner SQL
for more@garyorenstein
(c) Gary Orenstein 51