Parallel and distributed databases R & G Chapter 22.
-
date post
21-Dec-2015 -
Category
Documents
-
view
231 -
download
1
Transcript of Parallel and distributed databases R & G Chapter 22.
![Page 1: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/1.jpg)
Parallel and distributed databases
R & G Chapter 22
![Page 2: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/2.jpg)
What is a distributed database?
![Page 3: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/3.jpg)
Why distribute a database
Scalability and performance
Resilience to failures
Th
roughput
Data
siz
e
versusX X
![Page 4: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/4.jpg)
Why distribute a database
Data is already distributed Or needs to be distributed
Data is in multiple systems
![Page 5: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/5.jpg)
Why not distribute a database
You must earn your complexity!
Communication needed Must build a complex infrastructure Unpredictable latencies must be masked
More types of failures More components to fail Network failures Congestion, timeouts
More complex planning Communication cost plus I/O cost
May have to deal with heterogeneity Different types of systems Different schemas, possibly incompatible Different administrative domains
![Page 6: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/6.jpg)
Types of distributed databases
![Page 7: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/7.jpg)
The old days: mainframes
Definitely not distributed!
![Page 8: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/8.jpg)
Client-server
User interaction
Data processing
Network
![Page 9: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/9.jpg)
Parallel database
![Page 10: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/10.jpg)
Primary/secondary
X
![Page 11: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/11.jpg)
Multidatabase
![Page 12: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/12.jpg)
How do they work?
What is shared? How to distribute the data? How to process the data? How to update the data?
![Page 13: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/13.jpg)
What is shared?
Memory
CPUs RAM Disk
Most modern DBMSsMost modern DBMSs
![Page 14: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/14.jpg)
What is shared?
Disk
RAM
Oracle RACOracle RAC
![Page 15: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/15.jpg)
What is shared?
Nothing
RAM
Search engines, TeradataSearch engines, Teradata
![Page 16: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/16.jpg)
Server 1 Server 2 Server 3 Server 4
Bike $866/2/07 636353
Chair $106/5/07 662113
How to distribute the data?
Couch $5706/1/07 424252
Car $11236/1/07 256623
Lamp $196/7/07 121113
Bike $566/9/07 887734
Scooter $186/11/07 252111
Hammer $80006/11/07 116458
![Page 17: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/17.jpg)
How to distribute the data?
Hash partitioning Range partitioning
(key,value)
Hash()
(key,value)
<= X > X
![Page 18: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/18.jpg)
Server 1 Server 2 Server 3 Server 4
How to distribute the data?
Bike
Chair
Couch
Car
Lamp
Bike
Scooter
Hammer
$86
$10
$570
$1123
$19
$56
$18
$8000
6/2/07
6/5/07
6/1/07
6/1/07
6/7/07
6/9/07
6/11/07
6/11/07
636353
662113
424252
256623
121113
887734
252111
116458
![Page 19: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/19.jpg)
Query processing
Intra-operator parallelism
Inter-operator parallelism
![Page 20: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/20.jpg)
Parallel scanning
filter filter filter filter filter filter
Result
![Page 21: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/21.jpg)
Sorting
![Page 22: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/22.jpg)
Sorting
![Page 23: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/23.jpg)
Parallel hash join
Hash()
![Page 24: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/24.jpg)
Join
![Page 25: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/25.jpg)
Semi-join
![Page 26: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/26.jpg)
Inter-operator parallelism
![Page 27: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/27.jpg)
Updating distributed data
Synchronous: read-any-write-all
Reads are fastReads are fast
![Page 28: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/28.jpg)
Updating distributed data
Synchronous: voting
![Page 29: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/29.jpg)
Updating distributed data
Synchronous: voting
Writes tolerant to disconnectionWrites tolerant to disconnection
![Page 30: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/30.jpg)
Consistency of distributed data
Should provide ACID
![Page 31: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/31.jpg)
Primary/secondary
![Page 32: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/32.jpg)
Two-phase commit
PREPARE
PREPARED PREPARED
COMMIT
![Page 33: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/33.jpg)
Two-phase commit
PREPARE
PREPARED ABORT
ABORT
![Page 34: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/34.jpg)
Two-phase commit
PREPARE
PREPARED
ABORT
![Page 35: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/35.jpg)
Two-phase commit
PREPARE
PREPARED PREPARED
X
![Page 36: Parallel and distributed databases R & G Chapter 22.](https://reader035.fdocuments.in/reader035/viewer/2022081506/56649d555503460f94a31cd6/html5/thumbnails/36.jpg)
Conclusion
Parallelism and distribution very useful Performance Fault tolerance Scale
But complex! Rethink lots of aspects of the system Must earn the complexity