Distributed storage system
-
Upload
cong-loi-duong -
Category
Technology
-
view
273 -
download
2
description
Transcript of Distributed storage system
DISTRIBUTED STORAGE
SYSTEM
Mr. Dương Công Lợi
Company: VNG-Corp
Tel: +84989510016
Email:[email protected]
CONTENTS
1. What is distributed-computing system?
2. Principle of distributed database/storage
system
3. Distributed storage system paradigm
4. UniversalDistributedStorage
1. WHAT IS DISTRIBUTED-COMPUTING
SYSTEM?
Distributed-Computing is the process of solving a
computational problem using a distributed
system.
A distributed system is a computing system in
which a number of components on multiple
computers cooperate by communicating over a
network to achieve a common goal.
DISTRIBUTED DATABASE/STORAGE
SYSTEM
A distributed database system, the database is
stored on several computers .
A distributed database is a collection of multiple
, Logic computer network .
DISTRIBUTED SYSTEM ADVANCE
Advance
Avoid bottleneck & single-point-of-failure
More Scalability
More Availability
Routing model
Client routing: client request to appropriate server to
read/write data
Server routing: server forward request of client to
appropriate server and send result to this client
* can combine the two model above into a system
DISTRIBUTED STORAGE SYSTEM
Store some data {1,2,3,4,6,7,8} into 1 server
And store them into 3 distributed server
1,2,3,4,6,7,8
1,2,34,6
7,8
2. PRINCIPLE OF DISTRIBUTED
DATABASE/STORAGE SYSTEM
Shard data key and store it to appropriate server
use Distributed Hash Table (DHT)
DHT must be consistent hashing:
Uniform distribution of generation
Consistent
Jenkins, Murmur are the good choice; MD5, SHA
slower
CANONICAL PROBLEMS IN DISTRIBUTED
SYSTEMS
Distributed data independence
Distributed transactions: ACID (Atomicity,
Consistency, Isolation, Durability) requirement
Fault tolerance
Transparency
3. DISTRIBUTED STORAGE SYSTEM
PARADIGM
Data Hashing/Addressing
Determine server for data store in
Data Replication
Store data into multi server node for more available,
fault-tolerance
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
Data Hashing/Addressing
Use DHT to addressing server (use server-name) to a
number, performing it on one circle called the keys
space
Use DHT to addressing data and find server store it
by successor(k)=ceiling(addressing(k))
successor(k): server store k
0
server3
server1
server2
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
Addressing – Virtual node
Each server node is generated to more node-id for
evenly distributed, load balance
Server1: n1, n4, n6
Server2: n2, n7
Server3: n3, n5
0
server3
server1
server2
n7
n1
n5
n2
n4
n6
n3
n6
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
Data Replication
Data k1 store in server1 as master and store in
server2 as slave
0
server3
server1
server2
k1
UNIVERSALDISTRIBUTEDSTORAGE
a distributed storage system
4. UNIVERSALDISTRIBUTEDSTORAGE
UniversalDistributedStorage is a distributed
storage system develop for:
Distributed data independence
Distributed transactions (ACID)
Fault tolerance
Leader election (decision for join or leave server node)
Replicate with multiple master replication
Transparency
UNIVERSALDISTRIBUTEDSTORAGE
ARCHITECTURE
Overview
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
ARCHITECTURE OVERVIEW
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
Data hashing/addressing
Use Murmur hashing function
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
Leader election
Use Bully Leader Election algorithm
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
Multi-master replication
Problem of multi-master replication
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
Multi-master replication
Data store to main master (called sub-leader), then
this data post to queue to sync to other master.
UNIVERSALDISTRIBUTEDSTORAGE
STATISTIC
System information:
3 machine 8GB Ram, core i5 3,220GHz
LAN/WAN network
7 physical servers on 3 above mechine
Concurrence write 16500000 items in 3680s, rate~ 4480req/sec (at client computing)
Concurrence read 16500000 items in 1458s, rate~ 11320req/sec (at client computing)
* It doesn’t limit of this system, it limit at clients (this test using 3 client thread)
Q & A
Contact:
Duong Cong Loi
https://www.facebook.com/duongcong.loi