Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million...

46

Transcript of Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million...

Page 1: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 2: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

/

/home

/home/A

/home/B

/usr

/usr/A /usr/B

/bin

/bin/A /bin/B

/lib

/lib/A /lib/B

• Strongly consistent metadata.

• Atomic file system operations, such as, move and create

Metadata

Page 3: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 4: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

• {operation} [flags] {path(s)}•

Page 5: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 6: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 7: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 8: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 9: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 10: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

→ → →

→ →

→ → →

→ → →

→ →

Page 11: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 12: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 13: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

→ → →

→ → →

→ → →

→ →

→ →

Page 14: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 16: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 17: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 18: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 19: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

create file /user/F4 create file /user/F4

Page 20: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 21: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 22: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 23: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 24: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 25: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 26: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Select * from Inodes where PID = 1

drwxrwx--- /home/Adrwxrwx--- /home/B

Page 27: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Select * from Inodes where PID = 1

WITH(INDEX(…))

drwxrwx--- /home/Adrwxrwx--- /home/B

Page 28: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Page 29: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Hash Fn

PID % 4 = Partition No

Page 30: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Select * from Inodes where PID = 2

WITH(INDEX(…))

drwxrwx--- /home/Adrwxrwx--- /home/B

Page 31: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Start Transaction on Node 2Select * from Inodes where PID = 2

WITH(INDEX(name))

drwxrwx--- /home/Adrwxrwx--- /home/B

Page 32: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

••

Page 33: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

••

Page 34: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 35: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 36: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 37: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

• Identical avg op latency (~3ms) for small number (50) of clients

• 10X lower latency for large number (6500) of clients

• 37X more metadata.

Page 38: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 39: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 40: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 41: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 42: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Block Storage Layer for Small and Large Files

Metadata Storage

Small Files Storage

Block Storage Layer for Large Files

Page 43: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 44: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling
Page 46: Strongly consistent metadata. - GitHub Pages · 2020. 9. 6. · Scaling HDFS to more than 1 million operations per second with HopsFS M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

Read More

Scaling Distributed Hierarchical File Systems Using NewSQL Databases

Salman Niazi. Ph.D. Thesis. KTH Royal Institute of Technology

Scaling hierarchical file system metadata using newsql databases

S Niazi, M Ismail, S Haridi, J Dowling, S Grohsschmiedt, M Ronström

15th USENIX Conference on File and Storage Technologies (FAST 17), 89-104

Scaling HDFS to more than 1 million operations per second with HopsFS

M Ismail, S Niazi, M Ronström, S Haridi, J Dowling

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid …

Size Matters: Improving the Performance of Small Files in Hadoop

S Niazi, M Ronström, S Haridi, J Dowling

Proceedings of the 19th International Middleware Conference, 26-39