RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high- performance...

22
RIMAC: Redundancy-based hierarchical I/O cache architecture for energy- efficient, high- performance storage systems Xiaoyu Yao and Jun Wang Computer Architecture and Storage System Laboratory (CASS) University of Nebraska - Lincoln
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    3

Transcript of RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high- performance...

RIMAC: Redundancy-based hierarchical I/O cache

architecture for energy-efficient, high-performance

storage systemsXiaoyu Yao and Jun Wang

Computer Architecture and Storage System Laboratory (CASS)

University of Nebraska - Lincoln

2006-04-20 University of Nebraska-Lincoln 2

Big Picture Current energy-efficient storage solutions

promising: Saving energy at the cost of performance Saving energy by using DRPM disk

New RIMAC: Redundancy-based hierarchical I/O cache architecture Making storage cache aware of redundancy Solving the performance problem with power

aware request transformation

2006-04-20 University of Nebraska-Lincoln 3

Outline Background & Motivation

Why? How does RIMAC differ?

RIMAC: Redundancy-based Hierarchical I/O Cache Architecture

Evaluation Conclusion

2006-04-20 University of Nebraska-Lincoln 4

Energy Issues of Internet Data Center

Router

Web Servers

Application Servers

Database Servers

SAN

Storage SystemInternet

27% of Total Energy*

* From WP’02 http://www.max-t.com

70%/YearSwitch

2006-04-20 University of Nebraska-Lincoln 5

Backend Storage System

High performance SCSI disks Small disk array as building block

RAID-1, mirrored disk array RAID-5, parity disk array

Multi-level I/O cache Large storage cache Moderate RAID controller cache

2006-04-20 University of Nebraska-Lincoln 6

Related Work

Name Conventional Disk

DiskArray

Storage Cache

Performance Penalty

MAID(SC ‘02) Yes RAID-0 Yes Yes

PDC (ICS '04) Yes No No Yes

FS2(SOSP’05) Yes No No No

DRPM(ISCA’03)

No RAID-1RAID-5

No

PA/PB(HPCA’04)

No No Yes

Hibernator(SOSP’05)

No RAID-5 No

2006-04-20 University of Nebraska-Lincoln 7

Motivations Server workload characteristics

Dispersed idle period High performance vs. energy conservation

Long “Passive spin-up” delay in conventional disks (10-15 seconds)

Exploiting existing infrastructure to consolidate the short idle period Internal redundancy in disk array Multi-level I/O cache

2006-04-20 University of Nebraska-Lincoln 8

RIMAC - Redundancy

Identifying sources of “passive spin-up” Non-blocking read Derivative read due to parity update Dirty block flushing [Zhu et. al. HPCA’04]

Exploiting inherent redundancy to untouched sources of “passive spin-up” 1/N redundancy in RAID-5, Requests on standby disks are transformed to

active disk accesses

2006-04-20 University of Nebraska-Lincoln 9

RIMAC - Cooperative Cache Deploying parity exclusive cache

Storage cache: user data RAID controller cache: parity

Leveraging redundancy exploitation in cache High performance power-aware request

transformation in multi-level I/O cache Larger effective storage cache size with new

placement/replacement algorithm

2006-04-20 University of Nebraska-Lincoln 10

Sample Scenario – Transformable Read in Cache

(TRC)

P4

9

5

1

10

P3

6

2

11

7

P2

3

12

8

4

P1

Bottom-Half

Up-Half Storage Cache

Parity CacheP2 P3

XOR6 4 5 …

Read (addr=6,

len=1)

FRONT-END

Response

Idle/Active Idle/Active Idle/ActiveStandby

RIMAC

Storage System

……

Disk1 Disk3 Disk4Disk2

2006-04-20 University of Nebraska-Lincoln 11

Sample Scenario – Transformable Read on Disk

(TRD)

P4

9

5

1

10

P3

6

2

11

7

P2

3

12

8

4

P1

Bottom-Half

Up-Half Storage Cache

Parity CacheP2 P3

XOR6 4 8 …

Read (addr=6,

len=1)

FRONT-END

Response

Idle/Active Idle/Active Idle/ActiveStandby

RIMAC

Storage System

……

Disk1 Disk3 Disk4Disk2

2006-04-20 University of Nebraska-Lincoln 12

Power-aware Request TransformationStorage Cache

Parity Cache

Disks

Read

TRC

TRD

PU-DA-CPU-CA-C

PU-DA-DPU-CA-D

Write(PUPA)

2006-04-20 University of Nebraska-Lincoln 13

PUPA - Parity Update with Power-Aware

Direct Access: P2’ = 5’ XOR 5 XOR P2

Complementary Access: P2’ = 5’ XOR 6 XOR 4

P4

9

5

1

10

P3

6

2

11

7

P2

3

12

8

4

P1

Write (addr=5,

len=1)

2006-04-20 University of Nebraska-Lincoln 14

Cache Placement/Replacement

Algorithms Storage Cache LRU with N-1 constraints Compatible with MQ, LIRS, ARC algorithm

Parity Cache Parity stripe only Second chance replacement algorithm

2006-04-20 University of Nebraska-Lincoln 15

Evaluation Trace driven simulation

Disksim 2.0 3-state disk power models (IBM 36Z15) RIMAC front-end, bottom-half and upper-half

implementation with 5000 lines of C code Workloads

Cello99 from HP: file server TPC-D from HP: decision support SPC-SE from SPC: search engine

2006-04-20 University of Nebraska-Lincoln 16

System Performance

INF 64 320

0.2

0.4

0.6

0.8

1

1.2

1.4

Cache Size (MB)

No

rmal

ized

Ave

rage

Res

ponse

Tim

e

(a) Cello99

INF 128 640

0.2

0.4

0.6

0.8

1

1.2

1.4

Cache Size (MB)

No

rmal

ized

Ave

rage

Res

ponse

Tim

e

(b) TPC-D

INF 256 1280

0.2

0.4

0.6

0.8

1

1.2

1.4

Cache Size (MB)

No

rmal

ized

Ave

rage

Res

ponse

Tim

e

(c) SPC-SE

LRURIMAC

LRURIMAC

LRURIMAC

Cello99 TPC-D SPC-SE

30%

20-30% 2-6% 5-14%

Larger cache does improve performance

2006-04-20 University of Nebraska-Lincoln 17

Energy Consumption

INF 64 320

0.2

0.4

0.6

0.8

1

(a) Cello99

Cache Size (MB)INF 128 64

0

0.2

0.4

0.6

0.8

1

(b) TPC-D

Cache Size (MB)INF 256 128

0

0.2

0.4

0.6

0.8

1

(c) SPC-SE

Cache Size (MB)

LRURIMAC

LRURIMAC

LRURIMAC

Cello99 TPC-D SPC-SE

14-15%

33-34%

15-16%

Larger cache may not save more energy

2006-04-20 University of Nebraska-Lincoln 18

Effects of Read Policies

LRU RIMAC0

5

10

15

20

25

30

35

40

Rat

io (

%)

(a) Cello99

LRU RIMAC0

2

4

6

8

10

12

14

Rat

io (

%)

(b) TPC-D

LRU RIMAC0

3

6

9

12

15

18

21

23

Rat

io (

%)

(c) SPC-SE

Read HitTRCTRD

Read HitTRCTRD

Read HitTRCTRD

Cello99-64 MB TPC-D 128 MB SPC-SE 256 MB

33.8%

12.8%49.5%

10.1%4.1%

6.9%

2006-04-20 University of Nebraska-Lincoln 19

Effects of Power Aware Parity Update Policies

BASE RIMAC0

10

20

30

40

50

60

70

80

90

100

Rat

io (

%)

(a) Cello99

BASE RIMAC0

5

10

15

20

25

30

35

40

45

50

Rat

io (

%)

(b) TPC-D

Cello99 TPC-D0

2

4

6

8

10

12

14

16

18

Rat

io (

%)

(c) Parity Hit Ratio

BASERIMAC

Write HitPU-DA-CPU-CA-CPU-DA-D+PU-CA-D

Write HitPU-DA-CPU-CA-CPU-DA-D+PU-CA-D

Cello99-64 MB TPC-D 128 MB Parity Hit Ratio

83.7%13.8%

2006-04-20 University of Nebraska-Lincoln 20

Anatomy of Energy Consumption

Disk1(BASE) Disk1(RIMAC) Disk4(BASE) Disk4(RIMAC)0

0.5

1

1.5

2

2.5

3

3.5

4x 10

4

Ener

gy C

onsu

mpti

on (

J)

ActiveIdleSeekStandby

Cello99-64 MB

2006-04-20 University of Nebraska-Lincoln 21

Conclusions RIMAC: Redundancy based Hierarchical I/O

cache architecture with minimum overhead Address an open problem - “passive spin-up”

in energy-efficient server storage systems by power-aware request transformation both in caches and on disks

Reduce energy cost by up to 33% and improve performance by up to 30%

2006-04-20 University of Nebraska-Lincoln 22

Thank you!