The TickerTAIP Parallel RAID Architecture
description
Transcript of The TickerTAIP Parallel RAID Architecture
![Page 1: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/1.jpg)
The TickerTAIPParallel RAID Architecture
P. Cao, S. B. LimS. Venkatraman, J. Wilkes
HP Labs
![Page 2: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/2.jpg)
RAID Architectures
• Traditional RAID architectures have– A central RAID controller interfacing to the
host and processing all I/O requests– Disk drives organized in strings – One disk controller per disk string (mostly
SCSI)
![Page 3: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/3.jpg)
Limitations
• Capabilities of RAID controller are crucial to the performance of RAID– Can become memory-bound
– Presents a single point of failure
– Can become a bottleneck
• Having a spare controller is an expensive proposition
![Page 4: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/4.jpg)
Our Solution
• Have a cooperating set ofarray controller nodes
• Major benefits are:– Fault-tolerance– Scalability– Smooth incremental growth– Flexibility: can mix and match components
![Page 5: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/5.jpg)
TickerTAIP
Hostinterconnects
Controller nodes
![Page 6: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/6.jpg)
TickerTAIP ( I)
A TickerTAIP array consists of:• Worker nodes connected with one or more local
disks through a bus • Originator nodes interfacing with host computer
clients• A high-performance small area network:
– Mesh based switching network (Datamesh)– PCI backplanes for small networks
![Page 7: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/7.jpg)
TickerTAIP ( II)
• Can combine or separate worker and originator nodes
• Parity calculations are done in decentralized fashion:– Bottleneck is memory bandwidth not CPU
speed– Cheaper than having faster paths to a
dedicated parity engine
![Page 8: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/8.jpg)
Design Issues (I)
• Normal-mode reads are trivial to implement• Normal mode writes:
– three ways to calculate the new parity:• full stripe: calculate parity from new data• small stripe: requires at least four I/Os• large stripe: if we rewrite more than half a
stripe, we compute the parity by reading the unmodified data blocks
![Page 9: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/9.jpg)
Design Issues (II)
• Parity can be calculated:– At originator node– Solely parity: at the parity node for the stripe
• Must ship all involved blocks to party node– At parity: same as solely parity but partial
results for small stripe writes are computed at worker node and shipped to parity node• Occasions less traffic than solely parity
![Page 10: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/10.jpg)
Handling single failures (I)
• TickerTAIP must provide request atomicity• Disk failures are treated as in standard RAID• Worker failures:
– Treated like disk failures– Detected by time-outs
(assuming fail-silent nodes)– A distributed consensus algorithm reaches
consensus among remaining nodes
![Page 11: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/11.jpg)
Handling single failures (II)
• Originator failures:– Worst case is failure of a originator/worker node
during a write– TickerTAIP uses a two-phase commit
protocol:– Two options:
• Late commit• Early commit
![Page 12: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/12.jpg)
Late commit/Early commit
• Late commit only commits after parity has been computed– Only the writes must be performed
• Early commit commits as soon as new data and old data have been replicated – Somewhat faster– Harder to implement
![Page 13: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/13.jpg)
Handling multiple failures
• Power failures during writes can corrupt stripe being written:– Use UPS to eliminate them
• Must guarantee that some specific requests will always be executed in a given order:– Cannot write data blocks before updating the i-
nodes containing block addresses– Uses request sequencing to achieve partial write
ordering
![Page 14: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/14.jpg)
Request sequencing (I)
• Each request– Is given a unique identifier– Can specify one or more requests on whose
previous completion it depends(explicit dependencies)
• TickerTAIP adds enough implicit dependencies to prevent concurrent execution of overlapping requests
![Page 15: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/15.jpg)
Request sequencing (II)
• Sequencing is performed by acentralized sequencer– Several distributed solutions were considered
but not selected because of the complexity of the recovery protocols they would require
![Page 16: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/16.jpg)
Disk Scheduling
• Considered– First come first served (FCFS): implemented
in the working prototype – Shortest seek time first (SSTF): – Shortest access time first (SATF):
Considers both seek time and rotation time– Batched nearest neighbor (BNN):
Runs SATF on all reuests in queue
Not discussed in class in Fall 2005
![Page 17: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/17.jpg)
Evaluation (I)
• Based upon– Working prototype
• Used seven relatively slow Parsytec cards each with its own disk drive
– Event-driven simulator was used to test other configurations:• Results were always within 6% of prototype
measurements
![Page 18: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/18.jpg)
Evaluation (II)
• Read performance:– 1MB/s links are enough unless the request
sizes exceed 1MB
![Page 19: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/19.jpg)
Evaluation (III)
• Write performance:– Large stripe policy always results in a
slight improvement– At-parity significantly better than at-originator especially for link
speeds below 10MB/s– Late commit protocol reduces throughput by at most 2% but can
increase response time by up to 20%– Early commit protocol is not much better
• TickerTAIP always outperforms a comparable centralized RAID architecture
• best disk scheduling policy is Batched Nearest Neighbor
![Page 20: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/20.jpg)
Evaluation (IV)
• TickerTAIP always outperforms a comparable centralized RAID architecture
• Best disk scheduling policy is Batched Nearest Neighbor (BNN)
![Page 21: The TickerTAIP Parallel RAID Architecture](https://reader035.fdocuments.in/reader035/viewer/2022062323/56815a74550346895dc7dac5/html5/thumbnails/21.jpg)
Conclusion
• Can use physical redundancy to eliminate single points of failure
• Can use eleven 5 MIPS processors instead of single 50 MIPS
• Can use off-the-shelf processors for parity computations
• Disk drives remain the bottleneck for small request sizes