Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000...
-
Upload
randall-mosley -
Category
Documents
-
view
215 -
download
0
Transcript of Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000...
![Page 1: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/1.jpg)
Data Storage
CPTE 433John Beckett
![Page 2: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/2.jpg)
The Paradox
• “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server farm?”
• It isn’t about having storage• It’s about managing data through its
life-cycle• The new measurement is price per
gigabyte-month
![Page 3: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/3.jpg)
Definitions
• Spindle, platters, heads– Physical arrangement of disk– Of little interest to us, except to help us
understand how new technologies will impact us
• Drive controller– On the hard drive itself– Connected to…
• Host Bus Adapter
![Page 4: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/4.jpg)
RAID
Raid Level
Methods Characteristics
0 Stripe data acrossmultiple drives
Faster reads and writes; poor reliability
1 Mirrors copy of dataacross two drives
Faster reads; good reliability; failures tend to be catastrophic (JB & SA)
5 Distributed parityAny single disk may fail without loss
Faster reads; slower writes; more economical
10 Mirrored stripesRaid 0 group mirrored onto another group
Faster reads; best reliability; most expensive
Table 25.1
Dilemma: Can you add hardware without subtracting from reliability? (Only by using very high-quality hardware)
![Page 5: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/5.jpg)
Where Is the Data?
• DAS – Directly Attached Storage (IBM: DASD), connected directly to the server– May be a RAID array
• NAS – Network-Attached storage– Uses a protocol to transfer data
• SAN – Storage-Area Network– Separate network segment for storage,
connecting servers and drives
A SAN is usually made out of NAS devices
![Page 6: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/6.jpg)
Structure of a SAN
LAN
SANCtrlr
(Server)
SAN backbone
NASNAS
NASNAS
SANCtrlr
(Server)
![Page 7: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/7.jpg)
Managing Storage
• Think of storage as a community resource– If it’s personal, does it have any
business on company equipment?• Determine storage needs of the
group• Identify an architecture that will
satisfy that need• Plan an upgrade path for growth in
the future• Implement inventory and spares
policy
![Page 8: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/8.jpg)
Standardization
• Disk drives are as important to standardize as any other component– Spares issue– Warrantee service procedure– Ability to use obsoleted drives
• Drive lifetime issue:– A drive motor may become unreliable
after so many revolutions
![Page 9: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/9.jpg)
The Storage SLA
• Availability• Response time
• Reliability is increased by RAID > 0– …only if monitored and maintained– …only if RAID method is preserved
• Network is a part of the reliability picture
![Page 10: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/10.jpg)
Backup and RAID
• RAID is not a backup strategy• If >n drives fail, you lose data• Controller failure can cause data loss
• One possibility: RAID mirror as a backup– Requires disconnecting other drive on failure
• How about: Spare drive, auto backup each night– Maybe including incremental backups
![Page 11: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/11.jpg)
Using RAID mirror to effectively speed-up backup
• Break the RAID pair• Back up• Re-connect the RAID pair
![Page 12: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/12.jpg)
Monitoring
• How full– Rate of change
• Broken drives• How busy (especially network on
NAS)• Unused
![Page 13: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/13.jpg)
SAN Caveats
• Benchmarks are problematical• Useful versus physical storage size• Product life-cycle issues
![Page 14: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/14.jpg)
Pipeline Optimization
• Read – buffered and available immediately
• Write – buffered and done at leisure– Dangerous if drive fails before update is
posted
![Page 15: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/15.jpg)
Sync
• Early versions of an OS usually don’t sync properly if shut down during “quiet” time– Novell – unscheduled shutdown could be
catastrophic– Windows – learned some lessons from
others
• Is it safe to turn off power during operation?– A mainframe will be able to handle this
![Page 16: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/16.jpg)
Performance
• Locate simultaneously-used data on different spindles to minimize head thrashing– The more complex your data, the harder
this is to do– Restrict this technique to very heavily-
used data• Beware of compression
– Assumes your data is organized a certain way
– Assumes your CPU has spare time to spend
![Page 17: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/17.jpg)
Disk Access Density
• I/O Operations per second per gigabyte of capacity
• How fast can you move the entire drive of data?
![Page 18: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/18.jpg)
Fragmentation
• Don’t fill up your drives! • That makes defragging slow• Also slows online attempts at limiting
fragments
![Page 19: Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.](https://reader031.fdocuments.in/reader031/viewer/2022032313/56649e495503460f94b3be69/html5/thumbnails/19.jpg)
Continuous Data Protection
• Send a log of all changes somewhere other than your disk drive– Tape– Over the network to another location– Another disk drive
• Back-out and forward recovery