CHAPTER Network Reliability: Fault Tolerance and Other Issues.

73
CHAPTER CHAPTER Network Reliability: Fault Network Reliability: Fault Tolerance and Other Tolerance and Other Issues Issues
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of CHAPTER Network Reliability: Fault Tolerance and Other Issues.

CHAPTER CHAPTER

Network Reliability: Network Reliability: Fault Tolerance and Fault Tolerance and

Other IssuesOther Issues

Chapter Objectives

• Discuss network reliability issues– Fault tolerance, tape backup, UPS etc

• Describe different levels of fault tolerance– Levels 1, 2 and 3– Examine the relevance of file

allocation tables to fault tolerance

• Explain RAID technology

Chapter Modules

• An overview of network reliability• Level 1 fault tolerance• Level 2 and level 3 of fault

tolerance• Practical implementation examples• RAID

END OF CHAPTER END OF CHAPTER INTRODUCTIONINTRODUCTION

MODULE

An Overview of Network Reliability

Importance of Fault Tolerance

• Mission critical applications are today run on networks in many organizations

• Important to provide built-in fault tolerance in networks to support mission critical applications

Fault Tolerance

• The ability to continue to function when a fault occurs

• Example– A server with built-in fault tolerance

can continue to operate even when one of its hard disks fails

Focus of Fault Tolerant Features

• Most fault tolerance features are centered on a server– Disk storage in the server is the focal point

of a number of fault tolerant features– Mechanical components are more

susceptible to failure than electronic components

– The hard disk is most vulnerable to failure in a server

• A number of fault tolerant features address the possible failure of hard disks

Fault Tolerance Implementation

• Software based• Hardware based• A combination of both

Sever Based Implementation of Fault

Tolerance• Level 1• Level 2• Level 3

Preview of Fault Tolerance

• Based on the premise of maintaining multiple copies of critical components

• Level 1– Duplicate FATs

• Level 2– Duplicate server hard disks

• Level 3– Duplicate servers

RAID Storage: The Practical Implementation

• Redundant Array of Independent Disks

• Data is stored in a RAID subsystems

• A largely hardware-based implementation

Other Features

• Uninterruptible Power Supply (UPS)• Tape backup

Uninterruptible Power Supply (UPS)

• Ensures the uninterruptible supply of power to the server

• Batteries in the UPS will continue to provide power in the event of a power outage

UPS Implementations

• AS-400 example– When the power goes down the UPS takes

over and systematically shuts down the system preserving the data files

• Other implementations– Power loss---- UPS takes over -– - Standby generator is activated by

sensors – The process is reversed when the power

come back

Tape Backup• Used more as a precautionary measure

than a fault tolerant measure• Data on the server is periodically

backed up on a tape• If the disk storage fails on the server:

– A previously stored version of the data is loaded on to a newly installed disk storage on the server

• Offers some degree of protection against the total loss of data

Network Operating System Support for Reliability

• Support for Levels 1 and 2 of fault tolerance is readily available in network operating systems

• Currently support is also available for Level 3 fault tolerance as well

• RAID 0,1 and 5 are commonly supported

END OF MODULEEND OF MODULE

MODULE

Level 1 Fault Tolerance

Level 1 Fault Tolerance

• A Software Based Solution

Support for Fault Tolerance

• Provided by the network OS• Support for Level 1 and 2 has been

available in OS for a period of time• Newer operating systems have

support for Level 3 fault tolerance– Support for RAID is also incorporated

• RAID may be considered as an extension of Level 2 fault tolerance

Level 1 Fault Tolerance

• A backup copy of the File Allocation Table (FAT) is kept on the server disk

• NOS uses the backup FAT should the original FAT become corrupted– This would ensure the continued

operation of the server

• The problem should be rectified as soon as possible

File Allocation Table (FAT)

File ASize 34K----StartSector 1Track 2

FAT Backup FAT

FAT

A Summary of File Allocation Table (FAT)

Features• Keeps track of files on the disk• Uses pointers to point to the location of

the files– Tracks, sectors

• Stores file related information – Size, date last modified, security

information etc.

• If a FAT is corrupted, none of the files on the disk can be retrieved

A Note on File Systems• Newer file systems have been introduce following

FAT16• FAT32

– Windows 95/98/ME systems– Windows 2000 OS

• NTFS – Windows NT related filing technology– Windows 2000

• HPFS – OS/2 related filing technology

• Linux– ext2

Newer File System Characteristics

• Support longer file names• Better security• Support larger hard disks • Abide by Uniform Naming

Convention (UNC)• Provide by Better security

– Allows greater control to be exercised on the access to directories, files etc.

Format of Uniform Naming Convention

• \\computer_name\directory_name\file_name

END OF MODULEEND OF MODULE

MODULE

Levels 2 and 3 of Fault Tolerance

Levels 2 and 3

• A dominantly hardware based solution

• Obviously, software support in the OS also required

Level 2 Fault Tolerance (FT)

• Implemented by installing a duplicate disk in the server

• The server data is duplicated on the second disk in real-time to provide fault tolerance

• The duplication process itself is automatic when a NOS that supports Level 2 FT is used

• In the event of a failure of the primary hard disk, the network will continue to operate using the secondary hard disk– However, immediate action must be taken to

replace the failed hard disk

Level 2 FT Implementation

• Types of Implementation – Disk Mirroring– Disk Duplexing

• Disk Mirroring– One controller supporting two drives

• Disk Duplexing– Two controllers and two drives– Each drive would have its own controller– Better protection compared to disk

mirroring

Level 2 Fault Tolerance Implementation

HD

Controller

HD

HD

Controller

HD

Controller

DuplexingMirroring

Level 3 Fault Tolerance

• Dual interconnected servers are used to support the network– Second server is simply a mirror of

the first server

• Data mirroring is done automatically by the NOS that supports Level 3 fault tolerance

Level 3 Fault Tolerance Implementation

MainServer

MirroredServer

Work Stations

High-SpeedLink

Actual Implementation of Fault Tolerance

• Level 1 is universally deployed • Level 2 requires additional hardware

– Best deployed by using the RAID storage subsystem

• Level 3 requires considerably more hardware and software resources – Largely used in networks that support

mission critical applications

END OF MODULEEND OF MODULE

MODULE

RAID Storage Subsystem

RAID Storage

• Redundant Array of Independent Disks

• Data is stored striped over different disk in a RAID storage subsystem

Purpose of RAID

• Provide fault tolerance• Offer better performance

RAID Basics

• Data is stored striped over multiple disks– Data striping is the fundamental concept

pursued by RAID

• Data can be recreated from the redundant disks

• MTBF (Mean Time Between Failure) is reduced (MTBF of a disk/number of disks in the subsystem???)

RAID Storage Standards

• RAID 0 through RAID 5• Popular RAID formats

– RAID 0, RAID 1, RAID 5

• Other formats – RAID 10 and RAID 50

RAID 0

• Data is simply stored striped over multiple disks

• Does not offer fault tolerance• Offers better performance

– Multiple heads access the data stored on the different drives for faster data access

RAID 0 Striping

                                                                                                        

Source: Adaptech

More on Striping

• Striping logically divides each hard disk into stripes

• The stripes are arranged interleaved in a rotating sequence among the various disks

• Data stored in the stripes for a logical sequence of storage space composed alternatively of stripes from each disk (drive)

• A stripe can be as small as a sector (512 bytes) or as large as several megabytes – In general, a record falls entirely within one stripe

RAID 0 Data Access Performance

                                                                                                      

Source: Adaptech

Multiple I/O Access

• Most operating systems support concurrent disk I/O

• I/O load must be balanced on the disks for optimum performance

• Striping promotes load balancing and hence improves disk I/O performance

RAID 0 Configuration

• Large stripes for multiple users• Small stripes for single users

Advantages and Disadvantages

• Fast access• If one disk fails, the entire system

will no more be able to use the data on all the disks

Windows Support for RAID 0

• Windows 2003 supports RAID 0• 2 to 32 disks can be used in a set

known as a striped volume

RAID 1

• Provides fault tolerance• Basically implements disk

mirroring

Implementation

• A single pair of mirrored disks are not striped

• Multiple pair of mirrored disks can be striped to create striped volumes

RAID 1 in Operation

                                                                                                      

Source: Adaptec

RAID 1 Performance

• Read performance is improved because both disks can be simultaneously read for different records

• Write performance remains unchanged as the same data need to be written to both disks

Windows Support for RAID 1

• Supported in Windows 2000– Ftdisk.sys is the driver use for

supporting fault tolerance

RAID 5

• Provides fault tolerance using Parity

• Data and parity information is distributed over all the disks

Read and Write Operation with RAID 5

                                                                                                   

Source: Adaptech

Read and Write Performance

• Read access can be overlapped – Because data is spread over different

drives

• Write operations could also be overlapped – Because different data records store

the parity information in different disks

Windows Support

• Supported in Windows 2000• Known as “stripe set with parity on

basic disks”• Requires at least 3 disks• An additional 16 Mbytes of

memory must be provided to support RAID 5

RAID 10

• Offers the advantage of both RAID 0 and RAID 1– Faster performance through multiple

read access– Fault tolerance through disk mirroring

• Also known as RAID 0+1

RAID 50

• Combines the advantages of RAID 0 and RAID 5

Summary(Source: Adaptec)

• RAID 0 offers good read and write performance, but it does not provide fault tolerance

• RAID 1 offers fault tolerance, but it does not in general offer performance advantage– Multiple pairs may be created for

performance advantage in addition to providing fault tolerance

Summary (Continued)• RAID 5 combines efficient, fault-tolerant

data storage with good performance characteristics.

• However, write performance and performance during drive failure is slower than with RAID 1.

• Rebuild operations also require more time than with RAID 1 because parity information is also reconstructed.

• At least three drives are required for RAID 5 arrays.

END OF MODULEEND OF MODULE

MODULE

An Assembly of Fault Tolerance and Backup

Features

Fault Tolerant Components

• RAID storage subsystem• Redundant power supplies• Uninterruptible Power Supply or

UPS• Tape backup device

Hardware Systems for Reliability

Server

Client Client

RAID

Tape

UPS

Redundant Power Supply

With Surge Protector

UPS

Tape Backup Technology

QICTravanDAT8mmMammothAIT technology

Digital Linear TapeSuper DLTADR technologyLinear Tape OpenVXA technologyRobotic applications

http://www.pctechguide.com/15tape2.htm

Web Research

• Obtain information on RAID 0, 1, 5 and 10– Adaptec– http://www.acnc.com/04_01_00.html#top– Get the information on different file systems

including the Linux and Unix file systems

• Visit the website of an UPS vendor to get additional information on UPC– APC– PC Power and Cooling

• Tape backup– http://www.pctechguide.com/15tape.htm#QIC

Firewall and Protocols

Software Firewall Settings

• ICMP etc. • Check Zone Alarm Pro

END OF MODULE END OF MODULE END OF CHAPTER END OF CHAPTER