Distributed computing
-
Upload
alokeparna-choudhury -
Category
Engineering
-
view
2.466 -
download
2
description
Transcript of Distributed computing
UNIVERSITY INSTITUTE OF TECHNOLOGY,
THE UNIVERSITY OF BURDWAN
PARALLEL COMPUTING LAB
Presentation on Distributed Computing
DISTRIBUTED COMPUTING
Presented by..Alokeparna Choudhury(ME201310005)
Hossainara Begum (ME201310004)
CONTENTS
INTRODUCTION CENTRALIZED VS. DISTRIBUTED COMPUTING WHAT IS DISTRIBUTED SYSTEM? ORGANIZATION ARCHITECTURE TYPES OF DISTRIBUTED SYSTEM COMMUNICATION MIDDLEWARE MOTIVATION HISTORY GOAL CHARACTERISTICS EXAMPLES OF DISTRIBUTED COMPUTING DISTRIBUTED COMPUTING USING MOBILE AGENT
CONTD..
TYPICAL DISTRIBUTED COMPUTING A TYPICAL INTRANET INTERNET JAVA RMI TRANSPARENCY IN DISTRIBUTED SYSTEM CATEGORIES OF APPLICATIONS IN DISTRIBUTED
COMPUTING MONOLITHIC MAINFRAME APPLICATION vs DISTRIBUTED
APPLICATION ADVANTAGES DISADVANTAGES ISSUES & CHALLENGES CONCLUSION REFERENCES
INTRODUCTION
Nowadays it is not only feasible but also easy to put together computing systems composed of large numbers of computers connected by a high-speed network.
They are usually called computer networks or distributed systems, in contrast to the previous centralized systems(or, single-processor system).
CENTRALIZED VS. DISTRIBUTED COMPUTING
Early computing was performed on a single processor. Uni processor computing can be called Centralized computing.
A Distributed system is a collection of independent computers, interconnected via a network, capable of collaborating on a task. Distributed computing is computing performed in a distributed system.
Centralized computing Distributed computing
m ain f r am e c o m p u terw o r k s ta tio n
n e tw o r k h o s t
n e tw o r k lin k
te r m in a l
ce n tra lize d co m pu t in gdis t r ibu te d co m pu t in g
WHAT IS DISTRIBUTED SYSTEM?
Definition ‘‘A system in which hardware and software components
located on networked computers communicate and coordinate their actions only by passing messages.’’ (Coulouris)
‘‘A distributed system is a collection of independent computers that appears to its users as a single coherent system. ’’ (Tannenbaum)
• A Distributed system consists of multiple autonomous computers that communicate through a computer network.
• Distributed computing utilizes a network of many computers, each accomplishing a portion of an overall task, to achieve a computational result much more quickly than with a single computer.
• Distributed computing is any computing that involves multiple computers remote from each other that each have a role in a computation problem or information processing.
• In the term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a spatially distributed system.
• These networked computers may be in the same room, same campus, same country, or in different continents.
CooperationCooperation
Cooperation
InternetInternet
Large-scaleApplicationResource
Management
Subscription
Distribution
Distribution Distribution
Distribution
Agent
Agent Agent
Agent
Job Request
ORGANIZATION
Organizing the interaction between each computer is of prime importance. In order to be able to use the widest possible range and types of computers, the communication channel should not contain or use any information that may not be understood by certain machines.
Special care must also be taken that messages are delivered correctly and that invalid messages are rejected which would otherwise bring down the system and perhaps the rest of the network.
Another important factor is the ability to send software to another computer in a portable way so that it may execute and interact with the existing network. This may not always be possible when using differing hardware and resources, in which case other methods must be used such as cross-compiling or manually porting this software.
ARCHITECTURE
Distributed programming typically falls into one of several basic architectures: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose coupling, or tight coupling.
Client-server — Smart client code contacts the server for data, then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.
3-tier architecture — Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment.
N-tier architecture — N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.
Tightly coupled (clustered) — refers typically to a set of highly integrated machines that run the same process in parallel, subdividing the task in parts that are made individually by each one, and then put back together to make the final result.
Peer-to-peer —an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.
Space based — refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved.
TYPES OF DISTRIBUTED SYSTEM
Distributed Computing Systems
--cluster computing system
--grid computing system Distributed Information Systems
--transaction processing system
--enterprise application integration Distributed Pervasive Systems
--home system
--electronic health care system
--sensor networks
COMMUNICATION MIDDLEWARE
Several types of communication middleware exist.With remote procedure calls (RPC), an application component can send a request to another application component by doing a local procedure call, which results in the request being packaged as a message and sent to the caller. Then the result will be sent back to the caller application as the result of the procedure call.Techniques were developed to allow calls to remote objects, leading to what is known as remote method invocations (RMI).
MIDDLEWARE (CONTD.) An RMI is essentially the same as an RPC, except that it operates on
objects instead of applications.
RPC and RMI have the disadvantage that the caller and callee both need to be up and running at the time of communication. In addition, they need to know exactly how to refer to each other.
This tight coupling is often experienced as a serious drawback, and has led to what is known as message-oriented middleware (MOM). In this case, applications simply send messages to logical contact points, often described by means of a subject. Applications can indicate their interest for a specific type of message, after which the communication middleware will take care that those messages are delivered to those applications.
MOTIVATION
Inherently distributed applications Performance/cost Resource sharing Flexibility and extensibility Availability and fault tolerance Scalability Network connectivity is increasing. Combination of cheap processors often more cost-
effective than one expensive fast system. Potential increase of reliability.
HISTORY1975 - 1995 Parallel computing was favored in the early years Primarily vector-based at first Gradually more thread-based parallelism was introduced The first distributed computing programs were a pair of programs called
Creeper and Reaper invented in 1970s Ethernet that was invented in 1970s. ARPANET e-mail was invented in the early 1970s and probably the earliest
example of a large-scale distributed application. Massively parallel architectures start rising and message passing interface
and other libraries developed Bandwidth was a big problem The first Internet-based distributed computing project was started in 1988
by the DEC System Research Center. Distributed.net was a project founded in 1997 - considered the first to use
the internet to distribute data for calculation and collect the results.
1995 – TODAY
Cluster/grid architecture increasingly dominantSpecial node machines eschewed in favor of COTS
technologiesWeb-wide cluster softwareGoogle take this to the extreme (thousands of
nodes/cluster)SETI@Home started in May 1999 - analyze the
radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico.
GOAL
Making Resources Accessible Data sharing and device sharing
Distribution Transparency Access, location, migration, relocation, replication,
concurrency, failure Communication
Make human-to-human comm. easier. E.g.. : electronic mail
Flexibility Spread the work load over the available machines in the
most cost effective way To coordinate the use of shared resources To solve large computational problem
CHARACTERISTICS
Resource Sharing Openness Concurrency Scalability Fault Tolerance Transparency
EXAMPLES OF DISTRIBUTED COMPUTING
Network of workstations (NOW) / PCs: a group of networked personal workstations or PCs connected to one or more server machines.
Distributed computing using mobile agents
The Internet(World Wide Web)
An Intranet: a network of computers and workstations within an organization, segregated from the Internet via a protective device (a firewall).
JAVA Remote Method Invocation (RMI)
DISTRIBUTED COMPUTING USING MOBILE AGENTS
Mobile agents can be wandering around in a network using free resources for their own computations.
TYPICAL DISTRIBUTED COMPUTING
intranet
ISP
desktop computer:
backbone
satellite link
server:
network link:
A TYPICAL INTRANET
the rest of
email server
Web server
Desktopcomputers
File server
router/firewall
print and other servers
other servers
Local areanetwork
email server
the Internet
INTERNET
The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP).
JAVA RMI
Embedded in language Java:- Object variant of remote procedure call Adds naming compared with RPC (Remote Procedure Call) Restricted to Java environments
TRANSPARENCY IN DISTRIBUTED SYSTEMS Access transparency: enables local and remote resources to be accessed using identical
operations. Location transparency: enables resources to be accessed without knowledge of their
physical or network location (for example, which building or IP address). Concurrency transparency: enables several processes to operate concurrently using
shared resources without interference between them. Replication transparency: enables multiple instances of resources to be used to increase
reliability and performance without knowledge of the replicas by users or application programmers.
Failure transparency: enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components.
Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs.
Performance transparency: allows the system to be reconfigured to improve performance as loads vary.
Scaling transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms.
CATEGORIES OF APPLICATIONS IN DISTRIBUTED COMPUTING
Science Life Sciences Cryptography Internet Financial Mathematics Language Art Puzzles/Games Miscellaneous Distributed Human Project Collaborative Knowledge Bases Charity
MONOLITHIC MAINFRAME APPLICATION VS DISTRIBUTED APPLICATION
The monolithic mainframe application architecture: Separate, single-function applications, such as order-entry or
billing Applications cannot share data or other resources Developers must create multiple instances of the same
functionality (service).
The distributed application architecture: Integrated applications Applications can share resources A single instance of functionality (service) can be reused.
ADVANTAGES OF DISTRIBUTED COMPUTING
Cost : Better price / performance as long as everyday hardware is used for the component computers – Better use of existing hardware
Performance : By using the combined processing and storage capacity of many nodes, performance levels can be reached that are out of the scope of centralised machines
Scalability : Resources such as processing and storage capacity can be increased incrementally
Inherent distribution : Some applications like the Web are naturally distributed
Reliability : By having redundant components the impact of hardware and software faults on users can be reduced
DISADVANTAGES OF DISTRIBUTED COMPUTING
The disadvantages of distributed computing: Multiple Points of Failures: the failure of one or more
participating computers, or one or more network links, can generate trouble.
Security Concerns: In a distributed system, there are more opportunities for unauthorized attack.
Software: Distributed software is harder to develop than conventional software; hence, it is more expensive
ISSUES & CHALLANGES
Heterogeneity of components :-
Variety or differences that apply to computer hardware, network, OS, programming language and implementations by different developers.
All differences in representation must be deal with if to do message exchange.
Example : different call for exchange message in UNIX different from Windows.
Openness:-
System can be extended and re-implemented in various ways. Cannot be achieved unless the specification and documentation
are made available to software developer. The most challenge to designer is to tackle the complexity of
distributed system; design by different people.
Transparency:-
Aim : make certain aspects of distribution are invisible to the application programmer ; focus on design of their particular application.
They not concern the locations and details of how it operate, either replicated or migrated.
Failures can be presented to application programmers in the form of exceptions – must be handled.
Security:-
Security for information resources in distributed system have 3 components :
a. Confidentiality : protection against disclosure to unauthorized individuals.
b. Integrity : protection against alteration/corruption
c. Availability : protection against interference with the means to access the resources.
The challenge is to send sensitive information over Internet in a secure manner and to identify a remote user or other agent correctly.
Scalability :- Distributed computing operates at many different scales, ranging
from small Intranet to Internet. A system is scalable if there is significant increase in the number
of resources and users. The challenges is :
a. controlling the cost of physical resources.
b. controlling the performance loss.
c. preventing software resource running out.
d. avoiding performance bottlenecks.
Failure Handling :- Failures in a distributed system are partial – some
components fail while others can function. That’s why handling the failures are difficult
a. Detecting failures : to manage the presence of failures cannot be detected but may be suspected.b. Masking failures : hiding failure not guaranteed in the worst case.
Concurrency :- Where applications/services process concurrency, it will
effect a conflict in operations with one another and produce inconsistence results.
Each resource must be designed to be safe in a concurrent environment.
CONCLUSION
The concept of distributed computing is the most efficient way to achieve the optimization.
Distributed computing is anywhere : intranet, Internet or mobile ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi systems)
It deals with hardware and software systems, that contain more than one processing / storage and run in concurrently.
Main motivation factor is resource sharing; such as files , printers, web pages or database records.
Grid computing and cloud computing are form of distributed computing.
REFERENCES
Andrew S. Tanenbaum and Maarten Van Steen, Distributed Systems : Principles and Paradigms, Pearson Prentice Hall, 2nd Edition 2007.
www.inderscience.com/ijcnds George Coulouris, Jean Dollimore, and Tim Kindberg,
Distributed Systems: Concepts and Design, Addison-Wesley,Pearson Education 3rd Edition 2001.