Introduction
-
Upload
willow-beck -
Category
Documents
-
view
16 -
download
0
description
Transcript of Introduction
User Level Interprocess User Level Interprocess Communication for Communication for
Shared Memory Shared Memory MultiprocessorMultiprocessor
by by
Bershad, B.N. Anderson, Bershad, B.N. Anderson, A.E., Lazowska, E.D., and A.E., Lazowska, E.D., and
Levy, H.M.Levy, H.M.
IntroductionIntroduction
RPCRPC Help in implementing distributed Help in implementing distributed
applications by eliminating the need to applications by eliminating the need to implement communication mechanism.implement communication mechanism.
Decomposed system provides Decomposed system provides advantages of failure isolation, advantages of failure isolation, extensibility and modularity. So RPC is extensibility and modularity. So RPC is used even when the call is in the same used even when the call is in the same machine.machine.
IntroductionIntroduction
RPC CostsRPC Costs Stub overheadStub overhead Message buffer overhead (4 copies)Message buffer overhead (4 copies) Access validationAccess validation Message transferMessage transfer SchedulingScheduling Context switchContext switch DispatchDispatch
IntroductionIntroduction
LRPC CostsLRPC Costs Stub overheadStub overhead Message buffer overhead (1 copy)Message buffer overhead (1 copy) Only necessary access validationOnly necessary access validation Message transferMessage transfer Only necessary schedulingOnly necessary scheduling Context switch is minimized by using Context switch is minimized by using
domain cachingdomain caching
IntroductionIntroduction
IPCIPC Main components (All work in Kernel)Main components (All work in Kernel)
Processor reallocation (process context Processor reallocation (process context switch)switch)
Data transferData transfer Thread managementThread management
ProblemsProblems Processor reallocation is expensiveProcessor reallocation is expensive Parallel applications need user-level thread Parallel applications need user-level thread
managementmanagement
URPCURPC
User-Level Remote Procedure CallUser-Level Remote Procedure Call Shared memory multiprocessorsShared memory multiprocessors
Processor reallocation - minimizeProcessor reallocation - minimize Data transfer - user-level (Package called Data transfer - user-level (Package called
URPC)URPC) Thread management - user-level (Package Thread management - user-level (Package
called FastThreads)called FastThreads)
User-level componentsUser-level components
Processor ReallocationProcessor Reallocation
Limit the frequency of processor Limit the frequency of processor reallocationreallocation WhyWhy
Cost of process context switch is more Cost of process context switch is more expensive than thread context switchexpensive than thread context switch
Cost of invoking kernelCost of invoking kernel-Client makes procedure call in server address space-Invoke kernel-Kernel reallocates processor to server address space-Server finishes the job-Invoke kernel-Kernel reallocates processor to client address space-Client resumes the work
Processor ReallocationProcessor Reallocation
Limit the frequency of processor Limit the frequency of processor reallocationreallocation HowHow
Optimistic reallocation policyOptimistic reallocation policy Client has other worksClient has other works Server has or will soon has a processor to do the Server has or will soon has a processor to do the
jobjob
Uniprocessor can delay processor Uniprocessor can delay processor reallocationreallocation
-Client makes procedure call in server address space-Client does something else-Server finishes the job-Client resumes the work
Processor ReallocationProcessor Reallocation
ProblemsProblems Inappropriate situationsInappropriate situations
Single-threaded client, real time applications Single-threaded client, real time applications & high-latency I/O applications& high-latency I/O applications
Solve: Allow client to force processor Solve: Allow client to force processor reallocationreallocation
UnderpoweredUnderpowered No processor to handle the pending request No processor to handle the pending request
from clientfrom client Solve: Donate – idle processor donates itself Solve: Donate – idle processor donates itself
to underpowered address spaceto underpowered address space
Processor ReallocationProcessor Reallocation
ProblemsProblems Voluntary return of processorVoluntary return of processor
Processor working in server never return to Processor working in server never return to client because it is too busy working on the client because it is too busy working on the request of other clients.request of other clients.
Solve: enforce the process reallocation Solve: enforce the process reallocation when necessary such as high priority when necessary such as high priority waiting while low priority job is running and waiting while low priority job is running and processor is idlingprocessor is idling
Processor ReallocationProcessor Reallocation
LRPC VS URPCLRPC VS URPC Domain caching looks for idle processor in Domain caching looks for idle processor in
server contextserver context Optimistic reallocation assume there will Optimistic reallocation assume there will
be an available processor in server context be an available processor in server context and queue the request to be done laterand queue the request to be done later
URPC needs two level scheduling URPC needs two level scheduling decisions including looking for idle decisions including looking for idle processor and underpoweredprocessor and underpowered address address space while LRPC does not.space while LRPC does not.
Data TransferData Transfer
Use pair-wise shared memory to Use pair-wise shared memory to avoid the need of copying in kernel.avoid the need of copying in kernel.
Both give the same level of security Both give the same level of security since data need to be passed into since data need to be passed into stubs before it can be usedstubs before it can be used
Thread ManagementThread Management
ArgumentsArguments Fine-grained parallel application needs Fine-grained parallel application needs
high performance thread management high performance thread management which could only be achieved by which could only be achieved by implementing in user-levelimplementing in user-level
Communication & Thread management Communication & Thread management can achieve very good performances can achieve very good performances when both are implemented at user-when both are implemented at user-levellevel
Thread ManagementThread Management
Features of kernel such as time Features of kernel such as time slicing degrade performance of slicing degrade performance of applicationsapplications
To invoke thread management To invoke thread management operation, kernel traps are requiredoperation, kernel traps are required
Thread management policy Thread management policy implemented in kernel is unlikely to implemented in kernel is unlikely to be efficient for all parallel be efficient for all parallel applicationsapplications
Thread ManagementThread Management
Threads block in order toThreads block in order to Synchronize their activities in same Synchronize their activities in same
address spaceaddress space Wait for external events from different Wait for external events from different
address spaceaddress space Communication implemented at kernel level Communication implemented at kernel level
will result in synchronization at both user will result in synchronization at both user level and kernel levellevel and kernel level
URPCURPC
PerformancePerformance
Thread managementThread management faster at user faster at user levellevel
Component breakdownComponent breakdown
PerformancePerformance
Call latency & throughput is at worst Call latency & throughput is at worst when S=0when S=0
ConclusionConclusion
Moving the possible functionality Moving the possible functionality from kernel into user-lever to from kernel into user-lever to improve performanceimprove performance
In order to achieve great In order to achieve great performance on multiprocessors, performance on multiprocessors, system need to be designed to system need to be designed to support its functionalitysupport its functionality