Lecture 6 Spr2015

The University of Texas at Arlington

Lecture 6

Class Assignment for

Thursday

Read Chapter 5 to page 120 POSIX Threads

Tuesday, February 10, meet in Lab, ERB Rm. 126

Assignment 3 Due February 12

Lecture 6 2

Threading and Parallel

Programming Constraints

In todays lecture we will see some of the problems of threads finishing in the wrong

order, why some threads might not be

finished before the program terminates, or

why threads often need to communicate

with one another.

Chapter 4,Threading and Parallel Programming Constraints

Lecture 6 3

Lecture 6 4

Chapter 4, Threading and Parallel

Programming Constraints

Important Concepts

Synchronization

Critical Sections

Deadlock

Synchronization Primitives

Messages

Flow Control Concepts

5

Multi-thread Concepts

Multi-Threading Concepts Needed in

order to obtain maximum performance

from the multi-core microprocessors.

These concepts include :

Creating, Terminating, Suspending, and Resuming Threads

Thread Synchronization Methods

Semaphores, Mutexes, Locks and Critical Sections.

Lecture 6

6

Using Threads

Benefits of using threads include:

Increased performance

Better resource utilization

Efficient data sharing

However there are risks of using threads: Data races

Deadlocks

Code complexity

Portability issues

Testing and debugging difficulty

Lecture 6

7

Waiting for Threads

Looping on a condition is expensive

Thread scheduled even when no work

CPU time stolen from threads performing work

Hard to find the right balance

Locking probably too much or not enough

Thread.Sleep inflexible

Better option: Just wait for it! Wait(L) signaling

Lecture 6

8

The usage of threads helps you enhance the

performance by allowing you to run two or more

concurrent activities. Three concepts to insure threads

used properly.

Concepts:

Synchronization

Mutual exclusion

Critical Region

Threading Concepts

Lecture 6

9

Synchronization

Synchronization controls the relative

order of thread execution and

resolves conflicts among threads.

Two types:

condition synchronization.

mutual exclusion

Lecture 6

OpenMP Example

#progma omp parallel num_threads(4)

Lecture 6 10

Example Thread Operations

handle = (HANDLE)_beginthreadex(0,0,

&mywork, 0, CREATE_SUSPENDED,0);

DWORD SuspendThread (HANDLE hThread);

ResumeThread (handle); Lecture 6 11

Wait Command

WaitForMultipleObjects(numT

hreads, hThread,TRUE, INFINITE);

Lecture 6 12

13

What is Thread

Synchronization?

Two or more threads cooperating

One thread waits for another to be in known state before continuing

Lack of synchronization leads to data corruption/lockups

Using methods/constructs to enforce required behavior

Lecture 6

14

Synchronization Cont.

Condition synchronization allows a thread to wait until a specific condition is reached

The use of proper synchronization techniques insures that only one thread is allowed access to a critical section at any one instance.

The major challenge of threaded programming is to implement critical sections in such a way that multiple threads perform mutually exclusive operations for critical sections and do not use critical sections simultaneously.

Lecture 6

15

Mutual Exclusion

Mutual Exclusion - program logic used to ensure single-thread access to a critical

region.

One thread blocks a critical section of code that contains shared data that one or

more threads wait for access.

Lecture 6

16

Critical Section

Lecture 6

Example Commands for

Working with Threads

Lecture 6 17

Declaring a Critical Section

CRITICAL_SECTION g_cs;

Lecture 6 18

Critical Section

EnterCriticalSection(&L2);

processA(data1, data2);

LeaveCriticalSection(&L2);

Lecture 6 19

20

Ensuring Exclusive Access

Threads often need to share data or object

Locking

Waiting on common object

Little impact to code complexity

Mutex

One thread at a time

Can work across processes

Semaphore

Limited number of threads at a time

Can work across processes

Lecture 6

21

Using Synchronization

Synchronization is about making sure that threads take turns when they need to, typically to access some shared object.

Depending on your specific application needs, you will find that different options make more sense than others.

Windows simplifies this process since it has built-in support for suspending a thread at the scheduler level when necessary. In this manner, one thread can be put to sleep until a certain condition occurs in another thread. By letting one thread sleep instead of just repeatedly checking to see if another thread is done, performance is dramatically improved.

Lecture 6

22

Using the Mutex

The most common method of making sure that two threads take turns before accessing a given object is to use a shared lock. Since only one thread at a time can have the lock, other threads wait their turn.

Similar to a lock is the Mutex object. Only one thread can lock the Mutex at a time, and that same thread must then release it. The key difference between a Mutex and a standard lock is that it works across processes for more advanced scenarios.

Lecture 6

Example Mutex

CreateMutex() // create new

ReleaseMutex() // unlock

Lecture 6 23

24

Using the Semaphore

A Semaphore is similar to a Mutex, except that it uses the concept of capacity rather than a

simple lock. In other words, you create it with a

specified capacity, and once that number of

threads has locked it, subsequent access is

blocked until a slot opens up. A Semaphore with

a capacity of one is essentially a Mutex, with the

exception that any thread can release it, not just

a thread that has acquired it. As with the Mutex,

the Semaphore can be used across processes.

Lecture 6

25

Synchronization

Primitives Semaphores

Locks, and

Condition variables

Primitives are implemented by atomic operations by use of memory fence or barrier processor dependent operation that insures threads see other threads memory operations by maintaining reasonable order

Lecture 6

26

Implementing a

Semaphore A Semaphore is a form of a counter that

allows multiple threads access to a

resource by incrementing or decrementing

the semaphore.

Typical use is protecting a shared resource of which at most n instances are

allowed to exist simultaneously. Use P to acquire a resource and V to release.

Lecture 6

27

Semaphore Notes

The value of a semaphore is the number of units of the resource which are free. (If there is only one resource, a

"binary semaphore" with values 0 or 1 is used.) The P

operation busy-waits (or maybe sleeps) until a resource is

available, whereupon it immediately claims one. V is the

inverse; it simply makes a resource available again after

the process has finished using it. Init is only used to

initialize the semaphore before any requests are made. The

P and V operations must be atomic, which means that no

process may ever be preempted in the middle of one of

those operations to run another operation on the same

semaphore.

http://en.wikipedia.org/wiki/Semaphore_(programming)

28

Implementing a Locks

Insure that a only a single thread can have access to a resource

Acquire(): waits for the lock state to be unlocked and then sets the state to lock

Release(): Changes the lock state from locked to unlocked

Lecture 6

29

Lock Implementation cont.

Use the lock inside a critical section with single entry and single exit

Critical section

(operate on shared memory protected by lock)

End

Lock Types Mutex can include a timer attribute for release

Recursive can be repeatedly acquired by the owning thread (used in recursive functions).

Lecture 6

30

Lock Implementation Cont.

Locking restricts access to an object to one thread

Minimize locking/synchronization whenever possible

Make objects thread-safe when appropriate

Acquire locks late, release early Shorter duration, the better

Lock only when necessary Lecture 6

31

Lock Types - Cont.

Read-Write Locks allow simultaneous read access to multiple threads but limit the write access to only one thread. Use when multiple threads need to read shared data but do not need to perform a write operation on the data.

Spin Locks Waiting threads must spin or poll the states of a lock rather than getting blocked. Used mostly on multiprocessor systems as the one processor is essentially blocked spinning. Use when hold time of locks are short or less than a blocking or waking up a thread.

Lecture 6

32

Implementing a Mutex

Behaves much like the lock statement

Threads take turns acquiring/releasing it

WaitOne()

ReleaseMutex()

Lecture 6

33

Condition Variables - Other

Condition variables, messages, and flow control concepts will be discussed further as introduced in Chapter 5 and other examples.

In general condition variables are a method to implement a message regarding a specific condition that a thread is waiting on and the thread has a lock on specific resource. To prevent occurrences of deadlocks, the following atomic operations on a condition variable can be used. Wait(L), Signal(L), and Broadcast(L)

Lecture 6

34

Condition Variables

Suppose a thread has a lock on specific resource, but cannot proceed until a particular condition occurs. For this case, the thread can release the lock but will need it returned when the condition occurs. The wait() is a method of releasing the lock and letting the next thread waiting on this resource to now use the resource. The condition the original thread was waiting on is passed via the condition variable to the new thread with the lock. When the new thread is finished with the resource, it checks the condition variable and returns the resource to the original holder by use of the signal() or broadcast commands. The broadcasts enables all waiting threads for that resource to run.

Lecture 6

35

Example (Condition Variable)

Condition C;

Lock L;

Bool LC = false;

Void producer() {

while (1) {

L ->acquire();

// start critical section

while(LC == true) {

C -> wait(L);

}

// produce the next data

LC = true;

C ->signal(L);

// end critical section

L ->release();

}

}

Lecture 6

36

void consumer() {

while (1) {

L ->acquire();

// start critical section

while (LC == false) {

C ->wait(L);

}

// consume the next data

LC = false;

//end critical section

L ->release();

}

}

Example Cont.

Lecture 6

37

Message Passing

Message is a special method of communication to transfer information or a signal from one

domain to another. For multi-threading

environments the domain is referred to as the

boundary of a thread.

Message passing, or MPI, (used in distributed computing, parallel processing, etc.) A method

to communicate between threads, or processes.

Lecture 6

38

Deadlock

Deadlock:

Occurs when a thread waits for a condition that never occurs.

Commonly results from the competition between threads for

system resources held by other threads.

Lecture 6

39

Deadlock

Example: Traffic Jam

Conditions for Deadlock

A deadlock situation can arise only if all of the following conditions hold simultaneously in a system:[1]

Mutual Exclusion: At least one resource must be non-shareable.[1] Only one process can use the resource at any given instant of time.

Hold and Wait: A process is currently holding at least one resource and requesting additional resources which are being held by other

processes.

No Preemption: The operating system must not de-allocate resources once they have been allocated; they must be released by

the holding process voluntarily.

Circular Wait: A process must be waiting for a resource which is being held by another process, which in turn is waiting for the first

process to release the resource.

Hold and wait condition

40 Lecture 6

41

Deadlocks

Deadlocks occur when locks in use

Threads waiting on each others nested locks

Avoid nested locks if possible

If multiple locks are needed, always

obtain/release them in the same order.

Lecture 6

42

Race Conditions

Race conditions:

Are the most common errors in concurrent programs.

Occur because the programmer assumes a particular order

of execution but does not guarantee that order through

synchronization.

A Data Race:

Refers to a storage conflict situation.

Occurs when two or more threads simultaneously access

the same memory location while at least one thread is

updating that location.

Lecture 6

43

Race Conditions Cont.

Race conditions may not be obvious

Errors most likely only occur unexpectedly

Locks are the key to avoidance

Lecture 6

44

Summary

For synchronization, an understanding of the atomic operations will help avoid deadlock and eliminate race conditions.

Use a proper synchronization construct-based framework for threaded applications.

Use higher-level synchronization constructs over primitive types

Lecture 6

45

Summary Cont.

An application cannot contain any possibility of a deadlock scenario.

Threads can perform message passing using three different approaches: intra-process, inter-

process, and process-process

Important to understand the way threading features of third-party libraries are implemented.

Different implementations may cause

applications to fail in unexpected ways.

Lecture 6

Lecture 6 Spr2015

Documents

Transcript of Lecture 6 Spr2015