Multi-Thread Programming
Vincent LiuMDE GPE-EASun Microsystems Inc.
What in this topic
Basics about multi-threaded ProgramSolaris's thread modelSynchranization mechanismLock ContentionHow to measure a MT program's
scalability on SolarisSome tools at our command on SolarisImprove MT program's performance
Agenda
Introduction to Multi-Thread
Basic Thread Programming
Thread Synchronization
Locking Problems
Advanced Multi-Thread programming
Solaris Thread libraries
Introduction to Multithreading
Process, Thread and LWP
Which Applications Benefit?
When Not to Use Threads
Thread Standard
What Is a Thread?
• An independent flow of control in a program
• A “virtual” central processing unit(CPU)
• Thread share the address space and Process Structure, with its own stack, TCB, KTCB
• Traditional Process– MultiThreaded process with only one thread
Process,Thread and LWP
Thread vs. Process
Thread : high degree of parallelism on multiprocessor systems
Kernel resource consumed ( create, schedule, etc)Thread: lightweight
Process: heavyweight
Information sharing
Thread : share process address space
Process: IPC
Single LevelThreads Model
The default model in Solaris 9 and 10All user threads bound to LWPsKernel level scheduling – No more libthread.so schedulerMore expensive thread create/destroy,SynchronizationMore responsive scheduling, synchronization
Which Applications Benefit?
Threads can :Simplify program structure
Improve throughput
Improve responsiveness
Minimize system resource usage
Simplify realtime applications
When Not to Use Threads
For compute-bound threads on a uniprocessor.
For threads that execute very short tasks
When there is nothing to run concurrently
For multithreaded applications that are more difficult to design and debug than single-threaded applications.
Thread Interfaces
Two International standards:
POSIXSolaris 2.5+, HP-UX 10.30+, Digital Unix 4.0+
IRIX 6.2+, VMS, AS/400, AIX 4.1+
UI Thread Solaris 2.2+, UnixWare
Basic Thread Programming
Thread Life Cycle
Thread APIs
Thread Attribute
Thread Life Cycle
main() {pthread_create( &tid, &attributes, function, arg );…
}void *funciton(void *arg) {
…pthread_exit(status);
}
pthread_create(… function(), arg)
Function(arg) pthread_exit(status)
Simple Multi-Thread program#include <pthread.h>
main( ) {
pthread_t t1;
void * status ;
printf ( “main thread id is : %d \n”, pthread_self( ) );
pthread_create( &t1, NULL, func, NULL) ;
pthread_join( t1, &status );
pthread_exit ( status );
}
void * func( void * arg ){
sleep (10);
pthread_exit ( (void * ) 44 ) ;
}
POSIX Thread APIs
int pthread_create(&tid, &attr, func, arg );
void pthread_exit(status);int pthread_join(tid, &status);pthread_t pthread_self();int pthread_equal(tid1, tid2);int pthread_cancel(tid);void sched_yield();
Waiting for a Thread to Exit
“Undetached” threads must be joined and may return a status value
“Detached” threads cannot be joined and cannot return a status value
t1
t2
pthread_join(t2) pthread_exit(stat
us)
Thread Attributes Usage
pthread_attr_t attr;
pthread_attr_init( &attr);pthread_attr_setscope(&attr,
PTHREAD_SCOPE_SYSTEM );pthread_attr_setdetachstate ( &attr,
PTHREAD_CREATE_JOINABLE);pthread_create( &tid, &attr, func, arg);
Thread Synchronization
Why use Synchronization
Synchronization MechanismMutex
Semaphore
Condition Variable
Why use Synchronization
Unsynchronized Shared data is a formula of disaster
Thread 1 Thread 2
temp = your.bankbalance;
dividend = temp * interestrate;
newbanlance = dividend + temp;
your.bankbalance += deposit ;
your.bankbalance = newbalance;
All shared Data Must Be Locked
• Good programmers protect all usage of shared data with locks
Shared data
lock(M);
…….
unlock(M);
lock(M);
…..
unlock(M);
All shared Data Must Be Locked
Global variables
Process resources
Shared data structures
Static variables
Synchronization Variables
Used to protect shared resourceMutex exclusin locks
Used to determine when a thread should run
Counting Semaphores
Condition variables
Mutex Sample
Pthread_mutex_t lock;
pthread_mutex_init(&lock, NULL);
…..
Thread 1 Thread 2
pthread_mutex_lock( &lock); pthread_mutex_lock(&lock);
deposit(acct, x); draw(acct, y);
pthread_mutex_unlock(&lock); pthread_mutex_unlock(&lock);
User Account
deposit
draw
Mutexes
The blocked thread might not get the mutex
Typical lock/unlock time: one cycle
Critical sections should be as short as possible
Non-critical sections are usually much longer typically
Semaphores
sem_wait: decrease the count if non-zero;
otherwise wait until others execute sem_post
sem_post: increase count by 1, and wakeup
sleepers if exists
sem_init : semaphore initialization
Count 0
Sleeperst1 t3
Semaphore Excecution Graph
sem_wait
s=0,waiting sem_post
s=1,wake up t1
s=0
sem_post s=1
sem_wait s=0
sem_wait
s=0,waiting
t1
t2
t4
t5
t3
EINTR
Semaphore can be interrupted by signals, sem_wait can return without decreasing the value
while(sem_wait(&s) = = EINTR )
{<probably do nothing>}
do_thing;
Avoiding DeadlocksDeadlocks can always be avoided
You can establish a hierachy
Use a static analysis program(for example lock_lint) to scan your code for hierarchy violations
Use trylock primitive pthread_mutex_lock(&m2);
…
if (EBUSY==pthread_mutex_trylock(&m1)) {
pthread_mutex_unlock(&m2);
pthread_mutex_lock(&m1);
pthread_muetx_lock(&m2);
}
do_real_work();
Advanced Topic
• Thread Specific Data• Unix Signals• Advanced scheduling• MT-Safe library
Why TSD is needed
Sometimes it is useful to have a global variable which is local to a thread
Thread 1 Thread 2
err = read(..); err = ioctl(…);
if ( err ) if(err)
printf(“%d\n”, errno); printf(“%d\n”, errno);
TSD Usage• Key Creation
– pthread_key_create( &key, destructor )
• Key Delete– pthread_key_delete( key )
• Modify TSD– pthread_setspecific( key, value)
• Access TSD– pthread_getspecific( key )
TSD Samplepthread_key_t key1;
main(){
pthread_key_create( &key1, destroyer );
pthread_create( &t1, NULL, foo, 3.0);
pthread_create ( &t2, NULL, foo, 4.0);
}
foo( float x ){
pthread_setspecific( key1, x );
bar();
}
bar(){
n = pthread_getspecific( key1);
}
TSD Destructors
When a thread exits, it first sets the value of each TSD element to NULL, then calls the destructor on what the value was
If you delete a key, the destructor functions will not run, you must deal with it yourself
Advanced Topic
• Thread Specific Data• Unix Signals• Advanced scheduling• MT-Safe library
Three uses of Signals
• Synchronous signals for Error reporting – SIGFPE, SIGSEGV, SIGBUS, etc
• Asynchronous signals for Situation reporting
• Asynchronous signals for Interruption– SIGKILL, SIGALRM, SIGSTOP, etc
Traditional Signal Handling
main() foo()
USR1 foo
1 2
3
4
5
6
POSIX Signal Model
signal
library’s signal
handler routines
?
Signal dispatch table
Thread signal mask
Dedicated Signal Handling Thread
A multithreaded program can create one or more thread dedicated to perform signal handling for whose process by using sigwait
pthread_create( &p, &attr, handler, arg ); ….
void * handler ( void * arg ){ while (1){ sigwait( & sigset , &sig); …… } }
POSIX Signal API
pthread_sigmask( )
pthread_kill( )
sigwait
sigset or sigaction
HOL about Signal
1. Install signal Handler
2. In MT, all thread share one handler.
3. setup different sigmask to direct signal delivery
Advanced Topic
Thread Specific Data
Unix Signals
Advanced scheduling
MT-Safe library
Advanced Scheduling
Solaris Kernel schedulingRT, SYSTEM, TS
POSIX defines 3 scheduling classesSCHED_OTHER : time sharing
* SCHED_FIFO
* SCHED_RR
API for scheduling
pthread_attr_setschedpolicySCHED_OTHER, SCHED_FIFO, SCHED_PR
pthread_attr_setschedparam
pthread_attr_setscopePTHREAD_SCOPE_PROCESS, PTHREA_SCOPE_SYSTEM
pthread_attr_setinheritschedPTHREAD_INHERIT_SCHED, PTHREAD_EXPLICIT_SCHED
priocntl
MT-Safe Library
Thread Safety Level MT-Unsafe
MT-Safe
Alternative Call
Solaris Librarieslibpthread.so (POSIX threads)
pthread.h
libthread.so (UI threads)sync.h, thread.h
libposix4.so (POSIX semaphores) posix4.h
Compiling—s10-no flag needed
Choose semanti
cs
cc [flags] file -D_POSIX_C_SOURCE=199506L
[-lposix4] -lpthread
cc [flags] file -D_REENTRANT
-D_POSIX_C_SOURCE=199506L
[-lposix4] -lthread
Mixed usage
POSIX
cc [flags] file -D_REENTRANT
[-lposix4] -lthreadUI
Some commands to ObserveUse prstat(1) and ps(1) to monitor running processes and threadsmpstat(1) to monitor context switch rates and thread migrationsdispadmin(1M) to examine and change dispatch table parametersUser priocntl(1) to change scheduling classes and priorities
Examining A Thread Structure
# mdb -kR 21344 1 21343 21280 2234 0x42004000
ffffffff95549938 tcpPerfServerffffffff95549938::print proc_t
...p_tlist = 0xffffffff8826bc20
ffffffff8826bc20::print kthread_t
Thread Semantics Added to pstack, truss
# pstack 909/2909: dbwr -a dbwr -i 2 -s b0000000 -m /var/tmp/fbencAAAmxaqxb----------------- lwp# 2 --------------------------------ceab1809 lwp_park (0, afffde50, 0)ceaabf93 cond_wait_queue (ce9f8378, ce9f83a0, afffde50, 0) + 3bceaac33f cond_wait_common (ce9f8378, ce9f83a0, afffde50) + 1dfceaac686 _cond_reltimedwait (ce9f8378, ce9f83a0, afffdea0) + 36ceaac6b4 cond_reltimedwait (ce9f8378, ce9f83a0, afffdea0) + 24ce9e5902 __aio_waitn (82d1f08, 1000, afffdf2c, afffdf18, 1) + 529ceaf2a84 aio_waitn64 (82d1f08, 1000, afffdf2c, afffdf18) + 2408063065 flowoplib_aiowait (b4eb475c, c40f4d54) + 9708061de1 flowop_start (b4eb475c) + 257ceab15c0 _thr_setup (ce9a8400) + 50ceab1780 _lwp_start (ce9a8400, 0, 0, afffdff8, ceab1780, ce9a8400)
truss -p 2975/3
Using dtrace to do sth
# dtrace -n 'thread_create:entry { @[execname]=count()}'dtrace: description 'thread_create:entry ' matched 1 probe^Csh 1sched 1do1.6499 2do1.6494 2do1.6497 2do1.6508 2in.rshd 12do1.6498 14do1.6505 16do1.6495 16do1.6504 16do1.6502 16automountd 17inetd 19filebench 34find 130csh 177
Resources
Multithread programming on Sun.comdevnull.eng
Want to Scale?—Multithread did that
Top Related