.Net Multithreading and Parallelization

61
Multithreading and Parallelization Dmitri Nesteruk [email protected] | http://nesteruk.org/seminars

Transcript of .Net Multithreading and Parallelization

Page 1: .Net Multithreading and Parallelization

Multithreading and Parallelization

Dmitri [email protected] | http://nesteruk.org/seminars

Page 2: .Net Multithreading and Parallelization

Agenda

Overview

Multithreading

PowerThreading (AsyncEnumerator)

Multi-core parallelization

Parallel Extensions to .NET Framework

Multi-computer parallelization

PureMPI.NET

Page 3: .Net Multithreading and Parallelization

Why now?

Manycore paradigm shift

CPU speeds reach production challenges(not at the limit yet)

growth

Processor features

Hyper-threading

SIMD

Page 4: .Net Multithreading and Parallelization

CPU Scope

Yesterday1x-core

Today2x-core norm4x-

Tomorrow32x-core?

Past: more transistors per chip

Present: more coresper chip

Future: even more cores per chip; NUMA & other specialties

Page 5: .Net Multithreading and Parallelization

Machine Scope

Most clients are concerned with one-machine use

Clustering helps leverage performance

Clouds

Machine

Cluster

Cloud

Page 6: .Net Multithreading and Parallelization

Multithreading vs. Parallelization

Multithreading

Using threads/thread pool to perform asyncoperations

Explicit (# of threads known)

Parallelization

Implicit parallelization

No explicit thread operation

Page 7: .Net Multithreading and Parallelization

Ways to Parallelize/Multithread

Managed

Unmanaged

Specialized

System.ThreadingParr. ExtensionsLibraries

OpenMPLibraries

GPGPUFPGA

Page 8: .Net Multithreading and Parallelization

Managed

System.Threading

Libraries

Parallel Extensions (TPL + PLINQ)

PowerThreading

Languages/frameworks

Sing#, CCR

Remoting, WCF, MPI.NET, PureMPI.NET, etc.

Use over many machines

Page 9: .Net Multithreading and Parallelization

Unmanaged

OpenMP

– #pragma directives in C++ code

Intel multi-core libraries

Threading Building Blocks (low-level)

Integrated Performance Primitives

Math Kernel Library (also has MPI support)

MPI, PVM, etc.

Use over many machines

Page 10: .Net Multithreading and Parallelization

Specialized Ex. (Intrinsic Parallelization)

GPU Computation (GPGPU)

Calculations on graphic card

Uses programmable pixel shaders

See, e.g., NVidia CUDA, GPGPU.org

FPGA

Hardware-specific solutions

E.g., in-socket accelerators

Requires HDL programming & custom hardware

Page 11: .Net Multithreading and Parallelization

Multithreading: a look at AsyncEnumerator

Part I

Page 12: .Net Multithreading and Parallelization

Multithreading

Goals

Do stuff concurrently

Preserve safety/consistency

Tools

Threads

ThreadPool

Synchronization objects

Framework async APIs

Page 13: .Net Multithreading and Parallelization

A Look at Delegates

Making delegate for function is easy

Given void a() { … }

– ThreadStart del = a;

Given void a(int n) { … }

– Action<int> del = a;

Given float a(int n, double m) {…}

– Func<int, double, float> del = a;

Otherwise, make your own!

Page 14: .Net Multithreading and Parallelization

Delegate Methods

Invoke()

Synchronous, blocks your thread

BeginInvoke

Executes in ThreadPool

Returns IAsyncResult

EndInvoke

Waits for completion

Takes the IAsyncResult from BeginInvoke

Page 15: .Net Multithreading and Parallelization

Usage

Fire and forget

– del.BeginInvoke(null, null);

Fire, and wait until done

– IAsyncResult ar = del.BeginInvoke(null,null);…del.EndInvoke(ar);

Fire, and call a function when done

– del.BeginInvoke(firedWhenDone, null);Callback parameter

Page 16: .Net Multithreading and Parallelization

WaitOne and WaitAll

To wait until either delegate completes

– WaitHandle.WaitOne(new ThreadStart[] { ar1.AsyncWaitHandle,ar2.AsyncWaitHandle

}); // wait until either completes

To wait until all delegates complete

Use WaitAll instead of WaitOne

– [MTAThread]-specific, use Pulse & Wait instead

Page 17: .Net Multithreading and Parallelization

Example

Execute a() and b() in parallel; wait on both

ThreadStart delA = a;

ThreadStart delB = b;

IAsyncResult arA = delA.BeginInvoke(null, null);

IAsyncResult arB = delB.BeginInvoke(null, null);

WaitHandle.WaitAll(new [] { arA.AsyncWaitHandle, arB.AsyncWaitHandle });

Page 18: .Net Multithreading and Parallelization

LINQ Example

Execute a() and b() in parallel; wait on both

WaitHandle.WaitAll(new [] { a, b }

.Select (f =>f.BeginInvoke(null,null)

.AsyncWaitHandle).ToArray());Convert from IEnumerable to array

Call each delegate

Get a wait handle of each

Implicitly make an array of delegates

Page 19: .Net Multithreading and Parallelization

Asynchronous Programming Model (APM)

Basic goal

– IAsyncResult ar =del.BeginXXX(null,null);

…del.EndXXX(ar);

Supported by Framework classes, e.g.,

– FileStream

– WebRequest

Page 20: .Net Multithreading and Parallelization

Difficulties

Async calls do not always succeed

Timeout

Exceptions

Cancelation

Results in too many functions/anonymous delegates

Async workflow code becomes difficult to read

Page 21: .Net Multithreading and Parallelization

PowerThreading

A free library from Wintellect (Jeffrey Richter)

Get it atwintellect.com

Also check out PowerCollections

Resource locks

ReaderWriterGate

Async. prog. model

AsyncEnumeratorSyncGate

Other features

IOState managerNumaInformation :)

Page 22: .Net Multithreading and Parallelization

AsyncEnumerator

Simplifies APM programming

No need to manually manage IAsyncResult cookies

Fewer functions, cleaner code

Page 23: .Net Multithreading and Parallelization

Usage patterns

1 async op → process

X async ops → process all

X async ops → process each one as it completes

X async ops → process some, discard the rest

X async ops → process some until cancellation/timeout occurs, discard the rest

Page 24: .Net Multithreading and Parallelization

AsyncEnumerator Basics

Has three methods

Execute(IEnumerator<Int32>)

BeginExecute

EndExecute

Also exists as AsyncEnumerator<T> when a return value is required

Page 25: .Net Multithreading and Parallelization

Inside the Function

internal IEnumerator<Int32> GetFile(

AsyncEnumerator ae, string uri)

{

WebRequest wr = WebRequest.Create(uri);

wr.BeginGetResponse(ae.End(), null);

yield return 1;

WebResponse resp = wr.EndGetResponse(

ae.DequeueAsyncResult());

// use response

}

Page 26: .Net Multithreading and Parallelization

Signature

internal IEnumerator<Int32> GetFile(

AsyncEnumerator ae, string uri)

{

WebRequestwr = WebRequest.Create(uri);

wr.BeginGetResponse(ae.End(), null);

yield return 1;

WebResponseresp = wr.EndGetResponse(

ae.DequeueAsyncResult());

// use response

}

Function must return IEnumerator<Int32>

Function must accept AsyncEnumerator as one of the parameters (order unimportant)

Page 27: .Net Multithreading and Parallelization

Callback

internal IEnumerator<Int32> GetFile(

AsyncEnumerator ae, string uri)

{

WebRequest wr = WebRequest.Create(uri);

wr.BeginGetResponse(ae.End(), null);

yield return 1;

WebResponseresp = wr.EndGetResponse(

ae.DequeueAsyncResult());

// use response

}

Call the asyncBeginXXX() methods

Pass ae.End() as callback parameter

Page 28: .Net Multithreading and Parallelization

Yield

internal IEnumerator<Int32> GetFile(

AsyncEnumerator ae, string uri)

{

WebRequest wr = WebRequest.Create(uri);

wr.BeginGetResponse(ae.End(), null);

yield return 1;

WebResponseresp = wr.EndGetResponse(

ae.DequeueAsyncResult());

// use response

}

Now yield return the number of pending asynchronous operations

Page 29: .Net Multithreading and Parallelization

Wait & Process

internal IEnumerator<Int32> GetFile(

AsyncEnumerator ae, string uri)

{

WebRequest wr = WebRequest.Create(uri);

wr.BeginGetResponse(ae.End(), null);

yield return 1;

WebResponse resp = wr.EndGetResponse(

ae.DequeueAsyncResult());

// use response

}

Call the asyncEndXXX() methods

Pass ae.DequeueAsyncResult() as parameter

Page 30: .Net Multithreading and Parallelization

Usage

Init the enumerator

– var ae = new AsyncEnumerator();

Use it, passing itself as a parameter

– ae.Execute(GetFile(ae, “http://nesteruk.org”));

Page 31: .Net Multithreading and Parallelization

Exception Handling

Break out of function

– try {resp = wr.EndGetResponse(ae.DequeueAsyncResult());

} catch (WebException e) {// process eyield break;

}

Propagate a parameter

Page 32: .Net Multithreading and Parallelization

Discard Groups

Sometimes, you want to ignore the result of some calls

E.g., you already got the data elsewhere

To discard a group of calls

Use overloaded End(…) methods to specify

Group number

Cleanup delegate

Call DiscardGroup(…) with group number

Page 33: .Net Multithreading and Parallelization

Cancellation

External code can cancel the iterator

– ae.Cancel(…)

Or specify a timeout

– ae.SetCancelTimeout(…)

Check whether iterator is cancelled with

– ae.IsCanceled(…)

just call yield break if it is

Page 34: .Net Multithreading and Parallelization

Parallel Extensions to .NET Framework TPL and PLINQ

Part II

Page 35: .Net Multithreading and Parallelization

Parallelization

Algorithms vary

(e.g., matrix multiplication)

Some not so(e.g., matrix inversion)

Some not at all

parallelize them

Page 36: .Net Multithreading and Parallelization

Parallel Extensions to .NET Framework (PFX)

A library for parallelization

Consists of

Task Parallel Library

Parallel LINQ (PLINQ)

Currently in CTP stage

Maybe in .NET 4.0?

Page 37: .Net Multithreading and Parallelization

Task Parallel Library Features

System.Linq

Parallel LINQ

System.Theading

Implicit parallelism (Parallel.Xxx)

System.Threading.Collections

Thread-safe stack and queue

System.Threading.Tasks

Task manager, tasks, futures

Page 38: .Net Multithreading and Parallelization

System.Threading

Implicit parallelization (Parallel.For and ForEach)

Aggregate exceptions

Other useful classes

Parallel.For | ForEach

LazyInit<T>WriteOnce<T>

AggregateException

Other goodies

Page 39: .Net Multithreading and Parallelization

Parallel.For

Parallelizes a for loop

Instead of

for (int i = 0; i < 10; ++i) { … }

We write

Parallel.For(0, 10, i => { … });

Page 40: .Net Multithreading and Parallelization

Parallel.For Overloads

Step size

ParallelState for cancelation

Thread-local initialization

Thread-local finalization

References to a TaskManager

Task creation options

Page 41: .Net Multithreading and Parallelization

Parallel.ForEach

Same features as Parallel.For except

No counters or steps

Takes an IEnumerable<T>

Page 42: .Net Multithreading and Parallelization

Cancelation

Parallel.For takes an Action<Int32> delegate

Can also take an Action<Int32, ParallelState>

ParallelState keeps track of the state of parallel execution

ParallelState.Stop() stops execution in all threads

Page 43: .Net Multithreading and Parallelization

Parallel.For Exceptions

The AggregateException class holds all exceptions thrown

Created even if only one thread throws

Used by both Parallel.Xxx and PLINQ

Original exceptions stored in InnerExceptions property.

Page 44: .Net Multithreading and Parallelization

LazyInit<T>

Lazy initialization of a single variable

Options

– AllowMultipleExecutionInit function can be called by many threads, only one value published

– EnsureSingleExecutionInit function executed only once

– ThreadLocalOne init call & value per thread

Page 45: .Net Multithreading and Parallelization

WriteOnce<T>

Single-assignment structure

Just like Nullable:

HasValue

Value

Also try methods

TryGetValue

TrySetValue

Page 46: .Net Multithreading and Parallelization

Futures

A future is the name of a value that will eventually be produced by a computation

Thus, we can decide what to do with the value before we know it

Page 47: .Net Multithreading and Parallelization

Futures of T

• Future is a factory

• Future<T> is the actual future (and also has factory methods)

To make a future

– var f = Future.Create(() => g());

To use a future

Get f.Value

The accessor does an async computation

Page 49: .Net Multithreading and Parallelization

Task

Just like a future, a task takes an Action<T>

– Task t = Task.Create(DoSomeWork);

Overloads exist :)

Fires off immediately. To wait on completion

– t.Wait();

Unlike the thread pool, task manager will use as many threads as there are cores

Page 50: .Net Multithreading and Parallelization

Parallel LINQ (PLINQ)

Parallel evaluation in

LINQ to Objects

LINQ to XML

Features

IParallelEnumerable<T>

ParallelEnumerable.AsParallel static method

Page 51: .Net Multithreading and Parallelization

Example

IEnumerable<T> data = ...;var q = data.AsParallel().Where(x => p(x)).Orderby(x => k(x)).Select(x => f(x));

foreach (var e in q)a(e);

Page 52: .Net Multithreading and Parallelization

Interprocess communication with PureMPI.NET

Part III

Page 53: .Net Multithreading and Parallelization

Message Passing Interface

An API for general-purpose IPC

Works across cores & machines

C++ and Fortran

Some Intel libraries support explicitly

http://www.mcs.anl.gov/research/projects/mpich2/

Page 54: .Net Multithreading and Parallelization

PureMPI.NET

A free library available at http://purempi.net

Uses WCF endpoints for communication

Uses MPI syntax

Features

A library DLL for WCF functionality

An EXE for easy deployment over network

Page 55: .Net Multithreading and Parallelization

How it works

Your computers run a service that connects them together

Your program exposes WCF endpoints

You use the MPI interfaces to communicate

Page 56: .Net Multithreading and Parallelization

Communicator & Rank

A communicator is a group of computers

In most scenarios, you would have one group

MPI_COMM_WORLD

comm

Useful for determine whether we are the

Page 57: .Net Multithreading and Parallelization

Main

static void Main(string[] args)

{

using (ProcessorGroup processors =new ProcessorGroup("MPIEnvironment",

MpiProcess))

{

processors.Start();

processors.WaitForCompletion();

}

}

MPIEnvironment app.config

Start each one

Wait on all

Run MpiProcess on all machines

Page 58: .Net Multithreading and Parallelization

Sending & Receiving

Blocking or non-blocking methods

Send/Receive (blocking)

Begin|End Send/Receive (async)

Invoked on the comm

Page 59: .Net Multithreading and Parallelization

Send/Receive

static void MpiProcess(IDictionary<string, Comm> comms)

{

Comm comm = comms["MPI_COMM_WORLD"];

if (comm.Rank == 0)

{

string msg = comm.Receive<string>(1, string.Empty);

Console.WriteLine("Got " + msg);

}

else if (comm.Rank == 1)

{

comm.Send(0, string.Empty, "Hello");

}

}

Get a message from 1 (blocking)

Send a message to 0 (also blocking)

Get a default comm from dictionary

Page 60: .Net Multithreading and Parallelization

Extras

Can use async ops

Can send to all (Broadcast)

Can distribute work and then collect it (Gather/Scatter)

Page 61: .Net Multithreading and Parallelization

Thank You!