IndicThreads-Pune12-Accelerating Computation in HTML 5
-
Upload
indicthreads -
Category
Documents
-
view
219 -
download
0
Transcript of IndicThreads-Pune12-Accelerating Computation in HTML 5
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
1/26
Accelerating computation in html 5
Ashish ShahSAS R&D INDIA
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
2/26
Outline
Multicore Computing
Problem statement
Demo
Introduction to OpenCL and WebCL
Conclusion
References
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
3/26
Multicore Computing
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
4/26
Problem statement
Layout algorithm for node-linked graphs
AlgorithmLayout
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
5/26
DEMO
Demo 1 Serial versionDemo 2 - Parallel version with multi-core CPU
Demo 3 - Parallel version with many-core GPU
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
6/26
Performance analysis
Tim
ein
ms
Number of particles
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
7/26
Introduction to OpenCL
OpenCompute Language, C- like language.
Framework for writing parallel algorithms
Heterogeneous platforms
Developed by Apple
Is an open standard and controlled by Khronosgroup
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
8/26
Example of adding two vectors
_kernel add(a,b,c)
{
int i =get_global_id(); //get thread id
c[i]=a[i]+b[i];
}
For(i=1 to n)
c[i]= a[i]+b[i];
Serial version
Using OpenCL
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
9/26
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
10/26
OpenCL -Platform
Device
Host
Host
Intel CPUGPU 2
ComputeDevice 1 (GPU1)
Compute unite (Cores)
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
11/26
OpenCL-Execution Model
1. Kernel2. Work-items
3. Work group
4. ND-range
5. Program
6. Memoryobjects
7. Commandqueues
_kernel add(a,b,c)
{int i =get_global_id();//get thread/workitem id
c[i]=a[i]+b[i];
}
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
12/26
Memory Model in OpenCL
Compute Device
Compute unit 0 Compute unit 1 Compute unit 2
Global Memory -DRAM
Global constant memory-DRAM
Local memory/cache Local memory/cache Local memory/cache
Private register Private register Private register
P i d l
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
13/26
Programming model
1. Data parallel-single function on multiple data
2. Task parallel-Multiple functions on single data
O CL S k
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
14/26
OpenCL Runtime
OpenCL Framework
OpenCL Stack
OpenCL Device (GPU/CPUhardware)
Devicedriver
Compiler
Applications
kernals
OpenCL-Api
HTML,.java,.NET,c,c++
String data
Java,c,.net,WebCL
contextMemoryApis
Commandqueues, bufferobjects, kernelexecution
E ti l D l t T k
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
15/26
Essential Development Tasks
Parallelize Code KernelInitializeOpenCL
environment
Initiatekernels and
data
Executekernel
Read backdata to host
C-code with restrictions
E ti l D l t T k
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
16/26
Essential Development Tasks
Parallelize Code KernelInitializeOpenCL
environment
Initiatekernels and
data
Executekernel
Read backdata to host
Query compute device Create context Compile kernels
E ti l D l t T k
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
17/26
Essential Development Tasks
Parallelize Code KernelInitializeOpenCL
environment
Initiatekernels and
data
Executekernel
Read backdata to host
Create memory objects Map data structures to OpenCL
supported data structures. Initialize kernel parameters
E nti l D l nt T k
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
18/26
Essential Development Tasks
Parallelize Code KernelInitializeOpenCL
environment
Initiatekernels and
data
Executekernel
Read backdata to host
Specify number of threads toexecute task
Trigger the execution of kernel-sync or async
Essential Development Tasks
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
19/26
Essential Development Tasks
Parallelize Code KernelInitializeOpenCL
environment
Initiatekernels and
data
Executekernel
Read backdata to host
Map to application datastructure
Introduction to WebCL
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
20/26
Introduction to WebCL
Java Script bindings for OpenCL
First announced in March 2011 by Khronos
API definition underway
Prototype plugin is available only for Firefox
browser
Binding OpenCL to WebCL
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
21/26
Binding OpenCL to WebCL
CPU
Host application JavaScript
OpenCL Framework
WebCL
OpenCL
compliant
device
Coding with WebCL
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
22/26
Coding with WebCLplatforms = WebCL.getPlatformIDs();
context = WebCL.createContextFromType([WebCL.CL_CONTEXT_PLATFORM,
platforms[0]], WebCL.CL_DEVICE_TYPE_CPU);devices = context .getContextInfo(WebCL.CL_CONTEXT_DEVICES);
program = context .createProgramWithSource(kernelSrc);
kernelfunction1 = program.createKernel(function1");
buffparam = context.createBuffer(WebCL.CL_MEM_READ_WRITE, bufSize);
cmdQueue = context.createCommandQueue(devices[0], 0);
cmdQueue.enqueueWriteBuffer(buffparam , true, 0, bufSize, parameter, []);
kernelfunction1.setKernelArg(0, buffparam , WebCL.types.float2);
cmdQueue.enqueueNDRangeKernel(kernelfunction1 , 1, [], totalWorkitems,
totalWorkgroups, []);cmdQueue.finish ();
cmdQueue.enqueueReadBuffer(xyz, true, 0, bufSize, xyzParam, []);
A li ti f O CL
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
23/26
Applications of OpenCL
Database mining
Neural networks Physics based simulation,mechanics
Image processing
Speech processing Weather forecasting and climate research
Bioinformatics
Conclusion
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
24/26
Conclusion
Significant performance gains in using OpenCLfor computations in client-side environmentslike HTML5
Algorithms need to be parallelizable
Further optimizations can be achieved by
exploiting memory model
Software/Hardware used in demo application
-
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
25/26
Software/Hardware used in demo application
Hardware
Intel(R) Core(TM)2 Quad core CPU Q8400 @ 2.66GHzNvidia 160m Quadro 8 cores @ 580 MHz
Software
OpenCL runtime for CPU
http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/
OpenCL runtime for GPU
http://www.nvidia.com/object/quadro_nvs_notebook.html
WebCL plugin for Firefox
http://webcl.nokiaresearch.com/
References
http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://webcl.nokiaresearch.com/http://webcl.nokiaresearch.com/http://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/ -
7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5
26/26
References
http://www.macresearch.org/opencl
http://en.wikipedia.org/wiki/GPGPU
http://www.khronos.org/webcl/
http://www.macresearch.org/openclhttp://en.wikipedia.org/wiki/GPGPUhttp://www.khronos.org/webcl/http://www.khronos.org/webcl/http://en.wikipedia.org/wiki/GPGPUhttp://www.macresearch.org/opencl