Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Introduction to CBEA SDK
Veselin Dikov
Moscow-Bavarian Joint Advanced Student School19-29. March 2006
Moscow, Russia
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Outline
● Getting started● SPU Language Extensions● SPE Library● SDK Libraries● Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Getting started
● Two executables formats: ppu vs. spu Two different compilers: ppu-gcc and spu-gcc Check the format of an executable
# file <executable>
● Makefile headers Reside in SDK root directory
make.headermake.footermake.env
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Getting started
# Subdirectories ###############DIRS := spu
# Target #######################PROGRAM_ppu:= simple
# Local Defines ################
IMPORTS := spu/lib_simple_spu.a \-lspe
# imports the embedded simple_spu # library allows consolidation of # spu program into ppe binary
# make.footer ##################include ../make.footer
# make.footer is in the top of # the SDK
# Target #######################PROGRAMS_spu := simple_spu# created embedded libraryLIBRARY_embed:= lib_simple_spu.a
# Local Defines################################IMPORTS = $(SDKLIB_spu)/libc.a
# make.footer ################################include ../../make.footer
# make.footer is in the top of # the SDK
● Makefile - ppu ● Makefile - spu
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Outline
● Getting started● SPU Language Extensions● SPE Library● SDK Libraries● Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
SPU Language Extensions
● Provides SPU functionality● Extended RISC instruction set● Extended Data types: vector● C/C++ intrinsic routines● Memory Flow Control (MFC)
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
SIMD Vectorization
● vector - Reserved as a C/C++ keyword● 128 bits of size
16 bytes, or 4 32-bit words, or etc.
● Single instruction acts on the entire vector
vector unsigned int vec = (vector unsigned int)
(1,2,3,4);vector unsigned int v_ones =
(vector unsigned int)(1);vector unsigned int vdest =
spu_add(vec, v_ones);
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
SIMD Vectorization
● Overridden C/C++ operators sizeof() (=16 for any vector) operator= operator&
● Various intrinsic functions for vectors Arithmetical – spu_add, spu_mult, etc. Logical – spu_and, spu_nor, etc. Comparison – spu_cmpeq Scalar – spu_extract, spu_insert, spu_promote Etc.
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
MFC routines
● Local Store (LS) vs. Effective Address (EA)● Data transport via DMA
Up to 16,384 bytes on DMA message Asynchronous Part of the instruction set, C/C++ intrinsics
mfc_get(&data,addr+16384*i,16384,20,0,0);
● What is __attribute__ ((aligned (128)));?
LS addressEA address
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
MFC example
int main() {
…
/* the array needs to be aligned on a 128-byte cache line */ data = (int *) malloc(127 + DATASIZE*sizeof(int)); while (((int) data) & 0x7f) ++data;
…
spe_create_thread(gid,&sample_spu,data,NULL,-1,0);
● ppu main file
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
MFC example
int main() {
…
/* the array needs to be aligned on a 128-byte cache line */ data = (int *)malloc_align(DATASIZE*sizeof(int),7);
…
spe_create_thread(gid,&sample_spu,data,NULL,-1,0);
● ppu main file
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
MFC example
int databuf [DATASIZE]__attribute__ ((aligned (128)));
int main(void * arg) { int *data = &databuf[0];
…
mfc_get(data,arg,DATASIZE*sizeof(int),20,0,0); mfc_write_tag_mask(1<<20); mfc_read_tag_status_all();
● spu main file
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Outline
● Getting started● SPU Language Extensions● SPE Library● SDK Libraries● Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
● Provides PPU functionality● Two sets of functions
Thread management – POSIX like MFC access functions – access to mailboxes
● Header: <libspe.h>
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Thread management
● Threads are organized in thread groups
Thread Group 0
Th0 Th1 Th2 ThNThread Group 1
Th0 Th1 Th2 ThN
Thread scheduling:•SCHED_RR•SCHED_FIFO•SCHED_OTHER
Priority:•RR and FIFO – 1 to 99•OTHER – 0 to 99
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Thread management
● Threads are organized in thread groups
spe_gid_t gid; // group handle speid_t speids[1]; // thread handle
// Create an SPE groupgid = spe_create_group (SCHED_OTHER, 0, 1);if (gid == NULL) exit(-1); // failed
// allocate the SPE taskspeids[0] = spe_create_thread(gid,&sample_spu,0,0,-1,0);if (speids[0] == NULL) exit (-1);
// wait for the single SPE to completespe_wait(speids[0], &status[0], 0);
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Thread management
● spe_get_affinity, spe_set_affinity 8-bit mask specifying SPE units where to start the
thread
● spe_get_ls Gets access to SPU’s local store Dirty!! Not supported by all operating systems Used for RPC communication
● spe_get/set_context, spe_get_ps
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Thread management
● Two ways of communicating with threads via signals via mailboxes
● POSIX-like Signals spe_get_event - check for signals from a thread spe_kill – send signal to a thread spe_write_signal – writes a signal through MFC
● MFC mailboxes spe_read_out_mbox, spe_write_in_mbox spe_stat_in_mbox, spe_stat_out_mbox,
spe_stat_out_intr_mbox
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Outline
● Getting started● SPU Language Extensions● SPE Library● SDK Libraries● Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
SDK Libraries
●The SDK comes with various applied libraries
Library name Short Description PPE SPE
C Library standard C99 functionality. POSIX.1 functions. x x
Audio Resample Library
audio resampling functionality for PPE and SPE x x
Curves and Surfaces Library
quadratic and cubic Bezier curves. Biquadric and bicubic Bezier surfaces, and curved point-normal triangles.
x x
FFT Library 1-D FFT and kernel functions for 2-D FFT x x
Game Math Library
math routines applicable to game performance needs x x
Image Library routines for processing images - convolutions and histograms
x x
Large Matrix Library
basic linear algebra routines on large vectors and matrices
x
Math Library general purpose math routines tuned to exploit SIMD x x
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
SDK Libraries
●The SDK comes with various applied libraries
Library name Short Description PPE SPE
Matrix Library routines for operations on 4x4 Matrices and quaternions x x
Misc Library set of general purpose routines that don’t logically fit within any other
x x
Multi-Precision Math Library
operations on unsigned integer numbers with large number of bits
x
Noise LibraryPPE
1-,2-, 3-, 4-D noise, Lattice and non-lattice noise, Turbulance
x x
Oscillator Libraries
definition of sound sources x x
Simulation Library
functionality related to the Full-Simulator - -
Sync Library synchronization primitives, like atomic operations, mutex x x
Vector Library a set of general purpose routines that operate on vectors. x x
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Exampletypedef union {
vector float fv;vector unsigned int uiv;unsigned int ui[4];float f[4];
} v128;typedef vector float floatvec;
…
v128 A[4] = {(floatvec)(0.9501,0.8913,0.8214,0.9218), (floatvec)(0.2311,0.7621,0.4447,0.7382), (floatvec)(0.6068,0.4565,0.6154,0.1763), (floatvec)(0.4860,0.0185,0.7919,0.4057)};
v128 invA[4];v128 x, y;x.fv = (floatvec)(1, 5, 6, 7);y.fv = (floatvec)(0);
inverse_matrix4x4(&invA[0].fv, &A[0].fv);
madd_vector_matrix(4,4,&invA[0].fv,4,(float *)&x,(float *)&y);
yAx ● Solving
from Matrix library
from LargeMatrix library. Solves:
y = A*x + y
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Outline
● Getting started● SPU Language Extensions● SPE Library● SDK Libraries● Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Function-Offload Model SPE threads as services PPE communicates with them thought RPC
calls● Remote Procedure Calls
stubs
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Preparing stubs Interface Description Language (IDL) idl files
interface add{ import "../stub.h"; const int ARRAY_SIZE = 1000; [sync] idl_id_t do_inv ([in] int array_size,
[in, size_is(array_size)] int array_a[], [out, size_is(array_size)] int array_res[]);
…}
idl compiler# idl -p ppe_sample.c -s spe_sample.c sample.idl
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Preparing stubs
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Example
● .idl file
interface add{ import "../stub.h"; const int ARRAY_SIZE = 1000; [sync] idl_id_t do_add ([in] int array_size, [in, size_is(array_size)] int array_a[], [in, size_is(array_size)] int array_b[], [out, size_is(array_size)] int array_c[]);
[sync] idl_id_t do_sub ([in] int array_size, [in, size_is(array_size)] int array_a[], [in, size_is(array_size)] int array_b[], [out, size_is(array_size)] int array_c[]);}
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Example
● spe do_add.c file (user file)#include "../stub.h"
idl_id_t do_add ( int array_size, int array_a[], int array_b[], int array_c[])
{ int i; for (i = 0; i < array_size; i++) { array_c[i] = array_a[i] + array_b[i]; } return 0;}
idl_id_t do_sub ( int array_size, int array_a[], int array_b[], int array_c[])
{ ... return 0;}
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Example
● ppe main.c file (user file)
#include "../stub.h"
int array_a[ARRAY_SIZE] __attribute ((aligned(128)));int array_b[ARRAY_SIZE] __attribute ((aligned(128)));int array_add[ARRAY_SIZE] __attribute ((aligned(128)));
int main(){ ...
do_add (ARRAY_SIZE, array_a, array_b, array_add); ...}
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
Conclusion
● Powerful SDK vector data type Extensive Matrix routines Extensive Signal Processing routines
● Game-industry driven But applicable for scientific work as well
Moscow-Bavarian Joint Advanced Student School19-29 March 2006, Moscow, Russia
References
● CBEA-Tutorial.pdf, SDK documentation● idl.pdf, SDK documentation● libraries_SDK.pdf, SDK Documentation● libspe_v1.0.pdf, SDK Documentation● SPU_language_extensions_v21.pdf
Sony online resources http://cell.scei.co.jp/pdf/SPU_language_extensions_v21.pdf, 15.03.2006
Thank you for your attention!
Top Related