Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for...

23
1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn, G. Bois, H. Guérard IRT Saint Exupéry, Space Codesign, Thales AVS

Transcript of Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for...

Page 1: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

1

31/01/2020

Make Life Easier for Embedded Software EngineersFacing Complex Hardware Architectures

R. Leconte, E. Jenn, G. Bois, H. Guérard

IRT Saint Exupéry, Space Codesign, Thales AVS

Page 2: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

2

AGENDA

• SpaceStudio and deployment to hybrid HW/SW platform

• TOAST (Tool for OpenMP Annotations to Space design Translation) and its purpose

• OpenMP Offloading

• Example on a LineDetection algorithm

Page 3: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

3

Sequential C code

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)SpaceModules

• Workflow based on an existing tool: SpaceStudio to exposeparallelism• Encapsulate logic inside SpaceModules

Page 4: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

4

SpaceModules

• Use OpenMP to expose parallelism• Then TOAST generates SpaceModules

• Finally SpaceStudio deploys to the target platform

OpenMP annotated C codeSequential C code

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

OpenMP

Offloading

Page 5: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

5

• OpenMP allows to offload computations to an accelerator:• GPU, FPGA, etc.

SpaceModulesSequential C code OpenMP annotated C code

int a = 2;

int b = 12;

b = a + 2;

printf("b = %d\n", b);

// b = 4

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

Page 6: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

6

• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.

SpaceModulesSequential C code OpenMP annotated C code

int a = 2;

int b = 12;

#pragma omp target map(from:b)

{

b = a + 2;

}

printf("b = %d\n", b);

// b = 4

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

Page 7: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

7

int a = 2;

int b = 12;

#pragma omp target map(from:b)

{

b = a + 2;

}

printf("b = %d\n", b);

// b = 4

• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.

SpaceModulesSequential C code OpenMP annotated C code

FPGA

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

CPU

Page 8: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

8

Template code

Host code

Communications

Host code

Host.cpp

Template code

Communications

Offloaded code

Communications

Target0.cpp

Compilation

High Level

Synthesis

CPU

int a = 2;

int b = 12;

#pragma omp target map(from:b)

{

b = a + 2;

}

printf("b = %d\n", b);

// b = 4

• OpenMP allows to offload computations to an accelerator• GPU, FPGA, etc.

SpaceModulesSequential C code OpenMP annotated C code

FPGA

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

Page 9: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

9

#pragma omp target

{

}

Code to

offload

Communications

Communications

Template code

Communications

Host Code

Host Code

Offloaded Code

Host Code

Host Code

OpenMP Source Code with Offloading

Host Source Code

TargetSource Code

Template code

Host.cpp

Target0.cpp

• TOAST converts OpenMP annotated code into C/C++ code using SpaceStudio Communication APIs

Page 10: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

10

Project

C

Host

SpaceModule

Template

C

Host code

SpaceModule

CTarget

SpaceModule

Template

C

Target code

SpaceModule

Apply

Replacements

+

TOASTTool for OpenMP Annotations

to Space design Translation

C

code with

Offloading

Match AST

Nodes

Call Match

CallbackHost Source

Replacements

C

Modified

Host code

Target Source

elements

Add Template

code

Instantiate Template

Clang LibTooling

AST Matcher Match

Callback

TOAST Architecture

Page 11: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

11

Edge

Detection

Line

DetectionFiltering

Input Image Detected Lines

Edge

Detection

Line

Detection

Input Image Detected Lines

Send

Images

Receive

and

Display

Lines

Receive

Images

Send

Lines

Z-turnFiltering

FPGA

Cortex-A9Communi-

cations

Line Detection and test bench

NetworkNetwork

LineDetection Use Case

Page 12: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

12

Slower because of :

• Communications

• Low FPGA frequency

Convolution

Accelerator

Host

Host

#pragma omp declare targetvoid sub_filter(int line_start, int line_end, int height, int width, int kernel_size, unsigned char* source_image, unsigned char* result_image) {

// convolution accessing source_image and writing results in result_image...

}#pragma omp end declare target... #pragma omp target map(always, to: source_image[0:IMAGE_SIZE])

map(from: result_image[0:IMAGE_SIZE]){

sub_filter(0, height, height, width, kernel_size, source_image, result_image);}

Communications

LineDetection: One Accelerator

Page 13: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

13

#pragma omp target map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])

{sub_filter(0, height, height, width, kernel_size, source_image, result_image);

}

#pragma omp target data map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])

{#pragma omp target map(to: source_image[0: SLICE_SIZE + OVERLAP_SIZE])

map(from: result_image[0: SLICE_SIZE]) nowait{ sub_filter(0, height / 2, height, width, kernel_size, source_image, result_image); }

#pragma omp target map(to: source_image[SLICE_SIZE - OVERLAP_SIZE: SLICE_SIZE + 2*OVERLAP_SIZE])map(from: result_image[SLICE_SIZE:SLICE_SIZE]) nowait

{ sub_filter(height / 2, height, height, width, kernel_size, source_image, result_image); }}

LineDetection : Several Accelerators

Page 14: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

14

LineDetection : Several Accelerators

#pragma omp target data map(always, to: source_image[0:IMAGE_SIZE]) map(from: result_image[0:IMAGE_SIZE])

{#pragma omp target map(to: source_image[0: SLICE_SIZE + OVERLAP_SIZE])

map(from: result_image[0: SLICE_SIZE]) nowait{ sub_filter(0, height / 2, height, width, kernel_size, source_image, result_image); }

#pragma omp target map(to: source_image[SLICE_SIZE - OVERLAP_SIZE: SLICE_SIZE + 2*OVERLAP_SIZE])map(from: result_image[SLICE_SIZE:SLICE_SIZE]) nowait

{ sub_filter(height / 2, height, height, width, kernel_size, source_image, result_image); }}

Accelerator

Convolution

Host

Host

Evaluation of the solution :

• Execution time

• Space occupation on the FPGA

Page 15: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

15

LineDetection : Several Accelerators

Accelerator

Convolution

Host

Host ~60ms

~100msAccelerators

Convolution

Host

Host ~60ms

~40ms

Small code changes for

architecture exploration

Possible additional work:

• HLS pragmas -> ~17ms

Page 16: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

16

Sequential C code OpenMP annotated C code

8 identical

accelerators

• OpenMP allows to offload computations to multipleaccelerators• Architecture exploration:

allows to find the good offloading pattern

Accelerators

Convolution

Host

Host

Parallel SW

architecture on

Hybrid platform

(CPU + FPGA)

Page 17: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

17

Xilinx : Zynq Intel : Cyclone

Architectures in SpaceStudio

Page 18: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

18

Page 19: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

19

Page 20: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

20

Page 21: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

21

Video: application running on Z-turn

Page 22: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

22

Conclusion

TOAST allows to leverage the OpenMP Offloadingstandard to deploy software on FPGAs using SpaceStudio

• OpenMP is a well established standard

Our workflows allows:• Design space exploration

• Deployment on SoCs featuring an FPGA

• Support multiple vendors (Xilinx, Intel)

Page 23: Make Life Easier for Embedded Software Engineers Facing … · 1 31/01/2020 Make Life Easier for Embedded Software Engineers Facing Complex Hardware Architectures R. Leconte, E. Jenn,

2323

Thank you for your attention

© IRT AESE ”Saint Exupéry” - All rights reserved. This document and all information contained herein is the sole property of IRT

AESE “Saint Exupéry”. No intellectual property rights are granted by the delivery of this document or the disclosure of its content. IRT

AESE ”Saint Exupéry” and its logo are registered trademarks.