Download - microlab, NTUA, GR › ~specs › events › wrc2020 › files › Dimitrios Soudri… · microlab, NTUA, GR keywords processing platforms FPGA acceleration vision-based navigation

microlab, NTUA, GR

keywords▪ processing platforms▪ FPGA acceleration▪ vision-based navigation▪ …and more

Prof. Dimitrios SoudrisNational Technical Univ. of Athens

[email protected]

NTUA, 2019

1) acceleration in space applications ▪ survey & benchmarking of processing platforms

▪ comments based on past experience

2) application: autonomous rover navigation (SPARTAN)▪ project overview & results, future steps

3) application: spacecraft proximity operations (HIPNOS)▪ project overview & results, future steps

NTUA, 2019

Quick overview of our projects

NTUA, 2019

ESA activities

SPARTAN/SEXTANT/COMPASS (2011-2016)▪ Vision Based Navigation for Mars Rovers, with FPGA co-processor

▪ accelerated multiple odometry and 3D mapping algorithms

HIPNOS (2016-2017)▪ avionics solutions for high-performance tasks based on SoC-FPGA

▪ demo with pose estimation algorithm for Active Debris Removal

QUEENS (2017-2018)▪ tool-chain testing, benchmarking, hands-on demos of new EU FPGA

Radiation Testing of COTS FPGAs (2018)▪ Dec’17 at CERN test (SPS!) for SEE, Napoli (INFN) and ESTEC for TID

Porting of algorithms on new embedded devices (2019)▪ on Intel/Movidius Myriad2 DSP and on Nanoxplore NG-LARGE FPGA

NTUA, 2019

group platforms & compare (form clouds of results)

▪ CPU: space+embed similar, 1-2 orders slower than desktop

▪ GPU: 1-2 orders better performance/Watt than any CPU (vs desktop GPU, mobiles just trade 1 order of perf. for Watt)

▪ mutli-DSP: better performance/Watt than mobile-GPU

▪ FPGA: highest perf/Watt (by orders), almost highest perform.

NTUA, 2019

• FPGAs = best choice for HW acceleration, next to rad-hard CPU• outperform mobile GPUs and DSP multi-cores• allow for effective hardening even with COTS versions

• significant acceleration with limited power, e.g., P ≤ 10 Watts• 10x at system level (HW/SW) vs latest rad-hard CPUs• 100-1000x at function level vs conventional rad-hard CPUs

• necessary for future on-board processing• for high rate & high resolution images (e.g., unprepared docking)• for increased autonomy and faster exploration (e.g., rovers)• offload OBC, do data fusion, other real-time payload processing

• European technology catching up (rad-hard BRAVE FPGAs)

• started examining Myriad2 for space → also very interesting

NTUA, 2019

SPARTAN/SEXTANT/COMPASS(app= rover navigation)

NTUA, 2019

SPARTAN/SEXTANT/COMPASS projects (ESA)

❑ NTUA (GR), GMV (ES), FORTH (GR), DUTH (GR)

▪ HW/SW co-design of rover navigation algorithms

▪ emulate Martian scenarios and space-grade devices

▪ project time to 150 MIPS CPU, use limited FPGA resources

▪ synthetic datasets of Mars, real images of Mars-like terrains

✓ “localization” in 1sec, “mapping” in 20sec

April 2011 SPARTAN

(SPAring Robotics Technologies for Autonomous Navigation)

July 2013

April 2012 SEXTANT

(Spartan EXTensionActivity – Not Tendered)

May 2014

June 2014 COMPASS

(Code OptimisationModication

Partitioning)

NTUA, 2019

stereo camera

MER Rovers (2003) Curiosity (2012) ESA ExoMars (2020)

3D map

rover position (at every step)

Martian Rover

CPU

NTUA, 2019

highly complex Computer Vision algorithms low processing power CPUs (space-grade )➢ huge execution time, not very practical to use

➢MER rover: speed only 10 m/h with VO (124 without!)➢ used only on sand terrains and slopes (high slippage)

future: faster + more accurate (more complex!)

▪ considering our own/proposed CV algorithms:➢ 1 hour for 3D map on 150 MIPS CPU (budget = 20sec)➢ 1 minute for 1 step on 150 MIPS CPU (budget = 1sec)➢ looking for speed-up factors 10x to 1000x

NTUA, 2019

mapping mode (3D reconstruction of scene)

▪ use “navigation” camera (high-definition stereo)

▪ 1m above ground, 20cm baseline (parallel), tilted 39o

▪ generate 3d map of 4m radius (120o) in front of rover

▪ error < 2cm, execution time per map < 20 sec

NTUA, 2019

localization mode (6D pose of the rover)

▪ use “localization/hazard-avoidance” camera (stereo)

▪ 30cm above ground, 12cm baseline, tilted 31.55o, FoV 660

▪ rover stops every ~6cm to acquire new image (1Hz rate)

▪ estimate x-y-z position and pitch-roll-yaw of rover

▪ error < 2m after 100m path (2%, attitude

NTUA, 2019

generic rover geometry several CV algorithms multiple platforms

Xilinx Virtex6XC6VLX240T

- Intel Core 2 Duo- Executing C algorithms(time scaled to 150 MIPS)

- Calling FPGA accelerators

NTUA, 2019

synthetic videos

▪ 3DROV simulator

▪ mix of sand,rocks, diffuse lighting

▪ loc.: 512x384px

map.: 1120x1120px

real videos

▪ Atacama, Chile

▪ Devon, Canada

▪ Thrace, Greece

s1 s2 s3

r1 r2 r3

3DROV

NTUA, 2019

3D reconstruction(mapping)

▪ Disparity, Spacesweep

Visual Odometry(localization)

▪ Feature detection

▪ SURF, Harris, FAST

▪ Feature description

▪ SIFT, SURF, BRIEF

▪ Feature matching

▪ distances L1,L2,x2,Hamming

▪ Filtering and egomotion

▪ absolute orientation, LHMhistogram

of gradients

matching of histograms

image 2image 1

NTUA, 2019

Mapping Mode Functional Dependency

Laye

r 2

Laye

r 1

Laye

r 3

Laye

r 4

Functional Phase

Demo mapping

Imaging 3D Reconstruction

Debayer ContrastRectify disparity mapmergemapgen

Superimposistion

Edge detection Normalize

Absolutedifferences

Normalize ADs

Gaussian weight

Aggregation

Minimum disparity search

subpixel interpolation

Co

arse

-gra

in

anal

ysis

Fin

e-g

rain

an

alys

is

Component Mult-Div Add-sub-comp Typical param.debayer 6×W×H 4×W×H W×H=1120×1120contrast 2×W×H 2×W×H -rectify 8×W×H 6×W×H -edge detection 2×k2×W×H 2× (k2+20)

×W×Hk=13

superimposition 0 2×W×H -normalize 2×W×H 2×W×H -absolute differences 0 6×D×W×H D=200normalize ADs 2×D×W×H 2×D×W×H -aggregation 2×l2×D×W×

H2×l2×D×W×H l=19

min disparity search 0 2× (D+1) ×W×H -interpolation W×H 13×W×H -map generation 3×W×H 5×W×H -map merge 9×W×H 6×W×H -

NTUA, 2019

to FPGA: repetitive & computationally intensive functions to CPU: high program complexity & lightweight functions

NTUA, 2019

Target low-cost implementations▪ especially w.r.t. memory: bottleneck for CV on FPGA

▪ resource reuse: decompose input data, process successively

Target sufficient speed-up (for ESA specs) ▪ pipelining on pixel-basis

▪ burst read of image, transform on-the-fly (1 datum/cycle)▪ parallel memories & parallel processing elements

▪ parallel calculation of arithmetic formulas

Target configurability (tuning, adaptation)▪ parametric VHDL: data size, accuracy, parallelization,

NTUA, 2019

X1 X2 X3 X4

f(X)

NTUA, 2019

multiple accelerators on FPGA

▪ Disparity, Spacesweep

▪ SURF detector, SURF descriptor, SIFT descriptor, Harris, matching

➢ significant speedups 62x – 1111x

multiple CPU-FPGA pipelines

▪ 2 for mapping, 5 for localization

▪ speedup 16x – 444x, meet specs

3D map: accuracy 2cm at 4m depth, 120oFoV, 97% coverage

localization: accuracy 1.3m in 100m paths, and 5o in attitude

NTUA, 2019

Localization at 1sec▪ system speedup = 20x

Mapping at 8.4sec▪ system speedup = 444x

accelerators’ speedup

xc6vlx240t@172MHz vs. 150 MIPS CPU

▪SpaceSweep: 637x

▪Disparity: 120x

▪Harris detector: 75x

▪SURF detector: 56x

▪SIFT descriptor: 100x

▪SURF descriptor: 84x

▪SIFT matching: 180x

▪BRIEFmatching: 100x

▪Communication 81 Mbps

NTUA, 2019

FPGAs enable more complex/accurate/robust CV algorithms

▪ respecting given time constraints during rover traverse

FPGA acceleration would also enable bigger range for rovers

▪ acceleration= 10x at system level (HW/SW) vs rad-hard CPUs

▪ 100-1000x at function level vs conventional rad-hard CPUs

but, need considerable effort to optimize the design, especially when targeting resource-constrained space-grade devices

▪ efficiency-driven & demanding projects →manual coding▪ understand the algorithm (in-depth profiling/analysis)▪ design parallel architectures, think parallel

NTUA, 2019

HIPNOS(app= spacecraft proximity operations)

NTUA, 2019

High Performance Avionics Solution for Advanced and Complex GNC Systems

ESA, program GSP, low TRL

▪ scenario: VBN for e.Deorbit mission

▪ focus: avionics, COTS accelerators

▪ task: study & select best platform

▪ goal: design new avionics architect-ture + CV algorithm (estimate pose)

➢ accelerate 10x faster vs conventional

July 2016 HIPNOS October 2017

NTUA, 2019

Active Debris Removal missions

▪ chaser autonomously tracks/syncs with uncooperative target →VBN

▪ e.Deorbit (frozen), rendezvous this:

very high computational needs!

▪ real-time processing of HD images

▪ 1 Mpix at 5-10 fps at critical stages

▪ 10x more than rovers, + hard limits

▪ increased accuracy, e.g., 1% error

▪ no markers → complex algorithms

▪ but, short LEO mission →COTS?

ENVISAT2.5x2.5x10 m3

8 tons

2o/s spin

LEO

NTUA, 2019

RESOURCES

• tested on biggest Zynq7000 FPGA (xc7z100-2 of MMP)

• 36% LUTs, 48% DSPs, 77% RAMBs, Fmax>200MHz▪ most demanding is Renderer (94% logic of design)

• power≈4.5W (peak 9W) (CPU@667MHz, PS@200MHz)

• rough estimations for other FPGA devices

• xc7z045/xc7z030 (smaller): maybe feasible, requires much optimization, tolerable penalty in time/accuracy

• zu19eg (big upcoming RT): easy fit, utilization

NTUA, 2019

FROM TRADE-OFF STUDY

• latest space-grade CPUs 10x faster than predecessors, still slow for high-performance VBN (e.g., 0.1x)

• by offering best perf/Watt vs all platforms, FPGAs can bridge the gap with reasonable power (

NTUA, 2019

now using Myriad2 as SoC acceleration platform (new ESA pr.) board = EOT (from H2020)

▪ very small, lower power, very few I/Fs (+GPIOs!)

▪ now also being tested for radiation resilience (ESA)

NTUA, 2019

contact▪ Professor Dimitrios Soudris [email protected]

▪ Senior researcher George Lentaris [email protected]

▪ PhD student Kostantinos Maragos [email protected]

▪ PhD student Ioannis Stratakos [email protected]

▪ PhD student Ioannis Stamoulias [email protected]

▪ PhD student Vasileios Leon [email protected]

links▪ http://www.microlab.ntua.gr/

▪ https://microlab.ntua.gr/academics/dimitrios-soudris/

▪ http://users.uoa.gr/~glentaris
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]://www.microlab.ntua.gr/http://users.uoa.gr/~glentaris