Group Discussion

Group Discussion

Hong Man07/21/2010

1

UMD DIF with GNU Radio

From Will Plishker’s presentation.2

GRC

The DIF Package (TDP)

Platforms

GPUsMulti-processors

GNU Radio Engine Python/C++

Python Flowgraph

(.py)

3a) Perform online scheduling

DIF specification

(.dif)

3b) Architecture specification (.arch?)

Cell FPGA

XML Flowgraph

(.grc)

Schedule (.dif,

.sched)

4) Architecture aware MP scheduling

• (assignment, ordering, invocation)

• Processors• Memories• Interconnect

1) Convert or generate .dif file(Complete)

Platform Retargetable

Library

Uniprocessor Scheduling

Existing or Completed

Proposed

Legend

DIF Lite2) Execute static schedules from DIF (Complete)

SSP Interface with DIF• Currently DIF extracts dataflow model from GRC of GNU radio.

– GRC is at the waveform level (component block diagram)• To interact with DIF, we need to construct CL models at the

waveform level– Our current works are mostly at radio primitive level– We need to start waveform level CL modeling– Open questions:

• Mapping “things” and “paths” in CL models to “actors” in dataflow models

• Representing “data rates” (“tokens”) in CL models• “Processing delay” is missing in both models

3

Scheduling with Dataflow Models

• Scheduling based on dataflow models may achieve performance improvement with multi-rate processes (example from Will Plishker’s presentation)

• SDR at physical layer and MAC layer are mostly single-rate processes, and may not see significant performance improvement by using dataflow based scheduling

• Multicore scheduling is an interesting topic– Currently the assignments of “actors” to processors are

done manually

4

CD FIR1 FIR2 FIR3 FIR4 DAT1 1 2 3 4 7 5 7 4 1

e1 e2 e3 e4 e5

GPU and Multicore

• Our findings on CUDA– Many specialized library functions optimized for GPUs– Parallelization has to be implemented manually– UMD CUDA work (FIR and Turbo decoding) have not been

connected to their dataflow work yet• Some considerations

– Extend our investigation to OpenCL– Focus on CL modeling for multicore systems

• Automatically parallelize certain common DSP operations (e.g. FIR, FFT) from CL models

– Operation recognition and rule-based mapping

5

Next Step

• Beyond rehosting – optimal code generation– c/c++ → (CL model) → SPIRAL– c/c++ → (CL model) → CUDA or OPEN CL (GPU and

multicore)– c/c++ → (CL model) → c/c++ using SSE intrinsics

• CL modeling tasks– At both primitive level and waveform level– CL modeling from AST– DSP operation (or primitive) recognition– Code segment extraction, validation and transform

6

Group Discussion

Documents

Transcript of Group Discussion