Group Discussion
-
Upload
callum-rocha -
Category
Documents
-
view
17 -
download
0
description
Transcript of Group Discussion
Group Discussion
Hong Man07/21/2010
1
UMD DIF with GNU Radio
From Will Plishker’s presentation.2
GRC
The DIF Package (TDP)
Platforms
GPUsMulti-processors
GNU Radio Engine Python/C++
Python Flowgraph
(.py)
3a) Perform online scheduling
DIF specification
(.dif)
3b) Architecture specification (.arch?)
Cell FPGA
XML Flowgraph
(.grc)
Schedule (.dif,
.sched)
4) Architecture aware MP scheduling
• (assignment, ordering, invocation)
• Processors• Memories• Interconnect
1) Convert or generate .dif file(Complete)
Platform Retargetable
Library
Uniprocessor Scheduling
Existing or Completed
Proposed
Legend
DIF Lite2) Execute static schedules from DIF (Complete)
SSP Interface with DIF• Currently DIF extracts dataflow model from GRC of GNU radio.
– GRC is at the waveform level (component block diagram)• To interact with DIF, we need to construct CL models at the
waveform level– Our current works are mostly at radio primitive level– We need to start waveform level CL modeling– Open questions:
• Mapping “things” and “paths” in CL models to “actors” in dataflow models
• Representing “data rates” (“tokens”) in CL models• “Processing delay” is missing in both models
3
Scheduling with Dataflow Models
• Scheduling based on dataflow models may achieve performance improvement with multi-rate processes (example from Will Plishker’s presentation)
• SDR at physical layer and MAC layer are mostly single-rate processes, and may not see significant performance improvement by using dataflow based scheduling
• Multicore scheduling is an interesting topic– Currently the assignments of “actors” to processors are
done manually
4
CD FIR1 FIR2 FIR3 FIR4 DAT1 1 2 3 4 7 5 7 4 1
e1 e2 e3 e4 e5
GPU and Multicore
• Our findings on CUDA– Many specialized library functions optimized for GPUs– Parallelization has to be implemented manually– UMD CUDA work (FIR and Turbo decoding) have not been
connected to their dataflow work yet• Some considerations
– Extend our investigation to OpenCL– Focus on CL modeling for multicore systems
• Automatically parallelize certain common DSP operations (e.g. FIR, FFT) from CL models
– Operation recognition and rule-based mapping
5
Next Step
• Beyond rehosting – optimal code generation– c/c++ → (CL model) → SPIRAL– c/c++ → (CL model) → CUDA or OPEN CL (GPU and
multicore)– c/c++ → (CL model) → c/c++ using SSE intrinsics
• CL modeling tasks– At both primitive level and waveform level– CL modeling from AST– DSP operation (or primitive) recognition– Code segment extraction, validation and transform
6