Dan Werthimer & Aaron Parsons University of California, Berkeley
description
Transcript of Dan Werthimer & Aaron Parsons University of California, Berkeley
Correlators, Spectrometers, Beam Formers and VLBI using general purpose FPGA boards, tools & libraries
(how to build eight radio astronomy instruments in two years)
Dan Werthimer & Aaron ParsonsDan Werthimer & Aaron ParsonsUniversity of California, BerkeleyUniversity of California, Berkeley
http://seti.berkeley.eduseti.berkeley.edu
Our research group is really 3 Our research group is really 3 groupsgroups• SETI SETI (plus primordial black holes, HI mapping)(plus primordial black holes, HI mapping)
• Public Participation Distributed Public Participation Distributed ComputingComputing
• CASPER – Center for Astronomy Signal CASPER – Center for Astronomy Signal Processing and Electronics ResearchProcessing and Electronics Research
SETI GroupSETI Group
David Anderson, Bob Bankay, Court David Anderson, Bob Bankay, Court Cannick,Cannick,
Jeff Cobb, Kevin Douglas, Josh Von Korff,Jeff Cobb, Kevin Douglas, Josh Von Korff,
Eric Korpela, Matt Lebofsky, Dan Eric Korpela, Matt Lebofsky, Dan WerthimerWerthimer
UC Berkeley SETI ProgramsUC Berkeley SETI ProgramsName Time Scale Search Type
SERENDIP seconds radio sky survey
SETI@home mS - seconds radio sky survey
Astropulse nS - mS radio sky survey
SEVENDIP nS visible targetted
SPOCK 1000 seconds visible targetted
DYSON IR targetted
Public Participation Supercomputing GroupPublic Participation Supercomputing Group
David Anderson, Rom Walton, SETI GroupDavid Anderson, Rom Walton, SETI Group
• aka Distributed Computingaka Distributed Computing
• aka “edge resource aggregation”)aka “edge resource aggregation”)
The SETI@home ClientThe SETI@home Client
5,464,550 participants (in 226 countries)
2,000 per day
2.3 million years computer time
1,200 years per day
4*1021 floating point operations
95 Tera-flops
SETI@home Statistics
TOTAL RATE
BOINC: BOINC: NSFNSF
• Berkeley Open Berkeley Open Infrastructure for Network Infrastructure for Network ComputingComputing
– General-purpose distributed General-purpose distributed computing framework.computing framework.
– Open source.Open source.
– Will make distributed Will make distributed computing accessible to computing accessible to those who need it. (Starting those who need it. (Starting from scratch is hard!)from scratch is hard!)
ProjectsProjects• AstronomyAstronomy
– SETI@home (Berkeley) SETI@home (Berkeley)
– Astropulse (Berkeley)Astropulse (Berkeley)
– Einstein@home: gravitational pulsar search (Caltech,…)Einstein@home: gravitational pulsar search (Caltech,…)
– PlanetQuest (SETI Institute)PlanetQuest (SETI Institute)
– Stardust@home (Berkeley, Univ. Washinton,…)Stardust@home (Berkeley, Univ. Washinton,…)
• Earth scienceEarth science
– Climateprediction.net (Oxford)Climateprediction.net (Oxford)
• Biology/MedicineBiology/Medicine
– Folding@home, Predictor@home (Stanford, Scripts)Folding@home, Predictor@home (Stanford, Scripts)
– FightAIDSathome: virtual drug discoveryFightAIDSathome: virtual drug discovery
• PhysicsPhysics
– LHC@home (Cern)LHC@home (Cern)
• OtherOther
– Web indexing/searchWeb indexing/search
– Internet Resource mapping (UC Berkeley)Internet Resource mapping (UC Berkeley)
Where's the computing power?
●2010: 1 billion Internet-connected PCs
●55% privately owned
● If 100M participate:
– 100 PetaFLOPs, 1 Exabyte (10^18) storage
your computers
academic
business
home PCs
CASPER:CASPER:
Center for Radio Astronomy Signal Processing and Electronics Center for Radio Astronomy Signal Processing and Electronics ResearchResearch
Henry Chen, Daniel Chapman, Pat Crescini, Christina DeJesus, Pierre Henry Chen, Daniel Chapman, Pat Crescini, Christina DeJesus, Pierre DrozDroz
Kirsten Meder, Jeff Mock, Aaron Parsons, Andrew Siemion, Dan Kirsten Meder, Jeff Mock, Aaron Parsons, Andrew Siemion, Dan WerthimerWerthimer
Radio Astronomy Lab Don Backer, Paul Demorest, Matt Dexter,
Carl Heiles, David McMahon, Mel Wright, Lynn Urry
Berkeley Wireless Research CenterBob Broderson, Chen Chang, John Wawrzynek
SETI InstituteDave Deboer
Casper OriginsCasper Origins
• NSF proposal to build SETI spectrometer NSF proposal to build SETI spectrometer (2003)(2003)
(added one paragraph: BTW, this can be used for other (added one paragraph: BTW, this can be used for other astronomy instrumentation, potential spin offs are ….)astronomy instrumentation, potential spin offs are ….)
Reviewer’s comments Reviewer’s comments (paraphased):(paraphased):
~”SETI is bullshit, SETI will never find anything,~”SETI is bullshit, SETI will never find anything,
But these instruments are useful for the community, But these instruments are useful for the community, strongly recommend funding”strongly recommend funding”
CASPER Real-time Signal Processing CASPER Real-time Signal Processing InstrumentationInstrumentation
(NSF ATI)(NSF ATI)• Low NRE, shared by the communityLow NRE, shared by the community
• Rapid development Rapid development (8 instruments / 2 (8 instruments / 2 years)years)
• Open-source, collaborativeOpen-source, collaborative
• Reusable, platform-independent Reusable, platform-independent gatewaregateware
• Modular, upgradeable hardwareModular, upgradeable hardware
• Industry standard communication Industry standard communication protocolsprotocols
• Low CostLow Cost
MOTIVATIONMOTIVATION
ATA, SKA, Focal Plane Arrays, ATA, SKA, Focal Plane Arrays, SETI,SETI,
need >> PetaOp/secneed >> PetaOp/sec
Instruments take a long time to Instruments take a long time to build, very high NREbuild, very high NRE
Allen Telescope ArrayAllen Telescope Array•6.1-meter offset Gregorian (2.4-meter secondary)
•rim-supported, hydroformed dishes
ATA-42 Operational This ATA-42 Operational This SummerSummer
The Radio RevolutionThe Radio Revolution
Inner core
Station
SKA Square Kilometer Array
The Problem with the The Problem with the CurrentCurrentHardware Development Hardware Development ModelModel• Takes 5 yearsTakes 5 years
• Cost Dominated by NRE because Cost Dominated by NRE because of custom Boards, Backplanes, of custom Boards, Backplanes, ProtocolsProtocols
• Antiquated by the time it’s Antiquated by the time it’s released.released.
Solution:Solution:
• Modular HardwareModular Hardware
– Low number of board designsLow number of board designs
– Can be upgraded piecemeal or all Can be upgraded piecemeal or all togethertogether
– ReusableReusable
– Standard signal processing model Standard signal processing model which which
is consistent between upgrades.is consistent between upgrades.
Solution: use FPGA’sSolution: use FPGA’s
1 FPGA = 100 Pentium, 1/500 the power per 1 FPGA = 100 Pentium, 1/500 the power per opop
Computational Density Comparison
1000
10000
100000
1000000
10000000
10/28/1995
3/11/1997
7/24/1998
12/6/1999
4/19/2001
9/1/2002 1/14/2004
Release Date
(MO
PS
/MH
z)*l
am
da
^2 Processor Peak
FPGA 32-bit int MAC
FPGA maximum sustained performance
1
10
100
1000
10000
100000
12/1/1996
6/19/1997
1/5/1998
7/24/1998
2/9/1999
8/28/1999
3/15/2000
10/1/2000
4/19/2001
11/5/2001
5/24/2002
Release date
MO
PS
(3
2 b
it M
AC
)3X improvement3X improvementper year!per year!
Moores Law for FGPA’s
FPGA = Field Programmable Gate Array
reconfigurable computing - 1 minute100 times faster than CPU, 5 times less powerinteger arithmetic, not good at F. Point highly parallel (500 multipliers per chip)harder to program (mathlab simulink)tools to abstract the hardware awaysignal processing libraries
Compute Module DiagramCompute Module Diagram
138 bits 300MHz DDR 41.4Gb/s
4GB DDR2 DRAM12.8GB/s (400DDR)
100BTEthernet
5 FPGAs2VP70FF1704
FPGAFabric
MG
T
Memory Controller
IB4X/CX4 20Gbps
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
DR
AM
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MG
T
Memory Controller
FPGAFabric
MGT
Memory Controller
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
IB4X/CX4 40Gbps
Platform-Independent, Platform-Independent, Parameterized GatewareParameterized Gateware
• What is Gateware?What is Gateware?
– Design logic of FPGAs Design logic of FPGAs
(between hardware and software)(between hardware and software)
• Need libraries for signal Need libraries for signal processing which don’t have to processing which don’t have to be rewritten every hardware be rewritten every hardware generation.generation.
• Matlab Simulink!Matlab Simulink!
Biplex Pipelined FFTBiplex Pipelined FFT
• Uses 1/6 the resources of the Xilinx Uses 1/6 the resources of the Xilinx module.module.
FFT controls FFT controls (Verilog and Simulink (Verilog and Simulink Libraries)Libraries)
• Transform lengthTransform length
• BandwidthBandwidth
• Complex or RealComplex or Real
• Number of PolarizationsNumber of Polarizations
• Input bit width and output bit widthInput bit width and output bit width
• twiddle coefficient bit widthtwiddle coefficient bit width
• Run-time programmable down-shiftingRun-time programmable down-shifting
• Decimate optionDecimate option
Filter Response:PFB vs. FFT
PFB vs. FFTPFB vs. FFT
Additional PFB controlsAdditional PFB controls• Filter overlapFilter overlap
• Width of filter coefficientsWidth of filter coefficients
• Window function for filter (hamming, hanning, etc.) Window function for filter (hamming, hanning, etc.)
• Import filter coefficients for custom filter performanceImport filter coefficients for custom filter performance
Both FFT and PFB available as Verilog modulesBoth FFT and PFB available as Verilog modules
(no proprietary software, but not as portable(no proprietary software, but not as portable
between chips/architectures).between chips/architectures).
Digital Down-ConverterDigital Down-Converter
• Selectable # of FIR tapsSelectable # of FIR taps
• On-the-fly programmable mix On-the-fly programmable mix frequencyfrequency
• Selectable FIR coeffSelectable FIR coeff
• Agile sub-band selection.Agile sub-band selection.
X-Engine Correlation X-Engine Correlation Architecture (Lynn Urry, Architecture (Lynn Urry, Aaron Parsons)Aaron Parsons)
X-Engine Architecture:X-Engine Architecture:applied to an arbitrary applied to an arbitrary sized antenna arraysized antenna array
Hardware and Software Hardware and Software LibrariesLibrarieslegend:legend:
ApplicationsApplications
Global InterconnectsGlobal Interconnects• Commercial Infiniband Commercial Infiniband
switch from Mellanox, switch from Mellanox, Voltaire, etc.Voltaire, etc.– Packet switched, non-Packet switched, non-
blockingblocking
– 24 ~ 144 ports (4X) per 24 ~ 144 ports (4X) per chassischassis
– Up to 10,000 ports in a Up to 10,000 ports in a systemsystem
– 200~1000 ns switch 200~1000 ns switch latencylatency
– 400~1200 ns FPGA to 400~1200 ns FPGA to FPGA latencyFPGA latency
– 480Gbps ~ 2.88Tbps full 480Gbps ~ 2.88Tbps full duplex constant cross duplex constant cross section bandwidthsection bandwidth
– <$400 per port<$400 per port
ComputeNode
#N
ComputeNode
#1
Infiniband Crossbar Switch
Ethernet Switch
Commercial off-the-shelfMulticast 10 Gbps (10GE
or InfiniBand) Switch
PFBADCFPGA DSP
Module
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
FPGA DSPModule
General-purpose CPUs
PFB
PFB
.
.
.
Correlator
Beamformers/Spectrometers
Pulsar timer
.
.
.
ReconfigurableCompute Cluster
ADC
ADC
PolyphaseFilter Banks
.
.
.
.
.
.
Beowulf Cluster Like General Purpose ArchitechtureDynamic Allocation of Resources, need not be FPGA based
Targeted ApplicationsTargeted Applications
• Moderate to high-bandwidth Moderate to high-bandwidth problemsproblems
– For low bandwidths, just use CPUsFor low bandwidths, just use CPUs
• Lower to mid-scale computationLower to mid-scale computation
– For very large applications (SKA), may For very large applications (SKA), may be more cost effective to design ASICsbe more cost effective to design ASICs
• Rapid DevelopmentRapid Development
ApplicationsApplications• VLBI Mark 5B data recorder - Haystack – 500 MHzVLBI Mark 5B data recorder - Haystack – 500 MHz
VLBA and Beamforming - VLBA and Beamforming - CfA, Bob Wilson, Jonathan WeintroubCfA, Bob Wilson, Jonathan Weintroub
• SETI – Arecibo (UCB)SETI – Arecibo (UCB)
ATA (UCB, Seti Institute) ATA (UCB, Seti Institute)
JPL/UCB/SI DSN 20 GHz, 2pol JPL/UCB/SI DSN 20 GHz, 2pol (Preston, Gulkis, Levin, Jones)(Preston, Gulkis, Levin, Jones)
• Correlators and Imagers: Correlators and Imagers:
ATA (Mel Wright)ATA (Mel Wright)
Reionization Experiment (Backer/NRAO) Reionization Experiment (Backer/NRAO)
Carma Next Gen (Dave Hawkins, Caltech)Carma Next Gen (Dave Hawkins, Caltech)
SKA demonstrator South Africa (Justin Jonas)SKA demonstrator South Africa (Justin Jonas)
MWAR, LWA – MIT, NRL MWAR, LWA – MIT, NRL
128 Million Channel SETI 128 Million Channel SETI SpectrometerSpectrometer
• 200 MHz Bandwidth, 2 Hz resolution200 MHz Bandwidth, 2 Hz resolution
1 GHz bandwidth 1 GHz bandwidth “Pocket Spectrometer”“Pocket Spectrometer”
• Using ATMEL ADC’s at 2 Gsamples/secUsing ATMEL ADC’s at 2 Gsamples/sec
• Performing 4 real FFT’s in 1 (complex) Performing 4 real FFT’s in 1 (complex) biplex pipelined FFT module.biplex pipelined FFT module.
• 2048 channels2048 channels
• Uses just 1 ADC, 1 IBOB, and your Uses just 1 ADC, 1 IBOB, and your laptop.laptop.
Portable VLBI backendPortable VLBI backend
• Interfaces to MARK 5B data recorderInterfaces to MARK 5B data recorder
• 500 MHz spectrum recorder.500 MHz spectrum recorder.
• (This makes 4 instruments in 1 year!)(This makes 4 instruments in 1 year!)
VLBI Mark 5B Front EndVLBI Mark 5B Front End 500 MHz BW, 32 channel filter bank 500 MHz BW, 32 channel filter bank
Multi-Purpose FPGA-Based Multi-Purpose FPGA-Based Spectrometer – Low Spectrometer – Low BandwidthBandwidth
XilinxVirtex-II 6000
FPGA
XilinxVirtex-II
1000FPGA
256 MB DRAM
200 MhzADC
Compact PCIBackplane
Software
200 MhzADC
200 MhzADC
200 MhzADC
I
I
Q
Q
Pol. 1
Pol. 2
{
{
200 Aux. I/O
SERENDIP V SpectrometerSERENDIP V Spectrometer
SETI ApplicationsSETI Applications
• JPL/UCB/SI DSN Sky Survey (20 GHz Bandwidth)JPL/UCB/SI DSN Sky Survey (20 GHz Bandwidth)
• Parkes Southern SERENDIPParkes Southern SERENDIP
• ALFA Sky Survey (300 MHz x 7 beams)ALFA Sky Survey (300 MHz x 7 beams)
• SETI Italia (Bologna)SETI Italia (Bologna)
• SETI@homeSETI@home
Astronomy ApplicationsAstronomy Applications
• GALFA Spectrometer – Arecibo Multibeam Hydrogen SurveyGALFA Spectrometer – Arecibo Multibeam Hydrogen Survey
• Astronomy Signal Processor – ASP – Don Backer, Ingrid Stairs, Astronomy Signal Processor – ASP – Don Backer, Ingrid Stairs, (pulsars)(pulsars)
• ATA4 Correlator F Engine ATA4 Correlator F Engine
• Reionization Experiments (Backer (UCB), Chippendale/Ekers Reionization Experiments (Backer (UCB), Chippendale/Ekers (ATNF)) (ATNF))
• Antenna Holography, ATNF, ChinaAntenna Holography, ATNF, China
• GMRT correlatorGMRT correlator
SERENDIP V
PolyphaseFilter Bank
Serverw/ EDT card
GbESwitch
PC
Serverw/ EDT card
Serverw/ EDT card
Serverw/ EDT card
PCPCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
GbESwitch
PCPC
PCPC
PCPC
100 MHz
100 MHz
Pol. 1
Pol. 2
Astronomy Signal Processor – Don Backer
GALFA SpectrometerGALFA Spectrometer
GALFA SpectrometerGALFA Spectrometer
sin
cos
LPF
LPF
100 MHz
-50 to +50 MHz
sin
cos
LPF
LPF
100 MHz
-50 to +50 MHz
QuadratureDownconverter
Board
IF Pol. 1
IF Pol. 2
Biplex256 pnt.
PFB
e^-it
e^-it
FIRLPF
FIRLPF
12.5 MhzDigital
Decimateby 16
Decimateby 16
Biplex8192 pnt.
PFB
Stokes
Stokes
cPCIBackplan
eto
CPU
Multipurpose Spectrometer Board
GALFA Lowpass FilterGALFA Lowpass Filter
GALFA Lowpass FilterGALFA Lowpass Filter
Mars Orbiter mm Mars Orbiter mm SpectrometerSpectrometer
ASIC based spectrometer (mars)ASIC based spectrometer (mars)
• 2W/ADC + 2W/ASIC = 4 Watts2W/ADC + 2W/ASIC = 4 Watts
• Use UCB’s “Chip in a Day” softwareUse UCB’s “Chip in a Day” software
(compiles FGPA code into ASIC)(compiles FGPA code into ASIC)
Use rad hard libraries from LBLUse rad hard libraries from LBL
Moores Law – Instruments using FPGA’s: 2X per year(1,000,000 over 20 years)
Future SpectrometersFuture Spectrometers
2015 4 THz 400 beams 10 GHz each
2020 128 THz 12,800 beams
2025 4000 THz 40,000 beams
2030 128,000 THz 1M beams
CaveatsCaveats• Risky Risky
• Simulink new, buggy, not open sourceSimulink new, buggy, not open source
(verilog, vhdl old)(verilog, vhdl old)
just a bunch of clever students, just a bunch of clever students,
We’ve built the easy instruments so far,We’ve built the easy instruments so far,
(Not the hard ones), yet to demonstrate (Not the hard ones), yet to demonstrate packetizedpacketized
Correlator and compute clusterCorrelator and compute cluster
CASPER the CASPER the FriendlyFriendly......• Group Helping Open-source Signal-Group Helping Open-source Signal-
processing Technology (GHOST?)processing Technology (GHOST?)
– Goal to help develop signal processing Goal to help develop signal processing instrumenation and libraries for the instrumenation and libraries for the community.community.
– Open source hardware, gateware, and Open source hardware, gateware, and software.software.
– Provide training and tutorialsProvide training and tutorials
– Not so much delivering turn-key Not so much delivering turn-key instruments.instruments.
http://seti.berkeley.edu/casperhttp://seti.berkeley.edu/casper