TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada...

33
TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada [email protected]
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada...

Page 1: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TigerSHARC CLU

Closer look at the XCORRS

M. Smith,

University of Calgary, Canada

[email protected]

Page 2: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Overview

Recap GPS correlation Look at XCORRS instruction in detail

This was part of Take home quiz for 5005 Additional information on the web

Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm

code XcorrsTest.cpp – demonstrates testing of all the

functions being used Additional correlation presentations (not XCORRS)

from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC

XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors

Page 3: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

GPS Positioning Concepts

(1)

For now make 2 assumptions: We know the distance to each satellite We know where each satellite is

With this information from 2 satellites – you know you are on a “plane of intersection.

Require 3 satellites for a 3-D position in this “ideal” scenario Requires 4 satellites to account for local receiver clock drift.

Page 4: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Determining Time

Use the PRN code to determine time Use time to determine distance to the satellite

distance = speed of light * time

(1)

Signal send by satellite

Signal received by you

You know the signal sent

Perform correlations till you get a match

Page 5: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

The practice

Suppose we have the vector – in-phase and out-of-phase data gathered over an antenna from a satellite for example. Gain issues make it x16

-16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc

Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j,

How much is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc

Page 6: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Tackle the issue with FIR

First – modify correlation function to handle complex values Ignore that issue at the moment

– 1 add + 1 multiplication + 2 memory fetches to 3 adds + 4 multiplications plus 4 memory fetches

Imagine 1024 data points + 1024 PRN Need to do 1024 FIR each of 1024 taps We know how to optimize to do 2 taps every cycle (one

in X and one in Y) Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz

XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster

Page 7: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Where does the CLU fit in?

Page 8: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

XCORRS definition

Page 9: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

THEORYMathematicaldefinition

Uses registers

TR -- accumulateD -- 8 data?C -- 1 coefficient?

And something calledCUT – essentially awindow operation

fcut = 0 -- don’t use

Page 10: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

2005 Lab. 4Satellite data

Quad fetch brings in8 complex values 8 bits eachPattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….

Page 11: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

PRN code – 2 bit complex number

Seems strange to have two dummy bitsBut actually makes sense

PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ……….

+1, -1 are associated with the PSK – more another lecture

Problem BINARY means 1 and 0, so how represent 1 and -1

-1 are stored as 1’s, +1 stored as 0’s (DAMY)

Page 12: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

PRN

Page 13: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

PRN

0x3 value go in asC15 and C160011 -- C15 = -1 –j C16 = +1 + j

Page 14: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Loading the THR registers

Page 15: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Standard XCORRS instruction

Lower 46 bits ofTHR1:0

R7:3

TR0, TR1, TR2 ……. TR15

Page 16: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0)

Doing 8 complex taps of 16 correlationat each cycle

TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps

64 taps each cycles – on both x and y compute blocks – if set up properly

128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3

Page 17: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)

Because of offsets, sometimes wemust only use “some of the taps”

TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps

Page 18: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)

TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps

………..TR15 += 0 … 0 taps

Page 19: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT +7?)

TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps

………..TR15 += D7 * C7 + D6 * C7 + … 8 taps

Page 20: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.
Page 21: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)

TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps

………..TR15 += 0 … 0 taps

Page 22: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.
Page 23: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)

TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps

Page 24: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.
Page 25: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0)

TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps

64 taps each cycles – on both x and y compute blocks – if set up properly

128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3

Page 26: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.
Page 27: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Problem at this point -- THR3:2 emptyNeed to bring in more PRN values

Page 28: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15)

TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps

………..TR15 += D7 * C7 + D6 * C7 + … 8 taps

Page 29: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.
Page 30: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Final Result

Maximum correlation occurs every 3 shifts – which is what we expectIs it the correct result?

Page 31: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Correlation – result expected

In step-1 +0j, 1 + 0j, 1 + 0j, … 16 times

with-1 - j, 1 + j, 1 + j, … 16 times

-1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component

Out of step-1 +0j, 1 + 0j, 1 + 0j, … 16 times

with1 + j, 1 + j, -1 - j, … 16 times

-1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0

Page 32: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Final Result

1) Now have correlation values for 16 shifts in TR registers – store to external memoryRepeat for all other necessary shifts – find the maximum2) Now make parallel in SISD mode 3) Now make parallel in SIMD

Page 33: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca.

Overview

Recap GPS correlation Look at XCORRS instruction in detail

This was part of Take home quiz for 5005 Additional information on the web

Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm

code XcorrsTest.cpp – demonstrates testing of all the

functions being used Additional correlation presentations (not XCORRS)

from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC

XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors