Floating Point Representation · 2018-10-29 · EE 3610 Digital Systems Suketu Naik . Fixed Point...

EE 3610 Digital Systems Suketu Naik

Floating PointRepresentation

EE 3610: Digital Systems

Floating boat

Floating numbers

4Fixed Point Numbers

Generic Binary RepresentationExample: 1101012 1 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 1 x 20

= 32 + 16 + 4 + 1 = 5310Example: 1011.10112 1 x 23 + 0 x 22 + 1 x 21 + 1 x 20 1 x 2-1 + 0 x 2-2 + 1 x 2-3 + 1 x 2-4

= 11. 6875

N-bit fixed point, 2’s complement integer representation X = -bN-1 2N-1 + bN-2 2N-2 + … + b020

5Fixed Point Numbers‘Fixed Point’ number has a fixed number of digits after and before the decimal point

Qm.n notation m bits for integer portion n bits for fractional portionTotal number of bits N = m + n + 1, for signednumbersExample: 16-bit number (N=16) and Q2.13 format 2 bits for integer portion 13 bits for fractional portion 1 signed bit (MSB)

Examples:1110 in Integer Representation Q3.0-23 + 22 + 21 = -211.10 Fractional Q1.2 Representation-21 + 20 + 2-1 = -2 + 1 + 0.5 = -0.5

1.110 Fractional Q0.3 Representation-20 + 2-1 + 2-2 = -1 + 0.5 + 0.25 = -0.25

Difficult to use due to possible overflow In a 16-bit processor, the dynamic range is -32,768 to 32,767.Example: 200 × 350 = 70000: an overflow!

Fixed Point Numbers: Qm.n notation

7Fixed Point Numbers: Qm.n notationDynamic Range and Precision of 16-Bit Numbers for Different Q Formats

8Fixed Point Numbers: Adv. And Disadv. Disadvantage of fixed point: dynamic range is small

Advantages of fixed point: logic circuits simpler, less memory,less processor speed: smaller, faster, cheaper, lower power consumption

9Application of Fixed Point DSPADSP Blackfin: multi-format audio, video, voice and image

processing Example: Undersea pipline monitoring in Orman Natural

Gas Field Hydro-acoustics and non-contact sensors together with ADSP Blackfin and LabVIEW Embedded Module for sub sea state monitoring and analysis tasks

ADI Blackfin DSP Orman Natural Gas Field (Norway)

10Why Floating Point?

Floating point numbers are used to represent fractions and allow more precision

e.g. 35.261, 3.14159, 1.25

Arithmetic units for floating-point numbers are more complex than fixed-point numbers

IEEE 754 Floating-Point Formats

Single precision (32-bit) Double precision (64-bit) Extended precision (128-bit)

11Basics

N= F x BE

N = a floating point (real) number F = fraction B = base E = exponentBase can be 2, 10, 16 or other

Here we consider B=2

Fraction F and Exponent E can be represented in a number of ways

12Basics

Binary numbers, like decimal numbers, can be represented in scientific notation

Example: 923.5210 can be represented as 9.2352 x 102

Similarly, 101011.1012 = 43.62510 can be represented as 1.01011101 x 25

13Basics: Sign Field

43.62510 101011.1012 1.01011101 x 25

0 xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxx

Sign field is 1-bit long: 0 (positive) or 1 (negative)

N= (-1)S x (1 + F) x 2E

S = Sign fieldE = Exponent fieldF = Mantissa (Fractional) Field

14Basics: Exponent Field

43.62510 101011.1012 1.01011101 x 25

0 10000100 xxxx xxxx xxxx xxxx xxxx xxx

Exponent field is 8-bits long: range 0 to 255Add 127 to the exponent in scientific notationHere 5 + 127 = 132: 10000100 (If the exponent is less than 127 then it is negative)

15Basics: Mantissa (Fraction) Field

43.62510 101011.1012 1.01011101 x 25

0 10000100 01011101000000000000000

Mantissa field is 23-bits long: throw away 1 to the leftof binary point and pad it with zeros at the end1.01011101 01011101000000000000000

16Single Precision Representation

43.62510 101011.1012 1.01011101 x 25

Sign Exponent Fraction

0 10000100 01011101000000000000000

17Formats: Single PrecisionSingle Precision

Example: 13.4510 : 1101.01 1100 1100 1100 ... 2: 1.10101 1100 1100 ... x 23

(Note: 0.45 produces repeating binary fractions)Sign: positive: 0Exponent: 3 + 127 = 130: 10000010Mantissa (Fraction): 10101 1100 1100 1100 1100 11:

18Formats: Double PrecisionDouble Precision

Example: 13.4510 : 1101.01 1100 1100 1100 ... 21.10101 1100 1100 ... x 23

(Note: 0.45 produces repeating binary fractions)Sign: positive: 0Exponent: 3 + 1023 = 1026: 10000000010 Mantissa (Fraction): 10101 1100 1100 1100 1100 1100 1100 1100 1100 1100 1100 1100 110

19IEEE 754 Floating Point: Range

Type Size Exponent Mantissa Range

Single 32-bit 8-bit 23-bit 2x10+/-38

Double 64-bit 11-bit 52-bit 2x10+/-308

20Floating Point (FP) to DecimalExample: HEX 0xC0B40000, assume signal precision FP

Step 1: Convert HEX into Binary

HEX C 0 B 4 0 0 0

Binary 1100 0000 1011 0100 0000 0000 0000

21Floating Point (FP) to DecimalStep 2: Reorganize into packets of 1, 8, and 23 bits long

1 10000001 01101000000000000000000

HEX C 0 B 4 0 0 0

Binary 1100 0000 1011 0100 0000 0000 0000

22Floating Point (FP) to DecimalStep 3: Conversion

1 10000001 01101000000000000000000

Sign=1 (negative)Exponent= 10000001 = 1 x 27+ 1 x 20 = 129 => 129-127 = 2Mantissa= 1.01101

N= -1.01101 x 22

= -101.101= -(1 x 22 + 1 x 20 + 1 x 2-1 + 1 x 2-3)= -(4 + 1 + 0.5 + 0.125 ) = -5.625

23Floating Point (FP): VHDL

use ieee.float_pkg.all; variable x, y, z : float (5 downto -10); begin y := to_float (3.1415, y); -- Uses “y” for the sizing only. z := “0011101010101010”; –- 1/3 x := z + y;

24Application of Floating Point DSPADSP SHARC: most computationally intensive, real-time signal-processing applications Examples: High Definition Audio, Precision Motor/Tool Control, Precision Sensors

ADI SHARC DSP

25ReferencesFloating Point Converter http://babbage.cs.qc.cuny.edu/IEEE-754.old/Decimal.html

32-bit single precision floating point adderhttp://upcommons.upc.edu/pfc/bitstream/2099.1/15467/4/32BitFloatingPointAdder.pdf

ADI SHARC Processors http://www.analog.com/en/processors-dsp/sharc/products/index.html

Floating Point Representation · 2018-10-29 · EE 3610 Digital Systems Suketu Naik . Fixed Point...

Documents

Transcript of Floating Point Representation · 2018-10-29 · EE 3610 Digital Systems Suketu Naik . Fixed Point...

PERFORMANCE ANALYSIS OF A FIXED POINT STAR TRACKER · PDF fileperformance analysis of a fixed point star tracker ... performance analysis of a fixed point star tracker algorithm ...

Dual Fixed-Point Calculator

Fixed Point Iteration - KSU

Coping with Fixed Point

Floating Point to Fixed

Fixed Point Types Qn.m

Fixed Point Filtering imlementation

One Variable Fixed Point

Fixed Point Iteratio

Some fixed point and common fixed point theorems of integral

iTrans Fixed Point Transmitter

Fixed Point Routines - 00617b

Specification of Extended Fixed Point Routines€¦ · Specification of Extended Fixed Point Routines V 2.0.0 R 4.0 Rev 3 Document Title Specification of Extended Fixed Point Routines

The Brouwer fixed point theorem

Suzuki Type Unique Common Tripled Fixed Point Theorem for … · some fixed point theorems in partial metric spaces for a single map. For more works on fixed, common fixed point theorems

Bisection and fixed point method

Floating-Point to Fixed-Point Compilation and Embedded ...ece.ubc.ca › ~aamodt › papers › aamodt-masc.pdf · Floating-Point to Fixed-Point Compilation and Embedded Architectural

Common Carrier Fixed Point Microwavejirwinconsulting.com/FCC - DN Common Carrier Fixed Point Microwav… · Del Norte County Common Carrier Fixed Point Microwave Page 2 Revised May

Some Fixed Point Theorems in Menger Spaces and Applicationsshodhganga.inflibnet.ac.in/bitstream/10603/20254/1/some fixed point... · SOME FIXED POINT THEOREMS IN MENGER SPACES AND

Fixed Point Comp