Speech Reognition Using FPGA Technology

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101

Carlos Asmat – David López Sanzò – Kanwen Wu

Speech RecognitionUsing FPGA Technology

ByCarlos Asmat 260148251David López Sansò 260146414Kanwen Wu 260045745

Presentation Date: Wednesday, June 6, 2007

Project Supervisor: Prof. Miguel Marin

Project Coordinator: Prof. Kenneth L. Fraser

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101

2Carlos Asmat – David López Sanzò – Kanwen Wu

Outline

1) Introduction

2) MATLAB™ Demonstration

3) Hardware Implementation

4) Hardware Demonstration

5) Final remarks

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


What is speech recognition?● Convert analog sound into binary digits.

● Compare with the pre-stored word.

● Not to confuse with speaker recognition.

Introduction ● Hardware Implementation ● Demo ● Final Remarks

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Speech Recognition Performance

● Priority: Accuracy and Reliability.

● Consumer products.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Objectives● Hardware implementation of a simple speech recognition

system.

● Single word identification.

● Cost efficiency, reliability, and simplicity are the major consideration.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● The sound identification is based on its frequency content.

● Two steps:

➔ Training

➔ Recognition


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background theory● A MATLAB™ implementation was devised to assess the

project feasibility.

● Two files were produced:

➔ train.m

➔ recogniz.m


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● Training:

➔ Input several versions of a sound.

➔ Translate them to the frequency domain by using the FFT.

➔ Average their amplitude in the frequency domain.

● This produces the sound's fingerprint.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


● Note on the FFT:

➔ Only half of it is used.

➔ Five 1024-points FFTs are performed per sound sample.

Background Theory

X k=∑n=0

N−1

xn e−2 i

Nnk

k=0,... , N−1


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● User inputs .wav files.

● Decimate and quantize the input sound files.

● Sound acquisition parameters:

➔ Sound samples are quantized down to 8 bits.

➔ The sampling frequency is 5 kHz.

➔ Around one second (1.024s) of sound is stored.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● Sound detection:

➔ Compute the average of a window.

➔ Compare it to the average of the next window.

➔ If the difference is significant then the sound is assumed to start at that point.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory

w1=w2=1024 samples=0.2048s

L=5120 samples=1.024s

● Sound detection (cont'd):


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● Store detected sound stream into a vector.

● Apply FFT to the above vector's first 1024 points and put it in 's'.

● Store 's' as the first row in the matrix 'x' and repeat with the following 1024 points until there are five rows in 'x'.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Background Theory● Sound recognition:

➔ Compute the fingerprint of a sound.

➔ Compute the distance between the sound's fingerprint and the reference fingerprint

➔ If both are close enough, then the sound is assumed to match the reference sound.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


D=∑i=0

1024

ai−bi 2

Background Theory● Note on the distance computation:

➔ The sounds fingerprint and the reference fingerprint are considered as 1024-dimensional vectors.

➔ The distance between them is computed using the euclidean distance formula:


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


System Overview


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Hardware Implementation● Design approach

● A/D Conversion

● Word detector

● FFT

● Memory Management

● Distance Computation


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Design Approach● Quartus II

➔ VHDL process blocks

➔ Computer-Aided Design

● Datapath/Overall Controller

● Intermediate controllers


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


A/D Conversion

Introduction ● Hardware Implementation ● Demo ● Final Remarks Source: http://www.societyofrobots.com/images/analogdigital.jpg

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


A/D – Overall Configuration


MCLK

BCLK

LRCLK

ADCDAT

WolfsonCODEC

FPGA

I2C Bus

MASTER SLAVE

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


A/D Conversion

● Internal signals set by bus.

➔ De-mute.

➔ Boost mic.

➔ Change path.


MUTE

MUX A/D D/ADigital Filters

LINEIN

MICIN

MUTEMIC INSEL ADCDAT

LINEOUT

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


I2C Bus

● RADDR → Base address = 0011010

● R/W → Read/Write = 0

● B[15-9] → Control Address = 0000100

● B[8-0] → Control Data = 000001101


Source: Wolfson WM8731 data sheets, p.43

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


● B[8-0] → Control Data = 000001101

I2C Bus



'INSEL'

'MUTE MIC''MIC BOOST'

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


I2C Bus – ACK Signal● ACK signal goes from the Wolfson to the FPGA

➔ Opposite direction from rest of data

➔ Only one data line


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101






Solution...

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101






d a t a [ ]

e n a b l e d t

e n a b l e t r

r e s u l t [ ]t r i d a t a [ ]

L P M _ B U S T R I

i n s t

Solution...

Tri-state buffer!

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


A/D – ADCDAT Fetcher

● Clock module

● MSB available on 2nd rising BCLK edge



101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Quantization● Codec output: two's complement

● Quantize 24 bits into 8.


Decimalnumber

Binary (2's comp.)

Quantizeddecimal

Quantizedbinary

(2's comp.)

3 011

2 0101 01

1 001

0 0000 00

-1 111

-2 110-1 11

-3 101

-4 100-2 10

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Downsampler

● Implementation

➔ Flip-flop

➔ Counters (and FSM)


DownsamplerDATA_IN @ 48 kHz DATA_OUT@ 5 kHz

READY

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Word Detector

● Detects sharp transitions.


Comparator

DATA_IN Average

Register 1

Register 2AbsoluteDifference

8

THRESHOLD9

9

SOUND_STARTS

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Fast Fourier Transform● Altera IP MegaCore® 1024-points FFT module:

➔ Natural order streaming data input.

➔ Bit-reversed streaming data output.

➔ Low latency.

➔ Time Limited Version.


c l k

r e s e t _ n

i n v e r s e

s i n k _ v a l i d

s i n k _ s o p

s i n k _ e o p

s i n k _ r e a l [ 7 . . 0 ]

s i n k _ i m a g [ 7 . . 0 ]

s i n k _ e r r o r [ 1 . . 0 ]

s o u r c e _ r e a d y

s i n k _ r e a d y

s o u r c e _ e r r o r [ 1 . . 0 ]

s o u r c e _ s o p

s o u r c e _ e o p

s o u r c e _ v a l i d

s o u r c e _ e x p [ 5 . . 0 ]

s o u r c e _ r e a l [ 7 . . 0 ]

s o u r c e _ i m a g [ 7 . . 0 ]

F F T

i n s t 1

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Memory Management● Three memory modules:

➔ FALSH (4MB)

➔ SDRAM (8MB)

➔ SRAM (512 kB)


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Data I/O

Address

Chip Enable

Write Enable

Output Enable

High Byte Mask

Low Byte Mask

18

16SRAM Chip

Memory Management● 512 kB SRAM memory module


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


218 blocks

16 bits

8 bits

0123

262 141262 142262 143262 144

0 1

2 3

4 5

6 7

524 280 524 281

524 282 524 283

524 284 524 285

524 287 524 288

Memory Management● Memory structure:


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Memory Management● Memory Controller:

DATA_OUT

ADDR

DATA_IN

MODE

ENABLE

19

88

Memory Controller

Add

ress

Chi

p E

nabl

e

Wri

te E

nabl

e

Out

put E

nabl

e

Hig

h B

yte

Mas

k

Low

Byt

e M

ask

18

Dat

a I/

O

16


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Memory Management● Batch Operations:

MemoryBatch Operator

START_ADDR

DATA_IN

MODE

DATA_READY

19

8

END_ADDR19

ENABLE

CLK

DATA_OUT8

MEM_MODE

MEM_ENABLE

ADDR19


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Distance Computation● The distance computation module:

Distance

A

RST

8

ENABLE

CLK

DISTANCE8

B8


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Distance Computation● The distance computation module (cont'd):


SquareDifferenceA

8

B8

Accumulator

RST

CLK

DISTANCESquareRoot

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Demonstration


Sound Detection

I2C Done Signal

Threshold Settings

Assign Threshold

Send I2C Configuration

Current Average

Original image source: http://users.ece.gatech.edu/~hamblen/DE2/DE2.jpg

101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Final Remarks● Deficiencies.

● Strengths.

● Potential Improvements.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Deficiencies● Lack of accuracy.

● Lack of observability.

● Requires complex hardware

➔ FFT (Nios II)


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Strengths● Fast.

● Trainable.

● The system is not limited to speech.


101010101010101010101010111100101011001011110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101101010111010010101001010111100101011001011110101010101010101001010100101010100110111010101010001110101110101010001010111011000101101011000110100101010100110111010100101110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000110101010101010101001010100101010100110111010100101001010001010101010101101010101011010101010001110101110101010001010111011000101101011000110100101010100110111010100101001010001010101010101


Potential Improvements● Recognize several words

● Improve accuracy

● Variable length word

● Recognize sentences

➔ Requires hidden Markov model (HMM) (Very complex!)


Speech Reognition Using FPGA Technology

Technology

Transcript of Speech Reognition Using FPGA Technology