Phoning Home

16
Phoning Home Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.

description

Phoning Home. Controlling your household appliances through conversation. Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010. Agenda. Problem Statement How it works System Architecture Future Works Demonstration Questions. - PowerPoint PPT Presentation

Transcript of Phoning Home

Page 1: Phoning Home

Phoning Home

Sean PowersFlorida Institute of Technology

ECE 5525 Final:Dr. Veton Kepuska

Date: 07 December 2010

Controlling your household appliances through conversation.

Page 2: Phoning Home

AgendaProblem StatementHow it worksSystem ArchitectureFuture WorksDemonstrationQuestions

Page 3: Phoning Home

Problem Statement Imagine you were to leave your house in a hurry because

you were running late to work, had to pick up your kids from school or any other reason and you forgot to turn off the stove. You remember three blocks away and you do not have the time to turn around. You then dial a phone number that is assigned to your house and ask your house to turn the stove off for you. Your house confirms the stove will be turned off and you are now relived you won’t come home to a potential fire. This is the essence of the Phoning Home system.

Page 4: Phoning Home

How it works

Asterisk Server (VOIP)

Phoning Home

Services (WCF)

Speech Recogniziti

on (Cmu Sphinx)

Client Services (WCF + MCU)

Phoning Home is broken into four main components. The Voice over IP (VOIP) server which forwards the phone speech and uses text to speech (TTS) to speak to the user. The Phoning Home web services which handles the communication for all Phoning Home households. The speech recognition server which is responsible for recognizing the user’s phone speech. The client services which handle an individual households devices.

Page 5: Phoning Home

Answering incoming callsAsterisk Open Source Communication

ProjectTurns a computer into a voice communications

serverIncludes features:

Ability to answer incoming calls Ability to generate outgoing calls Ability to play and generate tones Ability to integrate with web services Ability to record calls Ability to provide call details such as caller

identification

Page 6: Phoning Home

Phone Server Application

Page 7: Phoning Home

Speech Recognition Application Carnegie Melon University (CMU) Sphinx-4 Speech Recognition

Engine

The Sphinx-4 framework consists of three primary modules: the FrontEnd, the Decoder, and the Linguist. The FrontEnd takes input signals and parameterizes them into a sequence of features. The Linguist translates any type of standard language model along with information from the Dictionary and structural information from one or more sets of Acoustic models into a SearchGraph. The Decoder uses the features from the FrontEnd, and the SearchGraph from the Linguist to perform the actual decoding and produce the Results.

Page 8: Phoning Home

Sphinx 4

Page 9: Phoning Home

A Wireless Appliance Reducing Energy (AWARE)

Machine B

• Each device that can be controlled via the Phoning Home system is known as an A Wireless Appliance Reducing Energy (AWARE). Every device is wirelessly controlled via the Phoning Home Master Control which is ultimately an Atmel Atmega16 microcontroller connected via USB to the Client Services.

Page 10: Phoning Home

Phoning Home System Overview

Page 11: Phoning Home

Sequence Diagram

Page 12: Phoning Home

IssuesOne issue I had to overcome when developing the

demonstration was recognizing telephone speech. There are significant differences between microphone and telephone speech. From the Sphinx documentation:“The issue with telephone audio is that it has limited

range of frequencies. Unlike usual microphone recording that includes frequencies from 1 Hz to 8000 kHz, telephone audio is passed through frequency filters. As a result telephone audio contains frequencies from 200 Hz to 3500 Hz. That makes it impossible to recognize telephone audio with usual microphone acoustic model. You need to use specialized models to recognize it.”

Ending up using the 8kHz VoxForge acoustic model.

Page 13: Phoning Home

Future WorksEasy Collection and Storage of Multiple

Utterances can be used to improve acoustic models

• Asterisk Server is capable of simultaneously handling multiple calls

• Server stores and catalogs utterances automatically

Database

Page 14: Phoning Home

Future WorksAlthough Phoning Home is designed to allow you to call

your house to control your appliances, it would be very useful to combine Dr. Kepuska’s Wake-up-Word (WuW) technology to allow you to control your appliances from inside the house as well.

In a commercial product, Phoning Home would need to be equipped with extensive security measures to confirm the user calling their house actually has appropriate credentials to control their appliances. This could be as simple as a password or as complex as adding speaker recognition to confirm the user calling is a user that is permitted to call.

Page 15: Phoning Home

DemonstrationVirtual Phone Number: 1 (321) 710 -

5090 #JSGF V1.0;

/** * JSGF Digits Grammar for Phoning Home */

grammar digits;

public <command> = <polite> <startAction> room [number] <numbers> <devices> <endAction>;

<polite> = [please | kindly | could you | oh mighty computer | operator];

<startAction> = (turn | switch);

<devices> = (lights | lamps);

<endAction> = (on | off);

<numbers> = (oh | zero | one | two | three | four | five | six | seven | eight | nine);

Page 16: Phoning Home

Questions