Silent sound technology final report

12
Voice Conversion for Silent Sound Technology Dept. of Electronics and Communication, HIT Nidasoshi 1 CHAPTER 1 INTRODUCTION The Silent Sound Technology is an amazing solution for those who had lost their voice but wish to speak over mobile phone. It is a technology that helps to transmit information without using vocal cords. IT detects every lip movement and internally converts the electrical pulses into sound signal and sends them by neglecting all other surrounding noise. Hence person on other end of phone receives the information in audio. Developed at the Karlsruhe Institute of Technology, Germany (KIT) and expected in the next 5-10 years. Once launched, will have drastic effect and with no doubt, will widely used. Silent speech technology enables speech communication to take place when an audible acoustic sound signal is unavailable. By acquiring sensor data from elements of the human speech production process from the articulators, their neural pathways, or the brain itself it produces a digital representation of speech can be synthesized directly, interpreted as data, or routed into a communication network. Sound Technology is technology for mobile phone that helps you communicate in noisy places too. It is a technology that helps to reduce noise pollution to a great extent. The uses of this technology are immense for people who are vocally challenged or have been rendered mute due to accident. IN the year 2010 CeBIT’s “Future Park”, a concept Silent Sound Technology demonstrated which aims to notice every movement of lips and transforms them into sounds. This new technology will be very helpful whenever a person loses his voice, while speaking to make silent calls without disturbing others, and even when we want to tell our PIN number to trusted friend or relative without having other person to listen it secretly. At the other end, the listener can hear a clear voice. Another important benefit is that it can be translated into any language of user’s choice. This translation works for many languages like English, French, & German. But for the languages like chines is difficult because different tones can hold many different meanings.

Transcript of Silent sound technology final report

Page 1: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 1

CHAPTER 1

INTRODUCTION

The Silent Sound Technology is an amazing solution for those who had lost their voice but wish to

speak over mobile phone. It is a technology that helps to transmit information without using vocal

cords. IT detects every lip movement and internally converts the electrical pulses into sound signal

and sends them by neglecting all other surrounding noise. Hence person on other end of phone

receives the information in audio. Developed at the Karlsruhe Institute of Technology, Germany

(KIT) and expected in the next 5-10 years. Once launched, will have drastic effect and with no doubt,

will widely used. Silent speech technology enables speech communication to take place when an

audible acoustic sound signal is unavailable. By acquiring sensor data from elements of the human

speech production process from the articulators, their neural pathways, or the brain itself it produces

a digital representation of speech can be synthesized directly, interpreted as data, or routed into a

communication network.

Sound Technology is technology for mobile phone that helps you communicate in noisy places too.

It is a technology that helps to reduce noise pollution to a great extent. The uses of this technology

are immense for people who are vocally challenged or have been rendered mute due to accident. IN

the year 2010 CeBIT’s “Future Park”, a concept Silent Sound Technology demonstrated which aims

to notice every movement of lips and transforms them into sounds. This new technology will be very

helpful whenever a person loses his voice, while speaking to make silent calls without disturbing

others, and even when we want to tell our PIN number to trusted friend or relative without having

other person to listen it secretly. At the other end, the listener can hear a clear voice. Another

important benefit is that it can be translated into any language of user’s choice. This translation

works for many languages like English, French, & German. But for the languages like chines is

difficult because different tones can hold many different meanings.

Page 2: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 2

CHAPTER 2

NEED FOR SILENT SOUND TECHNOLOGY

Silent Sound Technology will put an end to embarrassed situation such as:

Whenever you are talking in a cell phone in a crowd, then actually you are “not talking”,

because of lots of disturbances and noise around us.

In the case of an urgent call in a meeting, apologetically rushing out of the room in order to

answer or call the person back.

Humans are capable of producing and understanding whispered speech in quite

environments at a silent remarkably low signal levels. Most peoples can understand a few words

which are unspoken, by lip reading. The idea of interpreting silent speech electronically or with a

computer has been around for a long time, and was popularized in the 1968 Stanley Kubrick science

fiction film “2011 –A Space Odyssey”, A major focal point was the DARPA Advanced Speech

Encoding Program (ASE) of the early 2000’s, which funded researches on low bit rate speech

synthesis “with acceptable intelligibility, quality and aural speaker recognizability in acoustically

harsh environments”.

Fig. 2.1 Common people talking at same place without disturbance

Page 3: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 3

CHAPTER 3

METHODS OF SILENT SOUND TECHNOLOGY

Silent Sound Technology is processed through some ways or methods they are:

1. Electromyography

2. Image Processing (Ultrasound SSI)

3.1 ELECTROMYOGRAPHY:

Silent Sound Technology uses electromyography, for monitoring tiny muscular movements

that occur when we speak.

Monitored signals are converted into electrical pulses that can be converted electrical signal

with the help of transducers. And that can be turned into speech, without sound uttered.

Electromyography [EMG] is a technique for evaluating and recording electrical activity

produced by skeletal muscles.

Electromyography can be performed by using instrument called an “Electromyograph”, to

produce a record called an “Electromyogram”.

An Electromyography detects the electrical potential generated by muscle cells, when these

cells are electrically or neurologically activated.

Electromyography sensors attached to the face records the electrical signals produced by the

facial muscles, compare them with prerecorded signal pattern of spoken words.

When there is a match that sounds is transmitted on to the other end of the line and person at

the other end listen to spoken words.

3.1.1 ELECTROMYOGRAPHY [EMG] PROCEDURE:

There are two types of Electromyography (EMG)

1. Surface Electromyography: It consist of four different kind of transducers they are

Pressure sensor

Motion sensor

Vibration sensor

Electromagnetic sensor

Page 4: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 4

2. Intramuscular Electromyography: It consists of needle or an electrode containing two

fine wire electrodes is inserted through skin into muscle tissue. The insertion activity

provides valuable information about state of the muscle and innervating nerve.

Fig. 3.1.1 Electromyography sensors attached to the face and body

3.1.2 Basic Mechanism of Electromyography Signal Generation:

Basic mechanism of electromyography where the muscle activity is analyzed by the

Electromyograph to generate an Electromyogram.

The Surface Electromyography [EMG] Sensors pressure, vibration, motion, electromagnetic

sensors are attached to the surface of the face and body.

Page 5: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 5

The Intramuscular Electromyography [EMG] consists of a needle or an electrode containing

two fine wires are inserted through skin into muscle tissue. It provides valuable information

about state of the muscle and innervating nerves.

Fig. 3.1.2 Electromyography [EMG] Signal Generation

When there is movement in the muscle it generates an electrical pulse and transducers are

involved to convert electrical pulses into electrical signals.

The electrical source is the muscle membrane potential of about -90 mV. Measured EMG

potential range between less than 50uV and up to 20 to 30mV.

The electrical signals are amplified by using differential amplifier because it is having

common mode rejection ratio [CMMR] of about 120db and eliminates noise content in

electrical signal.

The output of an differential amplifier is given to EMG device it consist of an analog to

digital converter, prerecorded database of Spoken words, and electrical signal to sound

converter.

The analog electrical signal is converted into digital signal by sampling, quantizing, and

encoding.

Then digital signal is compared with prerecorded signals pattern of spoken words.

When there is match in that sound is transmitted on to the other end of line and person at

other end listen to spoken words.

Page 6: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 6

3.2 IMAGE PROCESSING (ULTRASOUND SSI)

Another way to obtain direct information on the vocal tract configuration is via imaging technique.

Ultrasound imagery is a non-invasive and clinically safe procedure which makes the most possible

real-time visualization of one of the important articulators of speech production system- the tongue.

ULTRASOUND

It is a local oscillating sound pressure wave having frequency greater than upper limit of human

hearing range. Ultrasound is thus not separated from ‘normal’ (audible) sound by difference in

physical properties; only by fact humans cannot hear it. Although this limit varies from person to

person, it is approximately 20 KHz. In healthy adults. Ultra sound devices operate with frequencies

from 20 KHz to several gigahertzes. Range of sound signal as shown in fig. 3.2

Fig. 3.2 Range of Sound signals

3.2.1 SILENT SOUND INTERFACE (SSI) PROCEDURE:

Silent Sound Interface consists of ultrasound transducer (probe), high resolution optical

camera, lip reader, silent vcoder as shown in Fig. 3.2.1

Ultrasound transducer (probe) is placed beneath of the chin, can provide partial view of the

tongue surface.

Page 7: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 7

Fig. 3.2.1 Silent Sound Interface using Image Processing (SSI)

Ultrasound device is coupled with standard optical camera as shown in fig.3.2.1 is used to

capture tongue and lip movements. Because of its non-invasive property, clinically safety and

good resolution.

The captured images of lip and tongue movement are given to lip reader.

Lip reader detects the lip and tongue movements by comparing the earlier stored images of

the spoken words with the present images of lip and tongue movement.

Where there is match in images of lip and tongue movements it generates a visual speech

signal.

The generated visual speech signals are given to silent vcoder. It consists of an HMM based

visuo-phonetic decoder, audio visual selection unit, concatenation of the selected units, HNM

based prosodic adaptation.

Silent Vcoder converts the visual speech signals into human spoken words (speech).

Page 8: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 8

CHAPTER 4

ADVANTAGES & DISADVANTAGES

4.1 ADVANTAGES

Helping people who have lost their voice due to illness or accident.

This technology helps to transmit information without using vocal cords. Who are suffering

from Aphasia (Speaking Disorder).

We can make silent calls even if we are standing in crowded place.

Allow the peoples to make silent calls without bothering others..

Very good technology for noise cancellation technique.

Very useful for sharing confidential information like secrete PIN number on cell phone at

public place.

4.2 DISADVATAGES

It will be like always talking to a robot. Means differentiating between peoples and emotions

cannot be done.

From security point of view recognizing who are talking to gets complicated.

This technology will be very costly to a common man.

This technology works in many languages of user’s choice like English, French, & German,

Etc. but languages like Chinese are difficult because different tones can hold different

meanings.

Page 9: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 9

CHAPTER 5

APPLICATIONS

Silent Sound Technology is applied in military for communicating secret/confidential matters

to others.

As we know in Space there is no media for sound to travel therefore this technology can be

best utilized by astronauts.

Helping people who have lost their voice due to illness or accident. Hence this technology

can be best utilized by dumb peoples.

Since the electrical signals are universal they can be translated into any language. Native

speaker can translate it before sending it to other side. Hence it can be converted into any

languages of choice currently being German, English, & French.

Page 10: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 10

CHAPTER 6

RESEARCH & FUTURE PROSPECTS

Silent Sound Technology gives a bright future to speech recognition technology from simple

voice commands to memorandum dictated over the phone all this fairly possible in noisy

public places.

In Electromyography, without having electrodes and sensors hanging all around your face,

these electrodes and sensors will incorporate in to cell phones.

In Image Processing (ultrasound SSI) features like lip reading based on image recognition,

and image processing, these are also incorporated into cell phones rather than using

computers.

Nano Technology will be a mentionable step towards device handy.

Page 11: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 11

CONCLUSION

Silent Sound Technology is one of the recent trends in the field of information technology

implements “TALKING WITHOUT TALKING”.

It will be one of the innovation and useful technology and in mere future this technology will

be used in our day to day life.

Silent Sound Technology aims to notice every movement of the lips and transforms them into

sounds, which could help people who have lost their voice due to illness or accident, and

allow the people to make silent calls without bothering others. Rather than making any

sounds.

Page 12: Silent sound technology final report

Voice Conversion for Silent Sound Technology

Dept. of Electronics and Communication, HIT Nidasoshi 12

REFERENCES

1. Honey Brijwani et al Int. journal of engineering Research and Applications www.ijera.com

ISSN; 2248-9622, Vol. 4, Issue 4 (Version 9), April 2014.

2. Priya Jethani, Bharat choudhari, Silent Sound Technology A solution to noise communication

international Journal of Engineering Trends and Technology (IJETT) –Volume 9 Number 14

March 2014.

3. Shehjar safays, Kameshwar sharma, Silent Sound Technology – An End to Noisy

Communication, Speech communication Vol. 1, Issue 9, November 2013.

4. Hueber T, Benaroya E-L, Chollet G, Denby B, Dretfus G, Stone M. (2010). Development of

a Silent Speech Interface driven by Ultrasound and Optical images of the tongue and lips.

Speech Communication.

5. Denby B, Schult T, Honda K, Hueber T, Gilbert J.M., Brumberg J.s.(2010).Silent Speech

Interfaces. Speech communication.