Interim Report Dana - University of...

The Edward S. Rogers Sr. Dept of Electrical and Computer Engineering University of Toronto

ECE496Y Design Project Course - Interim Report

Title: Virtual Theremin using IEEE1394

Project I.D. # 2002105

Prepared by: Jeremy Gillard – [email protected]

Supervisor: Prof. James MacLean

Section #: 5

Section Coordinator:

Phil Anderson

Date: Friday, January 10, 2003

Virtual Theremin Design Project Jeremy Gillard

Design Project – Interim Report ECE496

Design Project #1682002

Virtual Theremin using IEEE1394

Prepared by: Jeremy Gillard

[email protected]

Supervisor: Prof. James MacLean

Date: Friday January 10th 2003

Executive Summary

The project our design group has undertaken is the design and implementation of a

Virtual Theremin musical instrument that uses an i386-based architecture computer,

making use of vision tracking techniques. The musical performance of a Theremin

performer is tracked in real-time using an IEEE1394 based camera.

The main goals of the project are: to detect the performer’s hands in real-time, to apply

algorithms that will allow us to predict when and where the performer’s hands are, and to

emulate the sound a physical theremin instrument would make based on the different

hand positions . The project’s software will be developed in the C programming language

in a Linux environment.

The project is progressing as scheduled with all milestones being reached with the

exception of Dan being delayed by a week. Dan is creating a prototype tracking

algorithm for following the movement of skin tones with the camera based on a given set

of skin samples. The delay is due to changes in image formatting, specifically relating to

a change in color space. Nick has acquired image streams from the camera in the Linux

environment and I have completed a program for demonstrating audio functionality in the

Linux environment using interactive keyboard commands and audio output similar to that

of the theremin instrument.

The current problems I am encountering involve audio output channels for the soundcard.

Currently in the audio demo, sound is only outputted from one speaker. I would like this

to be changed so that sound is played from both speakers. This requires changing the way

sound data is given to the audio device.


Page 2 of 22

Table of Contents TABLE OF CONTENTS 2 SECTION 1: INTRODUCTION 2 1.1 Background 2 1.2 Rationale 2 1.3 Report Focus 4 1.4 Literature Review 5

1.4.1 DFS’s C Page 2001-2002 5 1.4.2 A Pthreads Tutorial 5 1.4.3 Debian 5 1.4.4 Sine Wave Modulation Synthesis for Programmers 6

SECTION 2: PROGRAM REVIEW 6 2.1 Accomplishments at Present 6 2.2 Sound Support in Debian Linux 6

2.2.1 Downloading and Installing Debian Linux 7 2.2.2 Obtaining and Configuring the Proper Debian Linux Kernel 9 2.2.3 Testing the Debian Linux Installation 10

2.3 Linux Audio Streams 12 2.4 Linux Audio Output Functions 13 2.5 Linux Audio Output Interactive Test Program 15 2.6 Future Milestones to be Met 17 SECTION 3: CHANGES TO PROGRAM 17 SECTION 4: REVISED TIMETABLE 18 REFERENCES: 20 APPENDIX A: ORIGINAL TIMETABLE 21


Page 3 of 22

Section 1: Introduction

The purpose of our project is to create a Virtual Theremin musical instrument using

computer vision techniques under a Linux-based environment. The project is broken

down into three main components: image acquisition, image processing and tracking,

and sound generation. Each of these components will be developed as a module for the

main program. I am responsible for the sound generation module of the project while Dan

and Nick are working on hand detection, tracking and image processing.

1.1 Background

The Theremin is a musical instrument based on the theory of beat frequencies. When you

play a note that is not in tune relative to a reference note of the same frequency, there is a

recognizable pulse until both notes are brought to the same frequency. As these notes get

further away in frequency, more beats/sec are generated. The Theremin uses the

difference between two pitches created by oscillators to produce a sound which falls is in

the auditory range. This sound is amplified giving us the unique sounds produced by the

Theremin [1]. The Theremin is played by altering the capacitance of antennae which

affect the oscillators and thus the pitch. One antenna controls the volume with another

controlling the pitch.

1.2 Rationale

In the past when performing with a Theremin instrument, the performer had to move his

or her hands in and out of several electro-magnetic fields in order to create music. The

positions of the performer’s hands, with reference to the Theremin, generate music. Since


Page 4 of 22

our design group would like to explore the various possibilities of real-time object

tracking in images, we decided that tracking the hands of a Theremin performer would be

an excellent way to begin.

Our Virtual Theremin will function by tracking a performer’s hands in real time and then,

depending on the locations of the hands, create the sound a physical Theremin would

produce. Our Virtual Theremin we will not be using physical antennae. Instead we will

track the performer’s hands using a fire-wire based video camera and treat the hands as if

they were actually causing changes in the true Theremin.

Our project will make use of several computer vision techniques used for tracking, digital

signal processing and sound emulation. We will create a programmable Theremin device

that will be able to emit the sounds of a physical Theremin.

1.3 Report Focus

This report will explain my progress on developing the sound generation component for

the Virtual Theremin project. Details from setting up the Linux software environment, to

developing a demonstration program for sound output under Linux using the OSS API

will be explained while making reference to problems encountered and overcome.


Page 5 of 22

1.4 Literature Review

1.4.1 DFS’s C Page 2001-2002

The DFS’s C page webpage was created by DF Stermole [2]. It contains basic

information about Linux systems, as well as information about programming in C. The

pertinent information from this source comes in the form of a sample C program to allow

a user to obtain input from the keyboard using a single key press under a Linux

environment.

1.4.2 A Pthreads Tutorial

The A Pthreads Tutorial was created by Andrae Muys [3]. The website contains

information on how to create multi-threaded applications under a Unix-type environment.

Programming concepts are explained by providing short easy to understand sample

programs. Topics covered include benefits of concurrency, creating threads, mutexes and

synchronization and examples of classical concurrency problems.

1.4.3 Debian

The Debian website contains information about the Debian operating system [4].

Documentation on the setup of the operating system is available, as well as information

on packaging and usage of the system. It is an invaluable resource for learning about

getting started with your Debian Linux system.


Page 6 of 22

1.4.4 Sine Wave Modulation Synthesis for Programmers

The Sine Wave Modulation Synthesis for Programmers was created by Ian Miller [5]. It

is a valuable resource for determining how sine waves can be used to synthesize sounds.

Equations on creating an appropriate sine wave at a particular frequency based on

sampling rates is present as well as more advanced information on sine wave

manipulation.

Section 2: Program Review

2.1 Accomplishments at present

I am responsible for programming the sound generation module of the Virtual Theremin

project. At the time of this report, I have completed all of the milestones prescribed in [6]

up to this date. Specifically, I have managed to get the soundcard to work under a stable

Linux environment. I have determined how to input and output audio streams using the

OSS API via a C program for the sound card. I have developed my own set of audio

output functions that create sounds similar that of a Theremin instrument and I am

working on developing and testing an interactive program in C to demonstrate audio

output functionality based on user keyboard input.

2.2 Sound support in Debian Linux

The first milestone that I had to reach from [6] involved getting the soundcard to work

under Debian Linux. This milestone ended up having multiple parts to it. It involved

downloading and installing a Debian Linux system. Getting an up to date kernel that

contained both USB and sound support. Configuring the kernel correctly so as to have


Page 7 of 22

USB support and sound support enabled, as well as testing the installation to make sure

everything was functioning correctly.

2.2.1 Downloading and installing Debian Linux

I decided that to avoid unneeded problems relating to the soundcard on the design

computer system, I would install and get running correctly the Debian Linux operating

system on my own home computer. Then I would know the exact steps needed to be

taken to install Debian Linux on the design system relating to soundcard setup with as

few problems as necessary. This information was relayed to Nick Dargus who installed

Debian Linux on the design system. This would also give me a working environment for

programming the sound system for the Virtual Theremin without having to work

exclusively on the design system.

The first thing that was needed to be done was to create a Linux partition on my home

computer where I could install the new operating system without having to worry about

interfering with my current system setup. Since my computer contains two hard-drives, I

designated the second drive to contain the Linux partition. This was particularly effective

as I could boot off of the second drive without having to install a boot manager on the

first drive to select between the Linux and Windows operating systems.

Next, I needed to install the operating system. Dan DeAraujo told me that Debian Linux

allows you to do a network install from the university of Toronto mirror site. This means

that a CD is not required for the installation. This appealed to me because I am on the


Page 8 of 22

university network through my residence network. This meant the download would be

fast.

There are multiple versions of the Debian Linux distributions that you can install and by

reading through [4] I determined that I would need the Debian/GNU Linux 3.0 install.

This installation was chosen because it contained the needed drivers for my network

adapter to allow for the installation via the network, and was said to be stable.

The installation proceeded without. An interesting aspect of the Debian Linux

distribution is that it allows for easy addition of software to the OS by using packages. By

installing a package, the software is configured automatically for your Debian Linux

setup. This is a particularly useful feature. Once the basics of Debian had been installed,

the setup asked me if I would like to install some basics to the OS that are commonly

used. This was great as it allowed me to install some needed features such as the GCC C

compiler and X-windows which is a graphical display similar to that of a windows

system.

Once the OS had been installed, I encountered some problems immediately. Firstly, my

mouse is a USB mouse, and USB support was either not installed or not functioning

because I did not have any control with my mouse. Also, there was no sound support or

the soundcard was not functioning as well. This was determined by attempting to run a

mixer program with no success. I decided that I had better read up some more on Linux

installations. From [4] I determined that an updated kernel was needed, as the one that


Page 9 of 22

had been installed was older and did not contain the proper USB support and sound

support.

2.2.2 Obtaining and configuring the proper Debian Linux kernel

From reading through [4], I had determined that I could use the program DSELECT to

obtain the new kernel. Since Linux is an open source development, the kernel source files

are given freely. The user is required to configure the files correctly using a provided

setup program, and then to compile the kernel for installation.

I decided to grab the newest kernel version provided assuming that it would function

correctly and have both the required USB support and sound support. This ended up

being a mistake as I will explain later. I configured the kernel using the provided setup

program using my knowledge of basic computer systems to choose the correct options for

my required setup. Following the directions I had to manually compile the source files

and setup the system to recognize and use the new kernel. LILO, the Linux boot manager

on my second hard drive also had to be modified to recognize the new kernel setup. This

way of updating the kernel was inefficient, but I found out a more efficient way as I will

explain later.

At first the new kernel seemed to work perfectly. My mouse was functioning correctly,

and I was confident that the audio drivers would work as well. A problem now occurred

when I tried to access the internet. I had not had any problems with this previously in the

older installation but now I could not connect to the internet. There was a problem with

my network device.


Page 10 of 22

I tried many things to see if I could correct the problem and scoured over the internet to

find people with similar problems. I could not find anything on the subject, so I thought I

must have done something wrong in the installation. I proceeded to completely reinstall

the Linux OS. But once again, when the new kernel was installed the same problem with

my network device not functioning correctly occurred. Having access to the internet was

a necessity for my programming environment, as it is a great information resource.

With having to do multiple kernel installations, I decided to try and find an easier way to

set up the new kernel. Sure enough by looking at [4] I found that Debian provided a

utility for creating packages. This way I could create a package of the configured kernel

and it would install automatically without me having to worry about it.

My problems with installing a new kernel were solved once I finally decided that I should

try an older and maybe more stable Linux kernel and hope that it had the functionality I

required. Once this older kernel version was installed and configured, I checked my

internet to see if the network device was functioning correctly and it was. I made a note

of this information to relay to Nick Dargus so that he would not have these problems

when installing Debian Linux on the design system.


Page 11 of 22

2.2.3 Testing the Debian Linux installation

Testing the new installation to see if all the needed functionality was present did not

prove too difficult. Using the Debian packaging utility DSELECT, I got a mixer utility

and a sound player to see if I could get audio to output from the system.

The mixer was used to make sure that the volume output levels were set correctly in the

soundcard. The fact that no errors were present when the mixer loaded was a good

indication that the soundcard was functioning correctly. I proceeded to play an audio file

and was glad to hear the tune coming out of the speakers indicating the soundcard was

indeed installed correctly.

The installed Debian Linux X-windows desktop


Page 12 of 22

2.3 Linux audio streams

The second milestone that I had to reach from [6] involved getting audio streams to

function correctly in a C program. An audio stream is either an input, like recording

sound using a microphone, or an output such as playing an audio file.

At the time [6] was written, it had not been decided as to what method of audio output

would be implemented. The methods to be implemented were either audio output by

discrete time samples, or by making use of MIDI. I decided to go with using the discrete

time samples for it’s efficiency, and because to emulate a virtual Theremin, I can make

use of a sine wave [5] which will be explained later.

It had been determined previously in [6] that a good way to accomplish these goals would

be to make use of the OSS API. The OSS API is an open source audio API that comes

standard with the Linux distribution [7]. When I first attempted to create a C file using

the OSS API, the file I created would not compile. Gcc could not find the intended header

file for the OSS API. This was distressing but after searching though the include

directories in my Linux installation I was able to successfully find the intended header

file and manually set the location of this file in the C source file. This allowed the C

source program to compile without problems.

In this basic step of trying to understand how the audio driver functions, I attempted to

create an extremely basic function that would just output some sounds from the speakers.

By reading through [7] I determined that it is beneficial to have a buffer for feeding the


Page 13 of 22

audio device. In Linux, devices are set up as files and to access them is the same as to

access a file. So if I wanted to output to the audio device, I would need to write the audio

buffer to the appropriate file located in the /dev directory in Linux. Following [7] I

created a basic program that opened the appropriate device with default settings, wrote

some data to the device and then exited. This process ended up being successful with

some noise coming from the speakers when executed.

The problems that occurred at this stage had to do with understanding what data needed

to be written to the device for coherent predictable audio and what settings for the device

needed to be set to be able to accomplish having coherent predictable audio.

2.4 Linux audio output functions

The third milestone that I had to reach from [6] involved creating audio output functions

that will give coherent predictable audio that is similar to that of a Theremin.

From [8] I knew that the sounds emitted by the Theremin closely resemble those of a

basic sine wave. I decided that this would be where I would start. I would need to put

appropriate samples into my audio buffer to write to the audio device.

It was at this time that I decided that it would be a good idea to follow [7] to change the

default audio output settings of the soundcard to something more appropriate. This way I

would know the proper size of each sample, as well as the sampling rate. Sampling is

converting an analogue signal into digital form by taking discrete samples at specific


Page 14 of 22

points in time of the wave form. The sampling rate is the number of samples taken of the

waveform per second [9]. I decided to keep the basic 8bit sample size but to alter the

sampling rate to be 44100.

With the aid of [5], I created a function that would write a sine wave of a specified

frequency and amplitude to the audio buffer. This basic task was fraught with problems.

If a constant frequency sine wave was written to the audio device, a constant tone should

have been emanating from the speaker. Instead, I was getting a single tone that was

choppy instead of constant. After thinking about this problem I determined that the wave

was not ending at the end of a cycle. To counteract this problem I used the sampling rate

to determine where 1 cycle of the wav at a specific frequency completed, and wrote this

to the audio buffer instead. When the audio buffer became full, its data was then written

to the soundcard but there was another problem. The sound was still choppy. I

determined that this must be a problem with the individual sound samples and the

amplitude of those samples. At first I had just set the amplitude of the sine wave to 127

and neglected the fact that sine waves cycle between negative and positive. The sine

wave that I was using was cycling between 127 and -127 giving the incorrect output. To

correct this problem, an offset of 128 was needed so that my sine wave would cycle

Sampling of a sine wave [9].


Page 15 of 22

between 0 and 255. This corrected the problem and I now had a constant tone emanating

from the speaker at my specified frequency.

As a test I wrote a small C program that would cycle up and down through frequency

changes to see if a constant tonal change would occur. Sure enough, the program

demonstrated that this approach to producing a tone similar to that of a Theremin would

be successful.

2.5 Linux audio output interactive test program

The last milestone that needed to be reached from [6] was to create an audio output

interactive demonstration test program. I wanted a program that would play a constant

tone from the soundcard, but would allow the user, via the keyboard, to alter both the

frequency and volume of the tone being generated. This presented me with two large

problems.

The first problem was that I needed to get characters from the keyboard. C provides a

GETCHAR function, but the problem with this is that you needed to hit the enter key

after each input. I did not want this functionality. I wanted the program to react with just

a single key press. I had no idea how to fix this problem, so I proceeded to search around

the internet to see if other users have had this same problem. I found numerous references

to this problem, with one solution being to change keyboard input so that automatic

keyboard buffering was disabled [2]. I used the functions from [2] to accomplish this task

and to turn off echoing of characters to STDOUT. Another function was given would


Page 16 of 22

return the keyboard input buffer to its previous state. The problem of single key presses

for input was solved.

The next problem in developing the audio output demonstration program was that I

needed to continually fill up the audio buffer and write the data to the soundcard. This

would not be possible if the program was waiting for input from the keyboard. I decided

that the way to get around this problem was to use threads, specifically the PTHREAD

library. This was a good choice as I had some experience from using threads from my

operating systems course. I could create a thread that would grab a frequency value from

a global variable and use this value for the sine wave that would be written to the audio

buffer. In the main program I would just need to update the global variable with a new

frequency when a specified key was pressed. This worked well.

The next problem was that a Theremin instrument can also alter the volume of the tone. I

also needed to change the amplitude of the sine wave. This was easily accomplished by

creating a global variable for the amplitude and allowing a user to change this value

between 0 and 127. This would effectively change the volume of the outputted sine wave

being generated by the thread. So I now had a completed demonstration program that

would allow you to change the frequency and volume coming from the speaker. This is

precisely what the Theremin allows you to do.


Page 17 of 22

2.6 Future milestones to be met

The future milestones that need to be met are to create a simulation, based on given hand

position data, to determine what tones and amplitude the system should output. Also, I

need to create a basic user interface to tie each design group member’s modules together

and allow editing of certain parameters of the system. This will allow ease of use of the

Virtual Theremin system.

Section 3: Changes to program

Since our Virtual Theremin project divides up nicely into sections that are independent

from one another, I have not needed to alter my milestones for any problems encountered

by other group members. Currently I am on schedule and do not anticipate having to

make many changes to stay on schedule. An updated list of the group milestones has been

included in the next section. To account for the changes of other group members

milestones, please consult either Nick Dargus’s, or Dan DeAraujo’s interim report.


Page 18 of 22

Section 4: Revised Timetable

Description Estimated Completion Date Responsibility

Proposal Draft 27-Sep-02 GroupInitial setup of Linux and Pyro webcam on main machine 05-Oct-02 NickGet sound card working under Debian Linux 18-Oct-02 JerProposal Final 18-Oct-02 GroupUse Octave for basic simulations 18-Oct-02 DanAcquire test images under Linux 20-Oct-02 NickDetermine how to input and output audio streams 08-Nov-02 JerPrototype tracking algorithm testing 15-Nov-02 DanDevelopment of audio output functions in C 20-Nov-02 JerAcquire full stream of video feed under Linux 01-Dec-02 NickInterim Report 10-Jan-03 GroupCompletion of Prototype tracking algorithm 17-Jan-03 DanDevelopment and testing of interactive audio output functions in C 20-Jan-03 JerDocumentation 01-Feb-03 NickTheremin Simulation algorithm for hand locations 01-Feb-03 JerImplement Basic algorithm Optimizations 05-Feb-03 DanFinal tracking algorithm testing 15-Feb-03 Dan & NickUser Interface 22-Feb-03 JerCompletion of Final tracking algorithm 22-Feb-03 DanFinal Project Testing 02-Mar-03 Group

Please see Appendix A for the original timetable. A graphical representation of this timetable can be found on the next page.


Page 19 of 22

Graphical Schedule (Gantt Chart)


References [1] Tranter, Jeff. (2002, Sept. 26). The Linux Sound HOWTO. [Online]. Available:

http://www.tldp.org/HOWTO/Sound -HOWTO/index.html [2] Stermole, DF. (2003, Jan. 4). DFS's C Page 2001-2002. [Online]. Available:

http://www.macdonald.egate.net/CompSci/index.html

[3] Muys, Andrae. (2003, Jan. 5). A Pthreads Tutorial. [Online]. Available: http://www.cs.nmsu.edu/~jcook/Tools/pthreads/pthreads.html

[4] (2002, Sept. 22). Debian. [Online]. Available: http://www.debian.org [5] Wilson, Ian. (2003, Jan. 5). Sine Wave Modulation Synthesis for Programmers.

[Online]. Available: http://www.geocities.com/SiliconValley/Campus/8645/synth.html

[6] Dargus, DeAraujo & Gillard. Virtual Theremin using IEEE1394 - Technical

Proposal. Toronto: University of Toronto, 2002. [7] Tranter, Jeff. (2002, Sept. 26). Open Sound System Programmer’s Guide. (1.11)

[Online]. Available: http://www.4front-tech.com/pguide/oss.pdf [8] Sexton, Robert. (2002, Sept. 22). Take a Look at Theremins. [Online]. Available:

http://www.ccsi.com/~bobs/theremin.html [9] Hawksley, John. (2003, Jan. 5). Sampling. [Online]. Available:

http://www.armory.com/~greebo/sampling.html


Page 21 of 22

Appendix A: Original Timetable

Description Estimated Completion Date Responsibility

Proposal Draft 27-Sep-02 GroupInitial setup of Linux and Pyro webcam on main machine 05-Oct-02 NickGet sound card working under Debian Linux 18-Oct-02 JerProposal Final 18-Oct-02 GroupUse Octave for basic simulations 18-Oct-02 DanAcquire test images under Linux 20-Oct-02 NickDesign RGB to HSI module 04-Nov-02 NickDetermine how to input and output audio streams 08-Nov-02 JerPrototype tracking algorithm testing 15-Nov-02 DanTest RGB2HSI module 16-Nov-02 NickDevelopment of audio output functions in C 20-Nov-02 JerAcquire full stream of video feed under Linux 01-Dec-02 NickCompletion of Prototype tracking algorithm 01-Dec-02 DanInterim Report 10-Jan-03 GroupTest RGB2HSI module against video feed 20-Jan-03 Nick

Development and testing of interactive audio output functions in C20-Jan-03

JerImplement Basic algorithm Optimizations 31-Jan-03 DanDocumentation 01-Feb-03 NickTheremin Simulation algorithm for hand locations 01-Feb-03 JerFinal tracking algorithm testing 15-Feb-03 DanUser Interface 22-Feb-03 JerCompletion of Final tracking algorithm 22-Feb-03 DanFinal Project Testing 02-Mar-03 Group

A graphical representation of this timetable can be found on the next page.


Graphical Schedule (Gantt Chart)

Interim Report Dana - University of...

Documents

Transcript of Interim Report Dana - University of...