Interim Report Dana - University of...
Transcript of Interim Report Dana - University of...
The Edward S. Rogers Sr. Dept of Electrical and Computer Engineering University of Toronto
ECE496Y Design Project Course - Interim Report
Title: Virtual Theremin using IEEE1394
Project I.D. # 2002105
Prepared by: Jeremy Gillard – [email protected]
Supervisor: Prof. James MacLean
Section #: 5
Section Coordinator:
Phil Anderson
Date: Friday, January 10, 2003
Virtual Theremin Design Project Jeremy Gillard
Design Project – Interim Report ECE496
Design Project #1682002
Virtual Theremin using IEEE1394
Prepared by: Jeremy Gillard
Supervisor: Prof. James MacLean
Date: Friday January 10th 2003
Executive Summary
The project our design group has undertaken is the design and implementation of a
Virtual Theremin musical instrument that uses an i386-based architecture computer,
making use of vision tracking techniques. The musical performance of a Theremin
performer is tracked in real-time using an IEEE1394 based camera.
The main goals of the project are: to detect the performer’s hands in real-time, to apply
algorithms that will allow us to predict when and where the performer’s hands are, and to
emulate the sound a physical theremin instrument would make based on the different
hand positions . The project’s software will be developed in the C programming language
in a Linux environment.
The project is progressing as scheduled with all milestones being reached with the
exception of Dan being delayed by a week. Dan is creating a prototype tracking
algorithm for following the movement of skin tones with the camera based on a given set
of skin samples. The delay is due to changes in image formatting, specifically relating to
a change in color space. Nick has acquired image streams from the camera in the Linux
environment and I have completed a program for demonstrating audio functionality in the
Linux environment using interactive keyboard commands and audio output similar to that
of the theremin instrument.
The current problems I am encountering involve audio output channels for the soundcard.
Currently in the audio demo, sound is only outputted from one speaker. I would like this
to be changed so that sound is played from both speakers. This requires changing the way
sound data is given to the audio device.
Virtual Theremin Design Project Jeremy Gillard
Page 2 of 22
Table of Contents TABLE OF CONTENTS 2 SECTION 1: INTRODUCTION 2 1.1 Background 2 1.2 Rationale 2 1.3 Report Focus 4 1.4 Literature Review 5
1.4.1 DFS’s C Page 2001-2002 5 1.4.2 A Pthreads Tutorial 5 1.4.3 Debian 5 1.4.4 Sine Wave Modulation Synthesis for Programmers 6
SECTION 2: PROGRAM REVIEW 6 2.1 Accomplishments at Present 6 2.2 Sound Support in Debian Linux 6
2.2.1 Downloading and Installing Debian Linux 7 2.2.2 Obtaining and Configuring the Proper Debian Linux Kernel 9 2.2.3 Testing the Debian Linux Installation 10
2.3 Linux Audio Streams 12 2.4 Linux Audio Output Functions 13 2.5 Linux Audio Output Interactive Test Program 15 2.6 Future Milestones to be Met 17 SECTION 3: CHANGES TO PROGRAM 17 SECTION 4: REVISED TIMETABLE 18 REFERENCES: 20 APPENDIX A: ORIGINAL TIMETABLE 21
Virtual Theremin Design Project Jeremy Gillard
Page 3 of 22
Section 1: Introduction
The purpose of our project is to create a Virtual Theremin musical instrument using
computer vision techniques under a Linux-based environment. The project is broken
down into three main components: image acquisition, image processing and tracking,
and sound generation. Each of these components will be developed as a module for the
main program. I am responsible for the sound generation module of the project while Dan
and Nick are working on hand detection, tracking and image processing.
1.1 Background
The Theremin is a musical instrument based on the theory of beat frequencies. When you
play a note that is not in tune relative to a reference note of the same frequency, there is a
recognizable pulse until both notes are brought to the same frequency. As these notes get
further away in frequency, more beats/sec are generated. The Theremin uses the
difference between two pitches created by oscillators to produce a sound which falls is in
the auditory range. This sound is amplified giving us the unique sounds produced by the
Theremin [1]. The Theremin is played by altering the capacitance of antennae which
affect the oscillators and thus the pitch. One antenna controls the volume with another
controlling the pitch.
1.2 Rationale
In the past when performing with a Theremin instrument, the performer had to move his
or her hands in and out of several electro-magnetic fields in order to create music. The
positions of the performer’s hands, with reference to the Theremin, generate music. Since
Virtual Theremin Design Project Jeremy Gillard
Page 4 of 22
our design group would like to explore the various possibilities of real-time object
tracking in images, we decided that tracking the hands of a Theremin performer would be
an excellent way to begin.
Our Virtual Theremin will function by tracking a performer’s hands in real time and then,
depending on the locations of the hands, create the sound a physical Theremin would
produce. Our Virtual Theremin we will not be using physical antennae. Instead we will
track the performer’s hands using a fire-wire based video camera and treat the hands as if
they were actually causing changes in the true Theremin.
Our project will make use of several computer vision techniques used for tracking, digital
signal processing and sound emulation. We will create a programmable Theremin device
that will be able to emit the sounds of a physical Theremin.
1.3 Report Focus
This report will explain my progress on developing the sound generation component for
the Virtual Theremin project. Details from setting up the Linux software environment, to
developing a demonstration program for sound output under Linux using the OSS API
will be explained while making reference to problems encountered and overcome.
Virtual Theremin Design Project Jeremy Gillard
Page 5 of 22
1.4 Literature Review
1.4.1 DFS’s C Page 2001-2002
The DFS’s C page webpage was created by DF Stermole [2]. It contains basic
information about Linux systems, as well as information about programming in C. The
pertinent information from this source comes in the form of a sample C program to allow
a user to obtain input from the keyboard using a single key press under a Linux
environment.
1.4.2 A Pthreads Tutorial
The A Pthreads Tutorial was created by Andrae Muys [3]. The website contains
information on how to create multi-threaded applications under a Unix-type environment.
Programming concepts are explained by providing short easy to understand sample
programs. Topics covered include benefits of concurrency, creating threads, mutexes and
synchronization and examples of classical concurrency problems.
1.4.3 Debian
The Debian website contains information about the Debian operating system [4].
Documentation on the setup of the operating system is available, as well as information
on packaging and usage of the system. It is an invaluable resource for learning about
getting started with your Debian Linux system.
Virtual Theremin Design Project Jeremy Gillard
Page 6 of 22
1.4.4 Sine Wave Modulation Synthesis for Programmers
The Sine Wave Modulation Synthesis for Programmers was created by Ian Miller [5]. It
is a valuable resource for determining how sine waves can be used to synthesize sounds.
Equations on creating an appropriate sine wave at a particular frequency based on
sampling rates is present as well as more advanced information on sine wave
manipulation.
Section 2: Program Review
2.1 Accomplishments at present
I am responsible for programming the sound generation module of the Virtual Theremin
project. At the time of this report, I have completed all of the milestones prescribed in [6]
up to this date. Specifically, I have managed to get the soundcard to work under a stable
Linux environment. I have determined how to input and output audio streams using the
OSS API via a C program for the sound card. I have developed my own set of audio
output functions that create sounds similar that of a Theremin instrument and I am
working on developing and testing an interactive program in C to demonstrate audio
output functionality based on user keyboard input.
2.2 Sound support in Debian Linux
The first milestone that I had to reach from [6] involved getting the soundcard to work
under Debian Linux. This milestone ended up having multiple parts to it. It involved
downloading and installing a Debian Linux system. Getting an up to date kernel that
contained both USB and sound support. Configuring the kernel correctly so as to have
Virtual Theremin Design Project Jeremy Gillard
Page 7 of 22
USB support and sound support enabled, as well as testing the installation to make sure
everything was functioning correctly.
2.2.1 Downloading and installing Debian Linux
I decided that to avoid unneeded problems relating to the soundcard on the design
computer system, I would install and get running correctly the Debian Linux operating
system on my own home computer. Then I would know the exact steps needed to be
taken to install Debian Linux on the design system relating to soundcard setup with as
few problems as necessary. This information was relayed to Nick Dargus who installed
Debian Linux on the design system. This would also give me a working environment for
programming the sound system for the Virtual Theremin without having to work
exclusively on the design system.
The first thing that was needed to be done was to create a Linux partition on my home
computer where I could install the new operating system without having to worry about
interfering with my current system setup. Since my computer contains two hard-drives, I
designated the second drive to contain the Linux partition. This was particularly effective
as I could boot off of the second drive without having to install a boot manager on the
first drive to select between the Linux and Windows operating systems.
Next, I needed to install the operating system. Dan DeAraujo told me that Debian Linux
allows you to do a network install from the university of Toronto mirror site. This means
that a CD is not required for the installation. This appealed to me because I am on the
Virtual Theremin Design Project Jeremy Gillard
Page 8 of 22
university network through my residence network. This meant the download would be
fast.
There are multiple versions of the Debian Linux distributions that you can install and by
reading through [4] I determined that I would need the Debian/GNU Linux 3.0 install.
This installation was chosen because it contained the needed drivers for my network
adapter to allow for the installation via the network, and was said to be stable.
The installation proceeded without. An interesting aspect of the Debian Linux
distribution is that it allows for easy addition of software to the OS by using packages. By
installing a package, the software is configured automatically for your Debian Linux
setup. This is a particularly useful feature. Once the basics of Debian had been installed,
the setup asked me if I would like to install some basics to the OS that are commonly
used. This was great as it allowed me to install some needed features such as the GCC C
compiler and X-windows which is a graphical display similar to that of a windows
system.
Once the OS had been installed, I encountered some problems immediately. Firstly, my
mouse is a USB mouse, and USB support was either not installed or not functioning
because I did not have any control with my mouse. Also, there was no sound support or
the soundcard was not functioning as well. This was determined by attempting to run a
mixer program with no success. I decided that I had better read up some more on Linux
installations. From [4] I determined that an updated kernel was needed, as the one that
Virtual Theremin Design Project Jeremy Gillard
Page 9 of 22
had been installed was older and did not contain the proper USB support and sound
support.
2.2.2 Obtaining and configuring the proper Debian Linux kernel
From reading through [4], I had determined that I could use the program DSELECT to
obtain the new kernel. Since Linux is an open source development, the kernel source files
are given freely. The user is required to configure the files correctly using a provided
setup program, and then to compile the kernel for installation.
I decided to grab the newest kernel version provided assuming that it would function
correctly and have both the required USB support and sound support. This ended up
being a mistake as I will explain later. I configured the kernel using the provided setup
program using my knowledge of basic computer systems to choose the correct options for
my required setup. Following the directions I had to manually compile the source files
and setup the system to recognize and use the new kernel. LILO, the Linux boot manager
on my second hard drive also had to be modified to recognize the new kernel setup. This
way of updating the kernel was inefficient, but I found out a more efficient way as I will
explain later.
At first the new kernel seemed to work perfectly. My mouse was functioning correctly,
and I was confident that the audio drivers would work as well. A problem now occurred
when I tried to access the internet. I had not had any problems with this previously in the
older installation but now I could not connect to the internet. There was a problem with
my network device.
Virtual Theremin Design Project Jeremy Gillard
Page 10 of 22
I tried many things to see if I could correct the problem and scoured over the internet to
find people with similar problems. I could not find anything on the subject, so I thought I
must have done something wrong in the installation. I proceeded to completely reinstall
the Linux OS. But once again, when the new kernel was installed the same problem with
my network device not functioning correctly occurred. Having access to the internet was
a necessity for my programming environment, as it is a great information resource.
With having to do multiple kernel installations, I decided to try and find an easier way to
set up the new kernel. Sure enough by looking at [4] I found that Debian provided a
utility for creating packages. This way I could create a package of the configured kernel
and it would install automatically without me having to worry about it.
My problems with installing a new kernel were solved once I finally decided that I should
try an older and maybe more stable Linux kernel and hope that it had the functionality I
required. Once this older kernel version was installed and configured, I checked my
internet to see if the network device was functioning correctly and it was. I made a note
of this information to relay to Nick Dargus so that he would not have these problems
when installing Debian Linux on the design system.
Virtual Theremin Design Project Jeremy Gillard
Page 11 of 22
2.2.3 Testing the Debian Linux installation
Testing the new installation to see if all the needed functionality was present did not
prove too difficult. Using the Debian packaging utility DSELECT, I got a mixer utility
and a sound player to see if I could get audio to output from the system.
The mixer was used to make sure that the volume output levels were set correctly in the
soundcard. The fact that no errors were present when the mixer loaded was a good
indication that the soundcard was functioning correctly. I proceeded to play an audio file
and was glad to hear the tune coming out of the speakers indicating the soundcard was
indeed installed correctly.
The installed Debian Linux X-windows desktop
Virtual Theremin Design Project Jeremy Gillard
Page 12 of 22
2.3 Linux audio streams
The second milestone that I had to reach from [6] involved getting audio streams to
function correctly in a C program. An audio stream is either an input, like recording
sound using a microphone, or an output such as playing an audio file.
At the time [6] was written, it had not been decided as to what method of audio output
would be implemented. The methods to be implemented were either audio output by
discrete time samples, or by making use of MIDI. I decided to go with using the discrete
time samples for it’s efficiency, and because to emulate a virtual Theremin, I can make
use of a sine wave [5] which will be explained later.
It had been determined previously in [6] that a good way to accomplish these goals would
be to make use of the OSS API. The OSS API is an open source audio API that comes
standard with the Linux distribution [7]. When I first attempted to create a C file using
the OSS API, the file I created would not compile. Gcc could not find the intended header
file for the OSS API. This was distressing but after searching though the include
directories in my Linux installation I was able to successfully find the intended header
file and manually set the location of this file in the C source file. This allowed the C
source program to compile without problems.
In this basic step of trying to understand how the audio driver functions, I attempted to
create an extremely basic function that would just output some sounds from the speakers.
By reading through [7] I determined that it is beneficial to have a buffer for feeding the
Virtual Theremin Design Project Jeremy Gillard
Page 13 of 22
audio device. In Linux, devices are set up as files and to access them is the same as to
access a file. So if I wanted to output to the audio device, I would need to write the audio
buffer to the appropriate file located in the /dev directory in Linux. Following [7] I
created a basic program that opened the appropriate device with default settings, wrote
some data to the device and then exited. This process ended up being successful with
some noise coming from the speakers when executed.
The problems that occurred at this stage had to do with understanding what data needed
to be written to the device for coherent predictable audio and what settings for the device
needed to be set to be able to accomplish having coherent predictable audio.
2.4 Linux audio output functions
The third milestone that I had to reach from [6] involved creating audio output functions
that will give coherent predictable audio that is similar to that of a Theremin.
From [8] I knew that the sounds emitted by the Theremin closely resemble those of a
basic sine wave. I decided that this would be where I would start. I would need to put
appropriate samples into my audio buffer to write to the audio device.
It was at this time that I decided that it would be a good idea to follow [7] to change the
default audio output settings of the soundcard to something more appropriate. This way I
would know the proper size of each sample, as well as the sampling rate. Sampling is
converting an analogue signal into digital form by taking discrete samples at specific
Virtual Theremin Design Project Jeremy Gillard
Page 14 of 22
points in time of the wave form. The sampling rate is the number of samples taken of the
waveform per second [9]. I decided to keep the basic 8bit sample size but to alter the
sampling rate to be 44100.
With the aid of [5], I created a function that would write a sine wave of a specified
frequency and amplitude to the audio buffer. This basic task was fraught with problems.
If a constant frequency sine wave was written to the audio device, a constant tone should
have been emanating from the speaker. Instead, I was getting a single tone that was
choppy instead of constant. After thinking about this problem I determined that the wave
was not ending at the end of a cycle. To counteract this problem I used the sampling rate
to determine where 1 cycle of the wav at a specific frequency completed, and wrote this
to the audio buffer instead. When the audio buffer became full, its data was then written
to the soundcard but there was another problem. The sound was still choppy. I
determined that this must be a problem with the individual sound samples and the
amplitude of those samples. At first I had just set the amplitude of the sine wave to 127
and neglected the fact that sine waves cycle between negative and positive. The sine
wave that I was using was cycling between 127 and -127 giving the incorrect output. To
correct this problem, an offset of 128 was needed so that my sine wave would cycle
Sampling of a sine wave [9].
Virtual Theremin Design Project Jeremy Gillard
Page 15 of 22
between 0 and 255. This corrected the problem and I now had a constant tone emanating
from the speaker at my specified frequency.
As a test I wrote a small C program that would cycle up and down through frequency
changes to see if a constant tonal change would occur. Sure enough, the program
demonstrated that this approach to producing a tone similar to that of a Theremin would
be successful.
2.5 Linux audio output interactive test program
The last milestone that needed to be reached from [6] was to create an audio output
interactive demonstration test program. I wanted a program that would play a constant
tone from the soundcard, but would allow the user, via the keyboard, to alter both the
frequency and volume of the tone being generated. This presented me with two large
problems.
The first problem was that I needed to get characters from the keyboard. C provides a
GETCHAR function, but the problem with this is that you needed to hit the enter key
after each input. I did not want this functionality. I wanted the program to react with just
a single key press. I had no idea how to fix this problem, so I proceeded to search around
the internet to see if other users have had this same problem. I found numerous references
to this problem, with one solution being to change keyboard input so that automatic
keyboard buffering was disabled [2]. I used the functions from [2] to accomplish this task
and to turn off echoing of characters to STDOUT. Another function was given would
Virtual Theremin Design Project Jeremy Gillard
Page 16 of 22
return the keyboard input buffer to its previous state. The problem of single key presses
for input was solved.
The next problem in developing the audio output demonstration program was that I
needed to continually fill up the audio buffer and write the data to the soundcard. This
would not be possible if the program was waiting for input from the keyboard. I decided
that the way to get around this problem was to use threads, specifically the PTHREAD
library. This was a good choice as I had some experience from using threads from my
operating systems course. I could create a thread that would grab a frequency value from
a global variable and use this value for the sine wave that would be written to the audio
buffer. In the main program I would just need to update the global variable with a new
frequency when a specified key was pressed. This worked well.
The next problem was that a Theremin instrument can also alter the volume of the tone. I
also needed to change the amplitude of the sine wave. This was easily accomplished by
creating a global variable for the amplitude and allowing a user to change this value
between 0 and 127. This would effectively change the volume of the outputted sine wave
being generated by the thread. So I now had a completed demonstration program that
would allow you to change the frequency and volume coming from the speaker. This is
precisely what the Theremin allows you to do.
Virtual Theremin Design Project Jeremy Gillard
Page 17 of 22
2.6 Future milestones to be met
The future milestones that need to be met are to create a simulation, based on given hand
position data, to determine what tones and amplitude the system should output. Also, I
need to create a basic user interface to tie each design group member’s modules together
and allow editing of certain parameters of the system. This will allow ease of use of the
Virtual Theremin system.
Section 3: Changes to program
Since our Virtual Theremin project divides up nicely into sections that are independent
from one another, I have not needed to alter my milestones for any problems encountered
by other group members. Currently I am on schedule and do not anticipate having to
make many changes to stay on schedule. An updated list of the group milestones has been
included in the next section. To account for the changes of other group members
milestones, please consult either Nick Dargus’s, or Dan DeAraujo’s interim report.
Virtual Theremin Design Project Jeremy Gillard
Page 18 of 22
Section 4: Revised Timetable
Description Estimated Completion Date Responsibility
Proposal Draft 27-Sep-02 GroupInitial setup of Linux and Pyro webcam on main machine 05-Oct-02 NickGet sound card working under Debian Linux 18-Oct-02 JerProposal Final 18-Oct-02 GroupUse Octave for basic simulations 18-Oct-02 DanAcquire test images under Linux 20-Oct-02 NickDetermine how to input and output audio streams 08-Nov-02 JerPrototype tracking algorithm testing 15-Nov-02 DanDevelopment of audio output functions in C 20-Nov-02 JerAcquire full stream of video feed under Linux 01-Dec-02 NickInterim Report 10-Jan-03 GroupCompletion of Prototype tracking algorithm 17-Jan-03 DanDevelopment and testing of interactive audio output functions in C 20-Jan-03 JerDocumentation 01-Feb-03 NickTheremin Simulation algorithm for hand locations 01-Feb-03 JerImplement Basic algorithm Optimizations 05-Feb-03 DanFinal tracking algorithm testing 15-Feb-03 Dan & NickUser Interface 22-Feb-03 JerCompletion of Final tracking algorithm 22-Feb-03 DanFinal Project Testing 02-Mar-03 Group
Please see Appendix A for the original timetable. A graphical representation of this timetable can be found on the next page.
Virtual Theremin Design Project Jeremy Gillard
References [1] Tranter, Jeff. (2002, Sept. 26). The Linux Sound HOWTO. [Online]. Available:
http://www.tldp.org/HOWTO/Sound -HOWTO/index.html [2] Stermole, DF. (2003, Jan. 4). DFS's C Page 2001-2002. [Online]. Available:
http://www.macdonald.egate.net/CompSci/index.html
[3] Muys, Andrae. (2003, Jan. 5). A Pthreads Tutorial. [Online]. Available: http://www.cs.nmsu.edu/~jcook/Tools/pthreads/pthreads.html
[4] (2002, Sept. 22). Debian. [Online]. Available: http://www.debian.org [5] Wilson, Ian. (2003, Jan. 5). Sine Wave Modulation Synthesis for Programmers.
[Online]. Available: http://www.geocities.com/SiliconValley/Campus/8645/synth.html
[6] Dargus, DeAraujo & Gillard. Virtual Theremin using IEEE1394 - Technical
Proposal. Toronto: University of Toronto, 2002. [7] Tranter, Jeff. (2002, Sept. 26). Open Sound System Programmer’s Guide. (1.11)
[Online]. Available: http://www.4front-tech.com/pguide/oss.pdf [8] Sexton, Robert. (2002, Sept. 22). Take a Look at Theremins. [Online]. Available:
http://www.ccsi.com/~bobs/theremin.html [9] Hawksley, John. (2003, Jan. 5). Sampling. [Online]. Available:
http://www.armory.com/~greebo/sampling.html
Virtual Theremin Design Project Jeremy Gillard
Page 21 of 22
Appendix A: Original Timetable
Description Estimated Completion Date Responsibility
Proposal Draft 27-Sep-02 GroupInitial setup of Linux and Pyro webcam on main machine 05-Oct-02 NickGet sound card working under Debian Linux 18-Oct-02 JerProposal Final 18-Oct-02 GroupUse Octave for basic simulations 18-Oct-02 DanAcquire test images under Linux 20-Oct-02 NickDesign RGB to HSI module 04-Nov-02 NickDetermine how to input and output audio streams 08-Nov-02 JerPrototype tracking algorithm testing 15-Nov-02 DanTest RGB2HSI module 16-Nov-02 NickDevelopment of audio output functions in C 20-Nov-02 JerAcquire full stream of video feed under Linux 01-Dec-02 NickCompletion of Prototype tracking algorithm 01-Dec-02 DanInterim Report 10-Jan-03 GroupTest RGB2HSI module against video feed 20-Jan-03 Nick
Development and testing of interactive audio output functions in C20-Jan-03
JerImplement Basic algorithm Optimizations 31-Jan-03 DanDocumentation 01-Feb-03 NickTheremin Simulation algorithm for hand locations 01-Feb-03 JerFinal tracking algorithm testing 15-Feb-03 DanUser Interface 22-Feb-03 JerCompletion of Final tracking algorithm 22-Feb-03 DanFinal Project Testing 02-Mar-03 Group
A graphical representation of this timetable can be found on the next page.