Musical Interaction at a Distance: Distributed Immersive...

10
1 Musical Interaction at a Distance: Distributed Immersive Performance E. Chew, R. Zimmermann, A.A. Sawchuk, C. Kyriakakis, C. Papadopoulos, A.R.J. François, G. Kim, A. Rizzo and A. Volk University of Southern California Viterbi School of Engineering, Integrated Media Systems Center, Los Angeles, CA 90089 {echew, rzimmerm, sawchuk, ckyriak, chrisp, afrancoi, arizzo, avolk}@usc.edu, [email protected] Abstract The Distributed Immersive Performance (DIP) project explores one of the most challenging goals of networked media technology: creating a seamless environment for remote and synchronous musical collaboration. A number of research groups have presented one-time demonstrations of distributed performance with varying degrees of success since the 1970s. None, as far as we know, has focused on capture and recording of musical experience and thorough analysis of realistic musical interaction in an environment constrained by network latency and reduced physical presence. In this paper we present a comprehensive framework for the capture, recording and replay of high-resolution video, audio and MIDI streams in an interactive environment for collaborative music performance, and user-based experiments to determine the effects of latency in aural response on performers’ satisfaction with the ease of creating a tight ensemble, a musical interpretative and adaptation to the conditions. The experiments mark the beginning of our efforts to study comprehensively the effects of musical interaction over the Internet in a realistic performance setting. The users and evaluators of the system are the Tosheff piano duo, Vely Stoyanova and Ilia Tosheff, a professional piano duo who have won awards and concertized internationally. We present preliminary results from our first two sets of experiments: two players in the same room at separate keyboards with direct visual contact and delayed aural response from partner (between 0 ms and 150 ms), and the same experiments with players swapping parts. User responses are reported for experiments using Poulenc’s Sonata for Piano Four-Hands, where the movements are Prelude (score-recommended tempo of 132 bpm), Rustique (46 bpm) and Final (160 bpm). For the fast movements (one and three), the players experienced the highest difficulty in creating a tight ensemble at 50 ms and above. In the fast but not-so-rapid first movement, the players almost always rated difficulty in creating a musical interpretation higher than ensemble (synchronization) difficulties. In general, the users judged that, with practice, they could adapt to delays below 50 ms. 1. Introduction This paper presents a real-time and multi-site distributed interactive and collaborative environment called Distributed Immersive Performance (DIP) [1], one of the key initiatives at USC’s Integrated Media Systems Center (IMSC). The paper proposes a framework for the capture, recording and replay of digital music, audio and video signals so as to enable comprehensive evaluation and study of musical interaction in a networked musical performance. We present here the initial experiments and preliminary results of the first two sets of networked performance experiments involving a professional piano duo. At IMSC, our goal is to develop the technologies of immersive and other integrated media systems through research, engineering, education and industrial collaborations [2]. Our vision of immersive technology is the creation of a complete audio and visual environment that places people in a virtual space where they can communicate naturally even though they are in different physical locations. The Distributed Immersive Performance (DIP) environment presents an ideal testbed for multimedia creation, archiving, representation and transmission of electronic experiences. As a system, it facilitates new forms of creativity by enabling remote and synchronous musical collaborations. Musical collaboration demands a level of fidelity and immediacy of response that makes it an ideal testbed for pushing both the limits of human perception as well as technology innovation. The performance of such a system can be measured and quantified through the capture, replay and analysis of musician interaction and the digital music signals they create. One of the main challenges of synchronous collaboration over the network is the effect of audio and video latency and reduced physical presence on ensemble synchrony as well as musical interpretation. We aim to create a comprehensive framework for capturing, documenting and analyzing the effects of audio and video latency and a virtual shared environment on synchronous musical collaboration. Musician satisfaction with the

Transcript of Musical Interaction at a Distance: Distributed Immersive...

1

Musical Interaction at a Distance:Distributed Immersive Performance

E. Chew, R. Zimmermann, A.A. Sawchuk, C. Kyriakakis, C. Papadopoulos,A.R.J. François, G. Kim, A. Rizzo and A. Volk

University of Southern California Viterbi School of Engineering,Integrated Media Systems Center, Los Angeles, CA 90089

{echew, rzimmerm, sawchuk, ckyriak, chrisp, afrancoi, arizzo, avolk}@usc.edu, [email protected]

Abstract

The Distributed Immersive Performance (DIP) project explores one of the most challenging goals of networkedmedia technology: creating a seamless environment for remote and synchronous musical collaboration. Anumber of research groups have presented one-time demonstrations of distributed performance with varyingdegrees of success since the 1970s. None, as far as we know, has focused on capture and recording of musicalexperience and thorough analysis of realistic musical interaction in an environment constrained by networklatency and reduced physical presence. In this paper we present a comprehensive framework for the capture,recording and replay of high-resolution video, audio and MIDI streams in an interactive environment forcollaborative music performance, and user-based experiments to determine the effects of latency in auralresponse on performers’ satisfaction with the ease of creating a tight ensemble, a musical interpretative andadaptation to the conditions. The experiments mark the beginning of our efforts to study comprehensively theeffects of musical interaction over the Internet in a realistic performance setting. The users and evaluators of thesystem are the Tosheff piano duo, Vely Stoyanova and Ilia Tosheff, a professional piano duo who have wonawards and concertized internationally. We present preliminary results from our first two sets of experiments:two players in the same room at separate keyboards with direct visual contact and delayed aural response frompartner (between 0 ms and 150 ms), and the same experiments with players swapping parts. User responses arereported for experiments using Poulenc’s Sonata for Piano Four-Hands, where the movements are Prelude(score-recommended tempo of 132 bpm), Rustique (46 bpm) and Final (160 bpm). For the fast movements (oneand three), the players experienced the highest difficulty in creating a tight ensemble at 50 ms and above. In thefast but not-so-rapid first movement, the players almost always rated difficulty in creating a musicalinterpretation higher than ensemble (synchronization) difficulties. In general, the users judged that, withpractice, they could adapt to delays below 50 ms.

1. Introduction

This paper presents a real-time and multi-site distributed interactive and collaborative environment calledDistributed Immersive Performance (DIP) [1], one of the key initiatives at USC’s Integrated Media SystemsCenter (IMSC). The paper proposes a framework for the capture, recording and replay of digital music, audioand video signals so as to enable comprehensive evaluation and study of musical interaction in a networkedmusical performance. We present here the initial experiments and preliminary results of the first two sets ofnetworked performance experiments involving a professional piano duo.

At IMSC, our goal is to develop the technologies of immersive and other integrated media systems throughresearch, engineering, education and industrial collaborations [2]. Our vision of immersive technology is thecreation of a complete audio and visual environment that places people in a virtual space where they cancommunicate naturally even though they are in different physical locations. The Distributed ImmersivePerformance (DIP) environment presents an ideal testbed for multimedia creation, archiving, representation andtransmission of electronic experiences. As a system, it facilitates new forms of creativity by enabling remoteand synchronous musical collaborations. Musical collaboration demands a level of fidelity and immediacy ofresponse that makes it an ideal testbed for pushing both the limits of human perception as well as technologyinnovation. The performance of such a system can be measured and quantified through the capture, replay andanalysis of musician interaction and the digital music signals they create.

One of the main challenges of synchronous collaboration over the network is the effect of audio and videolatency and reduced physical presence on ensemble synchrony as well as musical interpretation. We aim tocreate a comprehensive framework for capturing, documenting and analyzing the effects of audio and videolatency and a virtual shared environment on synchronous musical collaboration. Musician satisfaction with the

2

interactive environment is documented through a questionnaire, and their ensemble synchrony will be quantifiedcomputationally in future analyses. High-definition (HD) video, audio and MIDI streams are captured live andarchived with common time stamps.

The users (and evaluators) of this musical collaboration environment are a professional piano duo, the TosheffPiano Duo [3], award-winning expert musicians who have been performing together since 1997. By engaging aprofessional piano duo for the experiments, we can forego the effects of learning in the analyses. In the first setof experiments, the duo performed each of the three movements of Poulenc’s Sonata for Piano Four-hands withzero visual delay (in the same room, seated across from each other) and varying degrees of controlled audiodelay. In the second set of experiments, the duo swapped parts to test for symmetry in the effects of sensorydelay on the performers. We present some preliminary findings from these initial experiments, and proposefuture extensions of our current work.

Other research groups have experimented with distributed collaborative performance environments. One of thelatest initiatives takes the form of a Berlin-Paris network concert to take place at the eighth InternationalCultural Heritage Informatics Meeting (ICHIM) in August, 2004 [4]. Previous experiments include a NetworkJam session between Stanford’s SoundWIRE Group and McGill University on 13 June, 2002 [5], and livedemonstrations (a distributed Trio) and presentations documented by Eve Schooler at [6]. As a departure fromthe previous one-time demonstration/performances and a step towards rigorous study of performance in thetime-delayed environment, the Stanford SoundWIRE group has conducted several experiments to quantify theeffects of collaboration over the internet by analyzing the ensemble accuracy of two persons clapping a short butinter-locking rhythmic pattern [7, 8]. As far as we know, our experiments involving professional musiciansperforming complex composed pieces is the first such evaluation experiments on a realistic scale.

The remainder of the paper is structured as follows: we first present some previous incarnations of theDistributed Immersive Performance environment and our prior experiences with collaborative performances.Then, we focus on the present system, DIP v.2.0, including the synchronous capture and archival of HD video,audio and MIDI data streams, the evaluation procedure and some preliminary results. Descriptions follow forfuture experiments involving Distributed Immersive Performance.

1.1. Background: Snapshots of Prior Experiments

The present set of Distributed Immersive Performance experiments grew out of our previous work in this arena,depicted in the timeline in Figure 1. DIP has been planned as a two-way and multi-way extension of a previousIMSC research project called Remote Media Immersion (RMI) [9]. The goals of the RMI project are thecapture, one-way long-distance streaming, and reproduction of very high fidelity high-definition (HD) video andmultichannel audio over shared networks such as the Internet and Internet2. Figure 1 shows various RMImilestone demonstrations for one-way streaming that were performed between June 2002 and January 2004,each furthering RMI’s goals for high fidelity transfer of high bandwidth data streams.

The first DIP experiment involving synchronous collaboration, with remotely located musicians and two-wayaudio (one location with immersive audio), took place between EEB and PHE (two buildings in USC’s ViterbiSchool of Engineering) in December of 2002. The musicians, Wilson Hsieh (viola) and Elaine Chew (piano),played Piazzolla’s Le Grand Tango and excerpts from Hindemith’s Sonata Op.11 No.4. Audio delays between20 ms and 300 ms were introduced artificially. The violist found that the use of 10.2-channel immersive audioto simulate the acoustics of a concert hall greatly ameliorated the discomfort of playing with delay.

The first musical experiment involving both video (MPEG2) and audio (10.2 channel immersive audio), amaster class conducted by LA Philharmonic cellist and USC Thornton School of Music faculty Ron Leonard forstudents at the New World Symphony in Miami Beach took place in January of 2003. Leonard reportsimproved presence with immersive audio – saying that he felt that the “student was really there.”

The second DIP experiment involving synchronous musical collaboration, labeled DIP v.1.0, took place in June2003, between PHE 106 and Ramo Hall 106, two rooms located approximately 350 m apart on the USCcampus. This experiment featured two-way interactive video and 10.2 channel immersive audio at the audiencesite (PHE 106), and a mixture of both active (musicians) and passive (audience) participants. The musicianswere keyboard faculty at the Thornton School of Music, Dennis Thurmond (accordion) and Elaine Chew(piano). Photos from the experiments are shown in Figure 2.

3

Figure 3 shows the technical setup for DIP v.1.0. The video was NTSC resolution, 31 Mb/s DV, decompressedin software on the client PC. The total one-way video latency was approximately 115 ms, with 110 ms due toDV camera compression and decompression, plus < 5 ms network delay. The audio was uncompressed, with 16or more channels at 1 Mb/s each, and a total one-way latency of < 15 ms, with < 10 ms due to audio processing,plus < 5 ms network delay.

Figure 1. Timeline of Remote Media Immersion (RMI) and Distributed Immersive Performance (DIP)experiments. This paper focuses on the experiments outlined in the box.

Figure 2. Distributed Immersive Performance v.1.0 with USC Thornton School of Music keyboard faculty,Dennis Thurmond, in PHE 106 (left), and Elaine Chew in Ramo Hall 106 (right), located approximately 350 mapart on the USC campus.

Because the audience was co-located with one of the musicians (Thurmond), the performance was constrainedso that the end-result was a synchronous as possible at the audience-site. This arrangement resulted inperformances that were more musically restrained than the performers would have preferred, and the degree ofdelay in the video capture and delivery made it unusable as a source of cues for synchronization.

2002

Internet2 Meeting: Large Room RMI DemonstrationOctDistributed Immersive Performance (DIP) Experiment: Distributed DuetDec

Recording from StreamsJanDIP Experiment: Remote Master Class with New World SymphonyJanDIP Experiment v.1.0: Duet with AudienceJun

Remote Media Immersion (RMI) Initial DemonstrationJun

2004

2003

Jan Two-Way Live HD Streaming LA, Hawaii, Miami Experiments

Feb-Apr DIP Experiment v.2.0: Two-Way Baseline User Studies (planning)A: first time players perform under delayed conditionsB: player 1 and player 2 swap parts (symmetry test)C: players practise to compensate for delayD: players instructed to perform to target audience site

May

Jun

4

Figure 3. Distributed Immersive Performance v.1.0 technical setup.

Figure 4. Distributed Immersive Performance v.2.0 physical setup.

2. Distributed Immersive Performance v.2.0

The goal of DIP v.2.0 is to create a framework that will enable comprehensive evaluation an analysis of thetechnical barriers and psychophysical effects of latency and fidelity on music and other forms of humaninteraction between two interconnected sites.

2.1 Technical and Physical Setup

In the DIP v.2.0 experiments, two musicians are seated in the same room across from each other so that thevisual delay is zero. Each musician plays on a Yamaha MIDI keyboard with 88 weighted-action keys and hearshis/her own playing with no added delay. MIDI and audio output from the keyboards are sent to the High-performance Data Recording Architecture (HYDRA) database and the audio stream is processed through adelay box and re-routed back to the headphones of the other player. Figure 5 shows the physical setup of DIPv.2.0 and Figure 6 the technical setup. All data streams, high resolution video, audio and MIDI are recorded andarchived in HYDRA with common time stamps. The HYDRA system is described in greater detail in thefollowing section.

5

Figure 5. Distributed Immersive Performance v.2.0 technical setup. Thin and thick lines denote low and highbandwidth signals, respectively.

2.2 Streaming and Recording Technology

The recording, archiving and playback of performances is an essential component of the DIP v.2.0 system. Thisrequires a multi-channel, multi-modal recording system that can store a distributed performance event in real-time, as it occurs. Such a system must be capable of playing back the event with user defined delay offsetsbetween the various streams that constitute the performance. For example, an audience member should be ableto specify the reproduction of the event with different perspectives (for example, from the viewpoint ofperformer one, performer two, or the audience). The challenge is to provide real-time digital storage andplayback of the many, dynamically synchronized streams of video and audio data from scalable, distributedservers. The servers require that resources are allocated and maintained such that: (a) other streams are notaffected (recording or playback); (b) resources such as disk bandwidth and memory are used efficiently; (c)recording is seamless with no hiccups or lost data; and (d) synchronization between multiple, related streams ismaintained.

Figure 6 illustrates how the DIP v2.0 experiments were recorded with our High-performance Data RecordingArchitecture (HYDRA) [10, 11]. Extending our earlier Yima streaming media system [12], HYDRA includesreal-time stream recording. Support for a number of different media types and multiple concurrent streams wasrequired. In the experiments, two digital video streams were acquired from JVC JY-HD10U cameras viaFireWire (IEEE 1394) in HDV format (1280x720 pixels at 30 frames per second). The resulting MPEGtransport streams were time-stamped and transmitted in discrete packets to the storage backend at approximately20 Mb/s over an IP network. A suitable protocol for multimedia data traffic is the Real-time Transport Protocol(RTP) on top of the Universal Datagram Protocol (UDP). The frontend-backend dialog that includes controlcommands such as record, pause, resume, and stop were handled via the Real-time Streaming Protocol (RTSP).For monitoring purposes both camera streams were rendered with a software decoder on the HYDRA front-endmachine.

Additionally two uncompressed PCM audio channels were acquired (16-bits per sample at 48 KHz samplingrate). Again, all incoming data was time-stamped and then forwarded to the storage backend. The bandwidthrequired was approximately 1.5 Mb/s. Finally, the MIDI data produced by the electronic pianos were time-stamped and forwarded to the backend where they were stored as discrete events in a database. With thesecapabilities, HYDRA had the functionality necessary to support the DIP experiments. Here is a summary of thefeatures of this architecture.

6

• Multi-node, multi-disk cluster architecture to provide scalability [10]-[12].• Multiple recording gateways to avoid bottlenecks due to single-point recording.• Random data placement for the following operations: block-to-storage node assignment, block

placement within the surface of a single disk and optionally packet-to-block assignment. These resultin the harnessing of the average transfer rate for multi-zone disk drives and improves scalability.

• Unified model for disk scheduling: deadline-driven data reading and writing (fixed block sizes reducecomplexity of file system).

• Unified memory management with a shared buffer pool for both reading and writing [13].• Statistical admission control to accommodate variable bit rate (VBR) streams and multi-zone disk

drives [14].• Selective retransmission based error control and bandwidth smoothing through feedback [15]-[17].

Figure 6. Data Stream Recorder Architecture. Multiple source and rendering devices are interconnected via anIP infrastructure. The recorder functions as a data repository that receives and plays back many streamsconcurrently. (Note, playback streams are not shown to simplify the diagram.)

3. User Experiments and Evaluation Methodology

Our goal is to enable and to perform comprehensive evaluation of the effects of remote collaboration over theInternet on musical ensemble, interpretation and other factors pertaining to human interaction. The HYDRAarchitecture enables the capture, storage and replay of multiple data streams. This section describes our effortsin designing user-centered experiments to determine the effects of network latency and physical presence onsynchronous musical collaboration. Section 3.1 describes the first two sets of experiments designed to capturethe effects of delayed audio response between players, Section 3.2 documents that players’ answers to thequestions posed, and Section 3.3 summarizes the preliminary results for these first sets of experiments.

3.1 User Experiments

In May 2004, we conducted two sets of experiments covering all three movements of Poulenc’s Sonata forPiano Four-hands. Our musical collaborators are the Tosheff Piano Duo [3], an award-winning professionalpiano duo consisting of Vely Stoyanova and Ilia Tosheff. They began their partnership as a piano duo in 1997and have won numerous awards and concertized internationally. The physical and technical setup is asdescribed in Section 2.1, and the high-performance data recording architecture in Section 2.2.

7

Experiment A: The players are asked to perform a movement of Poulenc’s Sonata for Piano Four-hands – VelyStoyanova playing the Prima part and Ilia Tosheff playing the Seconda part – under audio delay conditionsunknown to the players. They are given approximately 30 seconds to calibrate to the given conditions byplaying in unison the Seconda part of the first section of the first movement (a rhythmic pattern repeatednumerous times). They are then asked to perform the movement as best they can under the conditions. Theaudio delay varied from 0 ms to 150 ms. All interactions and performance nuances are recorded as HD video,audio and MIDI streams with common time stamps. At the end of the movement, the players complete aquestionnaire and are asked to self-report on their observations in a short debriefing session. The evaluationmethodology is described in greater detail in Section 2.3.

Experiment B: Same as above, except the players swap parts – Vely Stoyanova playing the Seconda part andIlia Toshef playing the Prima part.

The questionnaire the players had to answer at the end of each take consisted of the following questions (to beanswered on a scale of 1 to 7, with 1 being the easiest and 7 the most difficult):

(a) How would you rate the ease of ensemble playing?(b) How would you rate the ease of creating a musical interpretation?(c) How would you rate the ease of adapting to this condition with practice?

3.2 Preliminary Results

The users’ responses to the three questions for Experiment A are documented in Sections 3.2.1 through 3.2.3.We present the results of the symmetry experiment (B) for the slow movement, Rustique, in Section 3.2.4.

3.2.1 User Response for First Movement: Prelude (prescribed tempo = 132bpm)

In the following charts, each pair of columns represents Vely Stoyanova’s and Ilia Tosheff’s answerrespectively for each audio delay value indicated on the horizontal axis (in ms).

Q(a)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(b)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(c)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

8

3.2.2 User Response for Second Movement: Rustique (prescribed tempo = 46bpm)

Q(a)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(b)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(c)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

3.2.3 User Response for Third Movement: Final (prescribed tempo = 160bpm)

Q(a)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(b)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

Q(c)0

1

2

3

4

5

6

7

0 10 20 30 40 50 75 100 150

9

3.2.4 User Responses for Symmetric Experiment (B): Rustique

In the following charts, each pair of columns represents Vely Stoyanova’s (playing Prima, playing Seconda)and Ilia Tosheff’s (playing Prima, playing Seconda) answers respectively for a subset of audio delay levelsindicated on the horizontal axis (in ms).

Q(a)0

1

2

3

4

5

6

7

0 40 50 75 100

Q(b)0

1

2

3

4

5

6

7

0 40 50 75 100

Q(c)0

1

2

3

4

5

6

7

0 40 50 75 100

3.4 Summary of Preliminary Results

We first summarize the user responses for Experiment A: For the fast and rhythmic first movement (prescribedtempo in the score is 132 beats per minute), the Prelude, ensemble issues dominate at audio delays of 50 ms andabove. Musicality issues become problematic earlier, and both players agree that adaptation is possible foraudio delays below 50 ms.

For the slow movement (prescribed tempo in the score is 46 bpm), Rustique, ensemble, musicality andadaptation difficulty increase monotonically with delay values for both players (with only one exception). Asbefore, difficulty in presenting a musical interpretation rise above moderate levels at about 75 ms, and bothplayers appear confident that adaptation is possible below 75 ms.

For the third and fastest movement (prescribed tempo in the score is 160 bpm), Final, ensemble difficultyreached moderate levels even at audio delays of 10 ms. The answers imply that the players found this rapidly-paced movement harder to perform satisfactorily with delayed audio than the other movements. Both playersrated the ensemble difficulties as being comparable to the musicality challenges for all audio delay levels. Theplayers are less optimistic about being able to adapt to delays for this movement, but remain hopeful thatpractice can overcome some of these difficulties in cases where delay is under 50 ms.

Next, we summarize the users’ responses for Experiment B (where the players swap parts) for the secondmovement, Rustique. These answers are compared to their corresponding responses in Experiment A. Ingeneral, both players found the unfamiliar part (represented by the two inner columns in each quadruplet) moredifficult (in terms of ensemble, musicality as well as adaptability) at the highest delay value, 150 ms. At 75 ms,both players agree that the Seconda part is more difficult than the Prima (again, on all counts).

10

4. Conclusions and Future Experiments

We have presented a Distributed Immersive Performance environment for synchronous musical collaborationand a high-performance data recording architecture for recording and replay for analysis and evaluation ofmusical interaction in a networked environment. These initial experiments mark the beginning of a series ofexperiments to study and understand the effects of network delays and a virtual environment on musicalensemble, interpretation and adaptability. Future studies will incorporate detailed analyses of the users’comments as well as quantitative measures of musical synchronization derived by computational means.Upcoming experiments will allow the musicians to practice to adapt to network delays, and the musicians willbe instructed to perform for specific target sites.

Acknowledgements First and foremost, we wish to thank our musical collaborators, Vely Stoyanova and IliaTosheff, the Tosheff Piano Duo. We also acknowledge the help of numerous students who assisted in theproject, including, Hui-Yun Frances Kao, Dwipal Desai, Kanika Malhotra, Will Meyer, Moses Pawar and ShivaSundaram. We also thank Seth Scafani and Allan Weber for providing technical support throughout the project.Last but not least, the DIP experiments were made possible by NSF grants MRI- 0321377 and CooperativeAgreement No. EEC-9529152. Any Opinions, findings and conclusions or recommendations expressed in thismaterial are those of the authors and do not necessarily reflect those of the National Science Foundation.

References

[1] A.A. Sawchuk, E. Chew, R. Zimmermann, C. Papadopoulos, C. Kyriakakis, “From Remote MediaImmersion to Distributed Immersive Performance.” In Proceedings of the ACM SIGMM 2003 Workshop onExperiential Telepresence (ETP 2003) November 7, 2003, Berkeley, California, USA. In conjunction with ACMMultimedia 2003.[2] D. McLeod, U. Neumann, C.L. Nikias and A.A. Sawchuk, “Integrated Media Systems,” IEEE SignalProcessing Magazine, vol. 16, no. 1, pp. 33-76, January 1999.[3] The Tosheff Piano Duo – http://www.tosheffpianoduo.com.[4] ICHIM 04 – Eighth International Cultural Heritage Informatics Meeting. Berlin, August 30 – September 2,2004. – http://www.ichim.org/jahia/Jahia/lang/en/.[5] Stanford’s SoundWIRE Group – http://ccrma.stanford.edu/groups/soundwire/.[6] Eve Schooler’s Musical Distractions – http://www.async.caltech.edu/~schooler/music.html.[7] N. Schuett, “The Effects of Latency on Ensemble Performance.” Masters Thesis, Stanford CCRMA, May2002.[8] C. Chafe, M. Gurevich, G. Leslie and S. Tyan, “Effect of Time Delay on Ensemble Accuracy.”[9] R. Zimmermann, C. Kyriakakis, C. Shahabi, C. Papadopoulos, A.A. Sawchuk and U. Neumann, "TheRemote Media Immersion System," IEEE MultiMedia, vol. 11, no. 2, pp. 48-57, April-June 2004.[10] R. Zimmermann, K. Fu, and W.-S. Ku, “Design of a Large Scale Data Stream Recorder,” In Proceedings ofthe 5th International Conference on Enterprise Information Systems (ICEIS 2003), (Angers, France), April 23-26, 2003.[11] R. Zimmermann, K. Fu and D.A. Desai. HYDRA: “High-performance Data Recording Architecture forStreaming Media,” Book chapter in Video Data Management and Information Retrieval, editor Sagarmay Deb,University of Southern Queensland, Toowoomba, QLD 4350, Australia. Published by Idea Group Inc.,publisher of the Idea Group Publishing, Information Science Publishing and IRM Press imprints, 2004.[12] C. Shahabi, R. Zimmermann, K. Fu, and S.-Y. D. Yao, “Yima: A Second Generation Continuous MediaServer,” IEEE Computer, vol. 35, pp. 56–64, June 2002.[13] K. Fu and R. Zimmermann. “Memory Management for Large Scale Data Stream Recorders.” InProceedings of the 6th International Conference on Enterprise Information Systems (ICEIS 2004), Porto -Portugal, April 14-17, 2004.[14] R. Zimmermann and K. Fu. “Comprehensive Statistical Admission Control for Streaming Media Servers.”In Proceedings of the 11th ACM International Multimedia Conference (ACM Multimedia 2003), Berkeley,California, November 2-8, 2003.[15] R. Zimmermann, K. Fu, N. Nahata, and C. Shahabi, “Retransmission-Based Error Control in a Many-to-Many Client-Server Environment.” In Proceedings of the SPIE Conference on Multimedia Computing andNetworking 2003 (MMCN 2003), (Santa Clara, California), pp. 34–44, January 29-31, 2003.[16] A.E. Dashti, S.H. Kim, C. Shahabi, and R. Zimmermann, eds., Streaming Media Server Design. PrenticeHall IMSC Press Multimedia Series, March 2003. ISBN: 0-130-67038-3.[17] C. Papadopoulos and G.M. Parulkar. “Retransmission-based Error Control for Continuous MediaApplications.” In Proceedings of the 6th International Workshop on Network and Operating Systems Supportfor Digital Audio and Video (NOSSDAV 1996), Zushi, Japan, April 23-26, 1996.