Learning to play the flute with an anthropomorphic robot

6
Learning to Play the Flute with an Anthropomorphic Robot Solis, J. *, † , Bergamasco, M. * , Isoda, S. , Chida, K. , and Takanishi, A. * Perceptual Robotics Laboratory, Scuola Superiore Sant’Anna. Pisa, Italy [email protected], [email protected] Department of Mechanical Engineering, Waseda University. Tokyo, Japan [email protected] Abstract The research of the anthropomorphic flutist robot at Waseda University, for more than ten years, has focused on reproducing as best as possible the human organs physiology involved on the human flute playing to clarify this mechanism from an engineering point of view. This research is based on the need to develop useful robots for practical uses in the human living environment. As a result of our research, the newest anthropomorphic flutist robot WF-4 (Waseda Flutist No. 4) with 24-DOF has been developed, which not only improve the expressiveness but reduced also the dimensions of all their mechanisms similar to human size. In this paper, the flutist robot is used as a tool for helping to a human professor to improve the sound quality of beginner flutist players. In such a case, the robot is not only used to reproduce human flute playing but to evaluate also pupil’s performance and to provide useful verbal and graphical feedback so that learners’ performances are improved. Meanwhile the robot is transferring the basics of the skill to students; the teacher encouraged them to produce a similar sound as robot (student’s psychology). An experimental setup was designed to compare the added value of using the flutist robot for teaching to beginner students against the conventional way of teaching. Students’ performances have been analyzed with different evaluation methods. The results demonstrated that the performance of pupils were better when the robot was used. 1 Introduction Since we born through all our life, we learn new skills mainly by the observation of others. Learning new motor skills by observing and then reproducing the behavior of conspecifics is an example of social learning. It might be described as an imitative act (Bruce, 1996). Imitation is one of the most important mechanisms whereby knowledge may be transferred and skills acquired between agents (both biological and artificial). It requires two or more agents sharing a context that allows one agent to imitate another. Imitation is contrasted to mimicry, where imitation is more than the mere ability to reproduce others’ actions; it is the ability of replicating and, by doing so, learn new skills by the simple observation of those performed by others. Learning by imitation is thought to play a key role in the development of social skills in primates and humans (Meltzoff, 1996). One of the most prominent theories on the development of imitative abilities was presented by Piaget (1951), who advocated that the imitative ability is developed in stages: imitate behaviors compared within the same sensory modality, imitate acts that require cross-modal equivalences, and finally, deferred imitation. Scientists in the neurophysiology field have discovered a remarkable system they call mirror neurons that establish the imitating process as a central part of the human learning (Rizzolatti, 1996). Melzoff, et al (2002) suggested that the neurons in the premotor area are not only fired in preparation for upcoming movements but also are fired when we observe someone else carry out that action. The imitation is not only necessary for learning new skills but also the feedback is a critical constituent of learning. Some motor behavior researchers have stamped feedback as being the most important variable controlling performance and learning (Bilodeau, 1961). Sage (1984) have stated that feedback increases the rate of improvement on new tasks, enhances performance on overlearned tasks, and makes tasks more interesting. Feedback offers three paramount functions in the learning process: Reinforces the learner: Positive reinforcement gives the learner a feeling of satisfaction with his/her performance. More importantly, it instills a desire to repeat the performance in the same manner. Informs the learner: Specific information regarding execution is crucial to both the current performance and task repeatability. Motivates the learner: Feedback, when initiated in a constructive manner, provides incentive and motivates students to achieve higher performance levels. Teaching a robot to perform various human-like tasks has become a topic of growing interests. Imitation learning is one of the most promising ways to accelerate the behavior acquisition for complex robots (Schaal, 1999). The research on learning by imitation from robot has been one of the most interesting cognitive issues to model how human can learn to acquire various kinds of behaviors (Asada, et al Copyright©2004 All Rights Reserved Proceedings ICMC 2004

Transcript of Learning to play the flute with an anthropomorphic robot

Learning to Play the Flute with an Anthropomorphic Robot

Solis, J.*, †, Bergamasco, M.*, Isoda, S.†, Chida, K.†, and Takanishi, A.† *Perceptual Robotics Laboratory, Scuola Superiore Sant’Anna. Pisa, Italy

[email protected], [email protected] †Department of Mechanical Engineering, Waseda University. Tokyo, Japan

[email protected]

Abstract The research of the anthropomorphic flutist robot at Waseda University, for more than ten years, has focused on reproducing as best as possible the human organs physiology involved on the human flute playing to clarify this mechanism from an engineering point of view. This research is based on the need to develop useful robots for practical uses in the human living environment. As a result of our research, the newest anthropomorphic flutist robot WF-4 (Waseda Flutist No. 4) with 24-DOF has been developed, which not only improve the expressiveness but reduced also the dimensions of all their mechanisms similar to human size. In this paper, the flutist robot is used as a tool for helping to a human professor to improve the sound quality of beginner flutist players. In such a case, the robot is not only used to reproduce human flute playing but to evaluate also pupil’s performance and to provide useful verbal and graphical feedback so that learners’ performances are improved. Meanwhile the robot is transferring the basics of the skill to students; the teacher encouraged them to produce a similar sound as robot (student’s psychology). An experimental setup was designed to compare the added value of using the flutist robot for teaching to beginner students against the conventional way of teaching. Students’ performances have been analyzed with different evaluation methods. The results demonstrated that the performance of pupils were better when the robot was used.

1 Introduction Since we born through all our life, we learn new skills

mainly by the observation of others. Learning new motor skills by observing and then reproducing the behavior of conspecifics is an example of social learning. It might be described as an imitative act (Bruce, 1996). Imitation is one of the most important mechanisms whereby knowledge may be transferred and skills acquired between agents (both biological and artificial). It requires two or more agents sharing a context that allows one agent to imitate another.

Imitation is contrasted to mimicry, where imitation is more than the mere ability to reproduce others’ actions; it is the ability of replicating and, by doing so, learn new skills

by the simple observation of those performed by others. Learning by imitation is thought to play a key role in the development of social skills in primates and humans (Meltzoff, 1996). One of the most prominent theories on the development of imitative abilities was presented by Piaget (1951), who advocated that the imitative ability is developed in stages: imitate behaviors compared within the same sensory modality, imitate acts that require cross-modal equivalences, and finally, deferred imitation.

Scientists in the neurophysiology field have discovered a remarkable system they call mirror neurons that establish the imitating process as a central part of the human learning (Rizzolatti, 1996). Melzoff, et al (2002) suggested that the neurons in the premotor area are not only fired in preparation for upcoming movements but also are fired when we observe someone else carry out that action. The imitation is not only necessary for learning new skills but also the feedback is a critical constituent of learning. Some motor behavior researchers have stamped feedback as being the most important variable controlling performance and learning (Bilodeau, 1961). Sage (1984) have stated that feedback increases the rate of improvement on new tasks, enhances performance on overlearned tasks, and makes tasks more interesting. Feedback offers three paramount functions in the learning process:

• Reinforces the learner: Positive reinforcement gives the learner a feeling of satisfaction with his/her performance. More importantly, it instills a desire to repeat the performance in the same manner.

• Informs the learner: Specific information regarding execution is crucial to both the current performance and task repeatability.

• Motivates the learner: Feedback, when initiated in a constructive manner, provides incentive and motivates students to achieve higher performance levels.

Teaching a robot to perform various human-like tasks has become a topic of growing interests. Imitation learning is one of the most promising ways to accelerate the behavior acquisition for complex robots (Schaal, 1999). The research on learning by imitation from robot has been one of the most interesting cognitive issues to model how human can learn to acquire various kinds of behaviors (Asada, et al

Copyright©2004 All Rights Reserved

Proceedings ICMC 2004

2000). The development of Humanoid Robots has been focused on making robots more “human” not only to provide natural means of human-robot interaction but also as a mechanism for bootstrapping behaviors that are more complex. Several researchers have showed that humanoid robots can have incredible flexible mobility at a practical level and they can therefore be utilized for many applications. The potentiality of humanoid robots has been limited just to imitate or recognize human skills to render a natural interaction with humans.

Due to the high dexterity properties of the humanoid robots, it’s feasible to imagine human-like robots that are able to transfer and improve human abilities to unskilled people. The repetitive demonstration of the task and the proper feedback information to correct learner’s performance (visual, auditive, etc) will be an important aim, not only to understand better the parameters that may lead or break down the learning process but also will offer an important tool for the traditional way of teaching.

In this paper, we present an anthropomorphic robot that is not only capable of playing the flute as human does, but also is able to help beginner students to improve the sound quality of their performances by the demonstration and furthermore, after recording and processing the student’s performance, the robot can provide graphical and verbal feedback to improve their execution. This paper is organized as follows: in the second section, a briefly description of the newest version of the flutist robot is shown. In the third section, it is described the experimental design to verify the performance of students when a robot is used for teaching. In the last section, the results of this experiment are presented and discussed.

2 Anthropomorphic Flutist Robot In 1992, the “Humanoid Project” has been started for the

development of anthropomorphic robots for co-existing and interacting with humans. Therefore, the robot must be able to replicate human-like activities as people do, for rendering a natural communication between human and robots. That’s why it becomes necessary that the robot simulates some of the basic functionalities of the human body. Since 1990, the Waseda University has been developing the flutist robot in order to clarify the mechanism involved in human flute playing from an engineering perspective. Moreover, the music performance seems particularly promising, since music is a universal communication medium, at least within a given cultural context (Gabrielsson, 1999).

The previous version of the flutist robot (WF-3RIX) reproduced every human organ functions, for flute playing, as real as possible (Takanishi, et al 1998). The robot simulated the function of the human organs: a respiratory system, a playing attitude mechanism, fingers, a mouth, a throat, a tongue, etc.

The newest version of the anthropomorphic flutist robot WF-4 has not only improved the expressiveness of robot’s

performance for playing the flute but has also improved the mechanical design for reducing the dimensions of each simulated organ (Takanishi, et al 2003). This robot includes the following mechanisms (Figure 1): the lip (5DOF), the neck (4DOF), the fingers (12DOF), the tongue (1DOF), the lung (1DOF) and the vibrator mechanism (1DOF). The neck has been introduced as a lip relative positioning mechanism. The design of the WF-4 was focused to replicate the human shape; therefore, the dimension of each mechanism has been designed as closer to human size (approximately 1 ~ 1.2 times).

Figure 1. The newest version of the flutist robot WF-4

(Waseda Flutist No. 4) was developed

In Fig. 2 is shown a schematic diagram of the robot

system of WF-4. This system is composed by the human organ mechanisms, a personal computer (PC) to control the robot, a PC sequencer to generate the MIDI signals and a MIDI tone generator module as an accompaniment system. The PC Sequencer outputs the accompaniment data for the MIDI sound source and the timing clock to manage the entire music information. The PC controller receives the outputs from the PC sequencer (music data) to control the robot’s hardware (robot data). The “robot data” contains all the position control parameters necessary for the robot’s performance.

Figure 2. Musical system performance

Proceedings ICMC 2004

3 Experimental Setup In this experiment, the tests reported where focused on

what happens when a robot is used for helping to a professor to improve the sound quality of pupils. Sixteen beginner students were used for doing this experiment (aged between 12-20 years old). The students were divided into two groups: group A was only taught by the human teacher and group B was taught also by a professor assisted by the flutist robot. Each group had two sessions in different days. Each session was divided in two phases: practice and evaluation using different scores (Figure 3). In the first day of each group, the teacher introduced the basics of flute playing and gave all the necessary advises about the technique to each pupil (fingering, body posture, etc.) so that student’s level became as homogenous as possible. Then, on the second day of each group, their performances were analyzed and compared (only the evaluation session).

The teaching procedure used for the group B is shown in Fig. 4. For each time of the evaluation session, the robot aided to the student by demonstrating and providing feedback to the learners; while the teacher motives them. In case of group A, the human teacher just provided the advice to improve the student’s execution without demonstrating the correct sound of notes; meanwhile, the robot just recorded the play to analyze and compare later against group B. In both cases, after the student repeated four times the Etude III, the human teacher and the robot didn’t provide any kind of feedback. Both feedbacks were given after the student finished the performance (off-line mode), based on the analysis of following evaluation methods described by Solis, et al (2004): the sound quality evaluation function, the entropy index, the power spectrum analysis and the Symmetric Dot-Pattern (SDP).

Figure 3. The first two etudes presented above were used during the practice session. The last etude was taken from the 4th movement of the Beethoven’s 9th symphony for the

evaluation session

In order to evaluate learners’ performances, their performances were recorded using the M-Audio USB Audiosport Duo. The operator didn’t control the recording start/stop commands, these signals were controlled by the PC sequencer. The recording is started after metronome signal has generated four pulses also from the PC sequencer (Figure 6). This procedure was used to synchronize the MIDI and sound data to analyze sound’s properties. After the system analyzed the execution, the results were displayed to motivate learner’s performance by means of a graphical feedback (Figure 7). This graphical display mainly

compares the differences of the performances between the student and robot. It’s expected that after some practice, these differences become smaller.

Figure 4. The teaching procedure used for the group B is

shown.

Figure 5. The flutist robot assisted to the human teacher to

improve the sound quality of beginner flutist players.

Figure 6. In order to synchronize the MIDI and sound data, the signals generated from the PC sequencer were used to

control the recording media

After the sound data was recorded, the sound data files were pre-processed by using the Fast Fourier Transform (FFT) analysis using the Hanning window performed in offline mode every 10ms. The previous experiment done by Solis, et al (2004), proposed different ways of analyzing students’ data to identify the main characteristics of beginner players; where the pupils were asked to play a short melody while a professional flutist provided them useful verbal feedback in order to improve their performances (off-line mode). In such experiment, four kinds of analyses were performed: the sound quality evaluation function, based on the experimental results from

Proceedings ICMC 2004

Ando (1970), who was able to simulate the flute playing by using a mechanical blowing apparatus in order to analyze the acoustical properties of the flute sound (eq. 1); the power spectrum, which has been used to detect patterns in complicated waveforms; the spectral entropy, which is an additive cost function that has been used as an useful criterion to analyze and compare the distribution of a signal (Pickover, 1990); and the Symmetric Dot-Pattern (SDP), which is obtained from converting sound wave into a collection of dots to produce the six-fold symmetry of a snowflake in order to represent in someway the internal structure of the music (Pickover, 1990). ( ) ( )

VLoLeHMF −+−

=

(1)

where V: Average Volume [dB]; M-H: Harmonic - Semi-Harmonic Level Average [dB]; Le-Lo: Even–Odd harmonics Level Average [dB].

The verbal feedback provided by the robot was based on

the analysis of the harmonics structure (M-H and Le-Lo). The mean averages of the level of these differences were calculated and the verbal feedback, based on the results of the statistical analysis performed also on the previous experiment, was provided depending on each step of the evaluation session; as it is shown in Table I. After the verbal feedback was provided, a graphical window is shown to provide more details about how the student can improve its performance (Figure 8).

Figure 7. The student and robot performances are compared.

Figure 8. Further information is provided to the student.

TABLE I. VERBAL FEEDBACK PROVIDED BY THE ROBOT WAS BASED ON THE HARMONICS STRUCTURE

Harmonics Structure Step

M-H Le-Lo Verbal Feedback

1 - - No advice will be given

2 - - Correct your body posture

3 <8 - Blow Constantly

8 ~ 10 - Not advice will be given

> 10 - Cover a little more the mouth hole

4 - < 4 Modify your blowing angle

- 4 ~ 7 Correct the flute mounting

- > 7 Correct the head joint mounting

5~8 - - No advice will be given

4 Results In this paper; we compared the results obtained from

both groups by the statistical analysis of the variance (ANOVA) of the different performance indexes for investigating the added value of using the flutist robot as a novel teaching tool to enhance the performance of beginner players. Moreover, we want to verify if the verbal feedback provided from the robot trough the method proposed is useful or not (Solis, et al 2004).

4.1 Sound Quality Evaluation Function The Fig. 9 presents the results of the mean average

obtained from all the students per each step of the evaluation function. Both groups didn’t present a significant improvement of the sound quality (p= 0.781). Although the students demonstrated a significant improvement in the harmonic average level (p=0.041), the volume level remained almost constant that reduced the increment of the evaluation function value. A statistical significant difference was detected between both groups (p<0.001), where the group B had the best response.

Figure 9. Mean average of the sound quality evaluation

function by step is shown.

Proceedings ICMC 2004

4.2 Power Spectrum Analysis The mean averages of the power spectrum from

students’ performances of both groups obtained on the last step of the evaluation session are presented and compared with the robot’s performance in Fig. 10. The frequency distribution presented in group B almost reached the amplitudes of the frequencies that were detected on the robot’s sound. As far the amplitudes of the fundamental frequencies of each note are similar to that of its harmonics, the sound produced has better quality (stimulate our senses). In the case of group A, although presented an improvement from the first to the last step, it couldn’t reach the amplitudes found in the robot’s performance.

Figure 10. The power spectrums obtained at the last step of the evaluation session from both groups are compared with the robot’s performances. The different peaks correspond to

the fundamental frequencies of the etude notes and theirs harmonics.

4.3 Symmetric-Dot Pattern Fig. 11 shows the resultant averaged image of the

Symmetric Dot-Pattern for both groups from the first and last steps of the evaluation session. The SDP of both groups reduced the darkness areas from first to the last step, although are still far from having the same shape of the robot’s performance. It was quite difficult for students to

improve their breath control affecting the production of good sound which has been related to the dark areas of the averaged shape of the SDP produced by students compared with the robot one.

Figure 11. Resultant averaged images of the SDP for group

A (top), group B (middle) and robot (bottom) of the first (left) and last (right) steps.

4.4 Spectral Entropy The mean average of the entropy index is shown in Fig.

12, from the first to the last steps of the evaluation session. Both groups presented a significant improvement on the value of the entropy index (p=0.027), but there wasn’t found a statistical significant difference (p = 0.562) between both groups (Table III). The students, as we obtained in the power spectrum analysis, actually distributed better the amplitude of the produced frequencies with practice.

Figure 12. Entropy index average from students by step is

shown.

Proceedings ICMC 2004

4.5 Verbal Feedback Each time the robot provided a verbal feedback to the

student for correcting its performance, the professional flutist determined whether or not the advice could help to improve pupils’ performances (Figure 13). On 25% of the times, the robot provided useful advices, while 55% of times provided partially correct advices, as it was necessary to provide further considerations to the student. The robot was able to help to the pupils, although it complete failed on 20% of the times.

Figure 13. Rate of correct and incorrect advices provided

from the flutist robot to students for improving their performances.

5 Discussion and Conclusions In this experiment, the anthropomorphic flutist robot

was used to improve the sound quality of beginner flutist players. Comparing both groups, it was demonstrated that students’ executions were better when the robot helped to the human professor as their performances increased the richness in harmonics in the frequency distribution (power spectrum and entropy). The Symmetric Dot-Pattern presented a qualitatively way of evaluating the player’s performance, although in this case, it was quite hard to find the differences between both groups.

The sound quality evaluation function, although it detected differences between both groups, it couldn’t evaluate correctly the improvement of players, as the pupils had difficulties on increasing their volume during the performance affecting the result of the evaluation function. It becomes necessary to find a better way for evaluating the sound quality of beginner players.

The verbal feedback provided from the robot, based on the analysis of the harmonics structure, was good; although it becomes necessary to consider more special cases. The robot helps to the teacher to transfer the basic skills needed for improving the sound quality of pupils, while professor encouraged them to reach the sound quality produced by robot.

6 Acknowledgments A part of this research was done at the Humanoid

Robotics Institute (HRI), Waseda University. We appreciate the help of the WABOT-HOUSE staff in the Gifu Prefecture for realizing this experiment. The authors would like to thank to SOLID WORKS Japan, CHUKOH Chemical Ind. Ltd. and MURAMATSU Inc. and Mr. Kunimitsu Wakamatsu for his valuable help and advices in performing the flute teaching experiment with beginner players.

References Ando, Y. (1970). Drive conditions of the flute and their influence

upon harmonic structure of generated tone. Journal of the Acoustical Society of Japan (in Japanese), 297-305.

Asada, M., MacDorman, K.F., Ishiguro, H., Kuniyoshi, Y. (2000). Cognitive developmental robotics as a new paradigm for the design of humanoid robots. In Proc. Of the 1st IEEE-RSA International Conference on Humanoid Robotics.

Bilodeau, E.A., Bilodeau, I.M. (1961). Motor Skills Learning. Annual Review of Psychology, 12, 243-280.

Bruce, Moore (1996). The evolution of imitative learning. Social Learning in Animals: The Roots of Culture, 245, 265.

Gabrielsson, A. (1999). The Performance of Music. In The psychology of Music. San Diego: Academic Press, 501-602.

Meltzoff, A. (1996). Towards a developmental cognitive science. The implication of cross-modal matching and imitation for the development of representation and memory in infancy. In The development and neural basis of higher cognitive functions. Annas of the N.Y. Acad. of Sci., 608.

Meltzoff, A., Prinz, W. (2002) The Imitative Mind: Development, Evolution, and Brain Bases. Cambridge GB: Cambridge University Press, 247-266.

Okuma, I.; Chida, K.; Isoda, S.; Saisu, Y.;Wakamatsu, K.; Takanishi, A.(2003). Time-scale performance control of new anthropomorphic flutist robot,” in Proc. of the 21st Annual Conference of the Robotics Society of Japan (in Japanese).

Piaget, J. (1951). Play, dreams, and imitation in childhood. English translation from the French original version by C. Gattegno and F.M. Hodgson, W. Heinemann Ltd.

Pickover, C.A. Computer, pattern, chaos and beauty. St. Martin’s Press: New York, pp. 21-35, 1990.

Rizzolatti, G., Fadiga, L. Gallese, V., Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Research, 131-141.

Sage, G.H., (1984). Motor Learning and Control - A Neuropsychological Approach. Wm. C. Brown pub., Dubuque, IA.

Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends in Cognitive Science.

Solis, J.; Bergamasco, M.; Isoda, S.; Chida, K.; Saisu, Y.; Takanishi, A. (2004). Evaluating sound quality of beginner players by an anthropomorphic flutist robot (WF-4). IEEE/RSJ International Conference on Intelligent Robots and Systems (in Proc.).

Takanishi, A., Maeda, M. (1998). Development of Anthropomorphic Flutist Robot WF-3RIV. In Proc. of the 1998 International Computer Music Conference, 328-331.

Proceedings ICMC 2004