Stop Looking and Start Listening

104
stop looking for music and start listening to it: auditory display in music collection interfaces Becky Stewart [email protected] Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary, University of London

Transcript of Stop Looking and Start Listening

Page 1: Stop Looking and Start Listening

stop looking for music and start listening to it:

auditory display in music collection interfaces

Becky [email protected]

Centre for Digital MusicSchool of Electronic Engineering and Computer ScienceQueen Mary, University of London

Page 2: Stop Looking and Start Listening

In this talk we will ...

Page 3: Stop Looking and Start Listening

• Review how search and browse for information

In this talk we will ...

Page 4: Stop Looking and Start Listening

• Review how search and browse for information

• Look at current commercially-available interfaces

In this talk we will ...

Page 5: Stop Looking and Start Listening

• Review how search and browse for information

• Look at current commercially-available interfaces

• Discuss why listening should be integrated

In this talk we will ...

Page 6: Stop Looking and Start Listening

• Review how search and browse for information

• Look at current commercially-available interfaces

• Discuss why listening should be integrated

• Look at solutions presented by academia

In this talk we will ...

Page 7: Stop Looking and Start Listening

• Review how search and browse for information

• Look at current commercially-available interfaces

• Discuss why listening should be integrated

• Look at solutions presented by academia

• Review recent research from C4DM

In this talk we will ...

Page 8: Stop Looking and Start Listening

• Review how search and browse for information

• Look at current commercially-available interfaces

• Discuss why listening should be integrated

• Look at solutions presented by academia

• Review recent research from C4DM

• Wrap up and conclude

In this talk we will ...

Page 9: Stop Looking and Start Listening

how do we find information?

Page 10: Stop Looking and Start Listening

let’s start with something easy...

Page 11: Stop Looking and Start Listening
Page 12: Stop Looking and Start Listening

Familiar interface

Summarizes information

Users seldom scroll down, almost never go to next page

Page 13: Stop Looking and Start Listening

how about better browsing?

Page 14: Stop Looking and Start Listening
Page 15: Stop Looking and Start Listening

Easy to traverse information

Relationships between items can be inferred

Encourages browsing

Page 16: Stop Looking and Start Listening

what about something other than text?

Page 17: Stop Looking and Start Listening
Page 18: Stop Looking and Start Listening

Users seldom go on to next page of results

Broad overview, but can zoom in on specific result

All other information beyond image is suppressed, but recallable

Page 19: Stop Looking and Start Listening

what about time-based media?

Page 20: Stop Looking and Start Listening
Page 21: Stop Looking and Start Listening

Less helpful than the image search results

Difficult to navigate results

Have to go to web page to view any portion of the video

Music or audio results only is not an option

Page 22: Stop Looking and Start Listening

so what about music interfaces? how do we find music?

Page 23: Stop Looking and Start Listening
Page 24: Stop Looking and Start Listening
Page 25: Stop Looking and Start Listening
Page 26: Stop Looking and Start Listening
Page 27: Stop Looking and Start Listening
Page 28: Stop Looking and Start Listening
Page 29: Stop Looking and Start Listening
Page 30: Stop Looking and Start Listening

commercial interfaces use a combination of text fields and seed songs/artists

Page 31: Stop Looking and Start Listening

commercial interfaces use a combination of text fields and seed songs/artists

results are lists of text perhaps enhanced with images, general knowledge and hyperlinks

Page 32: Stop Looking and Start Listening

commercial interfaces use a combination of text fields and seed songs/artists

results are lists of text perhaps enhanced with images, general knowledge and hyperlinks

songs are played back one at a time and only if explicitly requested by user

Page 33: Stop Looking and Start Listening

why should audio be integrated?

Page 34: Stop Looking and Start Listening

Bjork / Björk

• textual metadata can be malformed or wrong

• an empty text field is less than inspiring

• text can be a barrier to discovery

• previous knowledge is needed

• difficult to move into tail, will stay in the head

Celma and Cano From hits to niches? or how popular artists can bias music recommendation and discovery. In Proc. of 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD), Las Vegas, Nevada, USA, August 2008.

Page 35: Stop Looking and Start Listening

listening makes a difference

• users make different judgements about playlists when metadata is missing

L. Barrington, R. Oda, and G. Lanckriet. Smarter than Genius: human evaluation of music recommender systems. In Proc. of ISMIR’09: 10th Int.Society for Music Information Retrieval Conf., pages 357–362, Kobe, Japan, October 2009.

Page 36: Stop Looking and Start Listening

listening is faster

• when search results are compiled into a single audio stream instead of a list of results, users find what they are looking for quicker

S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008.

• listeners can find music without a GUI faster than with an iPod, and be just as happy with their selection

Andreja Andric, Pierre-Louis Xech, and Andrea Fantasia, “Music mood wheel: Improving browsing experience on digital content through an audio interface,”in Proc. of 2nd Int. Conf. on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS’06), 2006.

Page 37: Stop Looking and Start Listening

listening is effective

• users can understand and navigate a collection of music as effectively without a GUI as with one

• they are slower, but don’t make significantly more mistakes

S. Pauws, D. Bouwhuis, and B. Eggen. Programming and enjoying music with your eyes closed. In CHI ’00: Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 376–383. ACM, 2000. doi: 10.1145/332040.332460.

Page 38: Stop Looking and Start Listening

how can interfaces use more listening?

Page 39: Stop Looking and Start Listening

not by being VoiceOver

Page 40: Stop Looking and Start Listening

not by being VoiceOver

Page 41: Stop Looking and Start Listening

maps

Page 42: Stop Looking and Start Listening

mused

• passive listening

G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007.

• youtubehttp://www.youtube.com/watch?v=DuuESpj558Y&feature=related

Page 43: Stop Looking and Start Listening

mused

• passive listening

G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007.

• youtubehttp://www.youtube.com/watch?v=DuuESpj558Y&feature=related

Page 44: Stop Looking and Start Listening

sonic browser

• hugely influential interface

• introduced aurally exploring a map of sounds

• direct sonification

M. Fernström and E. Brazil. Sonic browsing: an auditory tool for multimedia asset management. In Proc. of ICAD ’01: Internation Conf. on Auditory Display, pages

132–135, Espoo, Finland, August 2001. M. Fernström and C. McNamara. After direct manipulation - direct sonification. In Proc. of ICAD ’98: Int. Conf. on Auditory Display, 1998.

Page 45: Stop Looking and Start Listening

soundtorch

• 3D version of sonic browser

S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008.

S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465.

• youtube http://www.youtube.com/watch?v=eiwj7Td7Pec

Page 46: Stop Looking and Start Listening

soundtorch

• 3D version of sonic browser

S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008.

S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465.

• youtube http://www.youtube.com/watch?v=eiwj7Td7Pec

Page 47: Stop Looking and Start Listening

neptune

• based on Islands of Music

P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta-information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.

Page 48: Stop Looking and Start Listening

neptune

• based on Islands of Music

P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta-information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.

Page 49: Stop Looking and Start Listening

sonixplorer

• extension of neptune

• landscape can be marked up by user

• introduced focus

• youtube http://www.youtube.com/watch?v=mIfWg2Eex74

D. Lübbers. Sonixplorer: Combining visualization and auralization for content-based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005.

D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.

Page 50: Stop Looking and Start Listening

sonixplorer

• extension of neptune

• landscape can be marked up by user

• introduced focus

• youtube http://www.youtube.com/watch?v=mIfWg2Eex74

D. Lübbers. Sonixplorer: Combining visualization and auralization for content-based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005.

D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.

Page 51: Stop Looking and Start Listening

what’s the problem?

Page 52: Stop Looking and Start Listening

what’s the problem?

• too much information thrown at the user

Page 53: Stop Looking and Start Listening

what’s the problem?

• too much information thrown at the user

• does not translate well to mobile devices

• rendering spatial audio

• reliance on screens

Page 54: Stop Looking and Start Listening

my research

Page 55: Stop Looking and Start Listening

virtual ambisonics

Page 56: Stop Looking and Start Listening
Page 57: Stop Looking and Start Listening

Number of convolutions increases with each sound source

Page 58: Stop Looking and Start Listening
Page 59: Stop Looking and Start Listening

Number of convolutions independent of the of sound sources

Page 60: Stop Looking and Start Listening

Number of convolutions independent of the of sound sources

Can do more efficient things in B-format domain

Page 61: Stop Looking and Start Listening
Page 62: Stop Looking and Start Listening

Can still do more efficient things in B-format domain

Page 63: Stop Looking and Start Listening

Can still do more efficient things in B-format domain

Only 3 convolutions and only need to store 3 filters

Page 64: Stop Looking and Start Listening

Can still do more efficient things in B-format domain

Only 3 convolutions and only need to store 3 filters

When compared to direct binaural, there are measurable differences.

Page 65: Stop Looking and Start Listening

Can still do more efficient things in B-format domain

Only 3 convolutions and only need to store 3 filters

When compared to direct binaural, there are measurable differences.

But listeners can’t tell the difference.

Page 66: Stop Looking and Start Listening

Can still do more efficient things in B-format domain

Only 3 convolutions and only need to store 3 filters

When compared to direct binaural, there are measurable differences.

But listeners can’t tell the difference.

So use the more efficient implementation.

Page 67: Stop Looking and Start Listening

build an interface which uses virtual ambisonics

task: browse a collection to select a single song

Page 68: Stop Looking and Start Listening

map paradigm without any visuals

Page 69: Stop Looking and Start Listening
Page 70: Stop Looking and Start Listening
Page 71: Stop Looking and Start Listening
Page 72: Stop Looking and Start Listening

evaluation

• user study with 12 users

• most liked the idea

• but the implementation needed improvement

• confusion as to how to navigate through the space

• some people adverse to concurrent playback

Page 73: Stop Looking and Start Listening

add visuals and improve physical controller, but keep dependence on audio

Page 74: Stop Looking and Start Listening

cyclic playback

• inspired by

S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008.

• hear everything within 20 seconds

• user can control concurrent playback

Page 75: Stop Looking and Start Listening
Page 76: Stop Looking and Start Listening
Page 77: Stop Looking and Start Listening
Page 78: Stop Looking and Start Listening

evaluation

• no formal evaluation, but demonstrated to a variety of individuals and small groups (approximately 40 people)

• improved interaction with physical controller

• perhaps too many controls, much steeper learning curve

• much room for improvement

Page 79: Stop Looking and Start Listening

art installation

Page 80: Stop Looking and Start Listening

Michela Magas

Page 81: Stop Looking and Start Listening

public installation

• shown in Information Aesthetics at SIGGRAPH 2009

• approximately 1000 passed through the exhibit

• children, students, artists, designers, technologists

• quick to bring smiles - it was fun, people even brought back friends to experience it

• easy to learn how to use

Page 82: Stop Looking and Start Listening

conclusions drawn from research

Page 83: Stop Looking and Start Listening

conclusions drawn from research

• context is key when shaping interaction

• users will approach an interface with previous knowledge, need to build on and incorporate that knowledge

Page 84: Stop Looking and Start Listening

conclusions drawn from research

• context is key when shaping interaction

• users will approach an interface with previous knowledge, need to build on and incorporate that knowledge

• audio can’t be subtle

• can’t rely on complex information to be universally implied through only audio

Page 85: Stop Looking and Start Listening

conclusions drawn from research

• context is key when shaping interaction

• users will approach an interface with previous knowledge, need to build on and incorporate that knowledge

• audio can’t be subtle

• can’t rely on complex information to be universally implied through only audio

• can (and should) be fun

Page 86: Stop Looking and Start Listening

conclusions drawn from research

• context is key when shaping interaction

• users will approach an interface with previous knowledge, need to build on and incorporate that knowledge

• audio can’t be subtle

• can’t rely on complex information to be universally implied through only audio

• can (and should) be fun

• maps aren’t great, there must be something better

Page 87: Stop Looking and Start Listening

why haven’t these ideas caught on?

• solutions use non-scalable algorithms that are impractical for commercial applications (a problem not limited to only interfaces within MIR)

• portability across devices

• many of them just don’t work that well

• most have very simple acoustics models

• too much information thrown at user, or information is not organized in an accessible way

Page 88: Stop Looking and Start Listening

flickr:matsber

flickr:jlcwalker

Page 89: Stop Looking and Start Listening

one more time

Page 90: Stop Looking and Start Listening

search engines are tuned for the type of information being sought

Page 91: Stop Looking and Start Listening

search engines are tuned for the type of information being sought

but they break down when presenting time-based media

Page 92: Stop Looking and Start Listening

search engines are tuned for the type of information being sought

but they break down when presenting time-based media

in our case, music

Page 93: Stop Looking and Start Listening

direct manipulation to direct sonification

Page 94: Stop Looking and Start Listening

direct manipulation to direct sonification

listen to the music first, then get more information if so desired

Page 95: Stop Looking and Start Listening

direct manipulation to direct sonification

listen to the music first, then get more information if so desired

this is done by using auditory displays

Page 96: Stop Looking and Start Listening

a lot of focus on map-based paradigms, but it may be time to move on

Page 97: Stop Looking and Start Listening

a lot of focus on map-based paradigms, but it may be time to move on

concurrent presentation of audio is a good idea

Page 98: Stop Looking and Start Listening

a lot of focus on map-based paradigms, but it may be time to move on

concurrent presentation of audio is a good idea

but spatialization should not be used to represent complex relationships

Page 99: Stop Looking and Start Listening

a lot of focus on map-based paradigms, but it may be time to move on

concurrent presentation of audio is a good idea

but spatialization should not be used to represent complex relationships

music is complex

Page 100: Stop Looking and Start Listening

incorporating listening improves music search and discovery

Page 101: Stop Looking and Start Listening

incorporating listening improves music search and discovery

so it should continue

Page 102: Stop Looking and Start Listening

incorporating listening improves music search and discovery

so it should continue

we haven’t figured out how to do it perfectly

Page 103: Stop Looking and Start Listening

incorporating listening improves music search and discovery

so it should continue

we haven’t figured out how to do it perfectly

need to turn fun toys into useful tools

Page 104: Stop Looking and Start Listening

thank you

these slides can be found at http://www.slideshare.net/beckystewart/presentations