Design and evaluation of a probabilistic music projection ... › ~rod › publications ›...

7
DESIGN AND EVALUATION OF A PROBABILISTIC MUSIC PROJECTION INTERFACE Beatrix Vad, 1 Daniel Boland, 1 John Williamson, 1 Roderick Murray-Smith, 1 Peter Berg Steffensen 2 1 School of Computing Science, University of Glasgow, Glasgow, United Kingdom 2 Syntonetic A/S, Copenhagen, Denmark [email protected], [email protected], [email protected] [email protected], [email protected] ABSTRACT We describe the design and evaluation of a probabilistic interface for music exploration and casual playlist gener- ation. Predicted subjective features, such as mood and genre, inferred from low-level audio features create a 34- dimensional feature space. We use a nonlinear dimen- sionality reduction algorithm to create 2D music maps of tracks, and augment these with visualisations of probabilis- tic mappings of selected features and their uncertainty. We evaluated the system in a longitudinal trial in users’ homes over several weeks. Users said they had fun with the interface and liked the casual nature of the playlist gener- ation. Users preferred to generate playlists from a local neighbourhood of the map, rather than from a trajectory, using neighbourhood selection more than three times more often than path selection. Probabilistic highlighting of sub- jective features led to more focused exploration in mouse activity logs, and 6 of 8 users said they preferred the prob- abilistic highlighting mode. 1. INTRODUCTION To perform information retrieval on music, we typically rely on either meta data or on ‘intelligent’ signal process- ing of the content. These approaches create huge feature vectors and as the feature space expands it becomes harder to interact with. A projection-based interface can provide an overview over the collection as a whole, while show- ing detailed information about individual items in context. Our aim is to build an interactive music exploration tool, which offers interaction at a range of levels of engagement, which can foster directed exploration of music spaces, ca- sual selection and serendipitous playback. It should pro- vide a consistent, understandable and salient layout of mu- sic in which users can learn music locations, select music and generate playlists. It should promote (re-)discovery of music and accommodate widely varying collections. c Beatrix Vad, 1 Daniel Boland, 1 John Williamson, 1 Rod- erick Murray-Smith, 1 Peter Berg Steffensen 2 . Licensed under a Creative Commons Attribution 4.0 International Li- cense (CC BY 4.0). Attribution: Beatrix Vad, 1 Daniel Boland, 1 John Williamson, 1 Roderick Murray-Smith, 1 Peter Berg Steffensen 2 . “Design and evaluation of a probabilistic music projection interface”, 16th Inter- national Society for Music Information Retrieval Conference, 2015. To address these goals we built and evaluated a system to interact with 2D music maps, based on dimensionally- reduced inferred subjective aspects such as mood and genre. This is achieved using a flexible pipeline of acoustic fea- ture extraction, nonlinear dimensionality reduction and prob- abilistic feature mapping. The features are generated by the commercial Moodagent Profiling Service 1 for each song, computed automatically from low-level acoustic fea- tures, based on a machine-learning system which learns feature ratings from a small training set of human subjec- tive classifications. These inferred features are uncertain. Subgenres of e.g. electronic music are hard for expert hu- mans to distinguish, and even more so for an algorithm using low-level features [24]. This motivates representing the uncertainty of features in the interaction. It is not straightforward to evaluate systems based on interacting with such high-dimensional data. This is not a pure visualisation task. Promoting understanding is sec- ondary to offering a compelling user experience, where the user has a sense of control. How do we evaluate projec- tions, especially if the user’s success criterion is just to play something ’good enough’ with minimal effort? We evaluated our system to answer: 1. Can a single interface enable casual, implicit and fo- cused interaction for music retrieval? 2. Which interface features better enable people to nav- igate and explore large music collections? 3. Can users create viable mental models of a high- dimensional music space via a 2D map? 2. BACKGROUND 2.1 Arranging music collections on fixed dimensions A music retrieval interface based on a 2D scatter plot with one axis ranging from slow to fast and the other from dark to bright on the timbre dimension is presented in [10]. The authors show this visualisation reduces time to select suit- able tracks compared to a traditional list view. [11] presents a 2D display of music based on the established arousal- valence (AV) diagram of emotions [20], with AV judge- ments obtained from user ratings. An online exploration tool musicovery.com [6] enables users to select a mood 1 http://www.moodagent.com/

Transcript of Design and evaluation of a probabilistic music projection ... › ~rod › publications ›...

Page 1: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

DESIGN AND EVALUATION OF A PROBABILISTIC MUSICPROJECTION INTERFACE

Beatrix Vad,1 Daniel Boland,1 John Williamson,1 Roderick Murray-Smith,1 Peter Berg Steffensen2

1School of Computing Science, University of Glasgow, Glasgow, United Kingdom2Syntonetic A/S, Copenhagen, Denmark

[email protected], [email protected], [email protected]@glasgow.ac.uk, [email protected]

ABSTRACT

We describe the design and evaluation of a probabilisticinterface for music exploration and casual playlist gener-ation. Predicted subjective features, such as mood andgenre, inferred from low-level audio features create a 34-dimensional feature space. We use a nonlinear dimen-sionality reduction algorithm to create 2D music maps oftracks, and augment these with visualisations of probabilis-tic mappings of selected features and their uncertainty.

We evaluated the system in a longitudinal trial in users’homes over several weeks. Users said they had fun with theinterface and liked the casual nature of the playlist gener-ation. Users preferred to generate playlists from a localneighbourhood of the map, rather than from a trajectory,using neighbourhood selection more than three times moreoften than path selection. Probabilistic highlighting of sub-jective features led to more focused exploration in mouseactivity logs, and 6 of 8 users said they preferred the prob-abilistic highlighting mode.

1. INTRODUCTION

To perform information retrieval on music, we typicallyrely on either meta data or on ‘intelligent’ signal process-ing of the content. These approaches create huge featurevectors and as the feature space expands it becomes harderto interact with. A projection-based interface can providean overview over the collection as a whole, while show-ing detailed information about individual items in context.Our aim is to build an interactive music exploration tool,which offers interaction at a range of levels of engagement,which can foster directed exploration of music spaces, ca-sual selection and serendipitous playback. It should pro-vide a consistent, understandable and salient layout of mu-sic in which users can learn music locations, select musicand generate playlists. It should promote (re-)discovery ofmusic and accommodate widely varying collections.

c© Beatrix Vad,1 Daniel Boland,1 John Williamson,1 Rod-erick Murray-Smith,1 Peter Berg Steffensen2.Licensed under a Creative Commons Attribution 4.0 International Li-cense (CC BY 4.0). Attribution: Beatrix Vad,1 Daniel Boland,1 JohnWilliamson,1 Roderick Murray-Smith,1 Peter Berg Steffensen2. “Designand evaluation of a probabilistic music projection interface”, 16th Inter-national Society for Music Information Retrieval Conference, 2015.

To address these goals we built and evaluated a systemto interact with 2D music maps, based on dimensionally-reduced inferred subjective aspects such as mood and genre.This is achieved using a flexible pipeline of acoustic fea-ture extraction, nonlinear dimensionality reduction and prob-abilistic feature mapping. The features are generated bythe commercial Moodagent Profiling Service 1 for eachsong, computed automatically from low-level acoustic fea-tures, based on a machine-learning system which learnsfeature ratings from a small training set of human subjec-tive classifications. These inferred features are uncertain.Subgenres of e.g. electronic music are hard for expert hu-mans to distinguish, and even more so for an algorithmusing low-level features [24]. This motivates representingthe uncertainty of features in the interaction.

It is not straightforward to evaluate systems based oninteracting with such high-dimensional data. This is nota pure visualisation task. Promoting understanding is sec-ondary to offering a compelling user experience, where theuser has a sense of control. How do we evaluate projec-tions, especially if the user’s success criterion is just toplay something ’good enough’ with minimal effort? Weevaluated our system to answer:

1. Can a single interface enable casual, implicit and fo-cused interaction for music retrieval?

2. Which interface features better enable people to nav-igate and explore large music collections?

3. Can users create viable mental models of a high-dimensional music space via a 2D map?

2. BACKGROUND

2.1 Arranging music collections on fixed dimensions

A music retrieval interface based on a 2D scatter plot withone axis ranging from slow to fast and the other from darkto bright on the timbre dimension is presented in [10]. Theauthors show this visualisation reduces time to select suit-able tracks compared to a traditional list view. [11] presentsa 2D display of music based on the established arousal-valence (AV) diagram of emotions [20], with AV judge-ments obtained from user ratings. An online explorationtool musicovery.com [6] enables users to select a mood

1 http://www.moodagent.com/

Page 2: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

Figure 1. (a) An audio collection, described by a large set of features automatically extracted from the content. (b)visualisation of this high-dimensional dataset in two dimensions using dimensionality reduction (c) probabilistic modelsshowing the distribution of specific features in the low dimensional space (d) combining dimensionality reduction withthese models to build an interactive exploration interface.

in the AV space and starts a radio stream based on the in-put. These use two predefined dimensions that are easy tointerpret, however they do not allow a broader interpreta-tion of musical characteristics based on richer feature sets.[13] finds that music listening is often based upon mood.The investigation of musical preferences in [9] shows mostprivate collections consist of a wide range of styles and ap-proaches to categorisation.

2.2 Music visualisations via dimensionality reduction

“Islands of Music” [17] visualises music collections us-ing a landscape metaphor. They use rhythmic patterns ina set of frequency bands to create a Self-Organizing Map(SOM), a map of music for users to explore. Similarly, [16]introduce the SOM-based PlaySOM and PocketSOM in-terfaces. Features are again based on rhythm and 2D em-bedding. An interesting visualisation feature is the use of“gradient fields” to illustrate the distribution of featuresover the map. Playlist generation is enabled with a rect-angular marquee and path selection. Elevations are basedon the density of songs in the locality, so clustered songsform islands with mountains. A collection of 359 pieceswas used to evaluate the system and song similarities weresubjectively evaluated. An immersive 3D environment formusic exploration, again using a SOM is described in [14].An addition to previous approaches is an integrated feed-back loop that allows users to reposition songs, alter theterrain and position landmarks. The users’ sense of simi-larity is modelled and the map gradually adapted. Both theSOM landscape and acoustic clues improved search timesper song.

SongWords [2] is an interactive tabletop application tobrowse music based on lyrics. It combines a SOM witha zoomable user interface. The app is evaluated in a userstudy with personal music collections of ca. 1000 items.One reported issue was that only the item positions de-scribed the map’s distribution of characteristics. Users hadto infer the structure of the space from individual items.“Rush 2” explores interaction styles from manual to auto-matic [1]. They use similarity measures to create playlistsautomatically by selecting a seed song.

A detailed overview of music visualisation approaches

and the MusicGalaxy system is contributed with [23]. Thiswork introduces adaptive methods for music visualisation,allowing users to adjust weightings in the projection. Italso explores the use of a lens so that users could zoom intoparts of the music space. Most notably, it receives a signif-icant amount of user evaluation. The lack of such evalua-tions in the field of MIR has been noted in [21], which callsfor a user-centred approach to MIR. The work in this pa-per thus includes an ‘in-the-wild’ longitudinal evaluation,bringing HCI methodology to bear in MIR.

2.3 Interaction with music visualisations

Path drawings on a music visualisation, enabling high-levelcontrol over songs and progression of created playlists canbe found in [26]. Casual interaction has recently startedreceiving attention from the HCI community [18], outlin-ing how interactions can occur at varying levels of userengagement. A radio-like interface that adapts to user en-gagement is introduced by [3,4]. It allows users to interactwith a stream of music at varying levels of control, fromcasual mood-setting to engaged music interaction. Musicvisualisations can also span engagement – from broad se-lections in an overview to specific zoomed-in selections.

3. PROBABILISTIC MUSIC INTERFACE

As shown in Figure 1, the interface builds on features de-rived from raw acoustic characteristics and transforms theseinto a mood-based visualisation, where nearby songs willhave a similar subjective “feeling”. Our feature extractionservice provides over thirty predicted subjective featuresfor each song including its mood, genre, style, vocals, in-strument, beat, tempo, energy and other attributes. Thefeatures associated with moods chosen for later highlight-ing in the visualisation include Happy, Angry, Sad and Ten-der. These were identified as relevant moods from socialtags in [12]. Erotic, Fear and Tempo (not strictly a mood)were also included. The features were investigated in [5].

Given our large number of features, we need dimension-ality reduction to compress the data from |F | dimensionsto |D| dimensions. The goal of this step is to preserve sub-jective similarities between songs and maintain coherent

Page 3: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

structure in the dataset. For interaction, we reduce downto 2D. We tried our system with a number of dimension-ality reduction techniques including PCA and SOM. Wechose the t-distributed stochastic neighbour embedding (t-SNE, [25]) model for non-linear dimensionality reductionto generate a map entangling a global overview of clustersof similar songs and yet locally minimise false positives.

To provide additional information about the composi-tion of the low-dimensional space, we developed proba-bilistic models to visualise high dimensional features in thelow-dimensional space. This probabilistic back-projectiongives users insight into the structure of the layout, but alsointo the uncertainties associated with the classifications.On top of the pipeline (Figure 1), we built an efficient, scal-able web-based UI which can handle music collections up-wards of 20000 songs. The tracks can be seen as randomvariables drawn from a probabilistic distribution with re-spect to a specific feature. The distribution parameters canbe estimated and used for prediction, allowing smoothedinterpolation of features as shown in Figure 2. We usedGaussian Process (GP) priors [19], a powerful nonpara-metric Bayesian regression method. We applied a squaredexponential covariance function on the 2D (x, y) coordi-nates, predicting the mood features Pf over the map. TheGP can also infer the uncertainty σ2

f of the predicted fea-ture relevance for each point [22].

Figure 2. Gaussian Process predictions of features. Or-ange denotes the “happy” feature distribution and blue de-notes “tender”. The greyscale surface shows the uncer-tainty; lighter is more certain and darker is less certain.

3.1 Interface design

To present the inferred subjective results to the users, theGP mean and standard deviation is evaluated over a 200×200 grid covering the 2D music space. A continuouslycoloured background highlighting is created where areasof high feature scores stand out above areas with higher un-certainty or lower scores. To highlight areas with high pre-diction scores and low uncertainty, a nonlinear transformis used: αf = P 2

f − σ2f , for each mood feature f , having

a standard deviation σf and a predicted feature value Pf .The clusters in the music space can be emphasised as in theupper part of Figure 3 by colouring areas with the colourassociated with the highest score; i.e. argmax(αf ) – awinner-takes-all view. This not only divides the space intodiscrete mood areas but also shows nuanced gradients ofmood influences within those areas. However, once a user

starts to dynamically explore a specific area of the space,the system transitions to implicit background highlightingsuch that the background distribution of the mood with thehighest value near the cursor is blended in dynamically asin the lower plots of Figure 3, giving the user more subtleinsights into the nature of the space.

Tracks are represented as circles in a scatter plot, wheresize can convey additional information, e.g. the popularityof a song, without disturbing the shape paradigm. To sup-port visual clustering, colour highlights the highest scoringmood feature of each song, and transparency conveys thefeature score. However, the number of diverging, brightcolours for categorisation is limited. Murch [15] states thata “refocus” is needed to perceive different pure colours, somatched pairs of bright and desaturated colours are chosenfor the highly correlated mood pairs tender/sad, happy/eroticand angry/fear.

Figure 3. Top: The interactive web interface in its ‘win-ner takes all’ overview colouring. A path playlist selectionas well as a neighbourhood selection is visible in the moodspace. Bottom: Background highlighting for the featuresangry, tender and erotic. Compared with the overviewcolouring, the subtle fluctuations of features are apparent.

3.2 Interaction with the interface

As the visualisation can handle very large numbers of items,a semantic zoom was integrated, where the size of each el-ement is fixed. This coalesces items on zoom out and dis-entangles items on zoom in.

Further insight into the nature of the space is given bythe adaptive area exploration tool which visualises the lo-cal item density. In contrast to previous work we do notuse a fixed selection area but one based on the k-nearest-neighbours to the mouse cursor. Points are highlightedas the mouse is moved, creating a dynamically expandingand collapsing highlight, responding to the structure of the

Page 4: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

space. The k-nn visualisation adapts to zoom level; whenzoomed out, k is large; when zoomed in, we support fo-cused interaction, with a smaller k.

Focus and context: To make the music map explorationmore concrete, a hover box is displayed with informationabout the nearest item to the cursor, including artist name,title, and album art (see Figure 3). It shows a mini barchart of the song’s mood features. As this is fixed onscreen,users can explore and observe changes in the mood chart,giving them insight into the high-dimensional space.

3.3 Playlist generation

Neighbourhood selection is a quick and casual interactionmetaphor for creating a playlist from the k nearest neigh-bours. Songs are ranked according to their query point dis-tance. This enables the directed selection of clusters inthe space, even if the cluster is asymmetric. By adjustingzoom level (and thus k), k-NN selection can include all in-cluster items while omitting items separated from the per-ceived cluster. This feature could be enhanced by addingan ε-environment similar to the density-based clustering al-gorithm DBSCAN [7]. Fast rendering and NN search wasimplemented using quadtree spatial indexing [8].

Path selection enables space-spanning selections. Draw-ing a path creates a playlist which ‘sticks’ to nearby itemsalong the way. The local density of items is controlled bymodulating velocity, so faster trajectory sections stick tofewer songs than slow ones. This ‘dynamic attachment’offers control over the composition of playlists without vi-sual clutter. E.g. a user can create a playlist starting in thehappy area, then gradually migrating towards tender.

4. USER EVALUATION

The evaluation was based on the research questions:1. How do users perceive the low-dimensional mood spaceprojection? 2. Is the mood-based visualisation useful inmusic exploration and selection? 3. Which techniques dousers develop to create playlists?

A pilot study evaluated the viability of the system andguided the design of the main longitudinal “in the wild”user study, which was conducted to extract detailed us-age behaviour over the course of several weeks. Adaptingto a new media interface involves understanding how per-sonal preferences and personal media collections are rep-resented. Longitudinal study is essential for capturing thebehaviour that develops over time, beyond superficial aes-thetic reactions and can – in contrast to Lab-based study –cover common use cases (choose tracks for a party, playsomething in the background while studying).

Eight participants (1 female, 7 male, 5 from the UK and3 from Germany, undergraduate and research students) –each with their own Spotify account and personal musiccollections – were recruited. The mood interface was usedto visualise the personal music collection of the partici-pants. The participants used the interface at home as theirmusic player to whatever extent and in whatever way theywanted. Two participants also used the system at work.

All subjects used a desktop to access the interface. As areward and to facilitate use together with the Spotify WebPlayer, participants were given a voucher for a three monthpremium subscription of Spotify.

The Shannon entropy H of the 6 mood features of eachuser’s music collection gives an impression of the diver-sity of content. Using the maximum mood feature for eachsong, H = −

∑i pi log2 pi, where pi = Nmoodi

/N .

1 2 3 4 5 6 7 8H 2.51 2.36 2.49 2.42 1.84 2.38 1.93 2.5N 3679 2623 4218 3656 2738 2205 1577 3781

Table 1. Entropy H , no. tracks N of users’ collections.

The study took place in two blocks, each with nomi-nally four days of usage, although the actual duration var-ied slightly. One of the key aims was to find out if the prob-abilistic background highlighting provides an enhanced ex-perience, so the study was comprised of two conditions ina counterbalanced within-subjects arrangement:A Music Map without background highlighting.B Music Map with background highlighting: The proba-bilistic models are included, with the composite view ofthe mood distribution as well as dynamic mood highlight-ing on cursor movements. Each participant was randomlyassigned either condition A in week 1 followed by B inweek 2 or vice versa. At the beginning of each condi-tion and the end of the study, questionnaires were adminis-tered to capture participants’ experience with the interface.Interface events, including playlist generation, navigationand all mouse events (incl. movements) were recorded.

5. RESULTS

Most participants used the software extensively, generat-ing an average of 21 playlists per user per week, as shownin Table 2. On average, users actively interacted with thesystem for 77 minutes each week (roughly 20 minutes aday) – time spent passively listening to playlists is not in-cluded in this figure. Both groups generated more playlistsin week 1 than in week 2, as they explored the system.

User 1 2 3 4 5 6 7 8Np,A 27 164 4 7 18 8 5 25Np,B 46 39 3 7 5 17 18 53

Table 2. No. playlists generated per user for cond. A & B.Users 1-5 had A in week 1, while 6-8 had A in week 2.

5.1 Mood perception

After each condition, users were asked to rate their satis-faction with interacting via the mood space. The overallopinion was encouraging. The majority of participants re-ported that they felt their collection was ordered in a mean-ingful way. Six stated that the mood-based categorisationmade sense. Initially, the distinction of different musictypes was not rated as consistently over all conditions. Thismight be due to the fact that people usually discuss musicin terms of genres rather than moods. However, the dif-ficulty rating of mood changed over sessions. While six

Page 5: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

users rated mood-based categorisation as difficult at thestart, only three participants still rated mood as difficultto categorise by the second week. This suggests that userscan quickly learn the unfamiliar mood-based model.

5.2 Interactions with the Mood Space

Browsing the Space: Analysis of mouse movements pro-vided insight into how participants explored mood spaces.Heatmaps were generated showing the accumulated mousepositions in each condition (Figure 4). Participants ex-plored the space more thoroughly in week one of the study.Some participants concentrated exploration on small pock-ets, while others explored the whole space relatively evenly.

Figure 4. Heatmaps of interaction (mouse activity) of user3 in week 1 (left) and week 2 (right). Interaction becomesmore focused in the second week.

The browse/select ratio dropped noticeably for the sec-ond week for users with condition A first, as shown inTable 3. This suggests that participants browsed muchmore for each playlist in the first part of the study, andwere more targeted in the second part. The browsing couldhave been either curiosity-driven exploration, or a frustrat-ing experience, because the non-linear nature of the map-ping made the space difficult for the users to predict theresponse to movements in the space. However, from Ta-ble 3 we can see that users who had the highlights in thefirst week seemed to have much more focused navigationfrom the start, and did not decrease their browsing much inweek 2 when they lost the highlighting mechanism.

Condition A Condition BWeek 1 1218.39 (483.49) 447.94 (176.85)Week 2 319.25 (29.27) 576.5 (416.72)

Table 3. Browse select ratios (std. dev. in parentheses) forweek 1 and week 2 of the experiment, in cond. A and B.

Selections and Playlist Creation: Figure 5 shows playlistsfrom two different participants in condition B. User 1 (left)created playlists by neighbourhood selection, and also drewa few trajectory playlists. User 7 (right) moved in overa more diverse set with a number of trajectory playlists.The paths partially follow the contours of the backgroundhighlight, which suggests this user explored contrasts inthe mood space on these boundaries.

Neighbourhood selection was used more often (341 neigh-bourhood selections and 105 path selections). A rise in

Figure 5. Created playlists under condition B for user 1(H = 2.51) and user 7 (H = 1.93). Note the differentclass layouts for the collections with high/low entropy H .

the use of path selections, and a decline in neighbourhoodselections can be seen in condition B versus A. In condi-tion A, five times more neighbourhood than path selectionswere recorded, and only twice as many in condition B (seeTable 4). This could be explained by the background distri-butions suggesting mood influence change gradually overthe space. This information may encourage users to createtrajectory playlists that gradually change from one moodintensity to another.

Selection A B TotalPath 42 63 105

Neighbourhood 216 125 341Neighbourhood/Path 5.1× 2.0× 3.3×

Table 4. Usage of the two different selection types ineach condition. The neighbourhood/path ratio shows theincreased use of the path tool in condition B.

5.3 Qualitative feedback

Background Highlighting: We asked whether backgroundhighlighting was valuable to the users. The answer wasclearly in favour of background highlighting: 6/8 usersvalued the highlighting, one user was indifferent and onepreferred the version without highlighting. The reasonsgiven in favour of the highlighting were that they couldmore easily identify different regions and remember spe-cific “locales” in the mood space. They recognised thatsongs had different mood influences and enjoyed follow-ing the colour highlights to areas of different intensity. Oneuser stated that he liked the vividness of the implicit high-lighting. The user who preferred no highlighting found it acleaner look that was less confusing. 6 participants statedthat they did not find the highlighting confusing. 7 par-ticipants answered that it did not distract from the playlistcreation task. Qualitative feedback also indicated a prefer-ence for highlighting: ”[with highlighting] I could easieridentify how the mood was distributed over my library”,”coloured areas provided some kind of ’map’ and ’land-marks’ in the galaxy”.

Preference for neighbourhood versus path playlists: Thedomination of neighbourhood versus path playlists in thelogged data is supported by feedback from questionnaires,

Page 6: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

which shows that users were generally happier with neigh-bourhood selection than the more novel path selection tech-nique. The attitude towards the path selection differed,however, between conditions. Participants were more sat-isfied with path selection under condition B, with interac-tive background highlighting. After condition A, four par-ticipants agreed the path playlist was effective, and threedisagreed. After condition B, however, 5 users agreed andonly one disagreed.

Advantages of the Interface: The subjective feedback re-vealed that users had fun exploring the mood space andenjoyed the casual creation of playlists. ”fun to explorethe galaxy”, ”easy generation of decent playlists” Usersalso appreciated the casual nature of the interface: ”Itwas very easy to take a hands-off approach”, ”I didn’thave to think about specific songs”. Users made spe-cific observations indicating that they were engaged in theexploration task and learned the structure of the map, al-though this varied among users. ”I discovered that mostolder jazz pieces were clustered in the uppermost corner”,”It was easy to memorize such findings [...] the galaxythus became a more and more personal space”. Satisfac-tion with the quality of selections was high, although someparticipants found stray tracks that did not fit with neigh-bouring songs. ”The detected mood was a bit off for afew songs”. Several users stated that they appreciated theconsistency of created playlists and the diversity of differ-ent artists, in contrast to their usual artist-based listening.There was concern that playlists did not offer enough di-versity ”some songs that dominated the playlists”, ”toomuch weight given to ’archive’ material”, ”some way toreorder the playlists to keep them fresh”, while others en-joyed this aspect: ”I rediscovered many songs I had notlistened to in a long time”.

Shared versus personal: Visualising a shared (i.e. inter-user) mood space with personal collections embedded wasnot rated very important by most users (only one user thoughtthis important). However, personalisation of the space wasrated of high importance by half of the users. Ensuringthat nearby songs are subjectively similar was addition-ally rated as important by the majority of participants (fiveusers). These user priorities led to trade-offs between verylarge music maps and maps reliably uncovering intrinsicclusters of similar items.

Improvement requests: The most requested missing featurewas a text search feature. The use of Spotify for playbackalso led to a disjointed user experience which would beeasily improved on in a fully integrated mood-map musicplayer. Users also requested the integration of recommen-dations and the ability to compare different mood spaces.

6. CONCLUSIONS AND FUTURE WORK

We presented an interactive tool for music exploration, withmusical mood and genre inferred directly from tracks. Itfeatures probabilistic representations of multivariable pre-dictions of subjective characteristics of the music to giveusers subtle, nuanced visualisations of the map. These

explicitly represent the vagueness and overlap among fea-tures. The user-based, in-the-wild evaluation of this novelhighlighting technique provided answers to the initial re-search questions:

Can users create viable mental models of the music space?The feedback from the ‘in-the-wild’ evaluation indicatesthat people enjoyed using these novel interfaces on theirown collections, at home, and that mood-based categorisa-tion can usefully describe personal collections, even if ini-tially unfamiliar. Analysis of logged data revealed distinctstrategies in experiencing the mood space. Some usersexplored diverse parts of the mood space and switchedamong them, while others quickly homed in on areas ofinterest and then concentrated on those. The question-naire responses suggest they learned the composition of thespace and used it more constructively in the later sessions.Users make plausible mental models of the visualisation –they know where the favourite songs are – and can use thismodel to discover music and formulate playlists.

Which interface features enable people to navigate and ex-plore the music space? Interactive background highlight-ing seemed to reduce the need to browse intensively withthe mouse (Table 3). Subjective feedback confirmed that ithelped understand the music space with 6/8 users prefer-ring it over no highlighting. Most users did not feel dis-turbed by the implicitly changing background highlight-ing. Both the neighbourhood and path playlist generatorswere used by the participants, although neighbourhood se-lections were subjectively preferred and were made threetimes more often than path selections. Subjective feedbackhighlights the contrast between interfaces which adapt toan individual user taste or reflect a global model, in whichall users can collaborate, share and discuss music, tradinggreater relevance versus greater communicability. Simi-larly, how can we adapt individual user maps as the user’smusical horizons are expanded via the exploratory inter-face? Users’ preference of comparing visualisations overinteracting in one large music space hints that an alignmentof visualisations is a valid solution to this problem.

Can a single interface enable casual, implicit and focusedinteraction? Users valued the ability to vary the level ofengagement. Their feedback also suggested that incorpo-rating preview and control over the playing time of playlistswould be useful, e.g. move towards “happy” over 35 min-utes. A recurring theme was that playlists tended to berepetitive. One solution would be to allow the jittering ofplaylist trajectories and to do this jittering in high-dimensionalspace. The low-dimensional path then specifies a prior inthe high-dimensional music space which can be perturbedto explore alternative expressions of that path.

Post-evaluation:An enhanced version with a text search function was dis-tributed at the end of the study. The encouraging result wasthat a month later, 3 of 8 participants still returned to theinterface on a regular basis – once every few days, with oneuser generating 68 new playlists in the following weeks.

Page 7: Design and evaluation of a probabilistic music projection ... › ~rod › publications › VadBolMur15.pdf · A detailed overview of music visualisation approaches and the MusicGalaxy

7. REFERENCES

[1] Dominikus Baur, Bernhard Hering, Sebastian Boring,and Andreas Butz. Who needs interaction anyway? Ex-ploring mobile playlist creation from manual to auto-matic. Proc. 16th Int. Conf. on Intelligent User Inter-faces, pages 291–294, 2011.

[2] Dominikus Baur, Bartholomaus Steinmayr, and An-dreas Butz. SongWords: Exploring music collectionsthrough lyrics. In Proc. ISMIR), pages 531–536, 2010.

[3] Daniel Boland, Ross McLachlan, and RoderickMurray-Smith. Inferring Music Selections for CasualMusic Interaction. EuroHCIR, pages 15–18, 2013.

[4] Daniel Boland, Ross McLachlan, and RoderickMurray-Smith. Engaging with mobile music retrieval.In MobileHCI 2015, Copenhagen, 2015.

[5] Daniel Boland and Roderick Murray-Smith.Information-theoretic measures of music listeningbehaviour. In Proc. ISMIR, Taipei, 2014.

[6] Vincent Castaignet and Frederic Vavrille. (23.04.2014)http://musicovery.com/.

[7] Martin Ester, Hans P Kriegel, Jorg Sander, and Xi-aowei Xu. A Density-Based Algorithm for Discover-ing Clusters in Large Spatial Databases with Noise. InSecond Int. Conf. on Knowledge Discovery and DataMining, pages 226–231, 1996.

[8] Raphael A. Finkel and Jon Louis Bentley. Quad treesa data structure for retrieval on composite keys. ActaInformatica, 4(1):1–9, 1974.

[9] Alinka Greasley, Alexandra Lamont, and John Slo-boda. Exploring Musical Preferences: An In-DepthQualitative Study of Adults’ Liking for Music in TheirPersonal Collections. Qualitative Research in Psychol-ogy, 10:402–427, 2013.

[10] Jiajun Zhu and Lie Lu. Perceptual Visualization of aMusic Collection. In 2005 IEEE Int. Conf. on Multi-media and Expo, pages 1058–1061. IEEE, 2005.

[11] JungHyun Kim, Seungjae Lee, SungMin Kim, andWon Young Yoo. Music mood classification modelbased on arousal-valence values. 13th Int. Conf. on Ad-vanced Comm. Technology, pages 292–295, 2011.

[12] Cyril Laurier, Mohamed Sordo, Joan Serra, and Per-fecto Herrera. Music mood representations from socialtags. In Proc. ISMIR, pages 381–386, 2009.

[13] Adam J Lonsdale and Adrian C North. Why do we lis-ten to music? a uses and gratifications analysis. BritishJournal of Psychology, 102(1):108–134, 2011.

[14] Matthias Lubbers, Dominik and Jarke. Adaptive Mul-timodal Exploration of Music Collections. In Proc. IS-MIR, pages 195–200, 2009.

[15] Gerald M. Murch. Physiological principles for the ef-fective use of color. Computer Graphics and Applica-tions, IEEE, 4(11):48–55, 1984.

[16] Robert Neumayer, Michael Dittenbach, and AndreasRauber. PlaySOM and PocketSOMPlayer, alternativeinterfaces to large music collections. In Proc. ISMIR,pages 618–623, 2005.

[17] Elias Pampalk, Andreas Rauber, and Dieter Merkl.Content-based organization and visualization of mu-sic archives. Proc. 10th ACM Int. Conf. on Multimedia,page 570, 2002.

[18] Henning Pohl and Roderick Murray-Smith. Focusedand casual interactions: allowing users to vary theirlevel of engagement. In Proc. ACM SIGCHI Conf. onHuman Factors in Computing Systems, pages 2223–2232, 2013.

[19] Carl Edward Rasmussen and Christopher K. I.Williams. Gaussian Processes for Machine Learning.MIT Press, 2006.

[20] James A. Russell. A circumplex model of affect. Jour-nal of Personality and Social Psychology, 39:1161–1178, 1980.

[21] Markus Schedl and Arthur Flexer. Putting the User inthe Center of Music Information Retrieval. In Proc. IS-MIR, Porto, Portugal, 2012.

[22] Devinderjit Sivia and John Skilling. Data Analysis: ABayesian Tutorial. Technometrics, 40(2):155, 1998.

[23] Sebastian Stober. Adaptive Methods for User-CenteredOrganization of Music Collections. PhD thesis, Otto-von-Guericke-Universitt Magdeburg, 2011.

[24] Bob L. Sturm. A simple method to determine if a musicinformation retrieval system is a horse. IEEE Transac-tions on Multimedia, 16(6):1636–1644, 2014.

[25] Laurens van der Maaten and Geoffrey Hinton. Visual-izing Data using t-SNE. Journal of Machine LearningResearch, 9:2579–2605, 2008.

[26] Rob van Gulik and Fabio Vignoli. Visual playlist gen-eration on the artist map. In Proc. ISMIR, pages 520–523, 2005.

Acknowledgments This was supported in part by the Dan-ish Council for Strategic Research of DASTI under theCoSound project, 11-115328