Lubos palcomaio2010

19
Retention of visitors coming to Palco 3.0 via the news in SAPO: Decoy optimization Jindřich Tandler Luboš Popelínský Jan Kocourek Knowledge Discovery Lab FI, Masaryk University Brno Czech Republic

description

 

Transcript of Lubos palcomaio2010

Page 1: Lubos palcomaio2010

Retention of visitors coming to Palco 3.0 via the news in

SAPO: Decoy optimization

Jindřich TandlerLuboš Popelínský

Jan Kocourek

Knowledge Discovery Lab FI, Masaryk University BrnoCzech Republic

Page 2: Lubos palcomaio2010

Readers of SAPO (www.sapo.pt) are presented with Palco news.

News appear occasionally that redirect the user to PalcoPrincipal (www.palcoprincipal.pt).

Problem: people who come to Palco in this way usually don't stay there long enough.

Question: How can we help Palco to retain them?

Introduction

Page 3: Lubos palcomaio2010

Main task

Task: improve current level of bouncing of users (users coming to www.palcoprincipal.pt through news)

Hypothesis: Shared interests are more common among the users who clicked on a specific link (are viewing the same content) than among users who didn't (are not).

The link(s) clicked is probably the only information we can possibly know about users coming from www.sapo.pt

Page 4: Lubos palcomaio2010

What else could be done?

Improving fidelity of already registered users by exploiting the social network

Hypothesis: socially related users share interests

e.g. we can identify users of Palco who showed interest in particular items and use their activity to make suggestions to socially related users

Page 5: Lubos palcomaio2010

Current work Data has been collected into the database:

Page 6: Lubos palcomaio2010

Current work Social network structure has been extracted from the

database to Pajek format

Using R software and igraph library we can compute assortative mixing coefficient

This can be used for verification of the second hypothesis (i.e. socially related users share interests)

R in combination with igraph will be used also for exploring the social network properties to get better understanding of its specific features

Page 7: Lubos palcomaio2010

Current work – network properties

Exploring social network properties to understand its specific features

Small world phenomenom, degree distribution, degree correlations, community structure, ...

A new student has been assigned to help with this task (Pavel Kocourek)

Page 8: Lubos palcomaio2010

Possible future work

Clustering

division of users into groups with similarcharacteristics can be used for contentrecommendation

Association rule mining

for discovering direct or indirect relationships betweenweb pages in users' browsing behavior

Page 9: Lubos palcomaio2010

Data needed

Actual rate of bouncing, time spent on the pages, etc. (Google Analytics?)

Clickstream data – browsing history of particular users (important for the main task)

Clickstream data assigned to the registered Palco users could be useful for mining as well

(privacy issues?)

Page 10: Lubos palcomaio2010

Some other questions

To what extent is it possible to change the layout and content of Palco pages?

What does it take to change something? (time, people involved, …)

What algorithm is currently being used for content recommendation?

Is it possible to perform AB testing of the site with the old and a new layout/content?

Page 11: Lubos palcomaio2010

Data overview

Page 12: Lubos palcomaio2010

Data – statistics *Query No of results

users 68314

listeners 50663 (74%)

artists 17660 (26%)

friendship ties 168342 (avg 2.5)

mainstream bands 114062

fans 42228

comments 817370

tags 55018* February 2010

Page 13: Lubos palcomaio2010

Data – users

User CreatedAt, LastLogin, IsActive

User_profile genre_id, Name, Slug, Category, Type, Culture,

ModerationStatus, About, CreatedAt, UpdatedAt

Artist MusicUploads, GooglePageRank,

GooglePageRankUpdatedAt

Page 14: Lubos palcomaio2010

Data – users

Friend InviterUserId, InvitedUserId, StatusId, CreatedAt,

UpdatedAt

Listener_data GenderId, BirthDate

Page 15: Lubos palcomaio2010

Data - contact

Contact country_id, city_id, Place, Address, PhoneNumber,

Homepage

Country

City

State

Page 16: Lubos palcomaio2010

Data – activities

Owner_activities ActionId, OwnerId, TargetId, CreatorId, CreatedAt,

Slug

Activity_stream_action ActionTag (new event added, band music added, ...),

PrivacyTagTitle, PrivacySubscriberTagTitle, ActionTitle

Page 17: Lubos palcomaio2010

Data - music

Playlist listener_id, photo_id, Name, Description, Hash, Slug,

Position, CreatedAt, UpdatedAt

Playlist_music playlist_id, track_id, Position, CreatedAt

Track album_id, Name, CreatedAt, UpdatedAt, Complete,

Downloadable, Position, IdS3, FilenameS3, LinkS3, UploadedS3, Slug, Processed

Genre

Page 18: Lubos palcomaio2010

Data – tags, comments

Tag Culture, Name, Hash, Slug, CleanSlug, Length,

CreatedAt

Comments AuthorId, OwnerId, CommentableModel,

CommentableId, Aproved, Text, CreatedAt

Page 19: Lubos palcomaio2010

Data – bands

mainstreamBand Summary, About, Name, CreatedAt, UpdatedAt,

LastFmLink, Hash, Slug

fans mainstreamBand_id, user_id

Albums artist_id, photo_id, Name, Description, ReleaseDate,

CreatedAt, UpdatedAt, Hash, Slug