A Scalable Solution for Personalized Recommendations in Large-scale Social Networks

18
A Scalable Solution for Personalized Recommendations in Large-scale Social Networks Sardianos Christos, Varlamis Iraklis Harokopio University of Athens Dept. of Informatics & Telematics {sardianos}{varlamis}@hua.gr Click icon to add picture HAROKOPIO UNIVERSITY of ATHENS Department of Informatics & Telematics PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

description

H AROKOPIO U NIVERSITY of A THENS Department of Informatics & Telematics. A Scalable Solution for Personalized Recommendations in Large-scale Social Networks. Sardianos Christos, Varlamis Iraklis Harokopio University of Athens Dept. of Informatics & Telematics - PowerPoint PPT Presentation

Transcript of A Scalable Solution for Personalized Recommendations in Large-scale Social Networks

Page 1: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

A Scalable Solution for Personalized

Recommendations in Large-scale Social Networks

Sardianos Christos, Varlamis Iraklis

Harokopio University of AthensDept. of Informatics & Telematics

{sardianos}{varlamis}@hua.gr

Click icon to add picture

HAROKOPIO UNIVERSITY of ATHENS

Department of Informatics & Telematics

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 2: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

In many Web 2.0 applications users can interact with the applications in terms of social activity. They can express their trust for another user or another user’s

review.

A recommender system is responsible for recommending items (e.g. products, articles etc.) to users, based on their previous activity.

This can be a difficult process, using existing techniques, in large social and bipartite graphs.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Role of Recommender Systems

Page 3: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Structure of Recommender Systems

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

We consider two types of entities:

• Users• Items

Users express their preferences for some of the available items by rating them (directly or indirectly).

These preferences usually are expressed in a user rating matrix or utility matrix.

System’s goal: Predict user’s preference for items that he hasn’t “rated” yet and recommend the k-most likely preferred.

Page 4: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Recommender Systems Approaches

There are many Recommender Systems approaches, which can be broadly categorized into the following categories.*

Collaborative Filtering (CF)

Content-based

Hybrid Systems

* P. Melville, V. Sindhwani. "Recommender Systems", Encyclopedia of Machine Learning, Springer, 2010.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 5: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Limitations of Existing Approaches

Social networks like Facebook & Twitter have over 1.5BN & 95M users respectively. Thus, a major limitation for Recommender Systems is scalability.

The process of generating recommendations for users, for whom the system has insufficient information (Cold-Start users) is a known issue of Recommender Systems.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 6: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Scientific Research Question-Definition

Is it possible to achieve equally good recommendations by applying CF over subgraphs of the original graph?

Is it possible to use these subgraphs for providing a solution for the Cold-Start problem?

Proposed Solution: The creation of subgraphs based on social information content.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 7: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Proposed Approach & Tools

o Partitioning using Metis from Karypis Lab*

o CF using LensKit Recommender Toolkit

(GroupLens Research**)

BipartiteGraph

SocialGraph

Partitioning

Subgraphs

SVDUser-UserItem-Item

CollaborativeFiltering

Recommendations

* http://glaros.dtc.umn.edu/gkhome/index.php** http://lenskit.grouplens.org/

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 8: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Description of the model functionalityPreparation of the Social Graph

Social Graph Partitioning

Bipartite Graph Partitioning

Recommendations Evaluation

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Bipartite

Graph

Page 9: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Evaluation Metrics

• ByUser• ByRating

• ByUser• ByRating

• ByUser• ByUser

MAE RMSE𝑹𝑴𝑺𝑬=√ 𝟏

𝒏∑𝒊=𝟏

𝒏

(𝒚 𝒊− �̂� 𝒊)𝟐

𝑴𝑨𝑬=𝟏𝒏∑

𝒊=𝟏

𝒏

|𝒚 𝒊− �̂� 𝒊|

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 10: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Dataset Characteristics Comparison Dataset

Characteristics Epinions FlixsterSocial Graph

Num. of Distinct Users 131,828 786,936

Num. of Social Edges 841,372 7,058,819

Average Degree 12.765 17.94

Bipartite

Graph

Num. of Distinct Users (Raters) 120,492 147,612

Num. of Distinct Items 755,760 48,794

Num. of Ratings 13,668,320 8,196,077

Avg. outDegree/User 113.44 10.42

Avg. inDegree/Item 18.09 167.97

Evaluation scale 1 – 5 0.5 – 5

Precision 1.0 0.5PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 11: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Experimental Procedure

Experimental procedure implementation

Use of Okeanos IaaS Cloud provided by The Greek Research and Technology Network (GRNET S.A.)

Two Linux based systems: Ubuntu Desktop 64-bit 2-CPUs QEMU Virtual CPU v.:1.7.0 2.1GHz CPU Speed, 512KB cache 6GB RAM memory

Platform used for experiments

Model implementation in Java Evaluation process run through Groovy scripts

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 12: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Evaluation of the Experimental Procedure

Algorithms evaluated:o User-Usero Item-Itemo FunkSVD (SVD Implementation)

We performed a 5-fold Cross-Validation over the Training & Testing samples.

The range of the different number of subgraphs examined was:s = {1, 2, 4, 8, 16, 33, 65, 125, 250, 500, 1000}, using the whole neighborhood as k-nearest neighbors.

For s = {4, 65, 1000} we examined the performance of User-User algorithm for different Neighborhood–Size (knn), with k = {1, 3, 5, 10, 25, 50, 100, 500, 1000}.

The number of features used for training by FunkSVD algorithm was set to: FeatureCount = 100.

The number of Listsize for the Top-N nDCG metric was set to: Ν = 5.PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 13: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Evaluation Findings

o Evaluation time is rapidly reduced, while number of subgraphs increases.o For s>16 (~7.530 users), Item-Item algorithm performs faster than User-User

και SVD.o Execution of Item-Item & User-User algorithms over the full graph was

impossible , while SVD algorithm could not be executed for s<4 (~30.123 users), due to memory insufficiency because of the way SVD algorithm works.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 14: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Evaluation Findings

o Algorithms SVD & Item-Item appear to have normalized gain , unlike User-User that performs poorly, due to the notable large number of items per subgraph.

o Algorithm Item-Item can predict similar items (based on the ratings), while SVD creates a smaller and denser item space. Better performance!

Page 15: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Evaluation Findings

o Results are comparable to those from Epinions.o User-User algorithm still doesn’t perform well, but has more stable behavior.o There is however, a larger standard deviation of the performance of User-

User algorithm over each subgraph for the different values of s, unlike Item-Item & SVD algorithms.

Page 16: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Conclusions

Is it possible to create a model that will take into account the social network of the users for creating personalized recommendations in large-scale social networks?

In conclusion, we can say that the performance of the proposed model (CF in subgraphs) is comparable to that of the traditional techniques (CF in full graph).

In sparse bipartite graphs, the performance of this model may be reduced.

But, using algorithms such as SVD, we can provide a solution even in the case of sparse bipartite graphs.

The proposed approach could be utilized to implement a distributed recommender system, minimizing the execution time and producing high quality recommendations.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Page 17: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics

Future Work

Deploy the proposed model over a distributed architecture

Partitioning is fast, CF is the bottleneck

• Based on graph (and subgraph) statistics, decide whether to partition or not and decide on the number of partitions

Graph partitioning results in many CrossCluster edges, which are currently ignored

• What happens when we take these edges into account

Page 18: A Scalable Solution  for  Personalized Recommendations  in  Large-scale Social Networks

Thank you for your time.

PCI 2014, Athens October 2-4, 2014 18th Panhellenic Conference in Informatics