A Hybrid Multigroup Coclustering Recommendation Framework...

27

A Hybrid Multigroup Coclustering Recommendation FrameworkBased on Information Fusion

SHANSHAN HUANG, JUN MA, and PEIZHE CHENG, Shandong UniversitySHUAIQIANG WANG, University of Jyvaskyla

Collaborative Filtering (CF) is one of the most successful algorithms in recommender systems. However,it suffers from data sparsity and scalability problems. Although many clustering techniques have beenincorporated to alleviate these two problems, most of them fail to achieve further significant improvementin recommendation accuracy. First of all, most of them assume each user or item belongs to a single cluster.Since usually users can hold multiple interests and items may belong to multiple categories, it is morereasonable to assume that users and items can join multiple clusters (groups), where each cluster is a subsetof like-minded users and items they prefer. Furthermore, most of the clustering-based CF models only utilizehistorical rating information in the clustering procedure but ignore other data resources in recommendersystems such as the social connections of users and the correlations between items. In this article, wepropose HMCoC, a Hybrid Multigroup CoClustering recommendation framework, which can cluster usersand items into multiple groups simultaneously with different information resources. In our framework, wefirst integrate information of user–item rating records, user social networks, and item features extracted fromthe DBpedia knowledge base. We then use an optimization method to mine meaningful user–item groupswith all the information. Finally, we apply the conventional CF method in each cluster to make predictions.By merging the predictions from each cluster, we generate the top-n recommendations to the target usersfor return. Extensive experimental results demonstrate the superior performance of our approach in top-nrecommendation in terms of MAP, NDCG, and F1 compared with other clustering-based CF models.

Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search andRetrieval—Information filtering

General Terms: Algorithms, Performance, Experimentation

Additional Key Words and Phrases: Recommender systems, collaborative filtering, coclustering, informationfusion, data sparsity

ACM Reference Format:Shanshan Huang, Jun Ma, Peizhe Cheng, and Shuaiqiang Wang. 2015. A hybrid multigroup coclusteringrecommendation framework based on information fusion. ACM Trans. Intell. Syst. Technol. 6, 2, Article 27(March 2015), 22 pages.DOI: http://dx.doi.org/10.1145/2700465

This work was supported by the Natural Science Foundation of China (61272240, 60970047, 61103151,71402083), the Doctoral Fund of Ministry of Education of China (20110131110028), the Natural ScienceFoundation of Shandong province (ZR2012FM037, BS2012DX012), the Humanity and Social Science Foun-dation of Ministry of Education of China (12YJC630211), and the Microsoft Research Fund (FY14-RES-THEME-25).Authors’ addresses: S. Huang, J. Ma, and P. Cheng, School of Computer Science and Technology, ShandongUniversity, 1500 Shunhua Road, Jinan 250101, China; emails: [email protected], [email protected], [email protected]; S. Wang, Department of Computer Science and Information Systems, Univer-sity of Jyvaskyla, Agora, 5. krs., Mattilanniemi 2, 40100 Jyvaskyla, Finland; email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or commercial advantage and thatcopies show this notice on the first page or initial screen of a display along with the full citation. Copyrights forcomponents of this work owned by others than ACM must be honored. Abstracting with credit is permitted.To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of thiswork in other works requires prior specific permission and/or a fee. Permissions may be requested fromPublications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)869-0481, or [email protected]© 2015 ACM 2157-6904/2015/03-ART27 $15.00DOI: http://dx.doi.org/10.1145/2700465

ACM Transactions on Intelligent Systems and Technology, Vol. 6, No. 2, Article 27, Publication date: March 2015.

http://dx.doi.org/10.1145/2700465

http://dx.doi.org/10.1145/2700465

27:2 S. Huang et al.

1. INTRODUCTION

In the age of information overload, Recommender Systems (RSs) have become indis-pensable tools in helping people find potential interest items and filter out uninter-esting ones [Adomavicius and Tuzhilin 2005]. They can be used to discover relevantitems or information and make personalized recommendations based on users’ pastbehaviors. RSs can not only benefit users by saving time but also help online shopssatisfy customers and make more profits.

Collaborative Filtering (CF) [Huang et al. 2007; Deshpande and Karypis 2004] isone of the most popular techniques to build recommender systems. The underlyingassumption of CF algorithms is that if users have similar tastes in the past, they aremore likely to have similar preferences for items in the future. An important advantageof CF is the ability to make recommendations without any domain knowledge. However,CF-based recommendation algorithms also suffer from several drawbacks that limittheir performance [Sarwar et al. 1998]. The first is the data sparsity problem, whichis really common in real-world applications. It makes the CF methods incapable offinding accurate neighbors when users have not rated many items in common. Thesecond is the scalability problem, which is caused by the increasing number of usersand items.

Some research has been conducted to address these two mentioned problems byvarious clustering techniques [Sarwar et al. 2002; Gong 2010a; George and Merugu2005]. Clustering methods are usually processed as an intermediate step in CF-basedrecommendation algorithms. They can be used to cluster users or items based on therating information. Most clustering methods used in recommender systems assumethat a user or an item falls into a single cluster. However, this is not a reasonable as-sumption in reality. In most situations, users and items can belong to several clusters;for example, one user may be interested in different kinds of movies such as horror,comedy, and drama and one movie could have multiple genres. In addition, in order tocope with the data sparsity problem, many hybrid recommendation algorithms havebeen proposed by incorporating other information resources such as social networks[Massa and Avesani 2007; Jamali and Ester 2010; Jiang et al. 2012] or item attributes[Middleton et al. 2004; Di Noia et al. 2012b; Ostuni et al. 2013]. However, this infor-mation has not been used for clustering in CF models.

Xu et al. [2012] proposed a multiclass coclustering (MCoC) model by assuming eachuser and item belongs to multiple clusters. However, MCoC clusters users and itemsbased only on rating information, which is usually very sparse. The sparse relationshipsbetween users and items may not be sufficient to find meaningful clusters. Especiallywhen a user has rated a very small number of items and an item has been rated by onlya few users, the clustering result will not be accurate for this user (item). Fortunately,in some recommender systems, people not only have interactions with items but alsohave social relations with each other. Besides, the items usually have metadata, suchas descriptions, categories, and so forth. Based on this additional information, we cancluster strongly connected users and highly correlated items in the same groups.

With all the concerns aforementioned, in this article, we propose HMCoC, a HybridMultigroup CoClustering recommendation framework, by extending conventional CF-based recommendation algorithms with a novel clustering method. In HMCoC, weassume that each user and item can belong to multiple groups (clusters). To alleviatethe data sparsity problems, besides rating records, we utilize additional informationresources, such as user social networks and item correlations. In order to calculateitem correlations, we extracted items’ category information from DBpedia. To utilizemultiple information sources together in the clustering model, we first fuse ratingrecords, social relations, and item correlations into a unified graph model. We then


A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion 27:3

formulate a novel clustering problem to cocluster both users and items into multiplegroups at the same time with all the information we have. Finally, we combine thesegroups with existing CF methods to generate top-n recommendation. In the top-nrecommendation process, we can partition the rating matrix into several submatricesaccording to the clustering result and choose one CF method to make recommendationsindependently with each submatrix. The recommendation results from all groups aremerged together to generate top-n recommendation. An advantage of our frameworkis that it is more general because it does not rely on any specific CF algorithm. For aconcrete problem, we can choose one suitable rating-based CF method to be integratedinto this framework without any modification.

The main contributions of this work can be summarized as follows:

—To the best of our knowledge, this is the first work that utilizes heterogeneous infor-mation to cluster users and items in recommender systems.

—We propose a novel recommendation framework, HMCoC, to cope with the limitationsof conventional CF methods via coclustering users and items into multiple groups atthe same time.

—We embed social networks and knowledge bases as the complementary data resourcesin the clustering progress to satisfy connectivity coherency and topical consistency.

—We formulate our Hybrid Multigroup Coclustering method as an optimization prob-lem that combines one-sided and two-sided clustering techniques. Furthermore, weprovide an effective approximate solution to the problem for finding meaningfuluser–item groups.

The remainder of this article is organized as follows. We discuss the related work inSection 2. We introduce our new recommendation framework, HMCoC, in Section 3.Section 4 presents the experimental settings. We report experimental results and dis-cussion in Section 5. Finally, Section 6 gives the conclusions and future work.

2. RELATED WORK

2.1. Collaborative Filtering

In recommender systems, CF algorithms can be mainly classified into two kinds of ap-proaches: memory-based algorithms [Wang et al. 2006; Huang et al. 2007; Deshpandeand Karypis 2004] and model-based algorithms [Lee and Seung 2000; Hofmann 2004;Mnih and Salakhutdinov 2007]. In memory-based CF algorithms, the entire user–itemrating matrix is directly used to predict unknown ratings for each user. User-based[Desrosiers and Karypis 2011] and item-based [Sarwar et al. 2001; Deshpande andKarypis 2004] CF algorithms are the two best-known methods that fall into this cate-gory. User-based CF methods first find several nearest neighbors with high similaritiesfor each user and then make predictions based on the weighted average ratings of hisor her neighbors. The neighbors can be determined by various similarity measures[Wang et al. 2006], such as Pearson correlation coefficient and cosine similarity in rat-ing space. Similarly, item-based CF methods find the nearest neighbors for each item.The similarity computation process is computationally expensive for large datasets andneighbors cannot be found accurately in highly sparse data [Adomavicius and Tuzhilin2005].

In model-based CF algorithms, a predictive model is trained from observed ratingsin advance. In this category, latent factor models (LFMs) [Hofmann and Puzicha 1999;Hofmann 2004; Koren 2008] are very competitive and widely adopted to build rec-ommender systems. They assume that only a few latent factors influence user ratingbehaviors. Latent factor models seek to factorize the user–item rating matrix into two



low-rank user-specific and item-specific matrices and then utilize the factorized matri-ces to make further predictions. Latent factor models can reduce data sparsity throughdimensionality reduction and usually generate more accurate recommendations thanmemory-based CF algorithms. PureSVD [Sarwar et al. 2000], Matrix Factorization(MF) [Mnih and Salakhutdinov 2007], and Nonnegative Matrix Factorization (NMF)[Lee and Seung 2000] are commonly used methods in LFM.

When numbers of existing users and items grow tremendously, traditional CF algo-rithms, either memory based or model based, will suffer a serious scalability problemwhere the computational cost goes beyond practical or acceptable levels. Clustering CFmodels address the scalability problem by making recommendations within smallerclusters instead of the entire database, demonstrating promising performance in trade-off between scalability and recommendation accuracy.

2.2. Clustering CF Models

Various clustering techniques have been investigated in an attempt to address the prob-lems of sparsity and scalability in recommender systems [Sarwar et al. 2002; Georgeand Merugu 2005; Xu et al. 2012]. In clustering CF models, clustering is often an in-termediate process and the clustering results are further used for CF recommendationalgorithms.

User clustering [Sarwar et al. 2002; Xue et al. 2005] and item clustering methods[Gong 2010b] (also called one-sided clustering) cluster users or items according to therating vectors, and then the prediction is calculated separately in each cluster. Someother clustering CF models cluster users and items at the same time (also called two-sided clustering). In George and Merugu [2005], the key idea is to simultaneouslyobtain user and item neighborhoods via coclustering. The proposed method generatespredictions based on the average ratings of the coclusters while taking into accountthe individual biases of users and items. Leung et al. [2011] propose a CollaborativeLocation Recommendation (CLR) framework that employs a dynamic clustering al-gorithm to cluster the trajectory data into groups of similar users, similar activities,and similar locations. The advantage of applying clustering techniques in CF is thatit can improve the scalability and alleviate the sparsity problem by partitioning thewhole rating space into smaller and denser subspaces. However, all the aforemen-tioned approaches assume that users and items belong to a single cluster, which is nota reasonable assumption in real-world applications.

Another clustering technique that is most related to our model is the MCoC method[Xu et al. 2012]. It assumes each user and item can appear in multiple groups andclusters users and items by limited rating records. However, the rating matrix is usu-ally very sparse because most users only rated a small fraction of items. The sparserelationships between users and items may not be sufficient to find meaningful clusters.

In this article, different from previous clustering techniques used in recommendersystems, we try to find meaningful user–item groups by integrating multisource in-formation. To the best of our knowledge, there has been no attempt to incorporateheterogeneous information from various resources for clustering in clustering CFmodels.

2.3. Recommendation Using Additional Information

Due to the lack of sufficient rating records, many research works have been done byexploiting additional information to enhance recommendation performance, such asmetadata [Ahn and Shi 2009], tags [Peng et al. 2010; Zhang et al. 2010] and socialrelations [Massa and Avesani 2007; Jamali and Ester 2010], and other social mediainformation [Bu et al. 2010].



Fig. 1. Overview of the recommendation framework.

Ahn and Shi [2009] use five types of cultural metadata, user comments, plot outline,synopsis, plot keywords, and genres provided by the IMDB, to calculate similarities be-tween movies. Zhang et al. [2010] propose a recommendation algorithm by integratingdiffusion on user–tag–item tripartite graphs. Symeonidis et al. [2008] represent therelationships between users, items, and tags by a tensor and then decompose the fullfolksonomy tensor using Higher-Order Singular Value Decomposition (HOSVD).

Social relations, such as trust and friendship relations, have been regarded as po-tentially valuable information in recommender systems because of the homophily andselection effects. The common rationale behind this is that a user’s taste is influencedby his or her trusted friends in social networks. The SocialMF [Jamali and Ester 2010]is proposed to incorporate trust propagation into probabilistic matrix factorization andachieves better recommendation accuracy.

Bu et al. [2010] use a unified hypergraph to model the high-order relation in socialmedia and music acoustic-based content to make recommendations. This approach isan application of ranking on graph data and requires learning a ranking function. Intheir hypergraph model, they need to store the vertex–hyperedge incidence matrix,which demands much larger memory space. The time complexity of computing theinverse of matrix is relatively high. Besides, in their work, different kinds of relationscontribute the same to the recommendation.

As far as we know, no research has been done to study if the additional informationcontributes to clustering users and items in recommender systems.

3. HYBRID MULTIGROUP COCLUSTERING RECOMMENDATION FRAMEWORK

3.1. Overview

As illustrated in Figure 1, our framework is composed of three main modules: informa-tion fusion, hybrid multigroup coclustering, and the top-n recommendation module.

(1) Information Fusion: In this module, we integrate information from differentsources. Besides the rating matrix, we also use the user’s social network and item’stopic information. In recommender systems, it is usually time-consuming to obtaintopics of items from tremendous plain texts with natural language processing tech-niques. Instead, we utilize some publicly available knowledge base (e.g., DBpedia)to help us extract accurate and specific properties of items. Lastly, we use a uniformgraph model to represent the integrated information.

(2) Hybrid Multigroup Coclustering: With the information from the first module,we cocluster users and items into multiple groups simultaneously. We assume thatusers and the items to which they have given high rating scores should belong to thesame one or more groups. To satisfy connectivity coherency and topical consistency,users who have tight social relationships are likely to appear in the same groups,



Fig. 2. A fraction of attributes extracted from DBpedia related to the movie Avatar.

and items that have strongly implicit correlations are also likely to appear in thesame groups. We combine the one-sided and two-sided clustering techniques andpresent a fuzzy c-means-based clustering method to discover user–item clusterswith different information sources.

(3) Top-n Recommendation: In this module, we choose a suitable CF recommenda-tion algorithm and implement it independently with user–item submatrices derivedfrom the clustering results. By merging the predictions from each cluster, we finallymake top-n recommendations to the target users for return.

3.2. Information Fusion Module

Conventional CF algorithms make recommendations based on a set of users U ={u1, u2, . . . , um}, a set of items I = {i1, i2, . . . , in}, and the rating matrix R ∈ R

m×n, whereeach element Rij denotes the rating score that user ui gives to item i j .

With the increasing popularity of social networks, many people maintain theirsocial relations online, such as friendships on Facebook1 or Last.fm2 and trust re-lations in Epinions.3 Social relations have been regarded as potentially valuableinformation in recommender systems because the relations can be usefully ap-plied to find users’ like-minded neighbors and reduce the data sparsity problem.Many researchers have successfully exploited social relations to improve the perfor-mance of online recommender systems [Massa and Avesani 2007; Jamali and Es-ter 2010]. In this article, we investigate if social relations can contribute to clusterusers.

Apart from social relations, other relations such as item–category could also beincorporated into recommender systems to make up for the lack of rating information[Zhang et al. 2013]. However, in many recommender systems, the category informationabout items is absent or too general, for example, Action or Drama in the movie domain.In recent years, thanks to the advancement of the Web of Data, we can have access toabundant knowledge bases such as Freebase4 and DBpedia.5 These knowledge basestypically contain a set of concepts, instances, and relations, where the informationabout instances is structured, specific, and comprehensive. In Figure 2, we show thepartial information about the movie Avatar extracted from DBpedia. It is observedthat the category information in this figure can represent the topics expressed by themovie. Several research works have been conducted to improve recommendation byexploiting the knowledge base [Di Noia et al. 2012a, 2012b]. In our study, for each item

1https://www.facebook.com.2http://www.last.fm.3http://www.epinions.com.4http://www.freebase.com/.5http://dbpedia.org/.


https://www.facebook.com

http://www.epinions.com

http://www.freebase.com/


in recommender systems, we try to map it to an instance in DBpedia and extract itsassociated properties to build its profile. Since the category property contains most ofthe information about an item, we only use the category information of items in thisarticle.

In DBpedia, the categories are modeled as a hierarchical structure, which allows usto catch implicit relations and expand information. In HMCoC, we extract not only theitems’ categories but also the categories’ parent categories from DBpedia in one step.In order to calculate the implicit correlations between each pair of items, we adopt thevector-based method used in Di Noia et al. [2012a], where each item i j is representedas a vector ω j = (ω ja1 , ω ja2 , . . . , ω jat ). The nonbinary weights in the vector are TF-IDFweights of category terms. More precisely, they are computed as

ω jai = t f jai × log(

nnai

), (1)

where n is the number of items and nai is the number of items that belong to categoryai; t f jai = 1 if item i j belongs to category ai, otherwise 0. Thus, more general categorieswill have lower weights such as English language films in the movie domain. As isoften used in the classical vector space model, we evaluate the implicit correlationsbetween items i j and ik by the cosine similarity between their vectors:

sim(ω j, ωk) =∑t

i=1 ω jai × ωkai√∑ti=1 ω2

jai·√∑t

i=1 ω2kai

. (2)

So far, we have three different types of relations in our framework, including ratingbehavior R, user social relations F, and item implicit correlations S. In fact, we canfuse this information by a heterogeneous graph model G = (V, E), where V = U ∪ I andE = R∪ F ∪ S. In this graph, we have two different kinds of vertices, namely, users anditems, and three different types of edges. R ⊆ U × I represents the rating behaviors,and the weights of this kind of edge are the rating scores. F ⊆ U × U represents thesocial relations between users; Fij is 1 if user ui and uj are friends or ui trusts uj , andotherwise, Fij = 0. Similarly, S ⊆ I × I is the implicit correlation between items, andthe weight of each edge is the cosine similarity computed by Equation (2).

3.3. Hybrid Multigroup Coclustering Module

In this module, the goal is to cocluster users and items into multiple groups simul-taneously. What differentiates our work from prior methods is that any user or itemcan belong to more than one group in different degrees, and furthermore, we utilizenot only the rating information to do the clustering procedure but also users’ socialrelations and items’ implicit correlations.

Important notations used in the rest of the article are listed in Table I. We usesubscript i and j to index the i/ jth row of a matrix and i j to index the cell in the ithrow and jth column of a matrix.

We first define the concept of group and then present the formulation of the hybridmultigroup coclustering problem. Lastly, we propose an approximate solution to theoptimization problem.

Definition 3.1 (Group). Let G(V, E) be a graph where V = U ∪ I is the vertex set ofG, and U and I are the user set and item set, respectively. For a given positive integerL and a fuzzy clustering method on V , if V can be partitioned into L nonempty subsetsVk = Uk ∪ Ik such that

⋃Lk=1Uk = U and

⋃Lk=1 Ik = I (0 ≤ k ≤ L), we call Vk a group.



Table I. Notations

Notation DescriptionU , I the user set and item set

R the rating matrixF the social relationship matrixS the item correlation matrix

m, n the number of users and itemsVk the kth groupY the membership matrix of clustering results

P, Q the membership matrix of users and itemsD the diagonal degree matrixL the number of groups

α, β the tradeoff parameters

We use membership matrix Y ∈ [0, 1](m+n)×L to represent the clustering results,where each element Yik is the relative weight of entry i belonging to each group Vk. Wecan fix the number of groups such that each user or item can belong to, for example, Kgroups (1 ≤ K ≤ L). Then we have exactly K nonnegative weights in each row and theremains are set to be zero. Specifically, matrix Y can be written as

Y =[

PQ

], (3)

where P ∈ [0, 1]m×L is the membership matrix for users and Q ∈ [0, 1]n×L for items. Adifferent number of groups may have different effects on the recommendation perfor-mance, and we conduct experiments to investigate this in Section 5.1.

We aim to group users and items simultaneously and allow each user and item tobelong to multiple groups. Intuitively, if a user gave a high rating score to an item,this user and item are likely to belong to the same one or more groups. Furthermore,if two users are connected in social networks, they probably appear in one or moregroups together, and if two items have strongly implicit correlations, they might belongto one or more of the same groups. Note that unlike spectral clustering [Von Luxburg2007] and bipartite spectral graph partitioning [Dhillon 2001] algorithms, we havetwo distinct types of vertices and three different kinds of adjacency matrices in graphmodel G. These algorithms cannot be directly used here without adaptation.

The pairwise relationship between users and items is represented in our undirectedweighted graph G. Considering the differences between the scale of weights and struc-ture, we need to model the inter- and intrarelationships between users and itemsseparately. In our clustering method, we assume that if users and items are stronglyassociated, their group indicator vectors Pis and Qjs should be as close as possible. Thelocal variation between two connected objects is the difference between their groupindicator vectors. However, before computing the local variation, we need to split theobject’s indicator vector among adjacent objects to make it balanced.

We use L(P, Q) to denote our loss function, which comprises three different terms.The first term indicates that if a user has rated an item, his or her group indicatorvectors (Pi and Qj) should be close. The second term implies that if two users arefriends, they should have similar interests and their group indicator vectors (Pi andPj) should be close. The last term states that if two items are correlated with each other,their group indicator vectors (Qi and Qj) should be close. Each term is proportionalto the relationship weights between users and/or items, which are Rij , Fij , and Sij . Inorder to group strongly associated users and items, inspired by Zhang et al. [2012] and



Von Luxburg [2007], we propose our clustering problem as follows:

L(P, Q) =m∑

i=1

n∑j=1

⎛⎜⎝

∥∥∥∥∥∥Pi√Drow

ii

− Qj√Dcol

j j

∥∥∥∥∥∥2

Rij

⎞⎟⎠ + α

m∑i=1

m∑j=1

⎛⎜⎝

∥∥∥∥∥∥Pi√DF

ii

− Pj√DF

jj

∥∥∥∥∥∥2

Fij

⎞⎟⎠

+ β

n∑i=1

n∑j=1

⎛⎜⎝

∥∥∥∥∥∥Qi√DS

ii

− Qj√DS

jj

∥∥∥∥∥∥2

Sij

⎞⎟⎠ ,

(4)

where Drow ∈ Rm×m and Dcol ∈ R

n×n are two diagonal degree matrices of users and items,respectively, and Drow

ii = ∑nj=1 Rij and Dcol

j j = ∑mi=1 Rij . Usually the user social relations

and item implicit correlations are symmetric, so we use DF and DS to denote the diago-nal degree matrix of F and S. Parameter α ≥ 0 controls the social-relation-constraineduser-side clustering, and β ≥ 0 controls the implicit correlation-constrained item-sideclustering. The joint objective function concerns not only the user–item preferences butalso the connectivity coherency between users and topical consistency between items.

After some algebraic derivations, Equation (4) can be rewritten in the matrix formas follows:

L(P, Q) =⎛⎝ m∑

i=1

‖Pi‖2 +n∑

j=1

‖Qj‖2 −m∑

i=1

n∑j=1

2Pi QTj Rij√

Drowii

√Dcol

j j

⎞⎠

+ α

⎛⎜⎝ m∑

i=1

‖Pi‖2 +m∑

j=1

‖Pj‖2 −m∑

i=1

m∑j=1

2Pi PTj Fij√

DFii

√DF

jj

⎞⎟⎠

+ β

⎛⎜⎝ n∑

i=1

‖Qi‖2 +n∑

j=1

‖Qj‖2 −n∑

i=1

n∑j=1

2Qi QTj Sij√

SFii

√SF

jj

⎞⎟⎠

= T r(

PT P + QT Q− 2PT AQ+ α(PT P + PT P − 2PT BP)

+ β(QT Q+ QT Q− 2QT CQ))

= T r(

[PT QT ][

Im −A−AT In

] [PQ

]

+ [PT QT ][

2α(Im − B) 00 0

] [PQ

]

+ [PT QT ][

0 00 2β(In − C)

] [PQ

])

= T r(

Y T[

Im + 2α(Im − B) −A−AT In + 2β(In − C)

]Y

)= T r

(Y T MY

).

(5)

In Equation (5), we have

A = (Drow)−12 R(Dcol)−

12 , B = (DF)−

12 F(DF )−

12 ,

C = (DS)−12 S(DS)−

12 ,

(6)



and

M =[

Im + 2α(Im − B) −A−AT In + 2β(In − C)

]. (7)

Matrices A, B, and C are the normalized Laplacian matrices of R, F, and S, respec-tively. Finally, with the loss function in Equation (5), we define the hybrid multigroupcoclustering problem as in Definition 3.2.

Definition 3.2. The hybrid multigroup coclustering problem is defined as

minY

T r(Y �MY )

s.t. Y ∈ [0, 1](m+n)×L,

Y1L = 1m+n,

|Yi| = K, i = 1, . . . , (m+ n),

(8)

where L is the number of clusters and K is the biggest number of groups each user oritem can belong to. Matrix M is given by Equation (7). Notation |Yi| means the numberof nonzero elements in each row of matrix Y .

Before we discuss the solution to our problem (Equation (3.2)), let us first prove aproperty of matrix M given in Theorem 3.1.

THEOREM 3.1. Matrix M given by Equation (7) is positive semidefinite.

PROOF. In linear algebra, an n × n real matrix M is said to be positive semidefiniteif z�Mz is nonnegative for every nonzero column vector z of n real numbers.

From Equation (5), we see that M is positive semidefinite if we can prove the threefollowing matrices are positive semidefinite:

[Im −A

−AT In

],

[2α(Im − B) 0

0 0

],

[0 00 2β(In − C)

]. (9)

For any vectors x = [x1, . . . , xm]� and y = [y1, . . . , yn]�,

[x�, y�][

Im −A−AT In

] [xy

]= x�x − y� A�x − xT Ay + y�y

= x�x + y�y − 2m∑

i=1

n∑j=1

√Rij√

Drowii

√Rij√Dcol

j j

xi yj

+m∑

i=1

n∑j=1

Rij

Drowii

x2i +

m∑i=1

n∑j=1

Rij

Dcolj j

y2i − x�x − y�y

=m∑

i=1

n∑j=1

⎛⎝ √

Rij√Drow

ii

xi −√

Rij√Dcol

j j

yj

⎞⎠

2

≥ 0

(10)



and

[xT , yT ][

2α(Im − B) 00 0

] [xy

]= α

(2xT x + 2xT BT x

)

= α

⎛⎝2

m∑i=1

x2i − 2

m∑i=1

m∑j=1

√Fij√DF

ii

·√

Fij√DF

jj

xixj

⎞⎠

= α

⎛⎜⎝ m∑

i=1

m∑j=1

⎛⎝ √

Fij√DF

ii

xi −√

Fij√DF

jj

xj

⎞⎠

2⎞⎟⎠ ≥ 0.

(11)

The proof of the third matrix is similar to the proof of Equation (11). The detailedproof is left to the interested readers.

Since the sum of positive semidefinite matrices is still positive semidefinite, matrixM is positive semidefinite.

However, it is not easy to solve the optimization problem in Equation (8), because itis nonconvex and discontinuous. In order to solve it efficiently, we relax Equation (8)according to the spectral clustering method given in Von Luxburg [2007]. First, we mapall the users and items into a common low-dimensional subspace, and then we clusterthem simultaneously in this subspace. Let Z ∈ R

(m+n)×r be the matrix with rows beingthe low-dimensional representations of users and items in r-dimensional subspace. Theoptimal Z∗ is obtained by solving the following problem:

minZ

T r(Z�MZ)

s.t. Z ∈ R(m+n)×r, Z�Z = I.

(12)

Since matrix M is positive semidefinite, according to the Rayleigh-Ritz theorem[MacDonald 1933], the optimal solution Z∗ can be given by the solution of the eigenvalueproblem MZ = λZ. Z∗ = [z1, . . . , zr] can be used as the approximate solution to ourhybrid multigroup coclustering problem, where z1, . . . , zr are the smallest eigenvectorsof matrix M ordered according to their corresponding eigenvalues.

As long as we obtain the unified representation Z of users and items, each row of Zcan be used as the feature vector of each user or item. We utilize fuzzy c-means [Lkeski2003] to cluster users and items into L groups with Z. After the clustering procedure,for each row of the membership matrix Y , only the top-K biggest entries are reservedand normalized to be 1. The pseudocode of the Hybrid Multigroup Coclustering methodis shown in Algorithm 1.

3.4. Top-n Recommendation Module

Now we describe how to combine these groups obtained in the previous section withconventional collaborative filtering methods.

Intuitively, for each group Vk, we can get a submatrix from the original user–itemmatrix R with only users and items appearing in that group. Let Rk ∈ R

mk×nk denote therating matrix for group Vk, where k = 1, . . . , L. mk and nk are the number of users anditems in that group, respectively. For a traditional CF method such as user-based CFor NMF, its input is the user–item rating matrix and the output is the predicted scoresfor the missing values in that matrix. We can apply any rating-based CF method ineach submatrix independently and merge the prediction results together from all the



Fig. 3. Illustration by example of recommendation procedure in HMCoC.

ALGORITHM 1: Hybrid Multigroup Coclustering AlgorithmInput: Rating matrix R ∈ R

m×n, user social relations F ∈ Rm×m, item implicit correlations

S ∈ Rn×n, the number of groups L, and the number of feature vectors r.

Output: Group membership matrix Y .1 Compute the normalized Laplacian A, B, and C according to Equation (6);2 Construct matrix M from A, B, and C according to Equation (7);3 Compute the first r smallest eigenvectors z1, . . . , zr of M;4 Let Z ∈ R

(m+n)×r be the matrix containing the vectors z1, . . . , zr as columns;5 For i = 1, . . . , m+ n, let yi be the vector corresponding to the ith row of Z;6 Cluster the points {yi}(i = 1, . . . m+ n) with the fuzzy c-means into L groups, resulting in the

membership matrix Y ∈ [0, 1](m+n)×L;7 For each row in Y , reserve the top-K largest entries and set others to be zero,

then normalized.

groups at last. The dimensions of the submatrices are much smaller than the originalmatrix and the CF model can be executed in parallel; therefore, many CF models canbe applied online with very large dataset. To be more specific, we show an example inFigure. 3.

Since each user and item can belong to multiple (K ≥ 0) groups, we need to mergethe prediction results generated from these groups. As in Xu et al. [2012], we definethe final prediction score of user ui to item i j as

Rij =

⎧⎪⎨⎪⎩

∑k

r(ui, i j, k) · ωik i f ui and ij belong to one or

more same groups,0 otherwise,

(13)

where r(ui, i j, k

)is the prediction score of ui to item i j in the kth group by the chosen

CF algorithm and ωik is the weight. ωik can be set to be the relative weight of user uibelonging to group k. To be simple, we just set ωik = 1 if Yik is the maximum satisfyingYik = 0 and Yjk = 0 (1 ≤ k ≤ L) and ωik = 0 otherwise.

By the recommendation framework described earlier, for each user, we sort the pre-diction scores in decreasing order and recommend the top-n items to the user. In fact,HMCoC can filter out lots of items for a user if these items are not in any of the groupsthis user belongs to. The pseudocode of the top-n recommendation process is shown inAlgorithm 2.



ALGORITHM 2: Top-n Recommendation AlgorithmInput: Rating matrix R ∈ R

m×n, all the groups {V1, . . . , VL}, a chosen CF method, and thenumber of items in recommendation list N.

Output: Recommendation list for each user.1 for k ← 1 to L do2 Extract submatrix Rk from rating matrix R with users and items belonging to group Vk;3 Apply the CF recommendation method with Rk as input and predict missing scores

r(ui, i j, k).4 end5 for i ← 1 to m do6 for j ← 1 to n do7 if Rij is missing then8 Find group index k = {max

kYik|Yik = 0 and Yjk = 0};

9 if k is null then10 Set Rij = 0;11 end12 else13 Set Rij = r(ui, i j, k);14 end15 end16 end17 Generate top-n recommendation list for user ui according to the decreasing order of the

predicted scores.18 end

4. EXPERIMENTAL SETTINGS

4.1. Datasets

Our experiments are carried out on two real datasets, Movielens-1M6 (ML1M) andLast.fm7 (LF), and we use the mappings of items to DBpedia instances published byDi Noia et al. [2012b] and Ostuni et al. [2013].8

The ML1M dataset contains user rating scores for different movies that are in a 1to 5 star scale. This dataset does not have user social networks. The second datasetcomes from the Last.fm online music system. Last.fm is an implicit feedback dataset,in which each user has a list of most-listened-to music artists and the weight indicatesthe listening frequency of one user to an artist. Its users are interconnected in a socialnetwork generated from Last.fm bidirectional friend relations. We delete the artiststhat have been listened to once. The basic statistics of these two datasets are shown inTable II.

In our experiments, the datasets were partitioned into five parts for fivefold cross-validation, where four parts were used for training and the remaining part for testing,and the averaged performances were reported.

4.2. Evaluation Metrics

In reality, recommender systems care more about personalized rankings of items thanthe absolute rating predictions Cremonesi et al. [2010] and Deshpande and Karypis[2004]. To be consistent with other top-n recommendation literature, three classicalmeasures are selected to evaluate the accuracy of the ranked list: F1-measure, MAP

6http://www.grouplens.org/node/73.7http://ir.ii.uam.es/hetrec2011/datasets.html.8http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/.


http://www.grouplens.org/node/73


Table II. Statistics of Datasets

Movielens-1M Last.fm# of users 6,040 1,885# of items 3,952 6,953# of items found in DBpeia 3,148 5,209# of categories 9,042 18,134# of ratings 1,000,209 82,155# of relations 25,334# of ratings per user 165.60 43.58# of ratings per item 253.09 11.82# of friends per user 13.44# of categories per item 49.66 29.19

(Mean Average Precision), and NDCG (Normalized Discounted Cumulative Gain). Foreach item in the recommendation list, if a user had a rating in test data, we assumethat he or she was interested in this item.

To compute the F1-measure, let precision and recall be the user-oriented averagingprecision and recall for the ranked list:

F1 = 2 × precision × recallprecision + recall

. (14)

For each user u, given a ranked list with n items, we denote prec( j) as the precisionat the rank position j, and pref( j) as the preference indicator of item at position j.If the item at position j is rated by user u in the test set, pref( j) = 1, and otherwise0. Average Precision (AP) is computed as the average of precisions computed at eachposition in the ranked list. MAP is the mean of AP for all users:

AP(u) =∑n

j=1 prec( j) × pref( j)

n

MAP = 1|U |

∑u∈U

AP(u).(15)

In addition, when evaluating lists of recommended items, the position of the relevantitem in the ranked list is also important. NDCG gives more weights to items whosepositions are in front:

NDCG = 1IDCG

×n∑

j=1

2pref( j) − 1log2( j + 1)

, (16)

where IDCG is produced by a perfect ranking algorithm. Higher F1, MAP, and NDCGimply better recommendation performance.

In experiments, we recommend the top-20 items for each user.

4.3. Comparisons

Here we chose four popular CF models as the basic CF algorithms, including onememory-based recommendation method with User-based CF (UserCF) [Huang et al.2007] and three model-based recommendation methods with PureSVD [Sarwar et al.2000], NMF [Lee and Seung 2000], and SLIM [Ning and Karypis 2011]. For user-based CF, we used Pearson correlation to measure user–user similarities. We set thedimension of the latent feature to be six in PureSVD and NMF. For SLIM, we setthe regularization parameters λ = 0.01 and the number of neighbors k = 30. Toinvestigate the effect of clustering models in CF recommendation, we used several



Table III. Performance Comparisons on LF in Terms of MAP, NDCG, and F1 with Group = 10 and 20

Recommendation Methods 10 Groups 20 GroupsMAP NDCG@10 F1@10 MAP NDCG@10 F1@10

UserCF 0.2084 0.2076 0.0800 0.2084 0.2076 0.0800Single+UserCF 0.2253 0.2217 0.0760 0.1359 0.1547 0.0385MCoC+UserCF 0.2381 0.2327 0.0816 0.2386 0.2393 0.0809HMCoC+UserCF 0.2614** 0.2532* 0.0989* 0.2742** 0.2511* 0.1026*PureSVD 0.2091 0.2419 0.0998 0.2091 0.2419 0.0998Single+PureSVD 0.2695 0.2772 0.1124 0.1800 0.1763 0.0609MCoC+PureSVD 0.3147 0.2794 0.1326 0.3208 0.2799 0.1172HMCoC+PureSVD 0.3308** 0.2845 0.1347 0.3488** 0.2942* 0.1328**NMF 0.3189 0.2697 0.1408 0.3189 0.2697 0.1408Single+NMF 0.2699 0.2585 0.1204 0.1968 0.1893 0.0713MCoC+NMF 0.3287 0.2674 0.1501 0.3302 0.2713 0.1515HMCoC+NMF 0.3420** 0.2906** 0.1522 0.3514** 0.2941* 0.1507SLIM 0.3119 0.2619 0.1307 0.3119 0.2619 0.1307Single+SLIM 0.2696 0.2627 0.1142 0.2082 0.1877 0.0729MCoC+SLIM 0.3279 0.2665 0.1396 0.3273 0.2618 0.1422HMCoC+SLIM 0.3380* 0.2842** 0.1405 0.3401** 0.2857** 0.1436

Bold typeset indicates the best performance. ** indicates statistical significance at p < 0.001. * indicatesstatistical significance at p < 0.01 compared to the second best.

variant clustering models in combination with UserCF, PureSVD, NMF, and SLIM.The clustering models are as follows:

—Single: In the single-cluster model, we use k-means to cluster users and items witheigenvectors Z computed according to Equation (12). Each user and item can onlybelong to one cluster.

—MCoC: This model only uses rating information in clustering, which is the same asXu et al. [2012]. We use this model to investigate whether or not the assumption thatusers and items belong to multiple groups is more reasonable.

—HMCoC: In this model, besides rating information, we also use user social networkinformation and item implicit correlations when clustering.

5. EXPERIMENTAL RESULTS AND DISCUSSION

Performance on Last.fm. In LF, the user–item ratings range in a large scale; for auser, some artists are listened to just once and some artists are listened to more than10,000 times. So we rescaled the ratings by Rij = log2(Rij) to alleviate the big variance.

In this experimental setting, we set α = 0.3 and β = 0.3 and K = �log2 L . Table IIIshows the experimental results on the Last.fm dataset with different combinations ofCF models and clustering models. From Table III, we observe that MCoC and HM-CoC yield better performance under most of the evaluation conditions. It verifies theassumption that recommendation performance can be improved if we consider thatusers and items can belong to multiple groups. When the number of groups changesfrom 10 to 20, we can see that Single+CF models perform better when the number ofgroups L is small (L = 10) and drops quickly when L increases (L = 20). This result isconsistent with some previous clustering CF models [George and Merugu 2005; Sarwaret al. 2002]. This is because a small number of clusters can help filter out irrelevantitems or users and denoise. Usually, when L becomes large, the number of items in eachcluster becomes too small to be recommended for each user. However, in our model, theperformance is more stable and even better when the number of groups L gets bigger.We believe that it is because the fuzzy weights may be more accurate as L increases



Fig. 4. Performance comparisons on different values of N (top-N) with 20 groups.

Table IV. Performance Comparisons on ML1M in Terms of MAP, NDCG, and F1 with Group = 10 and 20

Recommendation Methods 10 Groups 20 GroupsMAP NDCG@10 F1@10 MAP NDCG@10 F1@10

UserCF 0.2918 0.2872 0.0768 0.2918 0.2872 0.0768Single+UserCF 0.2208 0.3015 0.0470 0.1372 0.2562 0.0228MCoC+UserCF 0.2919 0.2838 0.0701 0.2790 0.2690 0.0637HMCoC+UserCF 0.3109* 0.2989* 0.0717 0.2890* 0.2784* 0.0712PureSVD 0.3870 0.3649 0.1042 0.3870 0.3649 0.1042Single+PureSVD 0.2655 0.3657 0.0688 0.1731 0.3128 0.0320MCoC+PureSVD 0.4151 0.3731 0.1199 0.4142 0.3760 0.1256HMCoC+PureSVD 0.4264* 0.3838* 0.1283 0.4306** 0.3872* 0.1301NMF 0.4043 0.3811 0.1203 0.4043 0.3811 0.1203Single+NMF 0.2753 0.3573 0.0729 0.1987 0.2976 0.0377MCoC+NMF 0.4121 0.3701 0.1330 4137 0.3742 0.1364HMCoC+NMF 0.4261* 0.3800* 0.1376 0.4256** 0.3839** 0.1403SLIM 0.4348 0.3767 0.1433 0.4348 0.3767 0.1433Single+SLIM 0.2812 0.3470 0.0724 0.2017 0.2863 0.0366MCoC+SLIM 0.4312 0.3664 0.1406 0.4329 0.3747 0.1440HMCoC+SLIM 0.4335 0.3696 0.1428 0.4346 0.3824 0.1447

Bold typeset indicates the best performance. ** indicates statistical significance at p < 0.001. * indicatesstatistical significance at p < 0.01 compared to the second best.

and meanwhile the number of items in each group will not drop drastically since eachitem can appear in multiple groups. We also observe that our HMCoC model yields thebest performance in most cases. It verifies that besides ratings, user social networksand item implicit correlations are both helpful to find more accurate groups for CF.

In Figure 4, we plot the precision values of different methods when the length ofthe recommendation list varies (one to 10). We choose a memory-based CF (UserCF)method and a model-based CF (SVD) method as examples. Similar to the results inTable III, the single-cluster model performs worse than the baseline CF methods whenthe number of the group L = 20. As can be seen, our model can outperform othercomparison partners constantly.

Performance on Movielens-1M. In Movielens, there is no user social relationinformation, so we set α = 0. The experimental results are shown in Table IV. FromTable II, we can see that Movielens-1M is a much denser dataset than Last.fm, so thebasic CF models can achieve fairly good performance. We found the results are similarto those on Last.fm; our recommendation framework still performs the best under mostsituations. The single-cluster model has very poor performance when the number of



Fig. 5. Impact of parameters L and K on Last.fm (a) and Movielens-1M (b).

groups is large. Surprisingly, MCoC and our model have a negative effect on SLIM.The reason may be that in the ML dataset, the number of items is much fewer than thenumber of users, and SLIM needs a feature selection procedure to learn its parameters.

5.1. Parameter Selection

In this section, we conduct various experiments to investigate how the parameters inour HMCoC model affect the recommendation accuracy. We use PureSVD as our basicCF model; UserCF and NMF have similar results and are omitted here.

5.1.1. Impact of L and K. In our HMCoC model, L is the number of groups and K is thebiggest number of groups a user or an item can belong to (1 ≤ K ≤ L). We conductexperiments on both LF and ML with Lvarying from two to 20. The impacts of these twoparameters on recommendation performance are plotted in Figure 5. We can observethat our model achieves better performance when K is small. This result correspondswith our expectation because people have diverse but also limited interests. In addition,we find that when L gets bigger, K needs to be bigger to get higher MAP. This is becausewhen we divide users and items into more clusters, users and items need to belong tomore clusters to keep the clusters big enough to make recommendations. Based on theprevious analysis, we set K = �log2 L .



Fig. 6. Impact of parameter r on Last.fm (a) and Movielens-1M (b).

Fig. 7. Impact of parameters α and β on Last.fm (a) and Movielens-1M (b).

5.1.2. Impact of r. Parameter r is the number of eigenvectors computed in Equation() and also the dimension of feature vectors in fuzzy c-means clustering. We conductexperiments on both LF and ML datasets with 10 and 20 groups and plot the results ofMAP in Figure 6. From the figure, we can see that our recommendation performanceis competitive when we use just a few eigenvectors in the fuzzy c-means clusteringprocess. So in our experiments, we select r = 4.

5.1.3. Impact of α and β. Another two important parameters α and β control the social-network-constrained user-side clustering and item implicit correlation-constraineditem-side clustering. Figure 7 shows how α and β affect the performance of HMCoCon Last.fm and Movielens-1M datasets. They have similar trends when α and β in-crease. When α and β are small, they have little effect on the performance because theinformation of user social network and item implicit correlations are ignored. Whenthey increase continuously (>1), the user social network and item implicit informationwould overwhelm the rating information and cause the descendence of performance.From Figure 6(a), we also can see that social relations among users contribute more tothe recommendation performance than item implicit correlations. When α and β arearound 0.5, we have the best MAP measure.

5.2. Discussion

Sparsity. In order to show how our model can alleviate the sparsity problem in CFrecommendation, we record the sparsity (percent of zero elements in a matrix) ofthe original rating matrix and also the average sparsity of groups in Table V. In thetable, “Random” means each user or item is assigned to multiple groups randomly,



Table V. Sparsity Comparisons on ML1M and LF

ML LF10 Groups 20 Groups 10 Groups 20 Groups

Original 0.9581 0.9581 0.9937 0.9937Random 0.9785 0.9781 0.9816 0.9814HMCoC 0.9512 0.9507 0.9536 0.9458

and the number of groups each user and item can belong to is the same as HMCoC(K = �log2 L ). From the table, we can observe that the sparsity is largely reduced byusing our HMCoC model. Furthermore, by comparing the average sparsity values ofthe last two rows of Table V, we can see that our clustering strategy is more effectivethan random clustering strategy in alleviating sparsity.

Scalability. Our recommendation framework includes three main processes, thatis, the information fusion process, hybrid multigroup coclustering process, and top-nrecommendation process, while information fusion and the clustering process can bedone offline, so the running time of top-n recommendation is not increased. In our clus-tering process, the time-consuming parts are eigenvector computing and fuzzy c-meansclustering. However, our matrix M is highly sparse and positive semidefinate and werequire only a few eigenvectors, so the eigenvector computation and fuzzy clusteringprocess are relatively fast. It takes O((m+ n)2) [Leordeanu and Hebert 2005] to com-pute the eigenvectors and O((m+ n)dc) [Kolen and Hutcheson 2002] to execute fuzzyc-means clustering, where m and n are the numbers of users and items, respectively; dis the dimension of the features; and c is the number of clusters. Furthermore, thereare many existing software packages supporting parallel eigenvector computation andfuzzy clustering for large datasets. In the top-n recommendation process, the originaluser–item matrix R can be partitioned into much smaller submatrices according tothe clustering results. Thus, the CF model can be executed independently by multi-processing systems; therefore, many CF models can be applied online with very largedatasets.

The presented experimental results suggest that it is more reasonable to assume thatusers and items can belong to multiple groups. Furthermore, integrating additionalinformation resources, such as user social network and item category, can help generatebetter groups for recommendation. However, one problem of our framework is that thegroups we get may be unbalanced. In extreme cases, there may exist some groupswith only a few items in them. In this situation, one solution is that we can put forthseveral popular items for recommendation. Our experiments are conducted on datasetswhose items are in the same domain, such as movies (Movielens) and music (Last.fm).The distinctions between items may be hardly defined only by categories in DBpedia.Some other domain-specific knowledge base with descriptions about plots of movies orfeelings of music could be further investigated.

6. CONCLUSION

In this article, we proposed a Hybrid Multigroup CoClustering recommendation frame-work, denoted as HMCoC, by extending conventional CF-based recommendation algo-rithms with a novel clustering method. This framework allows users and items tobe clustered into multiple groups. To generate groups, we employed the informationfrom different sources, for example, rating matrix and user social network and knowl-edge base, and we represented this information by a unified graph model. In ourtop-n recommendation process, many traditional rating-based CF models can be useddirectly without any modification. The experimental results showed that our frame-work can reduce the sparsity problem and is effective in top-n recommendation on



Movielens-1M and Last.fm datasets in terms of MAP, NDCG, and F1. The experimen-tal results on Last.fm also showed that user social relations contribute more to theperformance improvement of our recommendation framework.

In the future, we would like to test our framework in other multidomain datasetssuch as Epinions9 and Douban.10 In addition, we will investigate some other clusteringmethods such as community topic mining and find better ways to combine groups andCF algorithms.

ACKNOWLEDGMENTS

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions toimprove the quality of the article.

REFERENCES

Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next generation of recommender systems:A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and DataEngineering 17, 6 (2005), 734–749.

Shinhyun Ahn and Chung-Kon Shi. 2009. Exploring movie recommendation system using cultural metadata.In Transactions on Edutainment II. Springer, Berlin, 119–134.

Jiajun Bu, Shulong Tan, Chun Chen, Can Wang, Hao Wu, Lijun Zhang, and Xiaofei He. 2010. Music recom-mendation by unified hypergraph: Combining social media information and music content. In Proceed-ings of the 18th International Conference on Multimedia. ACM, 391–400.

Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the 4th International Conference on Recommender Systems.ACM, 39–46.

Mukund Deshpande and George Karypis. 2004. Item-based top-n recommendation algorithms. ACM Trans-actions on Information Systems 22, 1 (2004), 143–177.

Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommen-dation methods. In Recommender Systems Handbook. Springer US, 107–144.

Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning.In Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining. ACM,269–274.

Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, and Davide Romito. 2012a. Exploiting the webof data in model-based recommender systems. In Proceedings of the 6th International Conference onRecommender Systems. ACM, 253–256.

Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito, and Markus Zanker. 2012b. Linkedopen data to support content-based recommender systems. In Proceedings of the 8th InternationalConference on Semantic Systems. ACM, 1–8.

Thomas George and Srujana Merugu. 2005. A scalable collaborative filtering framework based on co-clustering. In 5th IEEE International Conference on Data Mining. IEEE, 625–628.

Songjie Gong. 2010a. A collaborative filtering recommendation algorithm based on user clustering and itemclustering. Journal of Software 5, 7 (2010), 745–752.

Songjie Gong. 2010b. An efficient collaborative recommendation algorithm based on item clustering. InAdvances in Wireless Networks and Information Systems. Springer, 381–387.

Thomas Hofmann. 2004. Latent semantic models for collaborative filtering. ACM Transactions on Informa-tion Systems 22, 1 (2004), 89–115.

Thomas Hofmann and Jan Puzicha. 1999. Latent class models for collaborative filtering. In Proceedings ofthe 16th International Joint Conference on Artificial Intelligence. ACM, 688–693.

Zan Huang, Daniel Zeng, and Hsinchun Chen. 2007. A comparison of collaborative-filtering recommendationalgorithms for e-commerce. IEEE Intelligent Systems 22, 5 (2007), 68–78.

Mohsen Jamali and Martin Ester. 2010. A matrix factorization technique with trust propagation for rec-ommendation in social networks. In Proceedings of the 4th International Conference on RecommenderSystems. ACM, 135–142.

9http://www.epinions.com/.10http://www.douban.com/.


http://www.epinions.com/

http://www.douban.com/


Meng Jiang, Peng Cui, Rui Liu, Qiang Yang, Fei Wang, Wenwu Zhu, and Shiqiang Yang. 2012. Socialcontextual recommendation. In Proceedings of the 21st International Conference on Information andKnowledge Management. ACM, 45–54.

John F. Kolen and Tim Hutcheson. 2002. Reducing the time complexity of the fuzzy c-means algorithm. IEEETransactions on Fuzzy Systems 10, 2 (2002), 263–267.

Yehuda Koren. 2008. Factorization meets the neighborhood: A multifaceted collaborative filtering model.In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining. ACM,426–434.

Marius Leordeanu and Martial Hebert. 2005. A spectral technique for correspondence problems using pair-wise constraints. In Proceedings of 10th IEEE International Conference on Computer Vision, Vol. 2.IEEE, 1482–1489.

Kenneth Wai-Ting Leung, Dik Lun Lee, and Wang-Chien Lee. 2011. CLR: A collaborative location recom-mendation framework based on co-clustering. In Proceedings of the 34th International ACM SIGIRConference on Research and Development in Information Retrieval. ACM, 305–314.

Jacek Lkeski. 2003. Towards a robust fuzzy clustering. Fuzzy Sets and Systems 137, 2 (2003), 215–233.J. K. L. MacDonald. 1933. Successive approximations by the Rayleigh-Ritz variation method. Physical Review

43, 10 (1933), 830–833.Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 1st Interna-

tional Conference on Recommender Systems. ACM, 17–24.Stuart E. Middleton, David De Roure, and Nigel R. Shadbolt. 2004. Ontology-based recommender systems.

In Handbook on Ontologies. Springer, Berlin, 477–498.Andriy Mnih and Ruslan Salakhutdinov. 2007. Probabilistic matrix factorization. In Advances in Neural

Information Processing Systems. MIT Press, 1257–1264.Xia Ning and George Karypis. 2011. SLIM: Sparse linear methods for top-n recommender systems. In 11th

IEEE International Conference on Data Mining. IEEE, 497–506.Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, and Roberto Mirizzi. 2013. Top-n recommen-

dations from implicit feedback leveraging linked open data. In Proceedings of the 7th ACM Conferenceon Recommender Systems. ACM, 85–92.

Jing Peng, Daniel Dajun Zeng, Huimin Zhao, and Fei-yue Wang. 2010. Collaborative filtering in socialtagging systems based on joint item-tag recommendations. In Proceedings of the 19th ACM IinternationalConference on Information and Knowledge Management. ACM, 809–818.

Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of dimensionalityreduction in recommender system-a case study. In Proceedings of the ACM WebKDD Web Mining forE-Commerce Workshop.

Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filteringrecommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web.ACM, 285–295.

Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2002. Recommender systems forlarge-scale e-commerce: Scalable neighborhood formation using clustering. In Proceedings of the 5thInternational Conference on Computer and Information Technology, Vol. 1.

Badrul M. Sarwar, Joseph A. Konstan, Al Borchers, Jon Herlocker, Brad Miller, and John Riedl. 1998. Usingfiltering agents to improve prediction quality in the groupLens research collaborative filtering system.In Proceedings of the 12th ACM Conference on Computer Supported Cooperative Work. ACM, 345–354.

Daniel D. Lee and H. Sebastian Seung. 2000. Algorithms for non-negative matrix factorization. In Advancesin Neural Information Processing Systems. MIT Press, 556–562.

Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2008. Tag recommendationsbased on tensor dimensionality reduction. In Proceedings of the 2008 ACM Conference on RecommenderSystems. ACM, 43–50.

Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395–416.Jun Wang, Arjen P. De Vries, and Marcel J. T. Reinders. 2006. Unifying user-based and item-based collab-

orative filtering approaches by similarity fusion. In Proceedings of the 29th International ACM SIGIRConference on Research and Development in Information Retrieval. ACM, 501–508.

Bin Xu, Jiajun Bu, Chun Chen, and Deng Cai. 2012. An exploration of improving collaborative recommendersystems via user-item subgroups. In Proceedings of the 21st International Conference on World WideWeb. ACM, 21–30.

Gui-Rong Xue, Chenxi Lin, Qiang Yang, WenSi Xi, Hua-Jun Zeng, Yong Yu, and Zheng Chen. 2005. Scalablecollaborative filtering using cluster-based smoothing. In Proceedings of the 28th Annual InternationalACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 114–121.



Lijun Zhang, Chun Chen, Jiajun Bu, Zhengguang Chen, Deng Cai, and Jiawei Han. 2012. Locally discrimi-native coclustering. IEEE Transactions on Knowledge and Data Engineering 24, 6 (2012), 1025–1035.

Xi Zhang, Jian Cheng, Ting Yuan, Biao Niu, and Hanqing Lu. 2013. TopRec: Domain-specific recommendationthrough community topic mining in social network. In Proceedings of the 22nd International Conferenceon World Wide Web. 1501–1510.

Zi-Ke Zhang, Tao Zhou, and Yi-Cheng Zhang. 2010. Personalized recommendation via integrated diffusionon user–item–tag tripartite graphs. Physica A: Statistical Mechanics and its Applications 389, 1 (2010),179–186.

Received November 2013; revised July 2014; accepted September 2014


A Hybrid Multigroup Coclustering Recommendation Framework...

Documents

Transcript of A Hybrid Multigroup Coclustering Recommendation Framework...