Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010.

Filling value to unrated items in Collaborative

Filtering Supervisor: Associate Prof. Jiuyong Li(John)

Student: Kang SunDate: 28th May 2010

OutlineIntroductionMotivationsRelated workExperiments Conclusion

Introductionfriends and neighbours were the main

resource to provide recommendationsrecommendations from friends

a) best café in the local areab) best book in particular topic

MotivationFind out a more reliable and accuracy

solutionLarge database supposed to help user to get

more accuracy result, however, when recommendation turn to online, similar user become hard to found

Research question How to build up a framework to improve the

prediction accuracy among recommendation data sets?

DilemmaNormally, data store in the large online

recommendation database contains lot of unrated items.

Unrated items could affect the result of recommendation

Related workSparse Matrix Prediction Filling in

Collaborative Filtering[Liu et al. 2009b]• Develop the approach to overcome the sparse

problem in user-based and item-based • Similarity computation based on the Boolean

matrix

Related workEffective Missing Data Prediction for

Collaborative Filtering[Ma, King & Lyu 2007]• Develop user information and item information

combination to give better performance

Related workA Hybrid User and Item-based Collaborative

Filtering with Smoothing on Sparse Data[(Rong & Yansheng 2006]• a framework to alleviate sparse problem • smoothing did increased the quality of

recommendation by their experiments

research data sets

Whole Jester has 617, 000 ratings of 100 jokes by 24, 900 users. range from −10 to +10. Whole rating matrix is filled to about 25%.

This research was using the part 1 of the three parts Data from 24,983 users who have rated 36 or more jokes, a matrix with dimensions 24983 X 101

data set is in .CSV format

Data pre-processingJester data sets using 99 to represent the

unrated valueFirst step is to change all the unrated values

to 0.Second part is the most important part of this

research which is predict the necessary unrated value for future prediction generation

All the data processing were using R programming

Data set preview-7.82 8.79 -9.66 -8.16 -7.52 -8.5 -9.85 4.17 -8.98 -4.764.08 -0.29 6.36 4.37 -2.38 -9.66 -0.73 -5.34 8.88 9.22

99 99 99 99 9.03 9.27 9.03 9.27 99 9999 8.35 99 99 1.8 8.16 -2.82 6.21 99 1.84

8.5 4.61 -4.17 -5.39 1.36 1.6 7.04 4.61 -0.44 5.73-6.17 -3.54 0.44 -8.5 -7.09 -4.32 -8.69 -0.87 -6.65 -1.8

99 99 99 99 8.59 -9.85 7.72 8.79 99 996.84 3.16 9.17 -6.21 -8.16 -1.7 9.27 1.41 -5.19 -4.42

-3.79 -3.54 -9.42 -6.89 -8.74 -0.29 -5.29 -8.93 -7.86 -1.63.01 5.15 5.15 3.01 6.41 5.15 8.93 2.52 3.01 8.16

-2.91 4.08 99 99 -5.73 99 2.48 -5.29 99 1.461.31 1.8 2.57 -2.38 0.73 0.73 -0.97 5 -7.23 -1.36

99 99 99 99 5.87 99 5.58 0.53 99 7.149.22 9.27 9.22 8.3 7.43 0.44 3.5 8.16 5.97 8.988.79 -5.78 6.02 3.69 7.77 -5.83 8.69 8.59 -5.92 7.52-3.5 1.55 2.33 -4.13 4.22 -2.28 -2.96 -0.49 2.91 1.99

99 -9.27 99 99 -7.38 99 8.74 -6.31 99 2.333.16 7.62 3.79 8.25 4.22 7.62 2.43 0.97 0.53 0.834.22 3.64 99 99 2.52 99 4.13 -5.19 99 7.91

99 7.62 99 99 -8.64 2.43 8.93 -6.6 99 -9.472.57 -0.73 99 99 2.57 99 -4.22 2.67 99 -1.317.28 5.39 99 99 -4.22 99 8.93 3.5 99 6.12

Data processing approach

Joke1 Joke2 Joke3 Joke4 Joke5 Joke6 Joke7 Joke

User 2 0 0 1 3 4 3 0 0

User 3 0 0 0 4 5 4 0 0

User 4 0 0 2 3 3 2 0 0

Manhattan distance measure is applied

Data processingDistance between user 2 and user 3 is four Distance between user 2 and user 4 is threeUser 4 seems more close to user 3

Data processing approach(con’d.)

Joke1 Joke2 Joke3 Joke4 Joke5 Joke6 Joke7 Joke

User 2 0 0 1 3 4 3 0 0

User 3 0 0 1 4 5 4 0 0

User 4 0 0 2 3 3 2 0 0

Data processingDistance between user 2 and user 3 is threeDistance between user 2 and user 4 is threeBoth user 3 and 4 has the same distance with

user 2

Measurement of accuracyrelative squared error used to computing

the accuracy

Traditional CF accuracy of joke 3Accuracy= 1-(1-2)²/1=0

Current approach accuracy Accuracy=1-(1-1.5)²/1=75%

User similarity comparison

Joke1 Joke2 Joke3 Joke4 Joke 5 Joke 6 Joke7 Joke 80

User 2User 3User 4

Conclusion Heavy computation forceMethods for both unrated value and missing

References Liu, Z, Wang, H, Qu, W, Liu, W & Fan, R 2009b, Sparse Matrix

Prediction Filling in Collaborative Filtering, IEEE Computer Society, pp. 304-307.

Ma, H, King, I & Lyu, MR 2007, Effective missing data prediction for collaborative filtering, ACM, Amsterdam, The Netherlands, pp. 39-46.

Rong, H & Yansheng, L 2006, 'A Hybrid User and Item-Based Collaborative Filtering with Smoothing on Sparse Data', paper presented at the Artificial Reality and Telexistence--Workshops, 2006. ICAT '06. 16th International Conference on, Nov. 2006.

Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010.

Documents

Transcript of Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010.

Using causal discovery for feature selection in multivariate ...nugget.unisa.edu.au/jiuyong/CausalfeatureSelection.pdfThe discovery of causal relationships is beneﬁcial for good

Gahee Kang 2014

Kang Fighting Phytophthora 08

Eun Yong Kang , Ilya shpitser , Hyun Min Kang, Chun Ye, Eleazar Eskin

Information based data anonymization for classiﬁcation utilitynugget.unisa.edu.au/jiuyong/Tradeoff.pdf · of privacy preserving data publishing is to protect private information

THREE ESSAYS ON SOCIAL RESOURCES AND WORK … · Date Kapil Verma . v ACKNOWLEDGEMENTS I am extremely grateful to my supervisor Yu Kang Yang Trevor and my co-supervisor Marilyn Ang

Welcome Dr. Youngjin Kang › ... › 44 › 2013 › 03 › 2017-Fall-Newsletter.pdfWelcome Dr. Youngjin Kang Dr. Youngjin Kang Assistant Professor Child and Family Studiesfamilies.

UROP_Chow Jun Kang

China By: Alissa Kang .

Kang Lings

Samuel O’Malley oymsj001@mymail.unisa.edu.au Supervisor: Prof. Jiuyong Li jiuyong.li@unisa.edu.au Associate Supervisor: Dr. Jixue Liu jixue.liu@unisa.edu.au.

KAIAWE KANG - Hawaii

Kang Da Jeong

Kang Hyun Jung

CivPro Kang Fall 2010

Market Orientation as a Branding Strategy - diva …159258/FULLTEXT01.pdfMarket Orientation as a Branding Strategy Charina Montemar Hägglund Supervisor: Olivia Kang Bachelor™s Thesis

Judge Shi Jiuyong, Maritime Delimitation in the Jurisprudence of the ICJ

Daewha kang

Deriving Topics and Opinions from Microblogs Feng Jiang Supervisors: Jixue Liu & Jiuyong Li.

1 Decision tree based classifications of heterogeneous lung cancer data Student: Yi LI Supervisor: Associate Prof. Jiuyong Li Data: 15 th May 2009.