Eindhoven University of Technology MASTER Hybrid ... · Hybrid recommendation systems combining...

Eindhoven University of Technology

MASTER

Hybrid recommendation systems combining user-preferences with domain-expert knowledge

Tufis, V.

Award date:2014

Link to publication

DisclaimerThis document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Studenttheses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the documentas presented in the repository. The required complexity or quality of research of student theses may vary by program, and the requiredminimum study period may vary in duration.

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

https://research.tue.nl/en/studentthesis/hybrid-recommendation-systems-combining-userpreferences-with-domainexpert-knowledge(1cbe376a-c2e2-4732-9e5b-ddfedcd83dfb).html

Vlad Tufis,

Hybrid recommendation systemscombining user-preferenceswith domain-expert knowledge

School of Science

Thesis submitted for examination for the degree of Master ofScience in Technology.Espoo 30.06.2014

Thesis supervisor:

Prof. Heikki Saikkonen

Thesis advisor:

Lic. Tech. Håkan Mitts

aalto universityschool of science

abstract of themaster’s thesis

Author: Vlad Tufis,

Title: Hybrid recommendation systems combining user-preferenceswith domain-expert knowledge

Date: 30.06.2014 Language: English Number of pages: 10+69

Department of Computer Science and Engineering

Professorship: Software Systems Code: T-106

Supervisor: Prof. Heikki Saikkonen

Advisor: Lic. Tech. Håkan Mitts

The ever-growing popularity and adoption of smartphones into everyday life hastransformed these devices into more than merely a tool which helps one maximizehis productivity, but a truly real-life companion. However, by staying connected allthe time, users generate large quantities of data, which in turn overwhelms them atlater points in time, thus making the task of choosing the right piece of informationin an optimal way virtually impossible. One solution to this problem is to equipapplications with intelligent modules able to filter out non-relevant informationand present highly focused information guaranteed to be relevant for the end-user.This work will focus on one subclass of such intelligent modules: recommendationsystems. Recommendation systems can help achieve an efficient filtering of largeamounts of information and match it against a previously inferred user profile.However, there are two aspects worth noting: first, from a technical standpoint,there is the challenge of understanding the domain of applicability and making theright design decision that will generate a higher accuracy for the RS which, in turn,will lead to increased user satisfaction; second, from a business perspective, anincreasing number of economic agents realize that having an intelligent algorithmas part of their value proposition might be the difference between success andfailure. The question in this situation is therefore “how to design an intelligentmodule such that it will stand out from the crowd while providing accurate andvaluable information?” In particular, this work will on recommendation systemsthat combine user preferences with domain-expert knowledge. As an operatingdomain I chose a mixture of Fitness and Occupational Health, and Wellbeingand I will provide answers for the following questions: (1) How can domain-expertinformation be used to enhance user-preference based recommendations? (2) Whatuser benefits can be achieved by augmenting preference-based recommendationswith domain-expert information?

Keywords: recommendation system, expert system, domain-expert, knowledge,hybrid, mobile, health

iii

Hosting Institutions and Organizations

EIT ICT Labs

EIT ICT Labs is an initiative of the European Union and implemented by the Euro-pean Institute of Innovation and Technology to establish Knowledge and InnovationCommunities. EIT ICT Labs creates a platform to bring together researchers, mem-bers of academia and business people in order to drive European leadership in ICTinnovation for economic growth and quality of life.

EIT ICT Labs offers higher education in ICT, integrated with innovation andentrepreneurial education through its Doctoral School, Master School, Open Schooland Summer school. The EIT ICT Labs Master School 1 is a joint initiative ofthe leading technical universities and business schools in Europe, coupled with thementoring and partnering from leading European research organizations and busi-ness partners. The Master School offers two-year educational programs along seventechnical majors. Each program includes a minor in Innovation & Entrepreneur-ship, and features a geographical mobility between the first and the second year ofstudies. Also, a winter-school, summer-school and an internship in a company arecompulsory elements of the master program.

Eindhoven University of Technology

Eindhoven University of Technology is a top university in The Netherlands, world-wide known for the top-quality of the provided educational programs and researchactivities carried-on. It is ranked 106 in the world according to Times Higher Ed-ucational World University Rankings of 2013-2014, the best Dutch engineering andscience university by the Study Guide to Universities 2013, and best university inthe Netherlands according to the weekly magazine Elsevier.

Eindhoven University of Technology was the entry-point university in this Mas-ter program, providing fundamental education in Business Information Systems,Business Process Management Systems, Introduction to Services, Innovation andEntrepreneurship.

Aalto University

Aalto University is a new university founded in 2010, but with centuries of experi-ence. Aalto University was created from the merger of three top Finnish universities,The Helsinki School of Economics, Helsinki University of Technology and The Uni-versity of Art and Design Helsinki, to encompass and stimulate new joint researchand teaching programs.

Aalto School of Science and Technology is located in Otaniemi, the largest tech-nology, innovation and business hub in Finland and in Northern Europe, with re-spect to the number of companies and R&D centers located in the area. Through

1http://www.eitictlabs.eu/education/master-school/

http://www.eitictlabs.eu/education/master-school/

iv

it s close connections with the industry, Aalto University provides students withexcellent research and entrepreneurial opportunities.

Aalto School of Science was the exit-point university in this Master program,providing education in the areas of Digital Services, Smart Spaces, Multimedia andMobile Services.

Framgo

Framgo is a Finnish start-up founded in September 2012, activating in the domainof occupational-health and wellbeing 2.

Framgo provided the necessary setup for completing the internship required bythe EIT ICT Labs Master School.

2http://www.framgo.com

http://www.framgo.com

v

AcknowledgementTo my family who has constantly supported me throughout my studies. To mygirlfriend for accepting to spend the most part of the last two years apart. You werethe driving force that determined me to complete this program. Thank you!

I would like to thank Prof. Heikki Saikkonen and especially Håkan Mitts fortheir valuable feedback and insightful conversations we had during the preparationof this thesis. A special thank you goes to Prof. Mykola Pechenizkiy in EindhovenUniversity of Technology, for providing me the theoretical foundation needed tocomplete this thesis.

Last, but not least, I would like to thank all my colleagues from the past twoyears. You were part of my learning experience; I had a wonderful time with youand I am very happy to have known you.

Otaniemi, 30.06.2014

Vlad Tufis,

vi

ContentsAbstract ii

Hosting Institutions and Organizations iii

Acknowledgement v

Contents vi

Abbreviations x

1 Introduction 11.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Thesis scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Theoretical Background 52.1 Recommendation systems basics . . . . . . . . . . . . . . . . . . . . . 52.2 Personalization process . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Similarity measures and distance metrics . . . . . . . . . . . . . . . . 7

2.3.1 Utility matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.2 Jaccard index . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.3 Cosine similarity . . . . . . . . . . . . . . . . . . . . . . . . . 92.3.4 Euclidean distance . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Content-based filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4.1 A basic architecture . . . . . . . . . . . . . . . . . . . . . . . 112.4.2 Content-based recommendation advantages . . . . . . . . . . . 122.4.3 Examples from literature . . . . . . . . . . . . . . . . . . . . . 12

2.5 Collaborative filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.5.1 A basic architecture . . . . . . . . . . . . . . . . . . . . . . . 142.5.2 User-User filtering . . . . . . . . . . . . . . . . . . . . . . . . 162.5.3 Item-Item filtering . . . . . . . . . . . . . . . . . . . . . . . . 162.5.4 Collaborative filtering advantages . . . . . . . . . . . . . . . . 172.5.5 Examples from literature . . . . . . . . . . . . . . . . . . . . . 17

2.6 Demographic filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.6.1 Examples from literature . . . . . . . . . . . . . . . . . . . . . 20

2.7 Common problems and limitations of recommendation systems . . . . 212.7.1 Over-specialization . . . . . . . . . . . . . . . . . . . . . . . . 212.7.2 Limited content analysis . . . . . . . . . . . . . . . . . . . . . 222.7.3 Cold-start (new-user) . . . . . . . . . . . . . . . . . . . . . . . 222.7.4 Cold-start (new-item) . . . . . . . . . . . . . . . . . . . . . . 232.7.5 Serendipity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.7.6 Shilling-attacks . . . . . . . . . . . . . . . . . . . . . . . . . . 232.7.7 Gray sheep . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

vii

2.8 Hybrid filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.8.1 Classic examples of hybridization . . . . . . . . . . . . . . . . 242.8.2 “Exotic” hybrid approaches . . . . . . . . . . . . . . . . . . . . 26

2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Expert systems 293.1 Expert systems architecture . . . . . . . . . . . . . . . . . . . . . . . 293.2 The knowledge acquisition process . . . . . . . . . . . . . . . . . . . . 313.3 Limitations and pitfalls . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.1 Choosing the right problem . . . . . . . . . . . . . . . . . . . 313.3.2 Collaborating with the domain-expert . . . . . . . . . . . . . . 323.3.3 Liability issues . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Combining recommendation systems and expert systems . . . . . . . 323.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4 OmaTauko - Concept Description 384.1 OmaTauko - Concept description . . . . . . . . . . . . . . . . . . . . 384.2 Domain model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5 System Design and Implementation Details 455.1 Motivation for a domain-expert enhanced recommendation system . . 455.2 Domain expert involvement . . . . . . . . . . . . . . . . . . . . . . . 465.3 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.3.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.3.2 Choosing the similarity measure . . . . . . . . . . . . . . . . . 475.3.3 Choosing the similarity threshold . . . . . . . . . . . . . . . . 525.3.4 Oskar architecture . . . . . . . . . . . . . . . . . . . . . . . . 53

6 System Evaluation 586.1 Experiment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.2 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7 Conclusion and Future Work 647.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

8 Appendix - Survey questions 66

viii

List of Figures1 Personalization process . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Utility matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Content based recommendation architecture . . . . . . . . . . . . . . 114 Collaborative filtering architecture . . . . . . . . . . . . . . . . . . . . 155 Collaborative filtering architecture - utility matrix . . . . . . . . . . . 166 The long tail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Demographic filtering architecture . . . . . . . . . . . . . . . . . . . . 208 Expert systems architecture . . . . . . . . . . . . . . . . . . . . . . . 309 Application starting screen . . . . . . . . . . . . . . . . . . . . . . . . 3910 Break configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4011 Performing an exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 4012 Monthly statistics view . . . . . . . . . . . . . . . . . . . . . . . . . . 4013 Overall statistics view . . . . . . . . . . . . . . . . . . . . . . . . . . 4014 Scheduling reminders . . . . . . . . . . . . . . . . . . . . . . . . . . . 4115 User details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4116 OmaTauko - Domain model . . . . . . . . . . . . . . . . . . . . . . . 4117 Domain-expert defined relative weights . . . . . . . . . . . . . . . . . 4818 Similarity metric comparison, t =< m, d, c > . . . . . . . . . . . . . . 4919 Similarity metric comparison, t =< m,m,m,m, d, d, d, d, c, c, c > . . . 5020 Similarity metric comparison, t =< m1,m2,m3,m4, d, c > . . . . . . . 5121 Heat-chart map of similarities, wd1, τ = 0.8 . . . . . . . . . . . . . . . 5222 Oskar architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5423 Distribution of skipped tasks over break duration . . . . . . . . . . . 6024 Histogram of recommended/skipped exercises . . . . . . . . . . . . . 6125 Recommendation system success from a user‘s standpoint . . . . . . . 6226 Questionnaire results . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

ix

List of Tables1 Long tail data source . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Recommendation systems - overview . . . . . . . . . . . . . . . . . . 283 Recommendation systems vs. Expert systems . . . . . . . . . . . . . 36

x

AbbreviationsAbbreviation Explanation

RS Recommendation systemCBR Content-based recommendationCF Collaborative filteringUM Utility matrixES Expert systemKE Knowledge engineerKB Knowledge basetf-idf Term frequency - inverse domain frequencyUX User experienceUI User interface

1

1 IntroductionDue to the constant advancements in technology in the last 20 years, people are nowable to stay connected to the Internet from virtually any place, at any point time.Whether they are using their PCs, laptops or mobile devices, they have constantaccess to the Internet and are able to consume a wide variety of information.

In particular, the adoption of mobile technologies in every-day life has impactedevery aspect in which people communicate and run their businesses. In U.S. only,the mobile device sales are expected to reach $215 million by the end of 2016, anincrease of 25% compared to 2009, while the revenue of mobile data usage is expectedto reach $180 billion, a staggering 85% increase over the same period of time, in thesame market [11].

The mobile phone is no longer a device used to make a phone call and sendtext messages, but rather a facilitator for interactivity and information sharing, atrue real life companion. The multitude of sensors with which a mobile phone isnowadays equipped, coupled with forward thinking and creativity of entrepreneurshas led to the creation of millions of applications in a myriad of domains ranging fromorientation, shopping, touristic activities, fitness and workout, up to more sensitivedomains like healthcare or automotive (i.e. augmented reality GPS software) wheresecurity, privacy and precision are highly important.

1.1 Problem description

However, by always staying connected, people generate an immense amount of data.For example, 90% of the data currently available have been generated only in thepast two years [14]. This process of constantly generating more and more data hasled to a big problem for the user - information overload. To alleviate this problem,researchers and business people have proposed and successfully deployed a numberof information filtering techniques; search engines, automated information retrievaltechniques (web-crawlers) or recommendation systems, all solve the same problemof information overloading.

This work will focus on a particular class of information filtering techniques,namely recommendation systems. Recommendation systems have been a hot areafor research in the recent years and have been successfully deployed in a variety ofdomains to boost business values. Typical areas include recommendations of movies,music, books or research articles with large vendors like Netflix, Amazon or eBay in-vesting important resources in developing state-of-the-art recommendation systemsthat would increase user-experience, customer satisfaction, customer retention andultimately increasing revenues.

An equally hot topic in the recent years, and growing more popular togetherwith the proliferation of mobile devices, is mHealth. mHealth is the term coinedfor “medical and public health care practice supported by mobile devices, such asmobile phones, patient monitoring devices, PDAs, and other wireless devices” [11].According to the same report, 31% of cell phone users have used their phones tosearch for health information. With an increase of 134% over one year (from 2010

2

to 2011), reaching 18.5 million mobile users looking for health information (personalhealth, fitness, wellness and information on health services), this category is the onegaining most popularity in the mobile users segment. Furthermore, as of April 2012,40000 mHealth apps are available across all major mobile platforms (iOS, Android,Windows Phone) with 70% of them being targeted for laymen and every-day use[11].

However, more is not always better. As Azar et al. claim in [4], many of themobile apps fall short of incorporating evidence-based content and theory-basedstrategies, that would lead to behavioral changes in health habits for the mobileusers. The most important reason for failing is that, in their hunt for increasing theiruser-base, app vendors focus more on the social aspect of the app (transforming theapp in a social game, sharing results with the community, performing actions whichare popular in the community, getting approved by the community, etc). I considerthat the focus of such apps should rather be on what people need to do, instead ofwhat people would like to do; since users typically do not know to articulate a need,especially in a very specific domain such as health-care, domain-expert help is muchneeded. Costabile et al. [8] define a domain-expert as a professional with extensiveknowledge in a specific domain. Examples of domain-experts include: accountantsin the domain of accountancy, lawyers in the domain of law, automotive engineersin automotive industry, medics in healthcare or personal trainers in fitness.

Additionally, the financial implications of focusing on treating rather than pre-venting are dear. Health care costs raise up to 17.9% of the gross domestic product(GDP) in the US, 8-9% in Europe and 4.5% in China [1]. Moreover, the cost fortreating chronic diseases represents 75% of all the health care costs [1]. The need toprevent instead of treat is logical and obvious.

Mobile technology has the potential to solve at least part of this stringent prob-lem by preventing acute problems (such as neck and back pain, stress, obesity) tolead to chronic ones. By integrating domain-expert knowledge (fitness training, nu-trition advice, weight-loss advice, etc.) in information systems, coupling them withthe power of recommendation systems to guide the users through a vast universeof information, and using mobile phones as the delivery platform, end-users wouldbe able to gain access to a wealth of high-quality information regardless of theirlocation, or the location of the domain-expert.

1.2 Research questions

A solution for the problem I intend to tackle can be provided by answering thefollowing research questions:

1. How can domain-expert information be used to enhance user-preferencebased recommendation?

In order to answer this question, I will implement a recommendation sys-tem that mixes domain-expert knowledge with user-preferences previously col-lected. The domain-expert knowledge will be provided by a fitness personaltrainer, while user-preferences are collected continuously via a mobile applica-

3

tion. Sections 5.2 and 5.3 will describe in detail the two information sources Iintend to use, as well as the architecture of the recommendation system.

2. What user benefits can be achieved by augmenting preference-basedrecommendations with domain-expert information?

The answer for this question will consist of an evaluation of the recommenda-tion system along two major coordinates. An objective measurement will as-sess the performance of the recommendation system by counting the number ofsuccessful recommendations explicitly indicated by users through a survey. Asubjective measurement will evaluate the benefits of the proposed recommen-dation system along several dimensions. Section 6 will detail the methodologyemployed for the evaluation, the results and the complementary discussion.

1.3 Thesis scope

This work is focused on the implementation and evaluation of a recommendationsystem that leverages the knowledge of a domain-expert in order to provide better-tailored recommendations for its users. The domain chosen for building the recom-mendation system is occupational health and well-being. The decision is based onthe fact that this work was performed during an internship at a technology-basedFinnish startup which activates in this domain.

The commercial nature of the product which will benefit from this recommenda-tion system has rendered impossible the testing of the recommendation system in areal-life scenario. Instead, in order to evaluate the system, a survey that emulatesthe behavior of the real system was designed, and 35 participants were included inthe evaluation. Most of the users were in contact with the recommendation systemand the product for the first time when filling in the survey, therefore a short de-scription of the product and a use-case were provided for each participant in orderto better understand the scenario.

1.4 Thesis structure

The first section presents a short introduction to the topic, the problem formulationand the research questions which are addressed.

Section 2 and 3 provide the theoretical background of this thesis, and definethe key concepts. First, the personalization process implications and benefits aredescribed. Then, the focus moves on presenting the notion of a recommender sys-tem, typical algorithms used to provide recommendations and connected terminol-ogy. Finally, a brief overview of expert-systems and their architecture is offered.Throughout both chapters the existing body of literature is used to provide variousexamples of the notions being defined.

Section 4 presents the current state of OmaTauko - a commercial product-servicesystem which enables end-users to keep and track micro-breaks aimed at decreasingtheir muscular-skeletal problems and increasing their energy levels.

4

Section 5 presents the design and implementation details of Oskar - a recom-mendation system which blends user preferences and domain expert knowledge torecommend users with physical exercises that fit the users‘ profile.

Section 6 presents the experiment design and data collection methods, and dis-cusses the obtained results, as well as the limitations of the proposed approach.Finally, section 7 is reserved for presenting future directions of research and a con-clusion of this study.

5

2 Theoretical BackgroundThis chapter will explain the key concepts present in this work. First a definition ofthe recommendation system is provided. Next, the personalization process is pre-sented and some popular metrics used for measuring similarity between the entitiesparticipating in such a system are introduced; the main approaches to provide rec-ommendations are detailed before discussing some of the problems and limitationsrecommendation systems have to face.

2.1 Recommendation systems basics

The existing body of literature provides various definitions for a recommendationsystem. For example, Meteren et al. [33] define RSs as a special type of informationfiltering systems, where information filtering is concerned with selecting a subsetof items from a large collection which is likely to be deemed interesting and usefulby the user. Similar to [33], Bogers et al. [5] define RSs as a class of personalizedinformation filtering technologies whose aim is to identify which items from a catalogare likely to be of interest for a user.

In [6] a recommendation system is described as a solution for the informationoverloading problem; it entails delivering personalized information services whichideally will help the user in selecting the desired item. Ricci [26] adopts a similardefinition for the RS, as an information provider and facilitator in the decision-making process.

In [20], the author argues that a recommendation system makes use of justifi-cations to recommend products to customers and ensure the customers like thoseproducts. The justifications can be obtained from preferences specified explicitly orinduced from past user behavior.

This work will use the definition provided by Semeraro et al.[19] according towhich recommendation systems are one way to guide a user into a large space ofpossible alternatives, towards those items considered to be of interest. They involvepredicting user responses to various options and suggesting those options for theend-user based either on hers, or the community‘s past behavior. In that sense, rec-ommendation systems are both an information-filtering as well as a personalizationtechnique.

Based on the type of information used for computing the predictions, recom-mendation systems can be classified in two major categories: content-based — theyexamine the properties of the recommended items, or collaborative — exploitingthe similarities between users or items [18]. Both methods have strengths as well asweaknesses, and the adoption of one over the other depends on the context of theapplication, as well as the type of data they make use of.

A smaller category of recommendation systems leverages the demographic in-formation of users (age, gender, location, income) in order to provide demographic-based recommendations. Also, in order to compensate for the weaknesses of content-based filtering and collaborative filtering, it is a common practice to merge the twomethods in order to obtain a hybrid recommendation system.

6

However, before discussing the different flavors of recommendation techniques,the personalization process is presented, followed by some definitions of the popularmetrics used in this domain.

2.2 Personalization process

Adomavicius and Tuzhilin [2] define personalization as the tailoring of offerings(content, services, product recommendations) from providers to consumers basedon existing knowledge about their preferences and behavior. The personalizationactivity is typically performed with particular goals in mind, such as improvingcustomer‘s experience when interacting with a product/service, increasing customerretention and satisfaction or increasing sales.

Personalization is an information-intensive activity and it involves operating onlarge data sets, rapid data-collection and processing of large volumes of information;also, the results of the analysis have to be quickly actionable items. For these reasons,personalization is better suited for the online world, rather than for the offline one.

In [2] the personalization activity is viewed as a process consisting of three mainstages. First, the provider needs to understand the customer through data collectionand analysis. The output of this stage is a comprehensive repository of user profilesstoring information about their behavior on the online platform. In the second stage,the information stored in the profiles is matched against certain rules in order todeliver personalized offerings for the users. The final stage involves measuring theimpact of personalization and adjusting of the personalization strategy to bettersuit the customers upon their next visit of the platform. A representation of thepersonalization process is illustrated in Fig. 1.

Figure 1: Personalization process - adapted from [2]

A similar approach is presented in [34] which breaks down the personalizationprocess in 5 steps including user identification, user data collection, user data in-terpretation, deciding upon the personalization itself and adaptation to the newcontext. On top of the definition provided in [2], the authors of [34] emphasizethe importance of including the user in all the stages of the personalization processand identify a set of issues that impact a user‘s experience when interacting with apersonalization system:

7

• predictability - the user must be able to predict the outcomes of her actionsbefore the new content is generated

• comprehensibility - the user must be able to understand how she is beingmodeled by the system and how does the personalization process work

• controllability - stemming from predictability, a user should be able to controlher user-model and what content will be generated

• unobtrusiveness - the user can complete her task using the information systemwithout being distracted by the personalization process

• privacy - the user should not have the feeling that the user model violates herprivacy

• breadth of experience - the user should not be prevented from discovering newitems using the information system; that is, the personalization process doesnot develop only in one direction.

• system competence - the user must not have the feeling that the system gen-erates faulty recommendations or the user model is built in a faulty manner[34].

As presented in [2], the process starts with the task of data acquisition fromvarious sources. These include explicit user actions (such as filling in a profile, ratingvarious items, bookmarking websites, liking pages) or implicit actions coupled withvarious heuristics to interpret them (e.g., spending more than t seconds on web-pagemeans that the user deems it relevant for her needs, watching a video until the endmeans that the user has enjoyed it).

Once the data is collected it needs to be structured into user profiles. Userprofiles are matched with item profiles to determine the extent to which a set ofitems fits the user needs and wants. Matching techniques include recommendationsystems, rule-based or case-based expert systems or statistics-based approaches [2].The matching process generates lists or sets of relevant items. The list may beordered on relevance, predicted rating or simply unsorted. An explanation of whythe delivered items are considered relevant is usually included.

The final stage involves measuring the impact of the personalization impact. Tothis end, metrics should be defined and performance goals set (i.e. the accuracy ofthe predictor has to be at least 80% — meaning that 80% of the predicted ratingswere estimated correctly). The information collected in the measurement stage canthen be used as feedback and integrated at each stage of the personalization processto improve the performance, or refine the behavior of each component.

2.3 Similarity measures and distance metrics

Before moving on to describe the various categories of recommendation systems, Iwill first summarize some of the popular notions and metrics used in this domain;

8

namely, the notion of utility matrix, Jaccard index, cosine similarity and Euclideandistance will be considered.

2.3.1 Utility matrix

The utility matrix (UM) (Fig.2) captures the preference relationship between a userand an item. One dimension of the matrix represents the users of the system;the other dimension represents the items present in the system, which should berecommended to the users.

I1 I2 I3 I4 I5 I6U1 1 5 4U2 2 3U3 1 5 3 5U4 4 5

Figure 2: Utility matrix representing the ratings of four users over six items

An element in the matrix, located at the intersection between line i and columnj represents the rating useri has awarded for itemj, under the assumption that usersare represented on lines, and items on columns. An example of a utility matrixconsisting of four users and six items is represented in Fig. 2. In this example, theusers were able to award ratings on a scale from 1 to 5. If the ratings are binary(e.g., 0 - not liked, 1 - liked), then the elements of the UM would modify accordingly.

2.3.2 Jaccard index

The Jaccard Index is a metric to measure the overlapping or similarity between twofinite sets, and it is defined by the formula:

J(A,B) =|A ∩B||A ∪B|

(1)

If A = B = ∅ ⇒ J(A,B) = 1.

If A and B have the same elements, then

A ∩B = A ∪B ⇒ J(A,B) = 1 (2)

On the other hand, if A ∩B = ∅, then

J(A,B) = 0 (3)

From (2) and (3) results that

0 ≤ J(A,B) ≤ 1 (4)

The intuition behind using the Jaccard index as a similarity measure betweentwo items represented as sets of elements is that, as two sets are more different,

9

their intersection will result in a smaller number of elements, while their union willresult in a bigger number of elements, thus the ratio between the intersection andthe reunion will be closer to 0. Conversely, if the two sets are more similar, theirintersection will contain a higher number of elements, and closer to the number ofelements contained by their reunion; thus, their ratio will be closer to 1.

The rows or columns of a UM, such as the one in Fig. 2, can be interpreted asvectors. However, if a processing step is applied on the UM such that, instead ofhaving ratings on a discrete scale from min to max, the matrix is binarized with agiven threshold (for example, all ratings ≤ 2 are discarded, and all ratings ≥ 3 arereplaced by 1), and the result will contain only 1s or empty spaces.

One could then interpret the matrix rows or columns as sets, with elementi ∈setj if UM(i, j) = 1.

Then, the Jaccard index, as illustrated by (1) can be computed for two sets,setj1 and setj2, regardless if they represent users or items.

2.3.3 Cosine similarity

The cosine similarity is derived from the dot product of two vectors a and b.

a · b = ||a|| · ||b|| · cos(θ) (5)

Therefore,

sim(a, b) = cos(θ) =a · b

||a|| · ||b||=

n∑i=0

ai ∗ bi

||a|| · ||b||(6)

and

||v|| =

√√√√ n∑i=0

(vi)2 (7)

The cosine similarity measures the angle between two vectors a and b. Theintuition behind using this metric as a similarity measure is that the smaller theangle between two vectors of features, the more similar they are. In a vector spacein which all the elements of a vector are positive, the cosine similarity will rangefrom 0 to 1, where 1 indicates complete overlapping (or complete similarity), and0 indicates orthogonality (or complete dissimilarity) between the two consideredvectors.

A potential shortcoming of this method is that the cosine similarity fails tocapture the difference in magnitude between the two vectors. For example, assumetwo vectors, v1 =< 1, 1, 1 > and v2 =< 10, 10, 10 > in a 3-dimensional space.According to (6):

cos(θ) =v1 · v2||v1||||v2||

=10 + 10 + 10√

3 ·√300

=30

30= 1 (8)

10

According to this metric, the two vectors considered are 100% similar. Obviouslythis is not the case, as v2 is 10 times bigger than v1 in all its dimensions.

2.3.4 Euclidean distance

The Euclidean distance between two vectors a and b is the length of the line segmentconnecting them. Suppose a =< a1, a2, ..., an > and a =< b1, b2, ..., bn >. Then, theEuclidean distance is defined by the equation:

d(a, b) =

√√√√ n∑i=0

(ai − bi)2 (9)

The Euclidean distance is a positive measure, left bounded to 0. That is, if twovectors are identical, the distance between the two of them is 0 and their similarityreaches the maximum point. The more different two vectors are, the bigger thedistance between them will be.

As compared to (6), equation (9) also takes into account the magnitude of thevectors. Using the same example from section 2.3.3, the Euclidean distance betweenv1 and v2 is:

d(v1, v2) =√81 + 81 + 81 ≈ 15.6 (10)

2.4 Content-based filtering

In a content-based filtering setting, the system recommends to the user items similarto the ones the user has liked in the past. Content-based filtering techniques analyzecommon features among items and select new items based on the correlations be-tween those and the user‘s past preferences [2] [33]. Similar definitions are providedin [25] and [23] which describe the outcome of content-based recommendations asresulting from the analysis of items rated by the user in the past, and matchingthem against candidates from the set of unrated documents.

Content-based recommendations (CBR) assume that each item in the systemhas a profile attached to it. An item profile is a collection of properties which canbe extracted for an item [18]. For example, suppose an information system providesrecommendations for recipes, such as in [32]. A recipe can be represented as a vectorof properties including cuisine, list of ingredients (set), diet type, region, occasion,number of servings, etc. Determining the elements of the item profile is the firstmajor design decision to be taken when developing a content-based recommendationsystem [25].

Ideally, roughly all the items in a collection have the same set of properties,and data sparsity3 is not an issue. In this situation, a similarity measure can becomputed (i.e., cosine similarity, or Jaccard distance) to determine to which extenta previously rated item is similar to a new one. If the result meets a certain thresh-old, then the new item is recommended for the user. Choosing the similarity metric

3item profiles consistently have the same properties

11

and the similarity threshold represents the other major design decision needed to betaken when developing a content-based recommendation system [25].

2.4.1 A basic architecture

A basic architecture of a CBR is presented in Fig. 3. First, the information iscollected from various information sources (bottom-left corner of the figure). Forexample, one could use a crawler to harvest tweets, recipes, images or articles fromvarious third party platforms. Most likely, the information will not be in a structuredform that would allow easy usage and straightforward inclusion in a recommendationengine. Therefore, the next step is adding structure to it, cleaning it from noiseand extracting only those features that are relevant for the specific situation (theContent Analysis block). The items are then stored in a database, or other formof persistent storage, for future reference (the Items Collection block). The typicalform of representing an item for a CBR system is a vector of features.

Figure 3: Content based recommendation - basic architecture. Adapted from [19]

A user will start using the platform, and will be presented with items. Thesystem typically has a way of collecting feedback from the user, for the items sheis viewing. For collecting feedback, two techniques are available: explicit feedbackand implicit feedback [19]. In an explicit feedback setting, a user is requested toexplicitly provide a rating for the item (typically on a scale from a minimum to amaximum — 1 to 5 stars, or a metaphor that is translated into a scale — variousemoticons to express the feeling about an item). Using implicit feedback entails thetracking of user actions and assigning them a weight (e.g., viewing a video until theend earns it 5 points, while only viewing the page containing it, 2 points). In eithercase, the feedback is being stored for future reference (the Feedback block, top-rightcorner).

12

The system periodically analyses those items that the user has recently rated andinfers a profile (the Profile Learning block). When providing new recommendations,new items are extracted from the Items repository, and compared against the userprofile. The Filtering block uses a similarity measure to determine which of thoseunseen items best fit the current profile of the user. Usually, only top-k items willmake it to the final recommendation list.

The process will then enter a new iteration in which the user views the recom-mended items, judges and rates them and her profile is being updated for the nextround of recommendations.

2.4.2 Content-based recommendation advantages

Semeraro et al. present in [19] a list of advantages the CBR systems posses. First,CBRs rely only on the ratings of the user for which recommendations are actuallyprovided.

This user-independence leads to another attractive feature - CBR systems donot suffer from the first-rater problem which means that they are capable of recom-mending items newly added in the system, items that have not been rated yet byany user.

Finally, a CBR can be easily explained for the active user. Such an explanationwould be for example: “You are seeing this item because you previously liked itemsB, C and D which are similar to the current one”. Providing explanations for the rec-ommendations is a good way to increase the trust of the user in the recommendationsystem [19].

2.4.3 Examples from literature

[36] presents BlogMuse - an application built to help blog writers connect with theiraudience. Readers that want to read about a certain topic but are not able to findit, can submit a request in the system. The request is then routed to potentialmatching users based on the interests from their profiles. In that sense, BlogMuseimplements a simple form of content-based recommendation in which an author thathas indicated an interest in a topic will be notified when that topic is being requestedby the potential audience.

If the writer decides to write about the topic, the requester is also notified. Inorder to support audiences larger than one person, a topic submitted by a person ispublic and can be viewed by other community members. The topics can be rated andthe voters are notified when someone has written about that topic. Also, whenevera topic‘s interest was increased as a result of a rating, potential matching authorsare notified and can decide to write about the topic. Therefore, BlogMuse alsoimplements a collaborative-approach to recommending topics, such that the morereaders request a topic, it is likely that it will be recommended to a potential author.

A more classical CBR approach is presented in [33]. PRES - Personalized REc-ommender System - is a recommendation system exploiting CBR techniques to cre-ate dynamic hyperlinks for web pages containing advices for “do it yourself” home

13

improvements. The architecture of PRES is typical for a CBR system and hence,similar to the one presented in Fig. 3. A user profile is learned from the feedbackthe user is providing; the RS compares the user profile with the documents in thecollection and feeds a list of recommendations ranked on various dimensions such asnovelty, similarity, proximity and relevance [33].

Due to the nature of the domain, the authors argue that the profile learned byPRES has to be very dynamic — it is highly likely that a user is not interestedin performing the same home improvement over a very short period of time. As aconsequence, the authors use implicit feedback heuristics to infer user-preferencesfrom their actions instead of asking users to provide explicit feedback for the pagesthey visited. Thus, the more time a user spends on a page, it is considered that thedocument is more relevant for her. However, the authors claim that using the sameheuristic to detect non-relevant documents is not suitable because a small amount oftime spent on reading a document might also be an indication that the document istoo similar to one she has previously read. For this reason, they do not use negativeexamples to learn the user profile.

PRES uses the relevance feedback model introduced by Rocchio [21] and definedby the following equation:

Pm = αP + β1

|Dr|∑

Dj∈Dr

Dj + γ1

|Dnr|∑

Dj∈Dnr

Dj (11)

where:

Pm the updated user profileP the initial user profileDr the set of relevant documentsDnr the set of non-relevant documentsα, β, γ constants to control the relative importance of the initial profile,

and the sets of relevant and non-relevant documents

However, because PRES does not make use of negative examples, γ is set to 0.Furthermore, β is set to 1 because a document is only considered to be relevant ornot; α is set to a value between 0 and 1 via experimentation to reduce the weightsin the current profile.

The documents are parsed offline on a periodical basis and represented in thedocument collection as vectors of features. The term frequency-inverse documentfrequency (tf-idf) measure is used to determine discriminative terms in a document,while cosine similarity, as defined by equation (6), is used to compute similaritybetween the items in a user‘s profile and a document in the collection. For sortingthe results, the authors use two approaches: recommending top-k results ranked bysimilarity, or recommending all results that comply with a similarity threshold.

To evaluate the performance of the RS, the authors used two metrics: precision- how many of the retrieved documents are actually relevant; recall - how manyof the relevant documents were actually retrieved. The most important finding ofthe study is that the precision varies depending on the considered topic [33]. The

14

authors attribute this finding to the intrinsic properties of the similar documents(similar documents contain different terms which cannot be associated). I considerthat this problem may have been overcome if a synonyms dictionary would havebeen used when parsing the documents to merge together terms with similar mean-ings.

Finally, one of the approaches presented by Pazzani in [25] uses a content-basedapproach to build user profiles based on the description of the items in his system.In order to learn the user profile he uses text-mining techniques to parse the descrip-tions of the items and applies the Winnow algorithm 4 to identify only the relevantattributes in a a pool of many possible attributes. Once the relevant features areidentified on a per item basis, a user-profile is built by inspecting the items, whichwere previously rated by the user.

I consider content-based recommendation to be relevant for this work. Therefore,part of the algorithm described in section 5 will employ this technique. Specifically,I am interested in using the vector space representation used in [33] as well asheuristics to infer the user feedback from her actions rather than explicitly requestingfor feedback. I will also build user profiles based on the items previously rated bythe user.

2.5 Collaborative filtering

Unlike content-based filtering, collaborative filtering (CF) techniques leverage thecorrelation between users with similar tastes [33]. The assumption behind collab-orative filtering techniques is that if two users have similar behaviors with respectto a set of items, they will have a similar behavior over other unseen items as well[31]. The key difference is that when computing similarity, collaborative methodsusually exploit the rating behavior of the users, instead of looking at the features ofthe users, or the features of the items.

There are two main categories of CF techniques. Memory-based techniques usethe whole (or a subset of the) dataset to compute ratings on the go. Such methodsare easy to implement and largely deployed in commercial systems such as Amazon[31]. Model-based techniques use the ratings awarded by users for items to estimateand learn user models that will generate rating predictions. The rest of this sectionwill focus on the first category, as it is more popular and easier to grasp. Henceforth,when the term “collaborative filtering” is being used, it is a reference to memory-based collaborative filtering techniques. Providing the fundamentals for the secondcategory is out of scope for the thesis; however, [31] contains a comprehensive reviewof CF model-based techniques.

2.5.1 A basic architecture

A picture sketching the architecture of a collaborative filtering architecture is de-picted in Fig. 4.

4http://en.wikipedia.org/wiki/Winnow_(algorithm)

http://en.wikipedia.org/wiki/Winnow_(algorithm)

15

According to Fig. 4, User A has rated positively items I1, I2 and I3, while User Bhas rated positively items I1, I2 and I4; User C has provided ratings for all I1, I2, I3and I4 but consistently smaller than both User A and User B. In this situation usersA and B seem to be similar, therefore, if A is the active user — the user for whicha recommendation will be computed — then he should be recommended with I4.Alternatively, if user B is the active user, then she should be recommended with I3.

A more formal representation of the collaborative filtering architecture can beachieved if we make use of the utility matrix introduced in section 2.3.1 and ispresented in Fig. 5.

Figure 4: Collaborative filtering architecture

16

I1 I2 I3 I4UA 4 5 4 ?UB 5 3 ? 4UC 1 2 1 3

Figure 5: Collaborative filtering architecture - utility matrix representation offigure 4

2.5.2 User-User filtering

In a user-centered approach of the CF technique, the algorithm tries to infer theratings Useri will award to unrated items, by comparing her rating behavior againstusers that have already rated those items. The system will estimate that Useri willaward the same rating to an item as the user, which is more similar to Useri.

For example, consider the UM from Fig. 5 and suppose we want to predict therating UA will give for I4. Both UB and UC have rated I4, therefore, two similaritieshave to be computed between the vectors UA, UB and UA, UC . Let us assume we areusing the Euclidean distance, as defined by (9) as a similarity measure between thetwo vectors. Because data sparsity - lack of ratings - is an issue for memory-basedCF techniques, it is necessary that only items that have been co-rated by both usersto be considered when computing the similarity between two users.

Thus, d(UA, UB) =√5, while d(UA, UC) =

√27, and because d(UA, UB) <

d(UA, UC), UA is more similar in behavior to UB than to UC , therefore the pre-dicted rating of UA for I4 is 4, which is a strong indication that the item should berecommended.

2.5.3 Item-Item filtering

The item-centered approach is orthogonal to the user-centered approach. Instead ofcomputing the similarity between the vectors of ratings awarded by two users, theitem-item CF computes the similarity between the ratings awarded by all the usersto a pair of items. Let us consider again the example from Fig. 5 and assume wewould like to determine the ratings UB will give for I3. Because UB has awardedratings to items I1, I2, I4, we need to compute three similarities, between the pairs(I3, I1), (I3, I2) and (I3, I4). Using (9) and the same heuristic as in the user-userfiltering technique, according to which only pairs of users who have co-rated an itemare considered, we obtain the following values:

d valued(I3, I1) 0

d(I3, I2)√2

d(I3, I4)√4

I1 will better approximate the rating UB will give for I3, hence the predictedrating is 5. Again, this is a strong indication that I4 will be liked by UB andtherefore it should be recommended.

17

2.5.4 Collaborative filtering advantages

The memory-based CF techniques used for recommendations have a series of ad-vantages that make them attractive and very popular to use; [31], present a list ofthese advantages.

First, as compared to their content-based counter parts, CF recommendationsystems do not require a representation of the items in a potentially n-dimensionalspace. This solves a big challenge because, in a content-based setting, a pre-processing step is often required to extract the relevant features for each item.

Second, and connected to the first point, when adding new items in the systemno pre-processing steps are required. An item is simply added in the collection withno rating and can immediately be included in the recommendation process.

Finally, another major advantage of memory-based CF techniques is their lowcomplexity in implementation. Usually, a matrix representation of the problem andchoosing a similarity metric is all that is needed to set up such a recommendationsystem.


Pazzani presents in [25] a memory-based collaborative recommendation system,which predicts the ratings that users award for restaurants. The author uses im-plicit feedback methods to infer a user‘s rating behavior. Thus, if a user has addedthe restaurant‘s web-page to her online profile, it means that the restaurant hasachieved a positive rating.

The author addresses both user-centric and item-centric approaches in his ex-periment. For a similarity metric, he chooses the Pearsons r measure of correlationdefined by equation (12) in order to find the degree of correlation between twousers; next, the algorithm predicts the rating for an item as a weighted average ofthe ratings the other users have awarded for the same item.

r(x, y) =

∑d∈docs

(Rx,d −Rx)(Ry,d −Ry)√ ∑d∈docs

(Rx,d −Rx)2∑

d∈docs(Ry,d −Ry)2

(12)

where Rx,d is the rating awarded by user x to document d in the collection andRx is the average rating awarded by user x to all the documents she has rated.

Another example of collaborative filtering is detailed in [5]; the authors use web-pages tag aggregation - or folksonomies - to enrich the recommendations deliveredusing CF techniques. A folksonomy is defined as a tripartite graph consisting ofusers, web-pages and tags. A user is connected to a web-page if she has added thatpage to her profile. A tag is connected to both a web-page and a user, if the user hasused that tag to mark that specific web-page. Thus, the folksonomy is representedas 3D matrix, containing implicitly collected information, with all items added by auser receiving a score of 1 in the matrix.

The authors build standard user-user and item-item CF algorithms using a k-Nearest Neighbor approach (only the top-k most similar users or items are consid-

18

ered for predicting ratings) and leveraging aggregated information from the matrixrepresentation of the folksonomy:

• R - matrix containing ratings awarded by the users for the pages. This is abinary matrix containing 0 if a user has not added the web-page to her profile,and 1 if she did.

• UI - matrix containing how many tags each user has assigned to each item

• UT - matrix specifying how many times a user has used a certain tag toannotate the items

• IT - matrix specifying how many times a tag has been used to annotate apage

To compute similarity between users and items they use the cosine similarity measureas defined in equation (6).

Furthermore, they investigate the possibility of using tags overlapping to measuresimilarity between users or items. To this end they use the matrices UT and ITand compare the performance of the Jaccard Index, Dice‘s coefficient5 and cosinesimilarity in measuring similarity.

The results show that the user-user CF algorithm performs better than its itemcounterpart, on the account that the average number of items per user is much higherthan the average number of users per item, which in turn leads to less sparse uservectors when computing the similarity measurement [5]. For the tag-overlappingsimilarity the results are mirrored. The authors motivate this finding due to thehigher average of tags per item, compared to the average number of items per user.

The findings from [5] are consistent with the real-world behavior, where there aremore users that purchase a large amount of different items, rather than having a largeamount of popular items, which are purchased by all the users. For this reason, user-centric approaches of CF algorithms are more popular and more frequently deployedin real-life situations.

The Long Tail

Moreover, the findings from [5] are also backed-up by the long-tail theory which sitsat the root of recommendation systems. According to the long tail theory, thereis a significant larger pool of unpopular or unrated items as compared to a smallerpool of very popular ones [3]. The purpose of the recommendation systems is torecommend less popular items residing in the long tail to a larger number of users;hence the phrase “less is more”. To better illustrate this concept, consider the plotfrom Fig. 6 in which on the x-axis the items that participate in a recommendationsystem are represented, while the y-axis contains the number of times an item hasbeen rated. The plot was obtained using the data from table 1.

5https://www.google.fi/?gfe_rd=cr&ei=M76NU7flH-nJ8gfa_4HoBg

https://www.google.fi/?gfe_rd=cr&ei=M76NU7flH-nJ8gfa_4HoBg

19

Item I1 I2 I3 I4 I5 I6 I7 I8 ... I100# Ratings 100 65 45 35 20 10 5 5 ... 5

Table 1: Distribution of ratings over a pool of 100 items

As it can easily be seen, the head of the tail (suppose we define as belonging tothe head of the tail only those items that have at least 20 ratings, that is I1 − I5)contains a total of 265 ratings, while the long tail contains 480, almost double theamount. For the sake of argument let us suppose that the action of rating impliesthat the user has bought the item, and all items have the same price. Thus, eventhough items I1 − I5 are very popular among the community of users, they havegenerated 50% less revenue than the rest of the items I6 − I100 residing in the longtail. Hence, the motivation of recommendation systems of trying to recommendnew, unseen items.

0 20 40 60 80 1000

20

40

60

80

100

120

Items

Frequency

Figure 6: The long tail

We limit the discussion about the various collaborative filtering techniques atthis point. Although it is a very popular technique and it is widely deployed in real-life scenarios, it will not be implemented in the recommendation system presentedin this paper. The reason for which I do not consider it relevant for this study is thatthe addressed domain — health and well-being — is of such nature that contributionfrom the community is much less important than domain-expert‘s knowledge andthe content of the recommended items.

2.6 Demographic filtering

A method that combines concepts from both content-based, as well as collaborativerecommendations is demographic filtering. When implementing demographic filter-

20

ing, a user is represented as a set of features, much like in a content-based approach.However, the vector of features characterizing a user consists of demographic datasuch as gender, age, income, location or profession. Most of the features that definea user‘s profile are typically collected in an initialization step, shortly after the user‘sregistration on the platform; also, usually, the features are specified explicitly.

The purpose of demographic filtering is to identify typologies of users — usersthat like a similar product — and focus the offering for that specific market segment[25]. In order to identify classes of users, user-user similarity is usually computedusing one of the metrics mentioned in section 2.3 (cosine similarity, Euclidean dis-tance, Jaccard coefficient). In that sense, demographic filtering uses concepts typicalfor the user-centric collaborative approach.

Fig. 7 presents the concept of user similarity from a demographic perspective.

Figure 7: Demographic filtering architecture


Recognizing that eliciting user demographics can be quite a challenge, Pazzani de-scribes in [25] an alternative approach to obtain demographic information for theusers of his system. By crawling and text-mining the home-pages of existing users,

21

the author minimizes the effort required to obtain demographic information. Aswith the content-based algorithm of the author, which was described in section 2.4.3,the Winnow algorithm is used to learn the relevant characteristics of home pages.User similarity is then calculated between inferred profiles and a recommendationis computed.

Vozalis and Margaritis use demographic data to enhance the results of the base-line collaborative user-user and item-item recommendation algorithms [35]. A userin their system has three demographic components: age - split into four categories,gender - split into two categories, and occupation - split into 21 categories; for eachcomponent, one state of the category is possible at a time. The resulting vector offeatures has 27 dimensions; an example is u = <0,1,0,0,0,1,0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1> representing a user in the second age category (e.g.25-35 years old), female, and having a job that fits in category 21 (e.g., accountant).For computing similarity between the vectors of demographic data, cosine similarityis used as a metric. The similarity score is then multiplied by the correlation scorebetween the two vectors of ratings awarded by the users for various items and thusan enhanced metric is obtained.

Demographic data is an important element for the domain addressed in this pa-per, as well as for the proposed recommendation system. Equally relevant is therepresentation of the vector of features as a binary vector in a multi-dimensionalspace; such a representation, although it significantly increases the dimensionalityof the working space, it compensates for the shortcoming of the cosine-similaritymethod of not accounting for the magnitude of the array. Consequently, this rep-resentation will be considered as an alternative representation in the vector-spacemodel of the items and its performance will be tested in order to take an informeddesign decision with respect to the implementation of the proposed algorithm.

However, at the point of development, I did not posses enough accurate demo-graphic information about the users to properly leverage it in the system. Thus, thelack of consistent demographic data constitutes a limitation of this system and theimplementation of a demographic component is deferred for future work.

2.7 Common problems and limitations of recommendationsystems

Taken individually, each of the above methods has certain problems and limitationswhich will be discussed in the following paragraphs.

2.7.1 Over-specialization

Over-specialization refers to a RS‘s incapacity to recommend things too differentthan the ones the user has rated with a high rating. Over-specialization is a lim-itation of CBR systems. For example, in a system recommending news articles, ifthe user has indicated as relevant some articles on the theme “Nokia acquired by

22

Microsoft”, the vast majority of the future recommendations will be on the sametheme.

Techniques to overcome over-specialization assume building hybrid RSs (using amix of CBR and CF), including a small random component in the final result [19], ormixing the set resulted from CBR with the set resulted from a demographic filtering.Also, one could prevent the recommendation of too similar results by lowering thethreshold of the similarity measure [19](i.e., instead of recommending items thatscore 80% or higher in the similarity test, recommend those items that are similarin a range between 60-75%).

2.7.2 Limited content analysis

CBR systems are also limited with respect to the size and contents of the items‘profile. Either if the data cleansing is performed manually or automatically, thereis only a limited set of features that can be extracted; often a trade-off needs to bemade with regards to what an item profile should contain. For example, in a reciperecommendation system, some recipes crawled from one website are characterized byfeatures like cuisine, ingredients and occasion, while recipes from a different web-siteare characterized by ingredients, number of servings and diet. A decision has to bemade with respect to which set should be used. Of course, a middle solution wouldbe to use both, but in this situation data-sparsity issues may arise, as different itemprofiles will not exhibit the same consistency.

Moreover, to accurately represent items for a CBR, domain expert knowledge isoften needed [19]. Using again the recipe recommendation system, one would haveto validate that the recipes have indeed a minimum set of features that deems themusable for the users of the system. The role of the domain-expert will be furtherdetailed in section 3.

2.7.3 Cold-start (new-user)

Both CBR and CF systems struggle to provide recommendations for users newlyentered in the system. Without a minimal base of reasoning, a CBR will not be ableto provide reliable recommendations. In the CF setting, the problem is related tothe data-sparsity problem. The UM used to compute similarity between new usersand new items is usually very sparse (provided the user base and/or the collectionof items is large) — only a small fraction of the user-item pairs will typically havea rating assigned.

In a CF system, a new-user is equivalent with an empty row (or column, de-pending on how the information is represented), in the UM. A user for which noprior ratings are available will not be recommended with any items, because hervector of ratings will always have all elements 0 and, according to equation (6), thedot-product will also be 0 therefore, a rating prediction cannot be computed.

23

2.7.4 Cold-start (new-item)

This problem is common for CF systems. An item without any prior ratings willnot be able to yield rating predictions, due to the same reasons as above. CBRs,though are not affected by this shortcoming, because this class of recommendationsystems focuses on the content of items.

2.7.5 Serendipity

Serendipity is the counter-part of over-specialization. It is a desired feature of a RS,and a limitation if the RS does not possess it. Serendipity is the ability of the RSto provide surprising recommendations, that otherwise the user would not have thechance to come across [19]. There is a clear distinction between the novelty propertyof a RS and serendipity, and it stems from the probability of the user discoveringthe recommended item. A novel item has a higher probability of being discovered ifnot recommended. Thus, an item that is serendipitous is also novel, but the reversedoes not hold [19].

2.7.6 Shilling-attacks

The “shilling-attack” method is a vulnerability of collaborative recommendation sys-tems. Shilling-attacks entail creating fake user profiles which rate in the same man-ner a specific set of target items. Next, the same user profiles are used to rate otheritems such that the rating behavior becomes similar to the ones of other regularusers. The final outcome is that, in a user-centric CF approach, regular users willbe recommended with the target items, on the account that the fake user has a sim-ilar rating behavior with the active user [7]. There are two forms of shilling-attacks:“push” attacks (promote attacks) when the target items are rated with a high ratingand the outcome is the one previously mentioned; “nuke” attacks (demote attacks)when the target items are rated with a low rating and the outcome is that the targetitems do not get the chance of being recommended due to their poor average score.

2.7.7 Gray sheep

“Gray-sheep” users are defined as users that have unusual tastes as compared tothe rest of the community [12]. Their ratings partially agree with some users andpartially disagree with others. Two potential problems may result from this scenario:first, “gray-sheep” users may not receive accurate recommendations due to theirinconstant purchasing/rating behavior; second, their contribution to the system isunreliable and might affect the quality of the recommendations for the other users.The authors of [12] present a solution for detecting “gray-sheep” users: they adaptthe k-Means++[?] clustering algorithm in order to cluster the utility matrix anddetect gray-sheep users, and use the results to generate accurate recommendationsfor this category of users.

24

2.8 Hybrid filtering

To compensate for the disadvantages previously mentioned, and augment the strongpoints of each approach, various hybrid methods have emerged.

One trivial approach would be to separately implement a content-based systemand a collaborative one. Each system will generate a set of results which can thenbe combined in a final list, using various heuristics to rank the list in a way whichis meaningful for the user [2]. Another approach, as previously seen in section 2.6,is to use demographic data to boost the results of collaborative filtering.

Across time, practitioners have observed the advantages of building hybrid rec-ommendation systems. The next section aims at presenting some of the most in-teresting examples encountered in the literature. Approaches that are a hybrid ofCBR and CF techniques will be presented, but also more “exotic” implementationsare considered.

2.8.1 Classic examples of hybridization

Collaboration via content

In line with one of the hybridization techniques proposed by Adomavicius andThuzilin [2], Pazzani [25] merges content-based and collaborative techniques in asingle recommendation model. His method - collaboration via content - exploits thecontent-based profile of each user (described in section 2.4.3) to compute similaritybetween pairs of users using the Pearson measure defined by equation (12). The rat-ings are predicted using the same technique described in section 2.5.5, by computinga weighted average of the ratings all users have awarded for a certain restaurant,where the weights are represented by a correlation factor previously calculated.

The results obtained with this approach (accuracy of 70.1% for the predictedratings) outperform the user-user CF approach (67.9%), item-item CF approach(57.9%) and the pure content-based approach (61.5%).

Merging all algorithms together

Another approach, yet again proposed by Pazzani in [25] merges together all thefour previously mentioned algorithms in the following manner:

1. each algorithm runs to provide a list of recommendations, out of which top-5 areretained.

2. each recommendation receives a number of points equal to 6− k, where k is therank of a recommendation (i.e., rank 1 - 5 points, rank 2 - 4 points, etc.).

3. the results are then aggregated into one list, with the points for the same recom-mendation being added.

On an average, this method is successful on 72.1% of the cases and outperformsall the other methods, thus demonstrating that hybrid techniques can successfullybe used to overcome limitations of stand-alone methods.

25

Quickstep

Quickstep [23] is a hybrid RS combining CBR and CF techniques for research paperrecommendations. The research papers are represented as vectors of relevant terms.When parsing a paper, term weights are computed as

w =term frequencytotal # of terms

(13)

and a pre-processing step for Porter stemming (eliminate suffixes, prefixes, etc.),removing stop words and common words is implemented.

As with [33], the authors use implicit feedback heuristics assigning different val-ues for user actions; however, providing explicit feedback is also possible and helpsenhancing the system‘s accuracy (i.e., browsing a paper, following a recommenda-tion, rating a paper as relevant or not). Based on the user actions, a topic interestvalue is computed.

For labeling the new research papers an inductive supervised learning method(nearest-neighbor) is used in conjunction with a multi-class representation, whereeach class is represented by a research paper topic. The authors investigate thepossibility of augmenting user profiles with a research-paper ontology. Thus, whena paper receives interest (as described above), its immediate super-class receives ashare of that interest (i.e., 50%), the next super class a smaller share (25%) untilthe top-level of the ontology is reached [23].

The recommendations are delivered following a matching between the user‘scurrent topic of interest and the papers classified as belonging to those topics. Theconfidence with which a recommendation is delivered is obtained by the equation:

recommendation confidence = classification confidence× topic interest value (14)

The collaborative component consists by users being able to provide new exam-ples of topics and correcting papers that were assigned in a wrong class.

I consider the augmentation of the topic list with a research-topic ontology to bevaluable, as it might alleviate the over-specialization problem to which the content-based filtering component is prone to. This is all the more important, as the effi-ciency of the collaborative component — which assumes a user will explicitly modifya paper‘s class — is questionable and conflicting with the authors‘ decision of usingimplicit feedback techniques to increase the unobtrusiveness of the system.

Personalized Learning Recommendation System (PLRS)

Lu [20] presents PLRS, a framework for personalized learning recommender sys-tems, consisting of four components: student profile builder, student requirementidentification, learning material matching analysis and learning recommendationgeneration.

The student profile is built with a mix of implicit and explicit student actions. Asa source of intentional (implicit) information, PLRS uses web-mining techniques toanalyze the click-stream of a student in an e-learning platform. This analysis reveals

26

the behavior of a student in terms of what materials is she viewing, what does sheconsider as being of interest, and serves as the basis for the CBR component of therecommendation algorithm.

The student requirement identification component uses multi-criteria analysis tobuild a model of the student. The reason provided by the author is that studentrequirements are difficult to approximate through precise values, and fuzzy valueswould better fit this domain (requirements are represented as “important”, “less im-portant”, etc.). Furthermore, the criteria for student requirement identification areenriched through a mix of demographic filtering (collect requirements from studentswith similar learning styles, and membership to different academic groups - busi-ness faculty, science faculty) as well as collaborative filtering (inspect the access tolearning material of other students).

The learning material matching analysis makes use of a set of matching rulesand a learning material tree to match a set of requirements against a learning ma-terial set. Finally, the recommendations are delivered to the student using a top-Ntechnique.

2.8.2 “Exotic” hybrid approaches

Conversational recommendation system - MobyRek

Ricci and Nguyen [27] challenge the efficiency of implicit user feedback as a source ofinformation in recommendation systems, and underline two major problems for thismethod: first, the user actions need to be interpreted and translated into meaningfuluser profiles; second, implicit feedback is often noisy, as the reason and objective ofa user-action varies from user to user and context to context.

Having this in mind, they propose MobyRek - a conversational recommendationsystem developed for mobile platforms. Conversational recommendation systemsassume human-computer interaction in successive cycles, the result of the recom-mendation being adjusted after each cycle, until it converges to the user‘s desiredoutcome. MobyRek recommends restaurants to users, where a restaurant is modeledas vector in an n-dimensional vectorial space.

A user query is composed of three parts: the logical query (QL) models the“must” conditions that need to be independently satisfied by the recommendations;the favorite pattern (p) models the “should” conditions which should be satisfied asmany as possible; finally, a vector (w) is reflecting the importance of some featuresover the others.

When the user initializes a recommendation session, the user‘s past history isconsulted to retrieve an initial list of recommendations. A list of ranked recommen-dations (based on the vector w) is delivered for the user, in which case one of threepossibilities may occur. If the user considers one of the recommendations appropri-ate, she may choose it and the process terminates; the recommendation is added tothe user‘s history and will be used in the future to provide new recommendations.If none of the recommendations is appropriate and the user terminates the session,the current case is recorded as a failed one and, again, used for future reference.Finally, the user may consider that a recommendation might suit her needs but

27

some features are not completely in line with her requirements. She can choose tocriticize the recommendation and indicate new features, as well as their type - “wish”or “must”. A new set of recommendations will be computed and again, the threesituations are possible.

Results of MobyRek‘s evaluation show that the critique-based RS converges toa successful recommendation in 2-3 cycles. There may be situations in which thisapproach might be effective (especially on a desktop/laptop computer); however,given the users‘ reluctance to provide feedback for recommendations (even in onecycle and provided the recommendation is a good match) I consider the critique-based RS would generally have a hard time eliciting user preferences, particularlywhen accessed from a mobile platform.

Networks of recipe ingredients

Teng et al. [32] present a recommendation system for recipes that leverages the in-formation encoded in the ingredients network. Parsing the collection of recipes andtheir ingredients, authors use pointwise mutual information 6 to determine whichingredients occur together, and build the complement network; the substitute net-work, which is derived from mining user-generated content to determine suggestionsfor recipe modifications, illustrates ingredients which can be replaced by other in-gredients in the network.

In order to predict recipe ratings, the authors apply stochastic gradient boost-ing trees and support vector machines techniques [32] and demonstrate that thestructure of ingredient network contains valuable information, which improves therecommendation results.

At its core the method is another hybrid approach, in which the content-basedcomponent is represented by the ingredients included in each recipe, while the contri-butions of the community of users are leveraged to determine the substitute networkof ingredients.

2.9 Summary

For a convenient overview of this chapter, table 2 summarizes the main conceptsdiscussed above. The summary includes: the main types of recommendation tech-niques, the data each technique uses, advantages and disadvantages of each ap-proach, and examples from the reviewed literature.

6http://en.wikipedia.org/wiki/Pointwise_mutual_information

http://en.wikipedia.org/wiki/Pointwise_mutual_information

28

Algorithm Data used Advantages Disadvantages Examples

Content-basedfiltering

• Features of items• Vector space repre-

sentation of an item• Active user‘s ratings

for items

• Relies only on theratings the activeuser has awarded• Does not suffer from

the first-rater prob-lem• Easy explanation of

the recommendation

• Overs-pecialization• Limited content

analysis• Limited serendipity

• Pazzani [25],• BlogMuse [36],• PRES [33]

Collaborativefiltering

• Ratings awarded byusers to items• User-centric• Item-centric

• Low complexity de-velopment• Data pre-processing

not required• Item dimensionality

reduction

• Cold-start• “Gray sheep”• Shilling-attacks• Scalability issues

• Pazzani [25]• Folksonomies [5]

Demographicfiltering

• User demographicdata

• Can be used to en-hance the recommen-dations of the previ-ous approaches

• Demographic data isnot accurate and dif-ficult to collect

• Pazzani [25],• Vozalis et al. [35]

Hybridfiltering

• Merges conceptsfrom several of theabove approaches,either in a sequentialway, or in parallel

• Able to overcome theweak points of in-dividual approacheswhile building ontheir strong points

• High complexity de-velopment

• Collaboration viacontent, Mergingseveral techniquestogether [25],• Quickstep [23],• PLRS [20],• MobyRek [27],• Teng et al. [32]

Table 2: Recommendation systems - overview

29

3 Expert systemsEdward Feigenbaum, considered to be the father of expert systems (ES), definesthem as “an intelligent computer program that uses knowledge and inference pro-cedures to solve problems that are difficult enough to require significant expertise”[28].

DeTore [9] defines ES as being computer programs that exploit knowledge fromhuman experts in order to solve problems in a non-procedural manner. A simi-lar definition is provided in [30] where ES are perceived as computerized systemswith embedded human expert problem solving knowledge and inference capabilities.Williams [37] sees the potential of expert systems as alternatives to human expertsand again, able to be deployed in a wide range of narrow domains.

Sasikumar et al. [28] refer to ES as applications which should be able to solvevery complex problems at least as well as human experts. In fulfilling this goal,they do not make use of algorithms, but rather rules of thumb from a very specificdomain. Singh [29] reinforces the statement that ES should be deployed in a veryspecific and limited domain.

A few recurrent topics emerge from the above definitions. First, there is unan-imous agreement that an expert system should embed the knowledge of a humanexpert. In that sense, DeTore [9] makes a clear distinction between knowledge andinformation in the sense that while information can exists by itself, knowledge isinformation, processed such that a decision can be made based on it.

Second, an expert system‘s domain of applicability should be very narrow. Thisidea is strongly connected to the one above. Because an expert system shouldreplace the interaction of the user with the human expert, the person that providesthe domain-knowledge for the system needs to be highly proficient in that domain.A human individual can achieve excellence in a specific domain only if she dedicatesthe majority of her time investigating that domain. Hence, an expert system whichwill leverage the deep, focused knowledge of a human individual, will typically bevery narrowly scoped.

Finally, an expert system should mimic the interaction between the user andthe human expert from one end of the experience to the other. This is typicallyachieved through a set of “if-then” rules, which are fired in a cascading manner [28].For an enhanced user experience the line of reasoning used to infer certain facts canbe explained at the end of the decision making process [37].

3.1 Expert systems architecture

The default architecture of an ES is presented in Fig. 8.The knowledge base (KB) contains the domain specific knowledge. It is the

task of the knowledge engineer to collect and encode the human knowledge from adomain-expert, such that a computer can understand it. The KB typically consistsof both theoretical as well as practical knowledge (heuristics and rules of thumb)[9]. The knowledge can be represented either as past cases, or if-then rules.

The working memory represents a set of facts used to describe a particular situ-

30

ation. It encompasses all the inputs of a program and determines the starting pointof the inference engine. Unlike an algorithm, an ES can start at different points ofits flow, depending on the current data [9].

Figure 8: Expert systems - basic architecture

The inference engine is the heart of ES; it schedules the rules from the KBto determine their sequence of execution and fires them using the data from theWorking memory block as input parameters [28]. The inference engine works byreasoning (chaining facts) about the problem at hand. It does so in one of twopossible ways: forward chaining or backward chaining.

Forward chaining assumes constructing a solution starting from initial informa-tion. The approach is suitable in situations in which there are a small numberof initial conditions and large number of potential solutions [9]. For this reason,forward chaining is also considered a data driven approach to solving a problem.

Backward chaining selects a possible answer and navigates backwards to seeif the input parameters match. It is a suitable technique when there are manyinitial conditions, but few possible results. Backward chaining is considered to be ahypothesis driven approach.

The user interface is the part through which the user interacts with the expertsystem and the main entry-point of the data in the program. Depending on thetype of application, ES can communicate with their users in either an interactiveor non-interactive way [9]. For example, a wine-advisor expert system might modelthe interaction with the user through a set of subsequent questions. Each time theuser provides an answer, the ES adjusts the interaction to fit the newly enrichedcontext. On the other hand, a recommendation system that makes use of domain-expert knowledge might feed recommendations in a non-interactive way, simply bypopulating the working memory with facts from the user‘s past behavior.

31

3.2 The knowledge acquisition process

The knowledge acquisition process has two actors. On one hand, there is the knowl-edge engineer (KE) — a person with extensive knowledge in using and buildingexpert systems. To some extent, the KE should also act as a solution architectwhen defining how the rules should fire. On the other hand, there is the domainexpert, who provides the knowledge to be represented in the KB. The KE observesthe domain-expert taking decisions and reasoning about various situations. In ad-dition, the domain-expert should also explain her reasoning to the KE. The task ofthe KE is to accurately translate the domain-expert knowledge into the set of rules,which will be later incorporated in the KB.

When eliciting knowledge, the KE should consider a different number of sources(such as textbooks or reference manuals) to reinforce, enrich and understand theknowledge shared by the domain-expert [28].

3.3 Limitations and pitfalls

Sasikumar et al. list in [28] a set of possible characteristics a domain should have inorder to support the decision of building an ES for it:

• A domain expert should be available and willing to share her knowledge aboutthe area

• The problem the system is trying to solve could be solved by talking to thedomain-expert in person

• The domain expert can solve the problem in a short amount of time

• The domain expert builds up her skills gradually as she solves more cases

• There is a book or manual which contains the same knowledge as the domainexpert possesses

Even though some, or all of the above items may be present in a situation,there are still limitations and pitfalls of which ES users should be aware. They arediscussed in the following paragraphs.

3.3.1 Choosing the right problem

Choosing a too difficult problem to solve will require more resources: for example,the problem can span across several domains, thus requiring more domain-expertsand perhaps more KEs; it can also translate into an increased number of rules inthe KB which will negatively impact the development time, and even the qualityand performance of the final system.

Also, from a business standpoint, the problem solved by the ES must justify thecosts involved by the development.

Finally, from a technological perspective, for a problem to be suitable for an ES,it should not be easily solvable using an algorithmic approach.

32

3.3.2 Collaborating with the domain-expert

The interaction with a domain expert can be a tedious and sometimes frustratingprocess. As stated earlier, the knowledge acquisition process should happen withthe KE observing actions performed by the domain-expert. It might happen thatthe domain-expert does not find enough time to schedule meetings for interviewsand observations through which the knowledge is elicited.

Provided the KE and the domain-expert find some common ground for the ob-servations to take place, then the KE must demonstrate sufficient skills to interpretthe rules, which the domain expert may specify either in a too simplistic or toocomplex way.

Finally, the KE might find herself faced with the situation in which she has tocooperate with a domain-expert doubting the effectiveness of an ES and thus makingthe cooperation more difficult.

3.3.3 Liability issues

Although it might not always be the case, managers that take the decision of im-plementing ES as part of their offering should be aware of the implications thatpotential failures entail. All actors involved in the life-cycle of an ES are, to someextent, subject to legal action [24]. For example, communication problems duringthe knowledge acquisition process could lead to situations where either the KE, thedomain-expert or the company owning the ES might be charged of negligence. Onone hand, KEs can misinterpret information transmitted by the domain-expert, orthey can invalidate it due to biased opinions or self-overrated capabilities. To thesame extent, domain-experts might not be able to properly articulate the knowledgethey are trying to pass on, or might not completely recall their line of reasoning withrespect to specific situations. Companies may enter the incidence of law simply bybeing situated higher on the hierarchy responsible for the ES [24].

End-users are also responsible for their actions with respect to an ES, especiallywhen referring to the interactive subclass of ESs. Their (non)-erroneous responsesand (poorly) formulated queries impact the final recommendation an ES will provide.

This subsection will stop here, as the liability issues connected to ES operationare far more involved and out of scope for this study. For further law-related detailsthe reader is pointed to source [24].

3.4 Combining recommendation systems and expert systems

The approach of merging the two notions of recommendation systems and expertsystems into an information system is not new. This section reports on the previousattempts to create hybrid systems that are striving to exploit the positive aspectsof RS and ES, while countering their negative aspects.

33

Fuzzy cognitive agents

Miao et al. [22] present a new type of recommendation systems called fuzzy cognitiveagents. A fuzzy cognitive agent provides recommendations based on current user‘spreferences, other user‘s common preferences and domain-expert knowledge. Theagent‘s knowledge model is represented as a fuzzy cognitive map: a weighted, signedand directed graph consisting of concepts and weights and defined as a 2-tupleMFCA = {C,W} where:

• C = {ci|ci ∈ [−1, 1]} - is the set of concepts

• W = {wij|wij ∈ [−1, 1]} - is the set of weights, with i, j = 1 : n

The vertices in the map indicate cause-effect relationship between two concepts,while the weight of one vertex defines the strength of the relationship: a positiveweight means that the higher concepti is, the higher conceptj will be; a negativeweight means that the higher concepti is, the lower conceptj will be. The valueci ∈ [−1, 1] indicates to what extent the concept is present in the map.

The system proposed in [22] is designed for a used-car online store. As such,the domain-expert has identified five attributes of which the buyers are usuallyconcerned when they are considering to purchase a second-hand car: price, model(particularities about the engine, i.e. 2.0 hybrid), mileage, age and make (the brandof the car, i.e., Toyota Prius); and the relationships between them and the customer‘ssatisfaction degree: the higher the price, age or mileage, the lower the satisfactiondegree will become; in turn, the model and make attributes are positively correlatedto the satisfaction degree; moreover, price is negatively correlated to the age andmileage (the higher the mileage, or the older the car, the lower the price), while themodel and make attributes are positively correlated to the price.

The recommendation is computed as a mix of user‘s current elicited preferencesand other users‘ preferences. Initially, the knowledge agent is initialized with thedomain-expert information. The map is then adjusted by applying case-based rea-soning from other users‘ past history, and neural networks learning to infer com-munity‘s general preferences. The current preferences are elicited using explicitinteraction with the user. Two recommendation lists are delivered to the user, onewhich takes into account all three information sources, and the other without takinginto account the user‘s individual preferences.

The authors run two experiments, one in which the knowledge-agent-modelfuzzy-cognitive-map is transferred onto a neural network without taking into ac-count the weights of the map, and the second one including the weights. The resultsindicate a mean square error of 0.01% and 0.005%, respectively, while the accuracyreaches 72.8% and 79.6%.

Multi-agent expert system for electronic store

Lee [17] presents a multi-agent system that uses domain-expert knowledge and col-laborative filtering techniques to provide product recommendations for an online

34

electronics store. The ES he develops does not build user profiles for capturing userpreferences, but rather uses the ephemeral information provided by the user whilevisiting the online store. He motivates his design decision by underlining the lowfrequency with which users purchase electronics items [17].

The ES multi-agent system comprises four agents. The interface agent collectsa set of requirements the user indicates for the future purchase. Because not allthe users have the required knowledge to provide quantitative information aboutelectronic products, the interface agent collects the requirements in a qualitativeway and sends the result to the decision-making agent.

The domain-expert interacts with the knowledge agent to share her knowledgeabout the products existing in the system. Following their interaction a product,which initially has a profile consisting of quantitative features, will also have acorrespondent consisting of qualitative ones. Several domain-experts can input theirknowledge about a certain product; when this happens, their input is combined withthe weights of all the experts being equal.

The decision-making agent receives the qualitative information from the interfaceagent and compares it against the qualitative features resulted from the knowledgeacquisition process. A product is recommended such that it has the largest benefitindicator value and the smallest cost indicator value, that is, the recommendedproduct is positioned closest to the best solution and farthest away from the worstsolution that matches the criteria [17].

Finally, to minimize the user-interface agent interaction, the behavior-matchingagent analyses the current user‘s behavior when answering the qualitative questionsand tries to determine users that are similar, in terms of behavior, with the currentuser. Thus, after each adjustment of preferences, the user is recommended productsthat were previously recommended to the matching users.

The idea of incorporating the knowledge of several domain-experts is valuable,but the reader should recall the limitations of ES mentioned in section 3.3.2. Whilethe automation of the knowledge acquisition process is definitely a plus of the sys-tem presented in [17], combining the domain-experts‘ knowledge in equal parts mightprove to be problematic in certain situations (i.e., domain expert does not fully un-derstand the purpose of the application, or does not fully understand the interface).

A recommendation system for the same domain but with a different implemen-tation is presented in [6]. Many of the authors‘ assumptions and design decisions aresimilar with the ones from [17]. For example, it is assumed that the customers do nothave enough knowledge to answer quantitative questions when eliciting user needs,and qualitative ones are used instead; also, it is assumed that customers do not buyelectronic products so often, hence there is no need to store a past user history; fi-nally, domain-expert knowledge is used to translate user preferences in quantitativemetrics, as well as to accurately represent the products in the database.

The difference between the two is that, if in [17] the author uses multi-attributedecision making to simultaneously consider customer‘s needs, Cao et al. [6] trans-late user preferences in triangular fuzzy numbers and computes similarity measuresbetween two sets of fuzzy numbers. Further technical details can be obtained byconsulting source [6].

35

GymSkill

GymSkill [16] is a smartphone application aimed at addressing several shortcom-ings identified by the authors in a previous extensive app review. Thus, GymSkillconsists of an exercise database, a module for collecting sensor data (using RFID,accelerometer and magnetometer data), a module for evaluating user‘s skill and pre-senting the feedback, and a module that recommends new exercises based on thecurrent skill. GymSkill is designed for balance board exercises. When performingan exercise session, the user is required to place the smartphone on which GymSkillis running, on the balance board. The balance board is augmented with an RFID 7

tag, which enables the phone, through the accelerometer and magnetometer sensors,to record deviations from the initial position [16].

After the completion of the exercise the recorded data is evaluated againstground-truth data and feedback is presented to the user. New exercises are rec-ommended to the user based on the previous skill assesment. The domain-expert‘simplication in this system consists in providing the ground-truth facts, based onwhich the end-user receives customized feedback.

The evaluation of GymSkill shows that the application might help reaching atraining goal, can provide long-term motivation for the end user, as well as recom-mendations aimed at improving certain parts of the human body in a systematicway; thus, the integration of the domain-expert knowledge in the system proves tobe lucrative.

Wine advisor expert system

Finally, Dinuca and Istrate [10] present a wine advisor expert system. In this sce-nario, rather than encoding domain-expert knowledge in lists of weights and rankingsof importance [17], [6], an “if-then-else” rule-based approach coupled with forwardchaining as defined in 3.1 is preferred. I consider the approach to be suitable andI will adapt it for the system proposed in section 5. However, [10] lacks evidencewith respect to the evaluation of the proposed ES therefore, a comparison betweenthe results of my implementation and [10] will not be possible.

3.5 Summary

This section will summarize the expert system concepts defined above by comparingthem to the recommendation system concepts presented in section 2.9; next, areasin which merging recommendation systems with expert systems can prove to beefficient, will be outlined. The results of the comparison are enclosed in table 3.

7http://en.wikipedia.org/wiki/Radio-frequency_identification

http://en.wikipedia.org/wiki/Radio-frequency_identification

36

Concept Advantages Disadvantages

Recommendationsystems

• Easyimplementationfor memory basedtechniques• Easy explanation

for CBR

• Over-specialization(CBR)• Cold-start (CF)• Limited content

analysis

Expert systems • Work better on verynarrow domains• Easy implementa-

tion of rule-baseddecision systems

• Potential faulty col-laboration betweenKE and domain-expert• Liability issues

Table 3: Recommendation systems vs. Expert systems

By augmenting a recommendation system with the knowledge of a domain ex-pert, several of the negative aspects of both approaches can be improved. First, asdiscussed in section 2.7, one major problem of the content-based filtering techniquesis over-specialization. As a reminder for the reader, over-specialization refers to therecommendation system‘s incapacity of recommending items very different from theones a user has previously indicated as relevant. By plugging in a component thatleverages a domain-expert‘s knowledge, recommendations can be enriched with po-tentially novel items. The assumption behind this statement is that a domain-expertis able to tell what a user needs, which may be different than what a user likes. Agood example where domain-expert knowledge is used for this purpose is presentedin [6].

Second, a domain-expert enhanced recommendation system can overcome thecold-start problem of a collaborative filtering technique. Cold-start refers to theinability of a new product (with very few ratings) to participate in the recommen-dation process; or to the inability of a user with no ratings to be recommended withany items. By using knowledge from past experience, a domain-expert can betterarticulate what are the needs of a new user; alternatively, the rules based on whichthe expert system is working can lead to recommending items newly added to thesystem.

Next, in a content-based approach, an item is typically represented as vector inan n-dimensional space. Determining the dimension of the vectorial space is a designdecision which has direct implications on the result of the similarity metric beingused in the system and, consequently, affects the results of the recommendations(e.g., two items represented in a 2-dimensional space may be fundamentally differentthan the same 2 items with n extra dimensions in a different space). A domain-expert can provide new interpretations for the values of an item‘s features and canpotentially enrich the recommendation list or improve the novelty of the results.

Finally, the data used by a recommendation system can be mined, use-cases canbe built and then used to improve the collaboration process between the knowledge

37

engineer and the domain-expert; the knowledge engineer can use the data to betterillustrate the behavior of a particular user, while the domain-expert can betterexemplify what is the result of applying a certain piece of knowledge to a specificscenario.

These ideas represent the main arguments of the decision of building the hybridrecommendation system, which will be described in section 5.

38

4 OmaTauko - Concept DescriptionThis section will briefly present the current state of the art of OmaTauko - a commer-cial system that will benefit from the development of the hybrid recommendationsystem proposed in this study. First, the fundamental idea of OmaTauko is pre-sented, together with an explanation of the basic interaction of the user with thesystem. Next, the entities involved in the system are described by detailing thedomain model.

4.1 OmaTauko - Concept description

Framgo is a start-up founded in September 2012 in Helsinki, Finland, activating inthe domain of occupational health and well-being. Framgo provides occupationalhealth services and products to small and medium sized companies along threecoordinates:

• The digital service Oma Tauko

• An ergonomics division selling products related to ergonomics - the productsincluded in this offering consist of small add-ons, such as back supports for of-fice chairs, ergonomic mouses and keyboards, or larger ones, such as ergonomicchairs or tables with adjustable height

• A wearable device able to measure muscle and fat tissue, heart rate and bloodpressure and several other physiological markers.

OmaTauko is the most developed component of the company‘s offering beingalready launched on the Finnish market. OmaTauko is an occupational healthproduct-service system (PSS) which provides a way to keep short micro-breaks bycombining physical workout gear with a smartphone application. The best way todecrease musculoskeletal problems and prevent a number of connected illnesses ishaving regular physical exercise. Additionally, according to [13], 12 minutes of breakper day give people energy and decrease stress levels. The exercises featured in themobile app are specifically designed by a personal trainer (which in this context actsas the domain expert), to decrease neck, shoulder and back pain.

The contents of OmaTauko can be tailored to a customer‘s needs; a starterpackage consists of the following elements:

• four kettle bells with the weights: 8kg, 6kg, 4kg and 2.5 kg

• a foam roller

• a balance board

• individual stress relief balls for each employee

• a specially designed shelf for storing and easily accessing the equipment

• an introduction session held by a personal trainer at the customer‘s location

39

In addition to the starter package, the employees of the company gain accessto the smartphone application. At the moment, Framgo addresses two out of thethree major mobile platforms - Windows Phone (WP) and iOS - as well as the webplatform.

From a technological perspective, OmaTauko is architected as a client-serverapplication. The server-side component is developed in Python using the Flaskframework and communicates with a PostgreSQL database. Both the WP and theiOS clients are built as thin clients in terms of data storage with all the contentbeing served over the network; therefore, in order for a user to be able to use theapplication, she needs to have access to an Internet connection.

When using the application, the user has an overview of the weekly progress aswell as of the current day. A day in which the user has completed the target of 12minutes of break per day is marked accordingly and in a different way than a daythat was either partially completed, or not started at all (Fig. 9).

Figure 9: Application starting screen

The user can choose to start a new break; when she does so, she is required toselect the devices (tennis ball, kettle bell, own weight, foam roller) with which shewould like to exercise during the break. She is also requested to select the durationof the break (current possible values are 2, 4 or 6 minutes) (Fig. 10). Once shedoes that, she is offered a list of tasks which corresponds to her selection. Whenvisualizing an exercise, the user is displayed with a title and a short description,together with a video playing in infinite loop. Should she decide the current exerciseis not relevant for her break or too difficult, she has the option of skipping it andmoving on to the next one (Fig. 11). However, if too many exercises are skipped ina break, when the selected break duration can no longer be completed, the processis aborted and the user is asked to provide feedback with respect to the reason ofskipping all the exercises.

40

Figure 10: Break configuration Figure 11: Performing an exercise

While using the app, the user also has access to a small set of statistical data,such as a monthly overview of her breaks, the total number of completed, partiallycompleted or incomplete days (Fig. 12), the current number of completed days inrow, or the distribution of exercises with respect to the types of devices available(i.e., 55% of the exercises have been completed using a kettlebell, while 45% of theexercises have been completed doing stretching routines) (Fig. 13).

Figure 12: Monthly statistics view Figure 13: Overall statistics view

The user has the option of scheduling customizable recurrent in-app remindersthat will let her know when is the moment to have a micro-break in order to refill herenergy levels (Fig. 14). Finally, in the settings area, the user can fill in a minimalset of personal information (name, birth date, gender) (Fig. 15).

41

Figure 14: Scheduling reminders Figure 15: User details

4.2 Domain model

The UML diagram representing a fraction of the domain model which is relevant forthis work, is presented in figure 16

Figure 16: OmaTauko - Domain model

42

User

A user gains access to the system following a registration process. After she registers,she is required to introduce a minimal set of demographic information (date of birthand gender). However, at this point in time the demographic information is notmandatory, therefore, it cannot be used in the recommendation algorithm.

Basetask

A basetask is an exercise which a user should perform during a break. It is identifiedin the system by the following attributes:

• name - the title of the task

• description - a small description regarding the movement the user should per-form during the exercise

• category - the type of exercise - based on it, the duration of a task is defined.For the time being, all the tasks have the same type and a duration of 30seconds. Future development will address other types of tasks with variabledurations

• device id - the device with which the task should be executed; a basetask isperformed with exactly one device

• muscle group - the muscle group targeted to be improved through the currenttask; a basetask addresses one primary muscle group

• complexity - the complexity of the movement required by the current task; abasetask has exactly one complexity level

Devices

A device represents a physical object with which a task is being completed. Thecurrent version of the system includes tasks that can be performed with two physicaldevices - kettllebell and tennis-balls - and tasks that can be performed withoutany devices - user‘s own body weight and stretching moves. As such, even thoughbody weight or stretching movements cannot be considered proper devices, theyare included in the list and the user can select them to indicate her preference ofexercise.

Other devices that are considered to be included in the system are a foam-roller,a gym stick, an elastic band and a balance-board.

A device can be associated to several tasks.

Muscle groups

The tasks in the system have been designed by a professional trainer to reducemusculo-skeletal problems and pain in three major areas of the body, as well as

43

improving overall posture. The areas addressed by the exercises are back, shouldersand legs. A small subset of exercises is designed for miscellaneous tasks such assimple massage techniques or wrist rotations. In the system, these exercises fallunder the category labeled “maintenance".

A muscle group can be associated to several tasks.

Complexity levels

Complexity levels capture differences between various tasks on two major coordi-nates: number of muscle groups involved by a movement (e.g., often, a movementdoes not completely isolate a muscle group but instead, it targets a primary groupand incidentally trains a secondary one) and complexity of the movement (e.g.,flexion, joint rotation, rotation and translation, etc.)

A complexity level can be associated to several tasks.

Break

A break represents a set of tasks executed by a user in a certain order, such thatthe user exercises at least the period of time indicated at the beginning of the break(e.g. 2 minutes). A completed break is identified in the system by the followingelements:

• id - an identifier that uniquely points to a break and indicates which taskshave been included in that break

• user id - the id of the user who has performed the break

• date - the moment in time when the user has performed the break

• duration - the duration of the break measured in seconds; a completed breakhas the duration equal to the duration indicated by the user when she startedthe break; a break that was not completed (as a result of skipping too manytasks) is saved in the database with a duration of 0 seconds; a break that wasaborted is not saved in the database

Completed task

A completed task is an instance of a basetask; several completed tasks are part ofa break. The duration of the break dictates the number of tasks included in thatbreak. Considering that for the moment all the tasks have a duration of 30 seconds,the number of tasks for each type of break is as follows:

• a 2 minutes break will consist of 8 basetasks out of which at least 4 need tobe completed


44


The role of the 4 extra tasks that are included in each break is to allow the userto skip a task should she not like it or if it is too complex.

A completed task is identified through the id of the basetask, the id of the breakin which it was included and a boolean flag indicating if the task has been completedor not. The flag is set to TRUE if the task is watched until the end without skippingit, or set to FALSE if a “skip” event has occurred before the duration of the taskhas expired.

45

5 System Design and Implementation DetailsThis chapter will delve in the details of Oskar - a recommendation system thatmakes use of domain-expert knowledge and user preferences to give users tailoredrecommendations. Oskar is the recommendation engine that powers the OmaTaukoproduct-service system.

First, a motivation for the decision of building a domain-expert knowledge-enhanced recommendation system is provided. Next, the overall architecture ofOskar and its components is detailed.

5.1 Motivation for a domain-expert enhanced recommenda-tion system

Having described the current state of OmaTauko, thoroughly detailed the domainmodel and using the insights presented in section 3.5 I will now provide the reasonswhy I consider a recommendation system that leverages the knowledge of a domain-expert would be a good fit for the case of OmaTauko.

First, the domain addressed by OmaTauko is highly specialized and very narrow.Occupational health is a sub-domain of health and well-being and I consider that asystem targeted at this domain (or any of its sub-domains) should benefit from theknowledge of a domain expert. Furthermore, OmaTauko is recommending physicalexercises for its end-users. The exercises involve various types of movements andworking with potentially heavy objects; thus the risk of misuse and injury exists. Thedomain-expert is already part of the OmaTauko offering through his participationin the videos used to illustrate the exercises. However, I consider highly relevantthat the domain-expert should be involved in the decision process as well, in orderto enhance the quality of recommendations for the end-users. This decision is alsobacked up by findings from literature which show that too few health and well-beingmobile apps include evidence-based content and theory-based strategies that wouldlead to a significant improvement in the user‘s life [4], [16].

Second, the current pool of basetasks already contains over 100 items. This, cou-pled with the fact that the primary delivery platform for OmaTauko is representedby the mobile devices, renders the display of all the exercises virtually impossible.One solution would be to present the basetasks in a hierarchical view and allow theuser to select the basetask herself. However, the approach would involve a lot ofuser interaction with the system, and the whole time allocated for the break wouldlikely be spent in browsing the list of exercises. In the current format, OmaTaukoallows a user to start a break and exercise with only three touches of the screen.Another argument against this solution would be that the approach would allow theuser to repeatedly select the same exercise(s) and potentially over-develop only onepart of the body. A recommendation system would solve these issues by rotatingthe exercises in an intelligent manner such that a user does not have to repeat anexercise in successive days.

I am opting for a hybrid recommendation system which blends a content-based

46

filtering technique with the knowledge elicited from a domain-expert. I base mydecision, of including domain-expert knowledge in the recommendation system, onthe fact that in a setting such as health and well-being, only the knowledge ofa domain-expert could lead to generating recommendations that are in-line withthe user‘s current needs. The choosing of a CBR component over a collaborativeapproach is motivated by the fact that users are quite different in terms of physicalcondition and endurance; therefore, what is suitable for one user may be harmfulfor the other and, therefore, the voice of the community should not have a highimpact on the final recommendations. User demographics could have potentiallyimproved this setting; however, at the moment, due to technical limitations and lackof reliable information, a demographic component cannot be implemented. Finally,with respect to over-specialization - the major shortcoming of the CBR approach- I expect this to be compensated by the domain-expert component, as it will beillustrated shortly.

5.2 Domain expert involvement

For the purpose of collecting the domain-knowledge required for our RS, I collab-orated with a domain-expert. In this particular scenario, the domain-expert is apersonal trainer with advanced knowledge in fitness training techniques and train-ing schedule creation.

Over the course of several meetings we discussed the aspects that should be takeninto consideration when building a list of recommended tasks that are included ina break. We encoded the results of our discussions in “if-then” rules, as presentedin [10]. The domain-expert is actively involved in three of the four componentsof the algorithm, namely: user level decision component, expert recommendationcomponent and tasklist ranking component.

Additionally, using concepts presented in [17] and [22], I asked the domain-expertto assign weights to the following concepts in the system: devices, muscle groups,complexity levels. By using these weights, I wanted to better capture the differencesbetween various types of devices or muscle groups. For example, working witha kettlebell is fundamentally different than working with a tennis-ball. Similarly,performing an exercise for legs has a different impact than performing an exercisefor shoulders. Moreover, I used the weights to compute various similarity measuresand decide on the one that best captures the difference between two basetasks. Theresults are synthesized in section 5.3.2.

5.3 System description

5.3.1 Objective

The entities participating in the proposed recommendation system are:

• active user - the user who explicitly takes a break and is recommended with alist of exercises;

47

• basetask - an exercise with a title and description, which must be executedwith one of the available devices, having a well defined complexity level andtargeting a primary muscle group

• user preferences

– a list of breaks performed in the past and their corresponding tasks.The user preferences aggregate the total duration of the workout over acertain period of time, and whether the tasks were liked by the user ornot. The system differentiates between a liked/not liked task by usingan implicit feedback heuristic: if an exercise runs its course until the endof its duration (e.g., 30 seconds) then it is inferred that it was a suitableexercise and it is marked as liked; if an exercise is skipped, regardless ofthe fact that it was started or not, it is inferred that the exercise was notsuitable and it is marked as unliked

– break duration - the duration of the current break, in minutes, explicitlyindicated by the user through the user interface

– list of devices - a list of devices with which the user would like to performthe exercises in the current break; explicitly defined by the user throughthe user interface

• domain-expert knowledge

– set of rules according to which the expert recommendation should becomputed; the rules are detailed in section 5.3.4

– set of weights associated with the concepts of muscle group, devices andcomplexity levels, aimed to better discriminate between different valuesof the same concept

Thus, the problem the algorithm is trying to solve, can be stated as following.

Problem statement

Given the user‘s explicit preferences with respect to the duration of the break andthe devices which she would like to exercise with, the algorithm makes use of pastuser preferences and domain expert knowledge to compute a list of exercises that fitthe current level of the user, the desired duration and the selected devices.

5.3.2 Choosing the similarity measure

In order to decide on which similarity measure to use, I have performed severalexperiments computing various similarity measures between each possible pairs oftasks from our collection. I have searched for a similarity measure that would transi-tion smoothly from the minimum possible value (0) to the maximum possible value(1) across the whole set of possible pairs. I was interested in a smooth transitionsuch that, when applying a similarity threshold, I would not loose too many itemsdue to large gaps between similarity scores.

48

Relative weights

First, I have tried an approach in which a task has exactly one muscle group asso-ciated to it. Thus a task is represented by a vector in a 3-dimensional space havingthe following configuration:

• t =< muscle_group, device, complexity >

where the concepts of muscle group, device and complexity were encoded using thevalues from Fig. 17. The weights were awarded by the domain-expert and aredesigned to capture the differences between two elements from the same category.For example, an exercise for the back is closer in execution to an exercise aimedfor shoulders and more different than an exercise aimed for the legs. All of themare fundamentally different from the maintenance exercises. Likewise, an exerciseperformed with the user‘s body-weight, is more similar to an exercise performedwith a kettlebell, and both are fundamentally different from an exercise performedwith the tennis ball.

Muscle group WeightMaintenance 0.001Shoulder 0.1Back 0.3Leg 1

Device WeightTennisball 0.001Stretching 0.01Bodyweight 0.3Kettlebell 1

Complexity WeightLow 0.01Medium 0.1High 1

Figure 17: Domain-expert defined relative weights

I have measured the cosine similarity, Jaccard Index (intepreting a tuple as aset) and Euclidean distance. The results are shown in Fig. 18.

Not surprisingly, the Jaccard Index is not a good similarity measure in the currentrepresentation, due to the small number of distinct elements in a tuple. Euclideandistance is performing better than Jaccard Index; however, there are still areas on theplot where the transition from one pair to another is done in significant steps. Cosinesimilarity captured best the pattern I was looking for; however, cosine similarity isunable to capture the difference in magnitude between two vectors. For example,consider two tasks t1 =< 0.001, 0.001, 0.01 >, t2 =< 1, 1, 1 > corresponding to thetwo most different tasks in the system. According to (6), sim(t1, t2) ≈ 0.6, whilethe Euclidean distance between the two, as defined by (9) is d(t1, t2) = 0.98, whichcorresponds to a similarity of 0.02, managing to capture in a more accurate mannerthe difference between the two items.

49

0 200 400 600 800 1,000 1,200−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Pairs #

Simila

rity

Cosine similarityJaccard Index

Euclidean Distance

Figure 18: Similarity metric comparison for task representation t =< m, d, c >

11-dimensional, boolean representation

In the second approach I have increased the number of dimensions on which a taskis represented, in a fashion similar to the one described in [35]. Thus, a task isrepresented in an 11-dimensional space having the following format:

• features 1-4 encode the muscle group targeted by the task; only one feature inposition 1-4 has a value of 1, while the others have a value of 0; the sequenceof features 1-4 is < shoulder, back, leg,maintenance >

• features 5-8 encode the device with which the task should be performed; onlyone feature in position 5-8 has a value of 1, while the others have a value of 0;the sequence of features 5-8 is< tennisball, stretching, bodyweight, kettlebell >

• features 9-11 encode the complexity of the task; only one feature in position 9-11 has a value of 1, while the others have a value of 0; the sequence of features9-11 is < low,medium, high >

Therefore, a task targeted for legs, of medium complexity and executed withbodyweight has the encoding t =< 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0 >.

For this approach I have only considered cosine similarity and Euclidean distance,and ruled out from the start Jaccard index on the account that all the distinctelements in this space were 0, 1 and the Jaccard index would have had only threepotential values: {0, 0.5, 1}.

The results obtained using this representation were worse than the results yieldedin the first approach (Fig. 19).

50

0 200 400 600 800 1,000 1,200−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Pairs #

Simila

rity

Cosine similarityEuclidean Distance

Figure 19: Similarity metric comparison for task representationt =< m,m,m,m, d, d, d, d, c, c, c >

6-dimensional, fuzzy representation of the muscle groups

Finally, in the third approach I have represented the task as a 6-tuple, t =<m1,m2,m3, m4, d, c >, with the following features:

• features 1-4 capture the targeted muscle group of the exercise; however, in thisapproach I am leveraging the fact that the exercises in the system do not fullyisolate a muscle group; therefore, a task is primarily addressed for a dominantmuscle group, but it may happen that at least another group is trained invol-untarily and to a small extent. The sum of the first four components shouldbe 1.

• feature 5 represents the device with which the task should be performed

• feature 6 represents the complexity of the task

• each of the components has associated to it a weight defined as:wmi

= 0.25, i = 1 : 4

wd = 1

wc = 1

(15)

The domain-expert has indicated the dominant muscle group for each exerciseand, if the task does not fully isolate the movement, at least one adjacent muscle

51

group that is incidentally trained. I have tested this representation for cosine sim-ilarity, classic Euclidean distance, and three variations of a weighted distance asdefined by the equation:

wdk(a, b) =

n∑

i=1

wi · (ai − bi)2

n∑i=1

wi

k

(16)

for k = 0.5, 1, 2. The similarity between two items, when using the wdk distances,is defined as:

swdk(a, b) = 1− wdk(a, b) (17)

0 400 800 1,200 1,600 2,000 2,400 2,800 3,200−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Items

Frequency

dwd1wd2wd1/2simjac

Figure 20: Similarity metric comparison for task representationt =< m1,m2,m3,m4, d, c >

The results from Fig. 20 suggest the following:

• Jaccard Index (curve jac - purple color) exhibits very sharp transitions fromone value of similarity to the next therefore, it is not the desired metric

• Cosine similarity (curve sim - orange color) has a smooth transition over therange of possible value and the shape would recommend it as a good candidate;however, as discussed earlier, it does not account for the magnitude of thevectors between which the similarity is computed, therefore it will not be ableto detect differences in magnitude for the components d and c of the task.Consequently, I have discarded this measure as well

52

• Euclidean distance (curve d - red color) exhibits a very sharp increase aroundthe similarity value of 0.65. Moreover, if a threshold of 0.6 would be consideredto define similarity between two items, when using this distance only approx.600 pairs of items would pass it, which accounts for 35 distinct items (k itemsgenerate k(k − 1)/2 pairs). Considering that some of the items might not fitthe user-preference requirement, I have considered the resulting number to betoo low; lowering the threshold would defeat the purpose of item similarity.Therefore, we discard this measure as well.

• Weighted distance (k = 1/2) (curve wd1/2 - black color) has a similar shape tocurve d, therefore the same arguments stand and I did not consider the metricas suitable

• Weighted distance (k = 2) (curve wd2 - blue color) is not too discriminativefor the current set of items. Even a restrictive threshold of 0.8 would result inconsidering approx. 2200 pairs (approx. 65 items).

• Weighted distance (k = 1) (curve wd1 - green color) is the metric which I havedecided to use since it exhibits the best trade-off in terms of discriminativepower and shape.

5.3.3 Choosing the similarity threshold

In order to determine the value of the threshold used for deciding on the similaritybetween two items, I have computed the similarity of each item against all the otheritems, and plotted the values in a heat-chart map presented in Fig. 21.

Figure 21: Heat-chart map of the similarities between all pairs of items,wd1, τ = 0.8

53

The matrix has the dimension 78 × 78 with the element Sim(i, j) representingthe similarity between the item at position i and the item at position j. Due tothe reflexive properties of the considered metric (wd1(i, j) = wd1(j, i)) the matrix issymmetrical with respect to the first diagonal.

In order to determine a suitable value for the similarity threshold (τ) I appliedthe following rules:

colorcell(i, j, τ) =

white, Sim(i, j) < τ

red, Sim(i, j) = τ

green, Sim(i, j) = 1

linearly interpolated between red and green, τ < Sim(i, j) < 1for τ = 0.8.

Fig. 21 suggests that, with only three exceptions, τ = 0.8 is a good value for thesimilarity threshold, as for each item there is a list of at least 30 potential similarcandidates.

5.3.4 Oskar architecture

The architecture of Oskar is presented in Fig. 22. The green (dark) blocks repre-sent the areas where the domain-expert is involved, while the yellow (light) blocksrepresent the areas where user preferences are captured or exploited.

User Interface

The user interface module provides the user with means to interact with the system.In the current setting the user has two available platforms of interaction: mobile(using touch technology) and web (using classic keyboard and mouse interaction).The user interface is the entry-point in the recommendation process. From thestarting screen (Fig. 9) the user indicates that she would like to start a break.The next step is to select the devices with which she would like to exercise duringthe break, and the duration of the break. Once this information is provided, therecommendation process is initiated.

User Level Update Component

The User Level Update Component uses as an input the user‘s current level and herpast history and, using domain-expert defined rules, it computes an updated userlevel. Current possible levels for the user are beginner, intermediate and advanced.The domain-expert has defined rules for upgrading/downgrading from one user-levelto another as follows:

• [Upgrade rule #1] a user can advance from beginner‘s level to intermediatelevel if she exercises an average of 40 minutes per week for the past threeweeks; the reason behind this rule is that beginner users have to show a fairlevel of commitment and develop a healthy habit for exercising

54

• [Upgrade rule #2] a user can advance from intermediate level to advancedlevel if she exercises an average of 55 minutes per week for the past two weeks;the reason behind this rule is that an intermediate user which has alreadydeveloped a healthy habit for exercising must show an increased level of com-mitment, for the results to start paying off.

• [Upgrade rule #3] a user can maintain her advanced level if she exercises anaverage of 60+ minutes per week in the past week; the intuition behind thisrule is that staying in top shape should be more difficult than getting there

• [Downgrade rule] a user is downgraded to the previous level, if she fails toachieve the average weekly break duration for her level

Figure 22: Oskar architecture

55

Training Type Decision Component

The training type component receives as input the updated user level and the user‘spast preferences, and decides on the type of training for the next break. Two optionsare available: holistic training and focused training.

A holistic training is an exercise plan for a break, which targets all the musclegroups the system is trying to address (e.g., back, shoulder, leg, maintenance). Bycontrast, a focused training is an exercise plan for a break, which targets at mosttwo muscle groups.

The decision regarding the type of training is made using the following rules:

• [Holistic training rule #1] If the user level is beginner, then the trainingis holistic

• [Holistic training rule #2] If the user level is at least intermediate then,if in the last week she did not have at least three days with at least threetrainings, then the training is holistic

• [Focused training rule] - the else branch of [Holistic training rule #2]

The decision regarding which muscle groups should be included in the break ismade as follows:

• [Muscle groups rule #1] If the training type is holistic, then all the musclegroups are included in the training

• [Muscle groups rule #2] If the training type is focused, the next musclegroup(s) that should be trained is(are) selected. The sequence in which themuscle groups should be trained is defined by the domain-expert.

• [Muscle groups rule #3] If the duration of the training is 2 or 4 minutes,only one group is included; if the duration is 6 minutes, two muscle groups areincluded

The result of this component is a list of muscle groups which should be targetedby the exercises in the new break.

Domain Expert Recommendation Component

The domain-expert recommendation component uses the muscle groups and list ofdevices and returns all the tasks that match these restrictions. The result serves asone of the inputs for the tasklist ranking component.

User Preferences Recommendation Component

The user preferences recommendation component implements a content-based fil-tering technique as described in section 2.4.

First, the date of the last workout for the current user is retrieved. If the date ofthe last break is at most a week in the past, then the tasks included in all the breaks

56

performed in the last week are considered, and the 15 most recent are considered tobe user‘s history. The number is a control parameter and can be adjusted for finetuning the algorithm.

The reason I have decided on this value is that in the current setting, in whichall the tasks have a duration of 30 seconds, 15 tasks are sufficient to cover 9 minutesof breaks. Considering that only breaks of 2, 4 and 6 minutes are currently possible,this condition is strong enough to guarantee that a user is not performing the sametasks twice in consecutive breaks (e.g., the worst case scenario is to have a break of6 minutes, followed by another break of 6 minutes).

If the date of the last break is more than a week in the past, then the user historywill consist of the last 5 completed breaks. The parameter is subject to change forthe purpose of fine-tuning the algorithm. Again, the size of the user‘s history istruncated to at most 15 tasks, this time for optimization purposes. The final set ofat most 15 tasks is referred to as the user_history_set.

Next, the remaining tasks, which are not included in the user history set, areretrieved. This set is referred to as the candidate_set. A similarity measure betweeneach element from the user_history_set and each element from the candidate_set iscomputed, and only those tasks for which the similarity meets a predefined thresholdare kept. The resulting set of similar tasks is delivered as the second input for theranking component.

Ranking Component

The objective of the ranking component is to merge the two list of tasks (expertrecommendations and user-preferred), select a number of tasks per break accordingto the duration of the break, and order them in such a way that the number ofskipped tasked will be minimized.

• [Ranking rule #1] The tasks recommended by the domain expert have pri-ority over user-preferred similar tasks

• [Ranking rule #2] If the duration is 2 minutes, the tasks with the moreimportant devices (Fig. 17) are scheduled in the beginning of the break

• [Ranking rule #3] If the duration is 4 or 6 minutes, no two tasks that aretargeted for the same muscle group should come in succession

• [Ranking rule #4] If the duration is 4 or 6 minutes, the tasks with lowcomplexity come before the tasks with higher complexity (to allow the user aminimal warm-up period)

An additional four tasks are added to the final list of recommendations, in orderto give the end-user some room to tailor her training, if the domain-expert recom-mendation does not fully suit her. The extra tasks are primarily retrieved from theuser-preferred set.

57

Feedback Component

The ranked list of tasks is delivered to the end-user so she can start her break. Ifthe user completes an exercise, it is marked as liked. Alternatively, if the user skipsan exercise, it is marked as not liked. Therefore, an implicit feedback collectiontechnique is being used, similar to the ones presented in [25] or [33]. A completedbreak, either fully completed or fully skipped, is saved in the Usage Log componentfor future reference.

58

6 System Evaluation

6.1 Experiment design

For the purpose of evaluating the recommendation system, I have designed a surveyto measure the performance of Oskar from two perspectives. An objective measure-ment was aimed at capturing the global success rate of the recommendation system,defined as below:

GRSsuccess =#relevant breaks#total breaks

(18)

where a break is considered relevant if less than 5 tasks in it were skipped.A second objective measurement was aimed at capturing the succes rate of the

recommendation from a user‘s perspective and is defined by the equation:

URSsuccess =#relevant breaks per user#total breaks per user

(19)

A subjective measurement was aimed at capturing the user benefits that canbe achieved through such a recommendation system, that blends domain-expertknowledge with user preferences. For this measurement, several dimensions wereconsidered:

• Motivation - is the system able to motivate the user to exercise regularly

• Usefulness - is the system able to help preventing the occurrence of neck andback pain resulted from spending long hours in front of the computer

• Fitness relevance - can the system help in reaching a training goal

• Freshness - is the system able to recommend fresh items over a period of timesuch that a user is motivated to keep using the system

• Customization - is the system able to provide tailored recommendation, inaccordance with the user‘s preferences and her skill levels.

The questions used in the survey can be found in section 8.In order to generate accurate recommendations, past user-history was required

for both the domain-expert-based component as well as for the user-preferences-based one. However, the commercial nature of the system, and the external contextwere important factors that prevented the evaluation of the system in a full-fledgedreal-life scenario. Specifically, OmaTauko‘s customers were not willing to participatein a research project and generate enough information that would help Oskar toprovide accurate recommendations. Reasons for declining the participation variedfrom lack of interest to lack of resources (e.g. time, personnel). Therefore, theevaluation scenario had to be slightly adapted in order to allow proper evaluation.

To this end, exploiting a shortcoming of the current setting - namely that beforeimplementing Oskar recommendations were generated in a pure random fashion - I

59

have generated several user profiles and 3 weeks-old user history behavior attachedto those profiles. The generated user behaviors fit 4 categories:

• “power” users - work 5 days per week, an average of 12+ minutes per day

• “average+” users - work 4 days per week, an average of 8 minutes per day

• “average-” users - work 3 days per week, an average of 4-6 minutes per day

• “lazy” users - work 2 days per week, an average of 2-4 minutes per day

The contents of each break were generated randomly using the old random al-gorithm. The rating behavior of the users was generated in such a way that all thebreaks were considered relevant; each break contained 3 skipped tasks in order tobetter capture the user‘s preference with respect to a particular task. It should benoted that this heuristic is different than a real-life scenario, where a user wouldprobably consistently indicate that she does not like a task. In order to compensatefor this shortcoming, when the user-preferences based recommendation is computed,a task is considered to be part of the user‘s history and contributes to the recom-mendation process if, in the considered period of time, it was liked more times thanit was skipped.

Having user profiles generated and in place, an anonymous online survey wasdesigned to collect the desired information. The survey consisted of 5 steps:

• the first three steps were aimed at collecting the respondent‘s opinion aboutrecommended breaks. The respondent was asked to evaluate three breaks:2 minutes, 4 minutes and 6 minutes long - for which the equipment (tennisball, kettle-bell, stretching, body-weight) was randomly assigned. The userwas displayed with the targeted muscle groups for each break, as well as withthe title, description and the video of the exercise for each task. The rat-ing of the exercise was captured through an explicit action: pressing an OKbutton for indicating the relevance of the exercise, or pressing a Skip buttonfor indicating the irrelevance of the exercise. Moreover, when indicating rele-vance/irrelevance of an exercise, the participants were instructed to evaluatethe exercise from the following perspectives: compliance with personal fitnesslevels, complexity of the movement, if it is indeed relevant for the targetedmuscle groups.

• in the fourth step, the respondent was asked to fill in a questionnaire consistingof five closed questions (1-5 Likert scale) in order to capture her opinion aboutthe five previously mentioned user-benefits.

• the fifth step consisted of collecting minimal demographic information fromthe respondent: age and gender

It is worth noting that when a survey respondent was recommended with a newbreak, the break was generated each time for a previously created user profile thatfit one of the categories “power”, “average+”, “average-” or “lazy” user. In this way,a relatively small amount of users was used to evaluate recommendations for threetimes more synthetic user profiles from the database.

60

6.2 Results and discussion

The survey was sent to 50 participants over the course of a week; no incentive forparticipating in the survey was provided. 35 participants successfully filled in thesurvey resulting in a participation success rate of 70%. The gender distribution ofthe participants was 28.5% women, 71.5% men, (age mean: 26.65 years, standarddeviation 3.11 years). All the participants in the survey were involved in a form ofwork which entailed spending many hours in front of the computer, thus fitting themarket segment targeted by OmaTauko.

The first finding of this study is related to the accuracy of the recommendationsystem. A total number of 35 users successfully participated in the survey; foreach user 3 breaks have been generated, one for each available duration (2, 4 and 6minutes), resulting in a total of 105 breaks. Using the heuristic previously described- a break is successful if it contains less than 5 skipped tasks - the analysis revealeda total number of 77 successful breaks, leading to GRSsuccess = 73.3%.

Next, I was interested in finding out if there is a pattern in the nature of theskipped breaks with respect to the duration of the break.

2 4 60

5

10

15

Break duration (minutes)

#of

skippe

dbreaks

Figure 23: Distribution of skipped tasks over break duration

As expected, the 6 minutes breaks were the ones with the highest incidence ofbeing skipped, while the 2 minutes breaks were only skipped once. One reason forthis might be the excitement level of the survey respondent.

On one hand, a 2 minute break contains only 8 exercises, hence the list was rathershort; moreover, most of the users were in contact with the concept of OmaTaukoand Oskar for the first time and the novelty of the system might have contributedto the high succcess rate of the 2 minutes breaks, which were first provided to theuser. On the other hand, a 4 minute break contains 12 exercises, while a 6 minute

61

breaks contains 16 exercises. The possibly high number of exercises included in a 4or 6 minute break, coupled with the repetitive nature of the task of rating exercises,might have contributed to a significant drop in the success rate of 4 and 6 minutesbreaks, compared to 2 minutes breaks.More important, as previously described, a survey participant was not assigned onlyone user profile from the databse, but instead he rated recommendations providedbased on three different user profiles. This detail might have led to situations inwhich a respondent was asked to rate in a 6 minute break, an exercise which he hadpreviously completed in a 2 minute break. It is highly likely that in these situations,respondents skipped the exercise the second time they were recommended with it.

Fig. 24 displays the histogram of the recommended tasks for the frequency of therecommended tasks as well as the frequency of skipped tasks. As the figure suggests,all the tasks were involved in the recommendation process for the considered user-base, demonstrating that the algorithm is able to cover the whole dataset. Each taskwas recommended at least 6 times, and on an average 15.88 times. With respect tothe number of skipped tasks, apart from 5 items, all the other ones were skipped atleast once.

0 10 20 30 40 50 60 700

5

10

15

20

25

30

35

Task ID

#of

recommenda

tion

s/skips

SkippedRecommended

Figure 24: Histogram of recommended/skipped exercises

There were 10 tasks that were skipped in more than 50% of the cases when theyoccurred in the recommendation. 6 of these tasks were targeted for shoulders, oneinvolved working with the body weight, and the remaining three involved workingwith a tennis-ball.

Fig. 25 illustrates the success of the recommendation system from a user‘s per-spective.

62

0/3 1/3 2/3 3/3

5

10

15

# skipped breaks / # recommended breaks

#of

users

Figure 25: URSsuccess

As the figure suggests, 48.6% (17) of the survey respondents did not invalidateany break through their skipping behavior, 22.9% (8) of them indicated that onebreak did not contain enough relevant exercises, while the remaining 28.5% (10)indicated two breaks that did not contain enough relevant exercises. None of theusers has invalidated all three recommended breaks.

Finally, the results of the subjective evaluation of the recommendation systemare presented in Fig. 26. The survey captured the user benefits of Oskar in termsof 5 dimensions.

First, with respect to motivation, 80% of the users provided a positive feedback(either Strongly Agree or Agree) considering the system‘s capability of increasingend-users‘ motivation level to exercise regularly.

Second, regarding the system‘s usefulness in preventing the occurrence of neckand back problems, 85.7% (30 users) expressed themselves in a positive manner,while only one user disagreed with the affirmation.

Third, in terms of the system‘s capability to help reaching a training goal, 48.5%of the respondents replied positively to this question, 28.5% of them were neutraland the remaining 23% were skeptical on this coordinate. The slightly poor resultswith respect to this coordinate might be attributed to the fact that OmaTauko isnot necessarily aimed for regular training and achieving fitness goals, but rather forpreventing musculoskeletal problems. In that sense, the users‘ responses illustratesthat this question might not have been relevant in the context of the survey and thedescribed service.

In terms of the system‘s capacity to recommend fresh tasks on a regular basis,80% of the respondents believed this system is able to provide fresh, non-redundantrecommendations, while the remaining 20% were neutral with respect to this dimen-

63

sion. None of the respondents answered negatively to this question.Finally, in terms of the system‘s capability to provide tailored recommendation

in accordance with the user‘s profile, 77% replied in a positive way (34% - StronglyAgree, 43% - Agree) while only 5% replied in a negative. The high scores of the lasttwo items come to support the results obtained through the objective measurements(recommendation success rate from a global and user perspective) and validate thefindings of the survey.

Motivation Usefulness Fitness Freshness Customization

0

5

10

15

20

25

4

17

1

4

12

24

13

16

24

15

4 4

10

76

1 1

6

0

22

0

2

0 0

#respon

dents

Strongly Agree Agree Neutral Disagree Strongly Disagree

Figure 26: Subjective evaluation of the recommendation system

In conclusion, the evaluation results are generally positive and show that such arecommendation system would be of good use for the end-users. However, the smallsample of respondents to the survey, and the pseudo-synthetic nature of part of thedata represent an important limitation of this study, which should be addressed infuture research. The next logical step is to confirm the results of this evaluation pro-cess by deploying the recommendation system in a real-life scenario and to performrelevant quantitative and qualitative analysis over a longer period of time.

64

7 Conclusion and Future Work

7.1 Conclusion

Information overload refers to the situation in which a user‘s access to informationis limited due to the high number of available options which add an overhead in thedecisional process to select the relevant information. This work has tackled the prob-lem of information overload in the domain of occupational health and well-being.In a world where the individuals spend increasing amounts of time while stayingconnected and generating data, the problem of information overload is becomingincreasingly relevant. Recommendation systems are an information filtering tech-nique that have the potential of solving the problem of information overload; theyguide the user in a large universe of information towards items that are likely to berelevant for her.

The second facet of the problem consists of the significant resources (materialand human) spent in the domain of healthcare and the high costs of treating a setof diseases that otherwise can be easily prevented at smaller costs. The adoptionof mHealth - medical and health-related services and products supported by mobiledevices - has offered the possibility that a set of diseases (e.g., obesity, circulatorydiseases, stress, musculo-skeletal problems, etc.) could be prevented, by making useof information consumed through a mobile phone and a data connection.

In such a context, this study has highlighted the importance of users havingaccess to high-quality, health-related information through their mobile devices, andsuggested as a solution the integration of domain-expert knowledge into recommen-dation systems, in order to provide relevant information for end-users.

Two research questions were addressed. First, this study attempted to findout how can domain-expert knowledge be used to enhance user-preference basedrecommendations. As an answer, Oskar - a hybrid recommendation system thatblends domain-expert knowledge with user preferences - was implemented and pre-sented. Oskar is the recommendation system that powers OmaTauko - a healthand well-being product-service system, which enables end-users to keep and trackmicro-breaks aimed at decreasing their musculo-skeletal problems and increasingtheir energy levels.

Second, this study tried to elicit the user benefits that can be achieved by aug-menting preference-based recommendations with domain-expert information. Eval-uation results indicate that 73.3% of the recommendations were accurate; moreover,all the evaluation participants were able to complete at least one break, with 48.6%of them indicating that all the breaks were relevant and were able to complete allthree of them. Furthermore, on an average, 80% of the respondents provided positivefeedback with respect to the system‘s ability to motivate them, system‘s usefulness,fitness relevance, freshness and demonstrated capabilities of customization.

65

7.2 Future work

Future development of this work should focus on a number of topics. The firstmajor area that should be addressed is a more thorough evaluation of the system.The commercial nature of the service has made evaluation quite difficult and thesample used for measuring the performance of the recommendation system wassmall. This is clearly a limitation of this study and one step to overcome it wouldbe to arrange a real-life scenario with the deployment of the recommendation systemin a production environment. As a suggested method, I recommend collecting users‘behavior over a certain period of time using the current state of the system (withouta recommendation system in place). Next, without letting the user-base know thata major feature has been released, deploy the recommendation system and repeatthe same measurements performed in the first part of the experiment. At the endof the second period a comparison of the user‘s behavior, in the setting before andafter the installation of the recommendation system, can be performed in order toobjectively measure the impact of the recommendation system and the accuracy ofthe recommendations.

Second, with respect to the system‘s user-experience and user-interface (UX &UI), an explanation interface of the provided recommendations should be developed,in order to motivate the decisions of recommending one item instead of the other.In the same area of UX & UI, the user should be aware of the rules used by therecommendation system and appropriate elements should be included (e.g., let theuser know how much does she still need to work until she reaches the next level,what are the benefits of the next level).

Finally, in order to mitigate the end-users‘ possible reluctance to use the ser-vice, this should include functionality that would allow exercising in groups and,accordingly, a group-recommendation system would be of high value in this context.Group recommendation systems aggregate the models of individual users to providea meaningful recommendation for the active group [15]. Enabling end-users to workin groups is likely to reduce the social pressure which some of the end-users mightexperience; also, this direction of development provides opportunities of includinga collaborative-filtering component in the recommendation algorithm, and allowingthe user community to have a stronger voice in the recommendation process.

66

8 Appendix - Survey questions1. [MOTIVATION] This system would motivate me to exercise regularly.

A.Strongy Disagree B.Disagree C.Neutral D.Agree E.Strongly Agree

2. [USEFULNESS] This system will help in preventing the occurrence of neckand back problems associated to long hours spent while sitting in front of thecomputer.A.Strongy Disagree B.Disagree C.Neutral D.Agree E.Strongly Agree

3. [FITNESS RELEVANCE] This system could help reaching a training goal.A.Strongy Disagree B.Disagree C.Neutral D.Agree E.Strongly Agree

4. [FRESHNESS] I am satisfied with the variety of exercises provided per break(in terms of not repeating the same exercises over a certain period of time).A.Strongy Disagree B.Disagree C.Neutral D.Agree E.Strongly Agree

5. [CUSTOMIZATION] I could use a system that would provide more tailoredexercise suggestions (with respect to my preferences and skill level).A.Strongy Disagree B.Disagree C.Neutral D.Agree E.Strongly Agree

67

References[1] mhealth - mobile technology poised to enable a new era in health care, Tech.

report, Ernst & Young, 2012.

[2] Gediminas Adomavicius and Alexander Tuzhilin, Personalization technologies -a process oriented perspective, Communications of the ACM 48 (2005), no. 10.

[3] Chris Anderson, The long tail: Why the future of business is selling less ofmore, Hyperion ebook, 2009.

[4] Kristen M.J. Azar, Lenard I. Lesser, Brian Y. Laing, Janna Stephens, Magi S.Aurora, Lora E. Burke, and Latha P. Palaniappan, Mobile applications forweight management, American Journal of Preventive Medicine 5 (2013), no. 45,583–589.

[5] Toine Bogers and Antal van den Bosch, Collaborative and content-based filteringfor item recommendation on social bookmarking websites, Proceedings of theACM RecSys’09, Workshop on Recommender Systems & the Social Web (2009).

[6] Yukun Cao and Yunfeng Li, An intelligent fuzzy-based recommendation systemfor consumer electronic products, Elsevier, Expert Systems with Applications33 (2007).

[7] Paul-Alexandru Chirita, Wolfgang Nejdl, and Cristian Zamfir, Preventingshilling attacks in online recommender systems, Proceedings of the 7th annualACM international workshop on Web information and data management (NewYork, NY, USA), ACM, 2005, pp. 67–74.

[8] M.F. Costabile, D. Fogli, C. Letondal, P. Mussio, and A. Piccinno, Domain-expert users and their needs of software development, Proceedings Session onEnd-User Development held at HCI International 2003 Conference (2003).

[9] Arthur W. DeTore, An introduction to expert systems, Journal of InsuranceMedicine 21 (1989), no. 4.

[10] Elena Claudia Dinuca and Mihai Istrate, Wine advisor expert system usingdecision rules, Annals of the University of Oradea, Economic Science Series 22(2013), no. 1, 1853–1864.

[11] Deloitte Center for Health Solutions, mhealth in an mworld - how mobile tech-nology is transforming health care, Tech. report, Deloitte Center for HealthSolutions, 2012.

[12] Mustansar Ghazanfar and Adam Prugel-Bennett, "fulfilling the needs of gray-sheep users in recommender systems, a clustering solution", 2011 InternationalConference on Information Systems and Computational Intelligence, January2011.

68

[13] http://www.healthisajourney.net/fitness-community-blog/100-60-minutes-of-exercise-a-week-can-change everything, 60 minutes of exercise per week canchange everything.

[14] http://www.sciencedaily.com/releases/2013/05/130522085217.htm, Big data,for better or worse: 90sciencedaily.

[15] Anthony Jameson, More than the sum of its members: Challenges for grouprecommender systems, Proceedings of the Working Conference on AdvancedVisual Interfaces (New York, NY, USA), AVI ’04, ACM, 2004, pp. 48–54.

[16] Matthias Kranz, Andreas Möller, Nils Hammerlac, Stefan Diewaldb, ThomasPlötz, Patrick Olivier, and Luis Roalter, The mobile fitness coach - towardsindividualized skill assessment using personalized mobile devices, Elsevier, Per-vasive & Mobile Computing (2012).

[17] Wei-Po Lee, Applying domain knowledge and social information to product anal-ysis and recommendations - an agent-based decision support system, ExpertSystems 21 (2004), no. 3.

[18] Jure Leskovec, Anand Rajaraman, and Jeff Ullman, Mining of massive datasets,2 ed., 2013.

[19] Pasquale Lops, Marco de Gemmis, and Giovanni Semeraro, Recommender sys-tems handbook, ch. 3, Springer, 2011.

[20] Jie Lu, A personalized e-learning material recommender system, Proceedings ofthe 2nd International Conference on Information Technology for Application(ICITA 2004) (2004).

[21] Christopher Manning, Prabkhar Raghavan, and Hinrich Schütze, Introductionto information retrieval, Cambridge University Press, 2009.

[22] Chunyan Miao, Qiang Yang, Haijing Fang, and Angela Goh, A cognitiveapproach for agent-based personalized recommendation, Elsevier, KnowledgeBased Systems (2007), no. 20.

[23] Stuart E. Middleton, David C. De Roure, and Nigel R. Shadbolt, Capturingknowledge of user preferences - ontologies in recommender systems, Proceedingsof the 1st international conference on Knowledge capture (2001).

[24] Kathleen Mykytyn, Peter P. Mykytyn Jr., and Craig W. Slinkman, Expertsystems - a question of liabiiity?, MIS Quarterly 14 (1990), no. 1, 27–42.

[25] Michael J. Pazzani, A framework for collaborative, content-based and demo-graphic filtering, Journal of Artificial Intelligence Review - Special issue ondata mining on the Internet (1999).

[26] Francesco Ricci, Travel recommendation systems, IEEE Inteligent Systems(Nov-Dec, 2002).

69

[27] Francesco Ricci and Quang Nhat Nguyen, Critique-based mobile recommendersystems, ÖGAI Journal, ÖGAI Press 24 (2005), no. 4.

[28] M Sasikumar, S Ramani, S Muthu Raman, KSR Anjaneyulu, and R Chan-drasekar, A practical introduction to rule based expert systems, Narosa Publish-ing House, New Delhi, 2007.

[29] A. Singh, Knowledge based expert systems in organization of higher learning,Proceedings of the International Conference and Workshop on Emerging Trendsin Technology (New York, NY, USA), ICWET ’10, ACM, 2010, pp. 571–574.

[30] Il-Yeol Song and Joseph LaGue, Predicting expert system success: An expertsystem for expert systems, Proceedings of the 1990 ACM SIGBDP Conferenceon Trends and Directions in Expert Systems (New York, NY, USA), SIGBDP’90, ACM, 1990, pp. 88–110.

[31] Xiaoyuan Su and Taghi M. Khoshgoftaar, A survery of collaborative filteringtechniques, Advances in Artificial Intelligence (2009).

[32] Chun-Yuen Teng, Yu-Ru Lin, and Lada Adamic, Recipe recommendation usingingredients networks, Proceedings of the 4th International Conference on WebScience (WebSci’12) (2012).

[33] Robin van Meteren and Maarten van Someren, Using content-based filtering forrecommendation, Proceedings of the Machine Learning in the New InformationAge: MLnet/ECML2000 Workshop (2000).

[34] Lex van Velsen, Thea van der Geest, and Michäel Steehouder, The contributionof technical communicators to the user-centered design process of personalizedsystems, Technical Communication 57 (May 2010), no. 2.

[35] Manolis Vozalis and Konstantinos G. Margaritis, On the enhancement of col-laborative filtering by demographic data, Web Intelligence and Agent Systems 4(2006), no. 2, 117–138.

[36] Casey Dugan Werner Geyer, Inspired by the audience – a topic suggestion sys-tem for blog writers and readers, Proceedings of the SIGCHI Conference onHuman Factors in Computing Systems (2010).

[37] Joseph Williams, When expert systems are wrong, Proceedings of the 1990 ACMSIGBDP Conference on Trends and Directions in Expert Systems (New York,NY, USA), SIGBDP ’90, ACM, 1990, pp. 661–669.

Eindhoven University of Technology MASTER Hybrid ... · Hybrid recommendation systems combining...

Documents

Transcript of Eindhoven University of Technology MASTER Hybrid ... · Hybrid recommendation systems combining...