Reco4J @ London Meetup (June 26th)

24
Alessandro Negro Reco4J Project @ London Meetup June 2013 Reco4J Project Intelligent RecommendaAons for Your Business

description

This presentation shows reco4j features and vision. In particular we add the new concept of context aware recommendation and how we integrate it into reco4j. In this new presentation there is also some piece of code that show how simple is integrate our software. See the project site for more details here: http://www.reco4j.org

Transcript of Reco4J @ London Meetup (June 26th)

Page 1: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013  

Reco4J  Project  Intelligent  RecommendaAons  for  

Your  Business  

Page 2: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  1  

Recommender  Systems  •  A  system  that  can  recommend  or  present  items  to  the  user  based  on  the  user’s  interests  and  interacAons  

•  One  of  the  best  ways  to  provide  a  personalized  customer  experience  

•  Built  by  exploiAng  collecAve  intelligence  to  perform  predicAons  

•  Examples:  Amazon,  YouTube,  NeSlix,  Yahoo,  Tripadvisor,  Last.fm,  IMDb  

Page 3: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  2  

The  Example:  NeSlix  •  The  world  largest  online  movie  rental  services,  33  million  members  in  40  countries  

•  60%  of  members  selecAng  movies  based  on  recommendaAons  (September  2008)  

•  NeSlix  Prize:  US$  1,000,000  was  given  to  the  BellKor's  PragmaAc  Chaos  team  which  bested  NeSlix's  own  algorithm  for  predicAng  raAngs  by  10.06%  (September  2009)  

•  75%  of  the  content  watched  on  the  service  comes  from  its  recommendaAon  engine  (April  2012)  

Page 4: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  3  

Why  Recommender  Systems  •  Standard  uses:  

–  Increase  the  number  of  items  sold  –  Sell  more  diverse  items  –  Increase  the  user  saAsfacAon  –  Increase  user  fidelity  –  Beeer  understand  what  the  user  wants      

•  Advanced  uses:  –  Create  ad  hoc  campaigns  (per  geographic  area,  per  type  of  users)  –  OpAmize  products  distribuAon  over  a  wide  area  for  large  retail  chains  

Page 5: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  4  

Problem  •  There  are  no  available  sofware  products  for  state-­‐of-­‐the-­‐art  recommender  systems  

•  There  is  no  "best  soluAon"  •  There  is  no  "one  soluAon  fits  all”  •  The  NeSlix  winner  composed  104  different  algorithms  •  A  high-­‐end  recommender  engine  can  be  built  only  through  expensive  custom  projects  

•  Large  scale  user/item  datasets  require  a  big  data  approach  

Page 6: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  5  

SoluAon:  Reco4J    

A  graph-­‐based  recommender  engine  

Page 7: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  6  

Reco4J  Main  Goals  •  Implement  the  state-­‐of-­‐the-­‐art  in  the  recommendaAon  on  top  of  a  graph  model  

•  Ready  to  use  framework  •  Extend/Improve  exisAng  sofwares:  –  Neo4j  –  ElasAcsearch  –  R  

•  Provide  sofware  /  cloud  services  /  consultancy    •  Contribute  to  the  RecSys  research  field  

Page 8: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  7  

Reco4J  Features  •  Core  

–  Based  on  collabora.ve  filtering  approach  –  Independent  from  source  knowledge  datasets  –  Persistent  models  (mulA  model  supported)  –  Updatable  models  –  Composable  models/algorithms  

•  Algorithms  –  Commercial  and  research-­‐oriented  algorithms  –  Context-­‐aware  recommendaAons  –  Social  recommendaAons  

•  Opera.ons  –  Cluster  and  cloud-­‐ready  for  Big  Data  Analysis  –  MulAtenant  

Page 9: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  8  

Reco4J  Under  the  Hood  •  J  is  for  Java  •  Customized  algorithm  implementaAon  based  on  graph  data  model  •  Terracoea®  Big  Memory  integraAon  •  Neo4J  graph  database:  

–  Data  source  repository  –  Persistent  model  repository  

•  Apache  Hadoop  –  Map  /  Reduce  based  model  building  

•  Apache  Mahout  –  Graph  data  model  –  Recommender  –  AlternaAng  Least  Square  Algorithms  (Hadoop  Version)  

Page 10: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  9  

Algorithms  Roadmap  •  CollaboraAve  filtering  

–  Memory  based  (Neighborhood)  •  User/Item  based  

–  Several  distance  algorithms  (Cosine,  Euclidean,  Tanimoto,  etc.)  •  Graph  based  

–  Path  Based  Similarity  (Shortest  Path,  Number  of  Paths)  –  Random  Walk  Similarity  (Item  Rank,  Average  first-­‐passage/commute  Ame)  

–  Model  based  (Latent  factor)  •  Stochas6c  gradient  descendant  •  Alterna6ng  least  square  •  SVD++  (by  Koren)  

•  Social  recommendaAon  –  Trust  based  approach  –  ProbabilisAc  approach  

Page 11: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  10  

Algorithms  Roadmap  (2)  •  Cross-­‐curng  features  (all  algos)  – Context  awareness  – Composability  – Real  Ame  – ParallelizaAon  

Page 12: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  11  

Context-­‐Aware  RecommendaAon  “The  ability  to  reach  out  and  touch  customers  anywhere  means  that  companies  must  deliver  not  just  compe;;ve  products  but  also  unique,  real-­‐;me  customer  experiences  shaped  by  customer  context”  

C.  K.  Prahalad    

•  Incorporate  contextual  informa6on  in  the  recommendaAon  process  •  Modeling  contextual  InformaAon  

–  From:  User  x  Item  -­‐>  RaAng  –  To:  User  x  Item  x  Context  -­‐>  RaAng  

•  Hierarchical  structure  •  Three  approaches  

–  Contextual  pre-­‐filtering  –  Contextual  post-­‐filtering  –  Contextual  modeling  

Page 13: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  12  

Advantage  of  graph  database  •  NoSQL  database  to  handle  BigData  •  Extensibility  •  No  aggregate-­‐oriented  database  •  Minimal  informaAon  needed  •  Natural  way  for  represenAng  connecAons:  

–  User  -­‐  to  -­‐  item  –  Item  -­‐  to  -­‐  item  –  User  -­‐  to  -­‐  User  

•  Graph  Based/Social  Algorithms  •  Graph  ParAAoning  (sharding)  •  Performance  

Page 14: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  13  

Example:  Find  Neighbors  

Page 15: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  14  

Why  Neo4J?  •  Java  based  •  Embeddable/Extensible  •  NaAve  graph  storage  with  naAve  graph  processing  engine  

•  Open  Source,  with  commercial  version  •  Property  Graph  •  ACID  support  •  Scalability/HA  •  Comprehensive  query/traversal  opAons  

Page 16: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  15  

RecommendaAon  Model  

Page 17: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  16  

Persistence  Model  

Page 18: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  17  

Persistence  Model  

Page 19: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  18  

Persistence  Model  

Page 20: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  19  

A  code  example  

Page 21: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  20  

Reco4J  +  Hadoop  •  Queue  Based  Process  •  Operates  both  on  cluster  and  cloud  •  Each  process  downloads  data  from  

Neo4J/Reco4J  before  or  during  computaAon  

•  Stores  data  into  Reco4J  Model    

•  Scaling  augmenAng  the  number  of:  •  Neo4J  Nodes  (only  one  master)  •  Hadoop  Nodes  

Page 22: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  21  

Reco4J  in  the  Cloud  •  Recommenda.on  as  a  service  (RaaS)  •  Reco4J  cloud  infrastructure  offers:  –  Pay  as  you  need  –  Pay  as  you  grow  –  Support  for  burst  –  Periodical  analysis  at  lower  costs  –  Test/evaluate  several  algorithms  on  a  reduced  dataset  –  Compose  algorithms  dynamically  

Page 23: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  22  

Consultancy  Goals  

Analysis  

Data  Source  

ExploraAon  

Process  DefiniAon  

Import  Data  

Test/EvaluaAon  

Deploy  

Page 24: Reco4J @ London Meetup (June 26th)

Alessandro  Negro   Reco4J  Project  @  London  Meetup    -­‐  June  2013   Page  23  

Thank  you  

Alessandro  Negro  Linkedin:  hep://it.linkedin.com/in/alessandronegro/  Email:  [email protected]        Reco4J  Site:  hep://www.reco4j.org  Twieer:  @reco4j  GitHub:  heps://github.com/reco4j