Reco4J @ London Meetup (June 26th)
-
Upload
alessandro-negro -
Category
Technology
-
view
1.375 -
download
3
description
Transcript of Reco4J @ London Meetup (June 26th)
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013
Reco4J Project Intelligent RecommendaAons for
Your Business
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 1
Recommender Systems • A system that can recommend or present items to the user based on the user’s interests and interacAons
• One of the best ways to provide a personalized customer experience
• Built by exploiAng collecAve intelligence to perform predicAons
• Examples: Amazon, YouTube, NeSlix, Yahoo, Tripadvisor, Last.fm, IMDb
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 2
The Example: NeSlix • The world largest online movie rental services, 33 million members in 40 countries
• 60% of members selecAng movies based on recommendaAons (September 2008)
• NeSlix Prize: US$ 1,000,000 was given to the BellKor's PragmaAc Chaos team which bested NeSlix's own algorithm for predicAng raAngs by 10.06% (September 2009)
• 75% of the content watched on the service comes from its recommendaAon engine (April 2012)
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 3
Why Recommender Systems • Standard uses:
– Increase the number of items sold – Sell more diverse items – Increase the user saAsfacAon – Increase user fidelity – Beeer understand what the user wants
• Advanced uses: – Create ad hoc campaigns (per geographic area, per type of users) – OpAmize products distribuAon over a wide area for large retail chains
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 4
Problem • There are no available sofware products for state-‐of-‐the-‐art recommender systems
• There is no "best soluAon" • There is no "one soluAon fits all” • The NeSlix winner composed 104 different algorithms • A high-‐end recommender engine can be built only through expensive custom projects
• Large scale user/item datasets require a big data approach
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 5
SoluAon: Reco4J
A graph-‐based recommender engine
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 6
Reco4J Main Goals • Implement the state-‐of-‐the-‐art in the recommendaAon on top of a graph model
• Ready to use framework • Extend/Improve exisAng sofwares: – Neo4j – ElasAcsearch – R
• Provide sofware / cloud services / consultancy • Contribute to the RecSys research field
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 7
Reco4J Features • Core
– Based on collabora.ve filtering approach – Independent from source knowledge datasets – Persistent models (mulA model supported) – Updatable models – Composable models/algorithms
• Algorithms – Commercial and research-‐oriented algorithms – Context-‐aware recommendaAons – Social recommendaAons
• Opera.ons – Cluster and cloud-‐ready for Big Data Analysis – MulAtenant
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 8
Reco4J Under the Hood • J is for Java • Customized algorithm implementaAon based on graph data model • Terracoea® Big Memory integraAon • Neo4J graph database:
– Data source repository – Persistent model repository
• Apache Hadoop – Map / Reduce based model building
• Apache Mahout – Graph data model – Recommender – AlternaAng Least Square Algorithms (Hadoop Version)
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 9
Algorithms Roadmap • CollaboraAve filtering
– Memory based (Neighborhood) • User/Item based
– Several distance algorithms (Cosine, Euclidean, Tanimoto, etc.) • Graph based
– Path Based Similarity (Shortest Path, Number of Paths) – Random Walk Similarity (Item Rank, Average first-‐passage/commute Ame)
– Model based (Latent factor) • Stochas6c gradient descendant • Alterna6ng least square • SVD++ (by Koren)
• Social recommendaAon – Trust based approach – ProbabilisAc approach
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 10
Algorithms Roadmap (2) • Cross-‐curng features (all algos) – Context awareness – Composability – Real Ame – ParallelizaAon
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 11
Context-‐Aware RecommendaAon “The ability to reach out and touch customers anywhere means that companies must deliver not just compe;;ve products but also unique, real-‐;me customer experiences shaped by customer context”
C. K. Prahalad
• Incorporate contextual informa6on in the recommendaAon process • Modeling contextual InformaAon
– From: User x Item -‐> RaAng – To: User x Item x Context -‐> RaAng
• Hierarchical structure • Three approaches
– Contextual pre-‐filtering – Contextual post-‐filtering – Contextual modeling
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 12
Advantage of graph database • NoSQL database to handle BigData • Extensibility • No aggregate-‐oriented database • Minimal informaAon needed • Natural way for represenAng connecAons:
– User -‐ to -‐ item – Item -‐ to -‐ item – User -‐ to -‐ User
• Graph Based/Social Algorithms • Graph ParAAoning (sharding) • Performance
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 13
Example: Find Neighbors
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 14
Why Neo4J? • Java based • Embeddable/Extensible • NaAve graph storage with naAve graph processing engine
• Open Source, with commercial version • Property Graph • ACID support • Scalability/HA • Comprehensive query/traversal opAons
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 15
RecommendaAon Model
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 16
Persistence Model
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 17
Persistence Model
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 18
Persistence Model
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 19
A code example
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 20
Reco4J + Hadoop • Queue Based Process • Operates both on cluster and cloud • Each process downloads data from
Neo4J/Reco4J before or during computaAon
• Stores data into Reco4J Model
• Scaling augmenAng the number of: • Neo4J Nodes (only one master) • Hadoop Nodes
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 21
Reco4J in the Cloud • Recommenda.on as a service (RaaS) • Reco4J cloud infrastructure offers: – Pay as you need – Pay as you grow – Support for burst – Periodical analysis at lower costs – Test/evaluate several algorithms on a reduced dataset – Compose algorithms dynamically
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 22
Consultancy Goals
Analysis
Data Source
ExploraAon
Process DefiniAon
Import Data
Test/EvaluaAon
Deploy
Alessandro Negro Reco4J Project @ London Meetup -‐ June 2013 Page 23
Thank you
Alessandro Negro Linkedin: hep://it.linkedin.com/in/alessandronegro/ Email: [email protected] Reco4J Site: hep://www.reco4j.org Twieer: @reco4j GitHub: heps://github.com/reco4j