The Big Data Revolution: “When Bigger is Better”...

Post on 27-Jun-2020

5 views 0 download

Transcript of The Big Data Revolution: “When Bigger is Better”...

The Big Data Revolution: “When Bigger is Better” Real Use Cases for the Online Gaming Business

Roberto Fenaroli & Brunino CrinitiBig Data Engineers

2

Chi è Ringmaster

Ringmaster nasce il 27 Ottobre 2011 come Joint Venture tra Lottomatica Group (ora IGT) e Reply S.p.A. con lo scopo di realizzare una software factoryper la realizzazione di piattaforme di gioco multi-utente nell'ambito media e gaming online.

In Ringmaster lavorano circa 60 ingegneri del software con una percentuale di laureati di oltre il 90% in un ambiente giovane, dinamico e informale.

3

Reply nel mondo

Americas

US(Chicago, Detroit)

Brazil(Belo Horizonte, Sao Paulo)

Europe

Germany (Berlin, Bremen, Dusseldorf, Frankfurt,

Gutersloh, Hamburg, Munich)Italy

(Bari, Milano, Padova, Roma,Torino, Trieste, Verona) The UK

(London, Basingstoke, Chester,Cockpole Green)

Benelux & France(Amsterdam, Brussels, Luxembourg, Paris)

Poland & Romania(Katowice, Bucharest)

Belarus (Minsk)

Asia

China(Beijing)

RUMENIA & POLANDNEAR SHORE

ITALYHOME COUNTRY

GERMANY & UKHUGE PRESENCE

6000 dipendenti(70% in Italia)

4

Introduction

• «The data volumes are exploding, more data has been created in the past twoyears than in the entire previous history of the human race.»

• «Within five years there will be over 50 billion smart connected devices in theworld, all developed to collect, analyze and share data.»

• «73% of organizations have already invested or plan to invest in Big Data by2016.»

At the moment less than 1% of all data is ever analyzed and used

5

«Big Data is like teenage sex. Everyone talks about it,

Nobody really knows how to do it,Everyone thinks everyone else is doing it,

So everyone claims they are doing it.»

Dan Ariely, Duke University

6

What is Big Data?

7

Customer’s Goals

Customers want to get the most out of their data• Analyze raw data in order to transform them into valuable

information• In other words..they want to make money!

8

Our Client’s Goals

Create smart applications based on Big Data technologies, for any kind of use cases such as:

• Analytic reports and dashboards• User Profiling• Games Management

9

JokeR* - IGT Recommendation Engine

A Recommendation Engine represents a perfect example of “smart” application that reach all these goals and returns valuable information for the client.

10

JokeR* - Overview

• Promote appropriate content to players• Provide similar content

• Smart targeting for player engagement

• Drive up player retention

• Use different techniques/algorithms• Collaborative filtering

• Content-based filtering

• Matrix Factorization algorithms

• Configurable Boosting Factor

Advanced analytics and event driven products for user

retention and revenue boosting

11

JokeR* – Solution

• We addressed the described challenges looking among the best open-source solutions available

• Our architecture relies on: • Cloudera CDH• Apache Spark• MongoDB• Apache Mahout• Spring Framework

12

Why Cloudera CDH?

• CDH is the most complete and popular distribution of Apache Hadoop (and related products), to ensure computational power and scalability

• CDH provides: • Scalability• Availability• Efficiency• Flexibility• Security• Usability• Integration

13

Cloudera Manager

14

JokeR* - High Level Architecture

JokeR* Core – Java, Spark, Mahout GameR Backoffice- AngularJS

REST Interface – Spring Rest APIs

JokeR Core

Games Catalog

Data Gathering Components – Spark, Flume, Kafka

Data Ingestion Data Processing Data Consumption

Big Data Environment – Cloudera CDH, Mongo DB

Game AttributesHandler

Data GatheringComponent

Big Data Environment

RESTInterfaceJokeR

Data Digest

JokeRML

Engine

NoSQLDatabase

Game Transactions

ExternalSystems

GameRBackoffice

15

JokeR* - Lifecycle

A Model is a combination of Dataset, Similarity Metrics and Algorithm/Parameters

CreateModel

Configure Parameters

Train the Model

Test and Validate

The Model

Model isReady For Production

A Dataset, a KPI, and an algorithm

are chosen from a set of available

ones

A specific configuration is defined based on

parameters offered by the algorithm

A training data setis needed to trainour algorithm

The model is tested to evaluate the

quality of produced outcomes

As a result a set of potential Active

Models is provided

Choose an Active Model

Schedule Recommendations

Refresh (daily, monthly,…)

Once a set of Active models is available we are ready to provide recommendations to any client system. The model can also be scheduled to be retrained based on new

data gathered.

Training and Validation Phase

Production Phase

16

JokeR* – Summary

• Information gathering from different sources• Games played and amount spent data

• Games database for game attributes

• Leverages Big Data technologies• Hadoop and Spark, and the Java Machine Learning Library

Mahout

• Transform raw data into valuable information• Provide “explorable” aggregated data to Business Analysts

• Generate Games Recommendations for final users