Adience - Turning low level behavioural signals into user profiles

32
Turning Low Level Behavioural Signals Into User Profiles Pablo Rosenman, VP Development

Transcript of Adience - Turning low level behavioural signals into user profiles

Page 1: Adience - Turning low level behavioural signals into user profiles

Turning Low Level Behavioural Signals Into User ProfilesPablo Rosenman, VP Development

Page 2: Adience - Turning low level behavioural signals into user profiles

Adience

Leading the user-centric mobile revolution

- Harness Deep Learning to profile mobile app users

- Distill user/app interaction to actionable segmentation data

2

Page 3: Adience - Turning low level behavioural signals into user profiles

3

Page 4: Adience - Turning low level behavioural signals into user profiles

Adience Insights

4

Page 5: Adience - Turning low level behavioural signals into user profiles

Adience SDK

- Runs on tens of millions of devices

- Runs in the background, without interfering with device’s

operations

- Collects raw data from the system and environment (according

to available permissions)

- Reduces dimensionality and anonymizes the data

- Sends results to the SDK Server

5

Page 6: Adience - Turning low level behavioural signals into user profiles

SDK Server

- Receives tens of millions of data submissions from the Mobile

SDK installations per day

- It should be able to scale by two orders of magnitude

- It should handle requests quickly, so as not to hang the client

(i.e. Mobile SDK)

- It should avoid losing data

6

Page 7: Adience - Turning low level behavioural signals into user profiles

SDK Server (Architecture)

- Data is sent from the mobile SDK to an Apache server running

on EC2

- SDK Server verifies validity of incoming data

- Incoming data gets written immediately (no processing) to S3

Amazon EC2Mobile Client Amazon S3

7

Page 8: Adience - Turning low level behavioural signals into user profiles

SDK Server (Scaling)

- The ELB balances the load on all the servers

- Auto Scaling will make sure there are enough servers to

handle the load

Amazon EC2

Auto Scaling

Mobile Client Amazon S3Elastic Load Balancer

8

Page 9: Adience - Turning low level behavioural signals into user profiles

Insights Workflow

- Create insights on the device’s owner when new data

arrives from the device

- Doesn’t have to be real-time (as the data arrives), but

shouldn’t be far behind

9

Page 10: Adience - Turning low level behavioural signals into user profiles

Insights Workflow (cont.)

- Data report sent by SDK consists of:

- Simple data points requiring simple statistic and arithmetic

operations, for example:

- Device model

- OS version

- More complex data matrices requiring matrix operations,

for example:

- Machine Learning features on time series data

- Machine Learning features on photos10

Page 11: Adience - Turning low level behavioural signals into user profiles

Insights Workflow (Architecture)

- Simple pattern for streamlined processing server application:

- Read input S3 filename from input SQS

- Read the file from the input S3 bucket, and process it

- Write results to file in output S3 bucket

- Send output S3 filename to output SQS

EC2 Servers

Amazon SQS

S3 Bucket Auto Scaling S3 Bucket

Amazon SQS

11

Page 12: Adience - Turning low level behavioural signals into user profiles

Insights Workflow (Architecture)

- Aggregate the data from all reports to a single device object

- Create insights from all the device’s aggregated data

- Advantages of architecture:

- Scalability

- Decoupling

Insights Servers

Devices Servers

Amazon SQS

Amazon SQS

Reports S3 Bucket

Insights DynamoDB

Table

Deep Learning Servers (GPU)

Devices S3 Bucket

Amazon SQS

12

Page 13: Adience - Turning low level behavioural signals into user profiles

Adience Events

13

Page 14: Adience - Turning low level behavioural signals into user profiles

Events SDK

- Receives events based on user interaction with the app

- Some events are automatically implemented (app was started)

- Custom events are the real driving force (user has made an in-

app purchase for $3.99)

- Events should be sent to the Events Server

14

Page 15: Adience - Turning low level behavioural signals into user profiles

Events Server

- Receives hundreds of millions of data submissions from the

Mobile SDK installations per day

- It should be able to scale by two orders of magnitude

- It should handle requests quickly, so as not to hang the client

- Analytics engine should work on all data from the last 30 days

- Data should be enriched with the user insights

15

Page 16: Adience - Turning low level behavioural signals into user profiles

Events Server (Architecture)

- All incoming events are written to a file in the local volume

- Once every hour, we close the file in each instance and ship it

to S3

Amazon EC2

Auto Scaling

Mobile Client Amazon S3Elastic Load Balancer

Amazon EBS

logrotate

16

Page 17: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce

- At the end of each day, all events from that day are in the

events S3 bucket

- We add to these a “mock event” per report sent to the SDK

Server

- Eventually, we wish to compare all the app’s users in the last 30

days to a subset of those users

17

Page 18: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

- Using Amazon EMR, we aggregate the data per app, device,

day, and event type

- Example: device 0123, on 2016-01-04, in app Blappy Fird,

purchased in-app goods worth a total of $100

EventsS3 Bucket

Raw2DailyAmazon EMR

Mock EventsS3 Bucket

DailyS3 Bucket

18

Page 19: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

- Using the Daily data for the last 30 days, we run an additional

EMR to aggregate per app, device, and event type

- Example: device 0123, in app Blappy Fird, purchased in-app

goods worth a total of $1000 (in the last 30 days)

- We enrich the data by adding the device’s insights to each record

DailyS3 Bucket

Daily2AggregateAmazon EMR

AggregateS3 Bucket

Insights DynamoDB

Table

19

Page 20: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

- Accessing DynamoDB per event type is costly

- We know last day’s users - save them to an in-memory cache

DailyS3 Bucket

Daily2AggregateAmazon EMR

AggregateS3 Bucket

Insights DynamoDB

Table

20

Insights Servers

Insights ElastiCache

Page 21: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

- Using the Aggregate data for the last 30 days, we run an

additional EMR to aggregate per app, country, age, gender, and

subset type

- Example: app Blappy Fird, in the US, for males aged 25-34

who purchased in-app goods worth a total of more than

$500 (in the last 30 days), 70% are tech savvy, 40% are

commuters, etc.

AggregateS3 Bucket

SubsetS3 Bucket

Aggregate2SubsetAmazon EMR

21

Page 22: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

22

Page 23: Adience - Turning low level behavioural signals into user profiles

Insights MapReduce (cont.)

- How can we show data on apps that haven’t integrated us?

- Create a mock event per app that we know is installed on the

device!

EventsS3 Bucket

Raw2DailyAmazon EMR

Mock Events

S3 Bucket

DailyS3 Bucket

Daily2AggregateAmazon EMR

AggregateS3 Bucket

Insights DynamoDB

Table

Aggregate2SubsetAmazon EMR

SubsetS3 Bucket

23

Page 24: Adience - Turning low level behavioural signals into user profiles

24

Page 25: Adience - Turning low level behavioural signals into user profiles

Next Generation

25

Page 26: Adience - Turning low level behavioural signals into user profiles

SDK Server (Next Generation)

Amazon EC2

Auto Scaling

Mobile Client Amazon S3Elastic Load Balancer

Mobile Client Amazon S3Amazon API Gateway

AWS Lambda

26

Page 27: Adience - Turning low level behavioural signals into user profiles

Insights Workflow (Next Generation)

Insights Servers

Devices Servers

Amazon SQS

Amazon SQS

Reports S3 Bucket

Insights DynamoDB

Table

Deep Learning Servers (GPU)

Devices S3 Bucket

Amazon SQS

Reports S3 Bucket Devices

Lambda

Devices S3 BucketDeep Learning

Servers (GPU)

Amazon SQS

StagingS3 Bucket

InsightsLambda

Insights DynamoDB

Table

27

Page 28: Adience - Turning low level behavioural signals into user profiles

Events Server (Next Generation)

Amazon EC2

Auto Scaling

Mobile Client Amazon S3Elastic Load Balancer

Amazon EBS

logrotate

Mobile Client Amazon S3Amazon API Gateway

AWS Lambda

Amazon Kinesis

Firehose

28

Page 29: Adience - Turning low level behavioural signals into user profiles

Bonus:ELK with Amazon

29

Page 30: Adience - Turning low level behavioural signals into user profiles

ELK with Amazon

- Server code sends logs to local ZMQ process

- ZMQ process then asynchronously sends to Kinesis

- Logstash pulls the Kinesis stream, and writes in batches to

ElasticSearch

Server Code

Amazon KinesisZMQ Logstash Amazon

ElasticSearch

30

Page 31: Adience - Turning low level behavioural signals into user profiles

We’re Hiring!Server Developer

Full Stack Web Developer

Algorithm Developer

DevOps Engineer

Page 32: Adience - Turning low level behavioural signals into user profiles

THANK [email protected]