Real time data analytics in a Serverless · 2019-05-17 · User Needs • Real time Analytics and...

Post on 25-Apr-2020

6 views 0 download

Transcript of Real time data analytics in a Serverless · 2019-05-17 · User Needs • Real time Analytics and...

Real time data analytics in a Serverless environment Manoj Aggarwal Principal Engineer

Agenda

• User Needs

Agenda

• User Needs • Potential Solutions

Agenda

• User Needs • Potential Solutions • Serverless Architecture

Agenda

• User Needs • Potential Solutions • Serverless Architecture • Lessons Learnt

User Needs

• Real time Analytics and Visualization platform – Real time visualization – Create and Serve new dashboards on demand

• Business Constraints – Low cost Solution – Start with 1000 users, scale on the go

• Technology Constraints – Zero impact on the current application

Potential Solutions ETL Features AWS - 1 Azure AWS - 2 Self-Hosted

Extraction - AWS DMS - Custom Jobs

- MongoDB Stitch - MongoDB Stitch - MongoDB Stitch - Custom Jobs

Transformation - AWS S3 - AWS Glue

- AZR Service Bus - AZR Functions

- AWS SQS - AWS Lambda

None [1]

Storage AWS S3 AZR SQL Database AWS Aurora [2] - Local MongoDB - EBS - EC2

Visualization - AWS Athena - Tableau server - Tableau embedded analytics [6]

- Power BI [3] - Power BI embedded [4]

- Tableau server - Tableau embedded analytics [6]

- Dremio [5] - Tableau server - Tableau embedded analytics [6]

Handle Big Data (3V)

Yes [7] No [8] No [8] No [9]

Configurable High [10] Medium [11] Medium [11] Low [12] Learning Curve High [13] Medium [14] Medium [14] Low [15] Real Time Sync Yes but may be costly

due to DMS Yes Yes Yes but with may

incur additional costs and development time

Support for Mobile App

Yes Yes Yes Yes

Scalable Yes Yes Yes No

Solution Architecture – Data Source

Zero impact on the current application!

Solution Architecture – ETL

Real Time Processing as and when the data changes!

Solution Architecture – Storage

Real time- low cost solution, scale on the go!

Solution Architecture – Visualization

Fully integrated, seamless experience for the users!

ETL in MongoDB Stitch? Monolith, Increased Cost due to additional processing, Single Point of failure, Not extendable

Lessons Learnt

• Keep cost in mind – Serverless can be expensive too • Actively monitor Serverless executions, raise alerts when the

threshold is crossed • Beware of cold start • Know your database limits, limit parallel executions • Benchmark your infrastructure • Transparently process failed messages

Anything that can go wrong will go wrong! – Murphy’s Law