Real time data analytics in a Serverless environment Manoj Aggarwal Principal Engineer
Agenda
• User Needs
Agenda
• User Needs • Potential Solutions
Agenda
• User Needs • Potential Solutions • Serverless Architecture
Agenda
• User Needs • Potential Solutions • Serverless Architecture • Lessons Learnt
User Needs
• Real time Analytics and Visualization platform – Real time visualization – Create and Serve new dashboards on demand
• Business Constraints – Low cost Solution – Start with 1000 users, scale on the go
• Technology Constraints – Zero impact on the current application
Potential Solutions ETL Features AWS - 1 Azure AWS - 2 Self-Hosted
Extraction - AWS DMS - Custom Jobs
- MongoDB Stitch - MongoDB Stitch - MongoDB Stitch - Custom Jobs
Transformation - AWS S3 - AWS Glue
- AZR Service Bus - AZR Functions
- AWS SQS - AWS Lambda
None [1]
Storage AWS S3 AZR SQL Database AWS Aurora [2] - Local MongoDB - EBS - EC2
Visualization - AWS Athena - Tableau server - Tableau embedded analytics [6]
- Power BI [3] - Power BI embedded [4]
- Tableau server - Tableau embedded analytics [6]
- Dremio [5] - Tableau server - Tableau embedded analytics [6]
Handle Big Data (3V)
Yes [7] No [8] No [8] No [9]
Configurable High [10] Medium [11] Medium [11] Low [12] Learning Curve High [13] Medium [14] Medium [14] Low [15] Real Time Sync Yes but may be costly
due to DMS Yes Yes Yes but with may
incur additional costs and development time
Support for Mobile App
Yes Yes Yes Yes
Scalable Yes Yes Yes No
Solution Architecture – Data Source
Zero impact on the current application!
Solution Architecture – ETL
Real Time Processing as and when the data changes!
Solution Architecture – Storage
Real time- low cost solution, scale on the go!
Solution Architecture – Visualization
Fully integrated, seamless experience for the users!
ETL in MongoDB Stitch? Monolith, Increased Cost due to additional processing, Single Point of failure, Not extendable
Lessons Learnt
• Keep cost in mind – Serverless can be expensive too • Actively monitor Serverless executions, raise alerts when the
threshold is crossed • Beware of cold start • Know your database limits, limit parallel executions • Benchmark your infrastructure • Transparently process failed messages
Anything that can go wrong will go wrong! – Murphy’s Law
Top Related