Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin

download Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin

of 31

  • date post

    21-Jan-2018
  • Category

    Technology

  • view

    185
  • download

    1

Embed Size (px)

Transcript of Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin

  1. 1. Till Rohrmann trohrmann@apache.org @stsffap Apache Flink Meets Apache Mesos and DC/OS
  2. 2. MapReduce is crunching Data
  3. 3. We need to turn faster!
  4. 4. FMACK Stack EVENTS Ubiquitous data streams from connected devices INGEST Apache Kafka STORE Apache Flink ANALYZE Apache Cassandra ACT Akka Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data & build data driven apps Mesos/ DC/OS Sensors Devices Clients
  5. 5. Datacenter
  6. 6. Naive Approach Typical Datacenter siloed, over-provisioned servers, low utilization Industry Average 12-15% utilization mySQL microservice Cassandra Flink Kafka
  7. 7. 2017 Mesosphere, Inc. All Rights Reserved. 9
  8. 8. Apache Mesos Typical Datacenter siloed, over-provisioned servers, low utilization Industry Average 12-15% utilization mySQL microservice Cassandra Flink Kafka Mesos automated schedulers, workload multiplexing onto the same machines
  9. 9. Original creators of Apache Flink Providers of the dA Platform, a supported Flink distribution
  10. 10. Apache Flink in a Nutshell Event-driven applications (event sourcing, CQRS) Stateful, event-driven, event-time-aware processing Batch Processing (data sets) Stream Processing / Analytics (data streams, windows, )
  11. 11. Programming Model Computation Computation Computation Computation Source Source Sink Sink Transformation state state state state
  12. 12. What is Flink Good For?
  13. 13. Detecting fraud in real time As fraudsters get better, need to update models without downtime Live 24/7 service Credit card transactions Notifications and alerts Evolving fraud models built by data scientists @
  14. 14. Athena X (https://eng.uber.com/athenax/) Streaming analytics platform SQL as abstraction layer Streams from Hadoop, Kafka, etc SQL, thresholds, actions Analytics Alerts Derived streams @
  15. 15. Blink based on Flink A core system in Alibaba Search Machine learning, search, recommendations A/B testing of search algorithms Online feature updates to boost conversion rate @
  16. 16. @ Complete social network Implemented using event sourcing and CQRS (Command Query Responsibility Segregation) https://data-artisans.com/blog/drivetribe-cqrs-apache-flink
  17. 17. Apache Flink & Apache Mesos
  18. 18. Why Apache Mesos? Mesos offers full functionality to implement fault tolerant and elastic distributed applications 30% of survey respondents were running Flink on Mesos (prior to proper Mesos support, September 2016)
  19. 19. Flinks Mesos Integration Apache Flink Framework Mesos Master Mesos App Master Flink Mesos ResourceManager JobManager Mesos Task TaskManager Mesos Task TaskManager Allocate Resources Launch Mesos tasks Register Execute Job
  20. 20. Resource Manager Components Monitors connection to Mesos Connection Monitor Launch Coordinator Resource offer processing and task scheduling Gathers offers and matches them to tasks using Fenzo Task Monitor Reconciliation Coordinator Monitors Mesos tasks Triggers reconciliation Makes sure tasks are properly killed Reconciles tasks view between ResourceManager and Mesos Master
  21. 21. Component Interplay ResourceManager Connection Monitor Launch Coordinator Task MonitorReconciliation Coordinator Mesos Master Resource offers Launch tasks Monitor tasks Status messages Trigger reconciliation Status messages Mesos Task Reconcile tasks Start TaskManagers Recover tasks Kill task
  22. 22. DC/OS Datacenter Operating System (DC/OS) Distributed Systems Kernel (Mesos) Big Data + Analytics EnginesMicroservices (in containers) Streaming Batch Machine Learning Analytics Functions & Logic Search Time Series SQL / NoSQL Databases Modern App Components Any Infrastructure (Physical, Virtual, Cloud)
  23. 23. Demo Time Generator Financial data generated by generator Written to Kafka topics Kafka topics consumed by Flink Flink pipeline operates on Kafka data Results written back into Kafka
  24. 24. Conclusion
  25. 25. TL;DL Apache Flink modern stream processor for real-time processing and event-driven applications Apache Flink runs on Mesos using Fenzo DC/OS offers easy to use Flink package
  26. 26. 30 Thank you! @stsffap @ApacheFlink @dataArtisans
  27. 27. We are hiring! data-artisans.com/careers